Identification of the physicochemical characteristics of peptides that influence
their hydrolysis by pepsin
Ousmane SUWAREH
Joint work with :
Françoise NAU and David
CAUSEUR
2
Context
pepsin
Which physicochemical characteristics would favor the hydrolysis of a cleavage site by pepsin
during the gastric phase ?
Variation of food ‘s structure…
significant effects on digestion process
• Digestion kinetics
• Type of peptides released
Aim : understanding the action of pepsin
Probabilistic modelling of a peptide cleavage
Nyemb et al. 2016
3
1. Materials and Methods
Nyemb et al (2016) study design
Data processing
Pepsinolysis dynamics modelling
Overview of data
Generalized Additive Modelling (GAM)
Model selection and assessment 2. Results
Overall approach
Specific approach
3. Conclusion and Perspectives
OUTLINE
4
1. Materials and Methods
Nyemb et al (2016) study design
Data processing
Pepsinolysis dynamics modelling
Overview of data
Generalized Additive Modelling (GAM)
Model selection and assessment 2. Results
Overall approach
Specific approach
3. Conclusion and Perspectives
OUTLINE
in vitro digestion
Egg White Gel
Ovalbumin
pH 2 pH 5 pH 7 pH 9
Identification of peptides by mass spectrometry T0
T2
T10
T60
Selection of peptides obtained from ovalbumin
OVA
Nyemb et al. 2016 5
Nyemb et al. (2016) study
Data processing
OVA_pH2_T0 OVA_pH5_T0 OVA_pH7_T0 OVA_pH9_T0
OVA_pH2_T60 OVA_pH5_T60 OVA_pH7_T60 OVA_pH9_T60 OVA_pH2_T2 OVA_pH5_T2 OVA_pH7_T2 OVA_pH9_T2
OVA_pH2_T10 OVA_pH5_T10 OVA_pH7_T10 OVA_pH9_T10
peptide Digestion time Type
of gel intensity Characteristics of the peptide
OVA.100.106 .
T0
pH2 pH5 pH7 pH9 T2
T10 T60
1043 peptides 4 measuring times 4 types
of gel 9039 intensities
6
Peptide intensities are converted into presence/absence data
50 60
50 72
40 70
30 80
Does the peptide « a » come from the cleavage of B, C or D ?
a B D
C
A potentially cleaved peptide
Peptide Tx son
B 1
C 1
D 1
E 0
E
30 55
7
Post-translational modifications (PTM) of ovalbumin
Example : OVA.313.358...S32..79.96633
The serine residue in position 32 of the peptide is
phosphorylated
The phosphoserines potentially found in position 68 and 344 are PTMs
Peptides that only differ by PTMs are considered to be different
Peptides that only differ by the presence of CM are assumed to be the same
Nisbet, Saundry, Moir, Fothergill, & Fothergill, 1981; Perlmann, 1952 8
Other modifications (acetylation, oxidation...) are chemical modifications (CM) generated
during MS identification
9
1. Materials and Methods
Nyemb et al (2016) study design
Data processing
Approaches used
Description of the explanatory variables
Generalized Additive Models (GAMs)
Building of models 2. Results
Overall approach
Specific approach
3. Conclusion and Perspectives
OUTLINE
10
1. Materials and Methods
Nyemb et al (2016) study design
Data processing
Approaches used
Description of the explanatory variables
Generalized Additive Models (GAMs)
Building of models 2. Results
Overall approach
Specific approach
3. Conclusion and Perspectives
OUTLINE
Non-parametric modelling of pepsinolysis dynamics
2 separated approaches using the same regression framework
Global approach : For all peptides
Specific approach : For the “preferential” types of cleavage made by pepsin
(Tonda, Grosvenor, Clerens, & Le Feunteun, 2017; Hamuro, 2008)
Construction of generalized additive models (GAMs) by the probability of a cleavage based on the physicochemical characteristics of peptides
11
pH 2 pH 5 pH 7 pH 9 43
87
7
50 149
87
235
56 153
87
274
152 before 2' before 10' before 60'
376 peptides at 0’
244 peptides at 0’ 114 peptides at 0’ 358 peptides at 0’
pH 2 pH 5 pH 7 pH 9
99
171
70
213 334
174
451
213 340
174
510
408 before 2' before 10' before 60'
633 peptides at 0’
424 peptides at 0’ 195 peptides at 0’ 580 peptides at 0’
Number of peptides potentially cleaved by pepsin for the global approach (among all those detected at 0’)
12
Number of peptides potentially cleaved by pepsin for the specific approach (among those detected at 0’ showing
a preferential cleavage site)
A brief overview of data
Peptide location
Position of the 1st amino acid residue in the protein sequence of ovalbumin
Position of the last amino acid residue
The explanatory variables
Physicochemical characteristics of the peptide
Number of aromatic amino acid residues
Number of sulphur amino acid residues
Presence/absence of a PTM on the peptide
Gravy index (hydrophobic index)
Isoelectric point
Peptide ranker (prediction of bioactivity)
aliphatic index (thermostability index)
13
Peptide size
Total number of amino acid residues of the peptide
Number of each type of amino acid residue
Number of essential amino acid residues
With :
14
The explanatory variables of the model
The smooth functions associated with each variable
with cleavage No cleavage
Why Generalized Additive Models ?
For linear logistic regression :
Why Generalized Additive Models ?
15
• Additivity allows the separate evaluation of the physicochemical variables on the probability of cleavage using model selection strategies
testing the significance of difference
between pepsinolysis dynamics in different conditions (gels)
pH 9 gel pH 7 gel pH 5 gel pH 2 gel
Isoelectric point of the peptide
Cleavage probabilities of the 4 gels between 0' and 2' along the isoelectric point
cleavage probabilities
S-shape curve
• Nonparametric regression offers a large
flexibility for the shapes of marginal effects
• Model fitting
• Not all explanatory variables are kept in the model : Forward stepwise selection based on minimum AIC
• Cross validation of models in the prediction peptide cleavage
• Sequence of threshold evaluation from 0 to 1
Proportion of true and false positives along threshold
AUC of the different ROC curve
ROC curve of the model predicting the cleavage of a peptide between 0' and 2' for the pH9 gel
16
AIC : Akaike Information Criterion AUC : Area Under Curve
ROC : Specificity/Sensibility Curve
Model fitting and assessment
17
1. Materials and Methods
Nyemb et al (2016) study design
Data processing
Approaches used
Description of the explanatory variables
Generalized Additive Models (GAMs)
Building of models 2. Results
Overall approach
Specific approach
3. Conclusion and Perspectives
OUTLINE
18
1. Materials and Methods
Nyemb et al (2016) study design
Data processing
Approaches used
Description of the explanatory variables
Generalized Additive Models (GAMs)
Building of models 2. Results
Overall approach
Specific approach
3. Conclusion and Perspectives
OUTLINE
For each model : combination of 2 to 8 variables
Selected physicochemical variables depend on gel type and digestion time
21 variables out of 33 variables are at least part of one model
nbreAA Stop PTM Start, nbAAaromatic, nbA, nbS, nbF, nbT,
Charges(+)
nbAAessentiels, nbY, Charges(-), pI,
Aliphatic
GRAVY, nbQ, nbE, nbK, nbV, Peptideranker Presence in the model
(12 models in total) 11 8 4 3 2 1
Global approach
pH 2 pH 5 pH 7 pH 9
T2son 0.7371 0.6299 0.7307 0.768 T10son 0.6639 0.6288 0.7328 0.7595 T60son 0.6656 0.5455 0.798 0.7406
Application of conditions
pH 2 pH 5 pH 7 pH 9
T2son 0.9678 0.9423 0.986 0.9747
T10son 0.9432 0.9687 0.9493 0.9763
T60son 0.9614 0.938 0.9381 0.9657 0.9509
With the model of T60son
AUC of the different ROC curve
19
• Improve accuracy of models by considering the specificity of pepsin
• Only observe a "structural effect"
1. Reduction of the dataset to the peptides concerned
2. Addition of new filters linked to certain pepsin specificities
• Reduced activity of pepsin near terminal amino acids (Power et al. 1977)
Approach specific to preferential cleavage types
20
nbreAA Start pI Stop, PTM, nbP nbAAessentials Aliphatic, charges(+), nbAAaromatic, Peptideranker,
nbI, nbA, nbF Presence in the model
(12 models in total) 9 9 4 3 2 1
Approach specific to preferential cleavage types
pH 2 pH 5 pH 7 pH 9
T2son 0.9721 0.9293 0.9992 0.9959
T10son 0.9608 0.9221 0.955 0.9801
T60son 0.95 0.9596 0.9704 0.9607
• AUC’s values : between 0.9221 and 0.9959
21
For each model : combination of 2 to 4 variables
Selected physicochemical variables depend on gel type and digestion time
14 variables out of 33 variables are at least part of one model
AUC of the different ROC curve
Conclusion
22