• Aucun résultat trouvé

11 1.5 Protein Dynamics and Protein Disorder

N/A
N/A
Protected

Academic year: 2021

Partager "11 1.5 Protein Dynamics and Protein Disorder"

Copied!
4
0
0

Texte intégral

(1)

Contents

1 Introduction 1

1.1 Some numbers about proteins . . . . 4

1.2 Relation between sequence and structure . . . . 6

1.3 Experimentally investigating the protein structure . . . . 8

1.3.1 Protein Crystallography . . . . 8

1.3.2 Nuclear Magnetic Resonance data . . . . 9

1.4 Anfinsen’s dogma, dynamics and protein behavior . . . 11

1.5 Protein Dynamics and Protein Disorder: . . . 15

1.6 Estimation of dynamics from sequence . . . 17

1.7 Protein as Probabilistic entities . . . 18

1.8 Computational models of biological processes . . . 18

1.9 Sequence analysis in bioinformatics . . . 19

1.10 Goal of the thesis . . . 20

1.11 Contributions and list of publications . . . 20

2 Methods 23 2.1 Building models for biology . . . 23

2.1.1 Machine learning . . . 23

2.1.2 Supervised and unsupervised models . . . 23

2.1.3 Training and testing the model: the concept of overfitting . . 24

2.1.4 Performances Evaluation . . . 26

2.1.5 Markov chains and Hidden Markov Models . . . 27

2.1.6 Support Vector Machines . . . 31

2.1.7 A brief overview on Neural Networks . . . 33

2.2 Tracking protein evolution . . . 36

2.2.1 The most used alignment tools . . . 37

(2)

2.2.1.1 Clustal . . . 38

2.2.1.2 Mafft . . . 38

2.3 Predicting properties of Protein sequences . . . 40

2.3.1 Disorder prediction . . . 40

2.3.1.1 IUpred . . . 41

2.3.1.2 ESpritz . . . 42

2.3.2 Protein beta aggregation prediction . . . 42

2.3.3 Annotation of Archaeal DNA-binding proteins . . . 44

3 Contributions 47 3.1 Rigapollo:features-based protein alignment . . . 47

3.1.1 Translating amino acids into feature vectors . . . 50

3.1.2 Emission probabilities using SVMs . . . 52

3.1.3 Summary of the methodology . . . 54

3.1.4 Performance evaluation . . . 57

3.1.5 Datasets design . . . 57

3.1.6 Results . . . 58

3.1.7 Discussion . . . 63

3.2 AgMata: a beta-amiloid propensity predictor . . . 64

3.2.1 Approach . . . 65

3.2.1.1 Datasets . . . 66

3.2.1.2 Selection of the structural data . . . 66

3.2.1.3 Probability of pairing as discriminative problem . . . 66

3.2.1.4 Feature Vectors and application of a discriminative model . . . 67

3.2.1.5 Single-residue interaction probability calculation . . . 67

3.2.1.6 Beta Pairing Propensity . . . 68

3.2.2 Results . . . 69

3.2.3 Discussion . . . 70

3.3 DisoMine: a webserver for disordered prediction . . . 74

3.3.1 Training and testing Datasets . . . 74

3.3.2 Approach . . . 74

3.3.3 Results . . . 77

3.3.4 Discussion . . . 77

(3)

3.4 Xenusia: archaea DNA binding proteins identification . . . 78

3.4.1 Datasets . . . 79

3.4.2 Approach . . . 80

3.4.2.1 Prediction of DNA-interacting residues . . . 80

3.4.2.2 Predicting the DNA binding domain . . . 81

3.4.3 Validation on archaeal proteins . . . 82

3.4.4 Discussion . . . 84

4 Experimental applications 86 4.1 In-silico mutagenesis of human Ataxin-3 . . . 86

4.1.1 Preliminary experimental results . . . 90

4.2 Identification of Archaea DNA binding proteins . . . 99

4.2.1 Asgrad . . . 99

4.2.2 Sulfolobus acidocaldarius . . . 101

4.2.3 Experimental investigation . . . 102

4.3 Summary of the experimental procedures . . . 107

4.3.1 In-silico mutagenesis of human Ataxin-3 . . . 107

4.3.2 DNA-binding assay . . . 108

5 Conclusions and Future works 109 5.1 Obtaining reliable predictions from noisy data . . . 109

5.2 Computational and non-computational sciences . . . 110

5.3 Dynamics and predicted dynamics . . . 110

5.4 Concerns about the developed tools . . . 112

5.4.1 Protein Alignments and Rigapollo . . . 112

5.4.2 Predicting protein disorder with DisoMine . . . 113

5.4.3 DNA-binding protein identification in archaea . . . 113

5.4.4 Beta aggregation prediction with AgMata . . . 114

5.5 Future work . . . 115

6 Appendix 117 6.1 Glossary . . . 117

6.2 Methods Supplementary details . . . 120

6.3 Acronyms . . . 121

(4)

7 Acknowledgements 123

Bibliography 124

Références

Documents relatifs

Of the 30 predictors and servers groups submitting models for this target, only the group of Kozakov/Vajda submitted 1 medium quality model among their 5 top (and top 10) ranking

aortic roots demonstrated that atherosclerotic lesions from Apoe -/- Fap -/- mice harbour thicker

We then verified the specificity of the contextualized macrophage interactome composed of 30,182 interactions by showing that it is enriched in proteins related to the immune

obtained with di fferent instruments and analytical methods, and (iii) to investigate carbon and chlorine isotope fractionation during aerobic biodegradation of 1,2-DCA with three

In section 2 we describe the template model based on secondary structure definition, the scoring function based on knowledge-based potentials and pseudo-energies, the search

Prediction of functional features: predicted aspects include ConSurf annotations and visualizations of functionally im- portant sites (26,27), protein mutability landscape

Our predictions of residues with significant changes in methyl side- chain generalized order parameters upon ligand binding are in good agreement with the experimental data, and

To investigate the mechanism of pVHL aggregation in the fission yeast Schizosaccharomyces pombe in the absence of cofactors, yeast cells that express the