Récemment recherché

Aucun résultat trouvé

Étiquettes

Aucun résultat trouvé

Document

Aucun résultat trouvé

Accueil Écoles Thèmes

Connexion

Using residues coevolution to search for protein homologs through alignment of Potts models

Partager "Using residues coevolution to search for protein homologs through alignment of Potts models"

N/A

N/A

Protected

Année scolaire: 2021

Info

Protected

Academic year: 2021

Partager "Using residues coevolution to search for protein homologs through alignment of Potts models"

Copied!

2

0

0

2

0

0

Chargement.... (Voir le texte intégral maintenant)

Télécharger maintenant ( 2 Page )

Texte intégral

(1)

HAL Id: hal-02402687

https://hal.inria.fr/hal-02402687

Submitted on 10 Dec 2019

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Using residues coevolution to search for protein homologs through alignment of Potts models

Hugo Talibart, François Coste

To cite this version:

Hugo Talibart, François Coste. Using residues coevolution to search for protein homologs through

alignment of Potts models. JOBIM 2019 - Journées Ouvertes Biologie, Informatique et Mathématiques,

Jul 2019, Nantes, France. pp.1. �hal-02402687�

(2)

Using residues coevolution to search for protein homologs through alignment of Potts models

Hugo Talibart

¹

and Fran¸ cois Coste

¹

Univ Rennes, Inria, CNRS, IRISA, Campus de Beaulieu, 35042, Rennes, France Corresponding author: hugo.talibart@irisa.fr

Thanks to sequencing technologies, the number of available protein sequences has considerably increased in the past years, but their functional and structural annotation remains a challenge.

This task is classically performed in silico by retrieving well-annotated homologs with profile Hid- den Markov Models (pHMMs), which are probabilistic models of families of homologous proteins capturing position-specific information with admissible insertion and deletion states. Two well-known software packages using pHMMs are widely used today : HMMER [1] aligns sequences to pHMMs to perform similarity searches, and HH-suite [2] takes it further by aligning pHMMs to pHMMs, enabling more sensitive remote homology searches.

Despite their solid performance, pHMMs are innerly limited by their positional nature. Yet, it is well-known that residues that are distant in the sequence can interact and co-evolve, e.g. due to their spatial proximity, resulting in correlated positions. Analyzing such correlations in a multiple sequence alignment by Direct Coupling Analysis [3], a statistical method to disentangle direct from indirect correlations, led to a breakthrough in the field of contact prediction [4]. Direct couplings are identified by inferring a Markov Random Field referred to as Potts model, and this model is of interest beyond its application in structure prediction. Indeed, its parameters can describe both positional conservation and direct couplings between residues of a protein. Such features drove us to examine Potts models for the purposes of modeling proteins and searching for their homologs.

In this poster, we focus on the use of Potts models for homology search, more specifically on our method for aligning and comparing Potts models. We present here our tool, named ComPotts, which formulates alignment of Potts models as an Integer Linear Programming problem and relies on a solver initially dedicated to pairwise protein alignment [5] to find efficiently the exact solution, and we present our first experimental results. Our ambition is to develop a package which would be equivalent to HH-suite but with Potts models rather than pHMMs, and to investigate on the added value of the direct couplings they provide.

Acknowledgements

HT is supported by a PhD grant from Minist` ere de l’Enseignement Sup´ erieur et de la Recherche (MESR).

This work benefited from the support of the French government through the National Research Agency with regard to an investment expenditure program, IDEALG (ANR-10-BTBR-04).

We would like to warmly thank Inken Wohlers for providing us with her code.

References

[1] Sean R. Eddy. Profile hidden markov models. Bioinformatics (Oxford, England), 14(9):755–763, 1998.

[2] Martin Steinegger, Markus Meier, Milot Mirdita, Harald Voehringer, Stephan J Haunsberger, and Johannes Soeding. Hh-suite3 for fast remote homology detection and deep protein annotation. bioRxiv, page 560029, 2019.

[3] Martin Weigt, Robert A White, Hendrik Szurmant, James A Hoch, and Terence Hwa. Identification of direct residue contacts in protein–protein interaction by message passing. Proceedings of the National Academy of Sciences, 106(1):67–72, 2009.

[4] Bohdan Monastyrskyy, Daniel D’Andrea, Krzysztof Fidelis, Anna Tramontano, and Andriy Kryshtafovych.

New encouraging developments in contact prediction: Assessment of the casp 11 results. Proteins: Structure, Function, and Bioinformatics, 84:131–144, 2016.

[5] Inken Wohlers. Exact Algorithms For Pairwise Protein Structure Alignment. PhD thesis, Vrije Universiteit,

01 2012.

Références

Télécharger maintenant ( PDF - 2 Page - 163.45 KB )

Documents relatifs

Designing for Innovation: Using Enterprise Ontology Theory to Improve Business-IT Alignment

As argumented by Enterprise Engineering, a systematic and scientific approach for constructing enterprise architectures should contribute to a solution for these three

Using Goals to Model Strategy Map for Business IT Alignment

Regulatory and social processes is the fourth cluster of core processes, required for regulatory and environmental sustainability compliance purposes, which is not being

Commercial-Off-The-Shelf Simulation Packages (CSPs) are widely used in industry to simulate discrete-event models.

Some of the motivations for using distributed simulation are as follows (Fujimoto 1999; Fujimoto 2003) - (a) Distributed simulation can facilitate model reuse by “hooking

Graph Search of Software Models Using Multidimensional Scaling

Our approach consists of an indexing phase, in which ver- tices of the model graphs are indexed for eﬃcient search, and a search phase, in which (i) an index lookup is performed on

Interval extensions are used to compute with partially known values defined by intervals

In this examination, we study interval constraint propagation methods to verify the reachability of a set of states (defined by intervals of values for each variable) from

Using Docker to deploy computing software

Created image based on tensorflow containing the Jupyter notebook implemented in Python convolutional neural network using bibltoteki Keras. This image was uploaded to the public

ComPotts: Optimal alignment of coevolutionary models for protein sequences

MRFalign showed better alignment precision and recall than HHalign and HMMER on a dataset of 60K non-redundant SCOP70 protein pairs with less than 40% identity with respect to

Using residues coevolution to search for protein homologs through alignment of Potts models

Using residues coevolution to search for protein homologs through alignment of Potts models.. Hugo Talibart,

Téléchargez tous les documents en téléchargeant vos documents d'étude.

Votre document sera enrichi, partagé sur 123dok FR pour vous aider à étudier.

Documents relatifs

Temporalité et surveillance dans l'art de David Rokeby

Temporalité et surveillance dans l'art de David Rokeby

111

0

0

Using simulation models to simulation models to improve the

Using simulation models to simulation models to improve the

21

0

0

The more-than-economic dimensions of cooperation in food production

The more-than-economic dimensions of cooperation in food production

16

0

0

Accord-cadre autonome des partenaires sociaux européens sur la numérisation

Accord-cadre autonome des partenaires sociaux européens sur la numérisation

14

0

0

Berkovich spaces embed in Euclidean spaces

Berkovich spaces embed in Euclidean spaces

16

0

0

Fine-grained parallelization of similarity search between protein sequences

Fine-grained parallelization of similarity search between protein sequences

47

0

0

Etude de fils semi-conducteurs dopés individuels par techniques locales d'analyse de surface

Etude de fils semi-conducteurs dopés individuels par techniques locales d'analyse de surface

209

0

0

Le post-opératoire

Le post-opératoire

12

0

0