Application of Lempel-Ziv complexity to alignment-free sequence comparison of protein families
Texte intégral
Documents relatifs
Central to the InterPro database are pre- dictive models, known as signatures, from a range of different protein family databases that have different biological focuses and
The assignment is based on support vector machine classification of binary feature vectors denoting the presence or absence in the protein of highly conserved
It can be used to comprehensively evalu- ate AF methods under five different sequence analysis scenarios: protein sequence classification, gene tree in- ference, regulatory
Abstract: We give a new analysis and proof of the Normal limiting distribu- tion of the number of phrases in the 1978 Lempel-Ziv compression algorithm on random sequences built from
The process of producing the database of homol- ogy-derived structures is effectively a partial merger of the database of known three-dimensional structures, here the
This motivated us to replace the notion of common contact by the more general notion of similar internal distance (according to a fixed threshold), and then to pro- pose
When co-variation analysis aims to gain evolutionary information, OMES and ELSC are well suited to identify the co-evolving residues that contributed to the divergence within
Maximum Entropy modelling of a few protein families with pairwise interaction Potts models give values of σ ranging between 1.2 (when all amino acids present in the multiple