• Aucun résultat trouvé

Genetic distances : description, limits and future

N/A
N/A
Protected

Academic year: 2021

Partager "Genetic distances : description, limits and future"

Copied!
4
0
0

Texte intégral

(1)

HAL Id: hal-02702484

https://hal.inrae.fr/hal-02702484

Submitted on 1 Jun 2020

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Genetic distances : description, limits and future

Jacques Cabaret

To cite this version:

Jacques Cabaret. Genetic distances : description, limits and future. Veterinary Research, BioMed

Central, 1994, 25 (6), pp.609-611. �hal-02702484�

(2)

de biologie moléculaire, hôpital Saint- Antoine, 75012 Paris, France)

Malgré d’importants progrès réalisés tant au plan diagnostique que thérapeutique, Pneu- mocystis cann/y reste un agent opportuniste

de premier plan et de nombreuses énigmes persistent, notamment au niveau épidémio- logique. Le mode de transmission reste

encore à élucider, question d’autant plus

d’actualité que la nosocomialité de cette infec- tion a été envisagée par différents auteurs : Chave et al (1991) et Goesch et al (1990).

Une réponse pourrait être apportée par la mise en évidence de différences entre souches. Dans cette optique, nous avons

recherché une variation génomique entre

P carinü isolés de différents patients et nous présentons les résultats préliminaires de

cette étude.

Nous avons effectué une PCR selon la

technique décrite par Wakefield etal(1990),

sur l’ADN extrait de 21 lavages bronchio-

alvéolaires de patients présentant une pneu-

mocystose confirmée par les techniques

usuelles de diagnostic. Après purification

des produits d’amplification, un séquençage

direct a été réalisé. Les séquences obte-

nues ont été comparées entre elles et avec

celle initialement décrite par Sinclair et ai

( 1991 ) et par Lee et al (1993).

Les résultats obtenus indiquent que les

séquences nucléotidiques de P carinü pré-

sentent des différences entre elles et avec

celles publiées.

Références

Chave JP, Davis S, Melle GV, Francioli P (1991) Trans- mission of Pneumocystis carîniifrom AIDS patients

to other immunosuppressed patients: a cluster of Pneumocystis carinüpneumonia in renal transplant recipients. AIDS 5, 927-932

Goesch TR, Gotz G, Steillbrinck KH, Albrecht H, Hoss- feld DK (1990) Possible transferof Pneumocystis

carinü between immunodeficient patients. Lancet 336,627

Lee CH, Lu JJ, Bartlett MS et al (1993) Nucleotide sequence variation in Pneumocystis carinü strains that infect humans. J Clin Microbiol 31, 754-757

Sinclair K, Wakefield AE, Banerji S, Hopkin JM (1991) Pneumocystis carinüorganisms derived from rat and human hosts are gentically distinct. Mol Biochem Parasitol45, 183-184

Wakefield AE, Pixley F, Banerji S et al (1990) Detec-

tion of Pneumocystis carinü with DNA amplification.

Lancet 336, 451-453

Genetic distances: description, limits

and future. J J Cabaret Cabaret(INRA, (INRA, station de station de pathologie aviaire et de parasitologie, labo-

ratoire d’écologie des parasites, 37380 Nou- zilly, France)

The assessment of resemblance between

objects has been a preoccupation of many

a researcher involved in morphologic, eco- logical or genetic investigations. Legendre

and Legendre (1979) described a large array of critical distances used in the field of ecol- ogy. Geneticists have also developed their

own distances, which have been reviewed

by de Vienne and Damerval (1985). Since

this review, advances in the field of mole- cular biology and statistics (resampling

methods are available due to progress in

computational capacities), may justify that

distances might be reexamined. The use of distances as basic materials for phylogenetic

reconstruction is now widespread (Darlu

and Tassy, 1993) and further investigations

of the limits of distances is needed. The pre- sent work will focus on present develop-.

ments in the use of distances.

There are ’good’ and ’bad’ distances from the mathematical point of view. The ’bad’

ones do not allow comparisons between

several populations. The ’good’ distances

should obey the triangle inequality. This

means that if distances between 3 taxa or

populations (A, B, C) are compared, dis-

tance (AC) must be equal or less than dis- tance (AB) plus distance (BC). Distances

used by ecologists, such as Jaccard, Sokal and Sneath, and chi-squared, obey this tri- angle inequality. The genetic distances of Rogers, Edwards and Gregorius also obey

the triangle inequality, whereas the Nei dis-

(3)

tance does not. The latter, although very

widely used in genetic investigations, is the

least efficient at comparing populations from

the mathematical point of view. Euclidean distances derived from multivariate ana-

lyses (eg, principal component analysis of phenotypic frequencies between popula- tions) may prove of a wider interest when

two-by-two independent comparisons are required (Gasnier et a!, 1992). The multi- variate approach is probably best when a comparative approach of populations (mor- phologic, ecological and genetic) is the tar- get (Hoste and Cabaret, 1992).

One of the most important drawbacks in the use of distances is the absence of an accurate estimation of variability. Gregorius (1984), Katz (1986) and Katz and Goux

(1986) attempted to estimate variability of Gregorius or Nei distances; formai estimates

were not obtained and only simulations could assess the extent of variability. Rogers (1991 ) also assessed the variability of sev-

eral genetic distances by means of com- puter simulations. Resampling methods (jack-knife and bootstrap) should be per- formed in the future if genetic distances are

to be used further. Genetic distances should be also related to the Fstatistics of Wright, largely employed by geneticists.

Distances are always estimated for a par- ticular purpose: identification or phyloge-

netic construction. In the latter case, patristic

distance (number of change of state during evolution) is most often used. Genetic dis- tances such as Rogers, Manhattan and Edwards distances underestimate patristic

distances. The genetic distances conversely

overestimate differences between popula- tions, mostly when small samples are stud-

ied (Rogers, 1991 ).

Resemblance-divergence patterns

between populations or taxa are established

on different approaches, phenetic (general

resemblance between taxa) or phylogenetic (evolution of descriptive characters). Group- ing of taxa relies on hypotheses described

in Darlu and Tassy (1993). The best match

between genetic (patristic distance) and algorithm of classification is still to be found.

Most of the algorithms used in phenetic classification are described in Roux (1985).

Cluster analyses are either ascending or descending. They minimize the within-group

distances or variance. Cladisticians refer to parsimony models (minimisation of changing states), among which the most common are Wagner (convergence and

reversion are accepted), Camin-Sokal (reversion towards ancestral state is not

accepted), and Dollo (convergence not accepted).

The choice of distances and estimation of their variability have little evolved over the last few years. The same observation can be made for algorithms. Multifactorial analyses, although more easily available on ordinary computers, did not show the development

one could have expected. This could be related to the apparent difficulty of assess- ing statistical inference in such analyses.

Discriminant analysis should overcome eas- ily this drawback: distances of Mahalanobis have a well-known distribution and should

satisfy the needs of geneticists. Multivari-

ate analyses could provide (directly or indi- rectly) an efficient array of distances between data of different status (morpho- logical, ecological and genetical); these met-

rics could provide good quality estimates of

mean and variability, when associated with

resampling procedures.

References

Darlu P, Tassy P (1993) Reconstruction phylogénétique.

Concepts et méthodes. Masson, collection Biologie théorique 7, Paris, 245 p

De Vienne D, Damerval C (1985) Mesures de la diver- gence génétique. 3. Distances calculées à partir de marqueurs moléculaires. In: Les distances géné- tiques. Estimations et applications (Lefort-Busson M, de Vienne D, eds), INRA, Paris, 39-57, Gasnier N, Cabaret J, Moulia C (1992) Allozyme varia-

tion between laboratory reared and wild populations

of Teladorsagia circumcincta. IntJ J Parasito/22, 581 -

587

(4)

Gregonus HR (1984) A unique genetic distance. Biom J 26, 13-18 8

Hoste H, Cabaret J (1992) Intergeneric relations bet-

ween nematodes of the digestive tract in lambs: a multivariate approach. Int Parasito/22, 173-179 Katz M (1986) Étude des propriétés de certains indices

de distance génétique et de leurs estimateurs. Thèse de sciences, Paris VII l

Katz M, Goux JM (1986) The statistical properties of genetic absolute distance. Biom J 28, 729-739 Legendre L, Legendre P (1979) Écologie numérique. 2.

La structure des données écologiques. Masson, col- lection d’Écologie 13, Paris. 254 p

Rogers JS (1991 ) A comparison of the suitability of the Rogers, modified Rogers, Manhattan and Cavalli- Sforza and Edwards distances for inferring phylo- genetic trees from allele frequencies. 5yst Zoo140, 63-73

Roux M (1985) Algorithmes de classification. Masson, collection Méthodes + Programmes, Paris, 151 p

Taxonomic sampling, sequence length

and the robustness of molecular phylo- genies. G Lecointre G Lecointre (Laboratoire d’ichtyo- (Laboratoire d’ichtyo- logie générale et appliquée, et service com-

mun de systématique moléculaire du muséum (GDR 1005), muséum national d’Histoire naturelle, 43, rue Cuvier, 75231 1

Paris cedex 05, France)

Before exposing original results on the

robustness of molecular phylogenies, 1 will

make a few remarks about the way trees

are used in applied biology, leading to 2 paradoxes.

First, in both applied and fundamental

sciences, the’operationai taxonomic unïts’

compared (strains, populations, species, genera, families, etc) are biological enti- ties, ie entities with historical links between them. The comparison of the characters studied (molecular, morphological, etc)

makes sense only in the light of the con- cept of descent (of organisms) with modifi-

cations (of characters). This concept makes

a large difference between classifying living organisms (which means producing a phy- logeny) and classifying, for instance, toys or

cartoon characters (which means producing

groupings based on non-historical criteria).

This concept implies that every tree built from biological entities concerns phy- logeny

*

. The first paradox is that some- times, in applied sciences symposia, people present dendrograms, pretending that they

are not phylogeny but just ‘classification’,

while their conclusions about these trees are explicitly historical. Even in agronomy, the classification of biological entities is about phylogenies. Thus everybody who produces trees from molecular data sets

necessarily practices molecular phylogeny.

As stressed by Ernst Mayr, nothing makes

sense in biology, except in the light of evo-

lution.

The second paradox is that the trees that

should represent phylogenies are not built

with optimal tools, but sometimes obsolete

tools, and in the worst cases, ill-employed

tools (ie UPGMA (unweighted pair group method using arithmetic averaging) without knowing anything about the constancy of

the rate of change of the characters used).

To avoid this, some basic books are avail-

able, for instance those of Hillis and Moritz

(1990), Li and Graur (1991), or Darlu and Tassy (1993). Moreover, in the great major- ity of studies in applied sciences, no infor-

mation is given on the reliability of the trees produced. A very widely used tool in fun- damental molecular phylogeny studies, the bootstrap (Feisenstein, 1985, 1988; Hillis’

and Bull, 1993), should be used. In this way 1 follow the opinion of Penny and Hendy

*

Some authors, like P Tassy (personal commu- nication) and G Nelson, offer a more restrictive definition of the phylogenetic tree: phylogenetic

trees are those which reveal homoplasic char-

acters (character convergences, reversions, etc)

and ’true’ homologous characters (synapomor- phies).This is only possible with parsimony meth-

ods and, to a lesser extent, with maximum likeli- hood methods. According to this definition,

distance matrix methods cannot pretend to be

phylogenetic.

Références

Documents relatifs