• Aucun résultat trouvé

A molecular time-scale for eukaryote evolution: combining

A fundamental aspect that is inextricably related to the origin and history of eukaryo-tes, but that we have not discussed in this manuscript so far, is the timing of eukaryotic evolution. How ancient are eukaryotes, and when did the major groups diverge? These are the type of questions that will need to be accurately answered before one can propose a complete scenario for the tree of eukaryotes, also taking into account the essential temporal facet. Of course numerous attempts have been made. On the one hand paleontologists have for example suggested that eukaryotes originated about 1.8 billion years (Gyr) ago based on the fossil record [Zhang 1986], even 2.7 Gyr ago based on traces of biomarkers (molecular fossils) indicative of eukaryotes [Brocks et al. 1999]. But these dates are highly dependent on the controversial interpretations of early eukaryotic fossils, which are hotly debated. On the other hand molecular data have also been used to infer a time-scale of eukaryote evo-lution, but generally the various studies did not confirm the paleontological data, neither did they agree with one another (the origin of eukaryotes varies between more than 2 Gyr ago and about 950 Gyr ago depending on the calibration points and molecular markers used [Douzery et al. 2004; Hedges et al. 2004; Berney and Pawlowski 2006].

In order to reduce the discrepancies between molecular-based time estimates, and see how conflicting fossils can fit onto such a molecular time-scale, several steps can be im-proved: (1) use of a large number of genes and species to decrease phylogenetic errors asso-ciated with lack of signal and poor taxon-sampling; (2) use of advanced relaxed clock methods; and (3) most importantly, use of multiple accurate calibration intervals to ac-count for paleontological uncertainties. Taking advantage of our phylogenomic dataset, we recently undertook a study that aims to infer a molecular time-scale for eukaryote evolu-tion, fulfilling all of the above weaknesses of earlier attempts. This is an ongoing project, and only very preliminary results (thus incomplete) are available at the moment.

Never-theless I decided to briefly describe our project here, as it is for us the logical follow up of our work on phylogenomics and more accurate results should be available shortly. It is a collaborative study that, in addition to Jan Pawlowski, also involves Hervé Philippe (lead-ing the dat(lead-ing analyses), Colomban de Vargas and Ian Probert (provided the new cocco-lithophorid species).

A large number of genes and a broad taxon-sampling are important for reducing sto-chastic errors in phylogenetic reconstructions; this is also primordial to obtain the most ac-curate inferred dates. Our dataset used to investigate the phylogenetic position of telone-mids and centrohelids already contained over 100 genes for 72 species belonging to all ma-jor groups of eukaryotes, so this was a good starting point to build a new alignment on (see chapter 6). Another key aspect of molecular dating is the availability of several accurate calibrations [Hug and Roger 2007]. We added about 20 new species to our taxon sampling, in part selected for their calibration values (i.e., giving access to bifurcations that can be calibrated with fossils). These species typically belonged to plants or metazoans, because these groups represent the most abundant source of macrofossils that can be mapped on trees. However macrofossils are usually discrete entities that cannot be dissociated from the imperfection of the fossil record, a consequence of the non-preservation of the earliest fossils of most lineages or incorrect identification. Thus more precise complementary calibration intervals need to be obtained to avoid conflicts associated with calibration errors.

One approach that has been proposed is to use the well-documented Phanerozoic (540 million years (Myr) ago to present time) microfossil record of protists as calibration sources [Berney and Pawlowski 2006]. These microfossils have a key advantage: they usually pre-sent a continuous record, so one has access to the detailed successions of forms in the dif-ferent stratigraphic levels from their time of appearance. Thus it is in principle possible to reliably date the first radiation of certain groups, or the divergence of two lineages that do fossilized, and not be affected by uncertainties related to macrofossils. In this project we massively sequenced three protistan species which allowed us to add two new calibration intervals based on the continuous microfossil record.

Precisely, we sequenced by 454 pyrosequencing a cDNA library of an early diverging pennate diatom, Fragilaria pinnata, and two coccolithophorids (haptophytes), Calcidiscus leptoporus and Coccolithus braarudii by 454 pyrosequencing and classical Sanger method, respectively. The precise time of appearance of the pennate diatoms is unclear because of a period of poor silica deposition in the Upper Cretaceous [Kooistra and Medlin 1996]. How-ever they are abundant in the Tertiary and totally absent from the Mid-Cretaceous, 110

can be conservatively chosen as a reliable upper limit for the divergence of pennate dia-toms from their centric ancestor (lower limit of 65 Myr, corresponding to the divergence of the raphid pennates). The divergence between C. leptoporus and C. braarudii is well known to have occurred in the Palaeocene [Bown et al. 2007], that is between the first oc-currence of the Coccolithaceae in the basal Danian, 64 Myr ago and the first ococ-currence of the Calcidiscaceae which are now known to range down to the Late Palaeocene 58 Myr ago. These two mostly unambiguous calibrations were combined to 17 others in a set of 19 calibration intervals (Table 1-ch.8), and used together with 93 species and more than 100 genes to estimate under a Bayesian relaxed clock the divergence times of the major eu-karyotic clades.

Table 1-ch.8. Nodes and corresponding calibrations.

Node* Calibration# Reference

Radiation of extant crown dinoflagellate lineages 250/210 [Fensome et al. 1996]

Radiation of pennate diatoms 110/65 [Kooistra and Medlin 1996]

Centric/pennate diatom split -/185 [Rothpletz 1896]

R. filosa/Quinqueloculina split -/525 [Culver 1991]

Calc idisc us/Coc c olit hus split 64/58 [Bown et al. 2007]

Radiation of coccolithophores -/215 [Bown 1998]

Pinus/Ginkgo split -/307 [Magallón and Sanderson 2005]

Radiation of Eudicots 121/0 [Magallón and Sanderson 2005]

Gymnosperms/Angiosperms split 359/299 [Magallón and Sanderson 2005]

Bryophytes/Angiosperms split 488/443 [Magallón and Sanderson 2005]

Neurospora/Schizosaccharomyces split -/400 [Padovan et al. 2005]

Mus/Rattus split 16/12 [Benton and Donoghue 2007]

Primates/Rodents split 100/62 [Benton and Donoghue 2007]

Monotrenata/Theria split 191/162 [Benton and Donoghue 2007]

Bird/Mammal split 330/312 [Benton and Donoghue 2007]

Actinopterygii/Sarcopterygii split 422/416 [Benton and Donoghue 2007]

Origin of crown-group Eumetazoa -/635 [Peterson and Butterfield 2005]

Acanthamoeba/Hartmannella split -/750 [Porter et al. 2003]

Origin of Rhodophytes -/570 [Wellman et al. 2003]

* Nodes onto which the calibrations apply.

# Calibration intervals, as deduced from the corresponding references. Upper limit on the left of the slash, lower limit on the right of the slash. A dash means no upper known value.

Bold text corresponds to the two new calibrated nodes added in this study.

According to our preliminary estimates, the basal radiation of extant eukaryotes took place 1126-1306 Myr ago (mean 1211), and at about the same period the unikonts (1126-1309, mean 1210) and bikonts (1102-1281, mean 1188) originated. The SAR and CCTH groups originated shortly after, between 1044-1213 Myr ago (mean 1124) and 1047-1225 (mean 1134), respectively. Interestingly, present-day red algal groups included in our analyses started to diverge 702-924 Myr ago (mean 812) and their split from the green line occurred between 1020-1191 Myr ago (mean 1101). We also observe a very ancient origin for Rhizaria, between 975-1132 (mean 1048). These results seem to give slightly older dates (but within the same range) for the most basal eukaryotic splits than the last two studies of reference, that is Berney and Pawlowski (2006) who used microfossil calibrations but only the SSU rDNA gene, and Douzery et al. (2004) who used 129 proteins but for a lim-ited taxon-sampling and only 6 macrofossil calibrations. On the other hand, they are in strong disagreement with the recent proposition of much younger basal radiations for the major groups, i.e. after the snowball earth thawed about 635 Myr ago [Cavalier-Smith and Chao 2006; Cavalier-Smith 2009].

Once again these examples of dates are only raw results and much caution is thus re-quired. Detailed analyses will follow, in particular to assess the confidence of the dates (by comparing several chains, run the chains longer), test different sets of calibrations, test dif-ferent relaxed clock models (correlated vs uncorrelated), test difdif-ferent topologies or alterna-tive rootings.