• Aucun résultat trouvé

2.7 Supplementary Information

4.4.3 Statistics

In order to compare the taxonomic richness and structure of the dataset, a Non-Metric Distribution Scaling (NMDS) was applied on the dataset at the different taxonomic resolution implemented in PAST v 2.17 (Hammer et al., 2001). We used the Dice

dis-tance to account on the presence/absence data and on the bray-curtis disdis-tance on the absolute values and the relative proportions of the single units in the samples (Fig. 4.3).

To compare the taxonomic richness of the samples with census counts, we used the MARGO database (Kučera et al., 2005) and calculated the SHANNON index using PAST v 2.17. The SHANNON index of each sample was calculated for each taxonomic resolution as well (Fig. 4.4).

4.5 Results

To assess the preservation potential of planktonic DNA in sediments, we used the dataset of (Lecroq et al., 2011) consisting of 78,613,888 raw rDNA sequence reads of 36 nt of the foraminiferal-specific 37f hypervariable region generated from 31 surface samples collected in 5 oceanic regions (Fig. 4.1). After quality filtration, the reads were dereplicated and compared with 461 unique reference sequences of planktonic foraminifera with curated taxonomy from the Planktonic foraminifera ribosomal refer-ence database PFR2 (Morard et al., 2015) covering the same genomic region. The reads having between 0 to 10 differences with the reference sequences were retained and subse-quently compared to an extensive database of benthic foraminifera used in (Pawlowski et al., 2014b) to ensure that none of the reads belong to benthic foraminifera. Lastly we retained the unique reads, that we herein name ribotypes, occurring in at least 2 samples or having an abundance of 10 in the dataset.

As a result we retained 697 unique ribotypes representing a total volume of 486,435 reads assigned to 16 morphological species of planktonic foraminifera. 40 of the unique ribotypes are identical to the planktonic reference sequences and represent 387,083 reads (79.57% of the volume of the dataset). In order to infer the structure of the cryptic diversity, the 697 unique ribotypes were grouped in respect to their attribution at the morphospecies level with the reference sequences and clustered into genotype using a phylogenetic approach. We used a phylogenetic approach instead of a direct assignation to reference because it cannot account for potential genuine genotypes ab-sent in the reference. The reads received the assignation of the reference sequences with which they cluster. In the case the ribotypes were grouped without a reference sequence, they receive an artificial attribution. As a result, the 697 unique ribotype

were grouped in 42 genotypes. 601 of these ribotypes were attributed to 26 genotypes observed in plankton and representing 451,797 reads (80% of the ribotypic diversity and 92% of the volume of the dataset, Fig. 4.2).

Figure 4.2: Heat Map of the relative proportions of the genotypes detected in the 28 samples containing planktonic foraminifera reads. The histogram on the left side of the heat map indicates the total abundance of the reads belonging to planktonic foraminifera in log-value.

Between 48 (SFA-17) and 124,355 (SFA-15) reads were retained in 28 samples (Fig 4.1). Only 3 samples located in Arctic Ocean did not contained reads belong-ing to planktonic foraminifera (Fig. 4.2). The ribotypes belongbelong-ing to the microper-forate species Globigerinita glutinata are dominating the dataset (375,145 reads) and are mostly abundant in the tropical assemblages. The common subtropical species Orbulina universa, Globorotalia menardii, Globorotalia hirsuta, Hastigerina pelagica Neogloquadrina dutertrei, Globigerina falconensis, Globigerinella siphonifera, Pulleni-atina obliquiloculata,Galitellia vivans and Candeina nitidaare also present in the sub-tropical samples whereas the speciesNeogloboquadrina pachyderma have been detected in all but 2 subpolar or polar samples. In addition, ribotypes belonging to the genotype IV ofN.pachyderma are mostly found in the southern ocean whereas the ribotypes of the genotype I are only observed in the subpolar samples of the north hemisphere.

The biogeography of the cryptic species of Globigerina bulloides is also recovered as the Type I is only detected in subtropical samples whereas the genotypes of the type II are found in subpolar assemblages. Seemingly, the Type II ofGlobigerinita uvula is

mostly detected in the samples of subpolar environment of both hemispheres (Fig. 4.2).

We next compiled the data in three data sets with three level of taxonomic reso-lution: ribotype, genotype and morphospecies. We inferred the patterns of diversity structure within the three datasets according to compositional similarity using Nonlin-ear Multi-Dimensional Scaling (NMDS, Fig. 4.3). The Dice and Bray-Curtis similarity measure computed on absolute values of reads abundances (Boxes I and VII) highlight the high reproducibility of community composition as all the samples from the same lo-calities are more similar between each other than those of other lolo-calities. It highlights the fact that the same ribotypes were sequenced in similar quantity and proportions in the sample originating in the same area. The Bray-Curtis measure performed on the relative proportions of the ribotype, genotype and morphospecies datasets (Boxes IV, V and VI) showed a gradual bipolarization of the assemblages with an increasing gap.

It reflects the dominance and almost mutual exclusion of ribotypes belonging to the morphospecies G.glutinata and N. pachyderma in subtropical and polar assemblages, respectively. Finally, the application of Dice distance measure on the datasets of dif-ferent taxonomic resolution tends to blur the gap between the difdif-ferent assemblages (Boxes VIII and IX) showing that genotypes and morphospecies are detected in all environments, but in low abundances.

Finally, we calculated the SHANNON index of each sample with the three degrees of taxonomic resolution and compare them with the same index calculated from fossil assemblages (Fig. 4.4). The latitudinal gradient revealed the well-known mid-domain effect (Letten et al., 2013), characterized by low values at high latitude, a maxima of taxonomic richness in subtropical area and a decrease in low latitude. The result on the molecular assemblages displayed the same characteristics with all degree of taxonomic resolution. Indeed, the high latitude assemblages display low to null value (Monospe-cific assemblage) and reach its maxima in the mid latitude reflecting assemblages with higher taxonomic richness and evenness.

4.6 Discussion

The factors controlling the release of extracellular DNA in marine environment, its survival in the water column, its transport into deep sea and its burial are not fully

Figure 4.3: Grouping of each samples according to their taxonomic composition using Non-linear Multi-Dimensional Scaling based on three different degrees of taxonomic resolution (ribotypes, genotypes and morphospecies) and two similarity measures (Bray-Curtis on ab-solute number and relative abundances) and Dice. Each symbol represents one locality.

understood yet (Corinaldesi et al., 2007, 2011, 2014), but we provide evidences that the imprint of extracellular DNA in sediments conserve the major features of planktonic community. The high reproducibility of taxonomic composition and reads number of samples located in the same area (Fig. 4.3) suggests that these factors are constant at the regional scale and produce similar preservation conditions. Although the amount of reads recovered is of several order of magnitude between Caribbean (62 to 212 reads

Figure 4.4: Macro ecological pattern of spatial diversity known as the mid-domain effect. The grey areas represent the distribution of the SHANNON index calculated on the census count of planktonic foraminifera in core top samples from the MARGO database against latitude (Kučera et al., 2005). The dark grey are represent the 1st-3rd quartiles = 50% confidence interval, light grey part the 5th-95th percentile = 90% confidence interval and the black line represents the median. The same similarity measure has been calculated at each location for the environmental data using the relative abundance with the 3 degrees of taxonomic resolution.

per samples) and Japan (3,620 to 124,355 reads per sample) regions, the structure of the assemblages is almost identical (Fig. 4.3), proving that the basic taxonomic composition of one community can be recovered regardless of the preservation condi-tions. However, the taxonomic richness is higher when the number of reads is high too (Fig. 4.2), showing that the potential diversity spectrum of one peculiar locality cannot be uncovered unless a sufficient sequencing effort is made. This effort would have to be scaled to the preservation potential of the sediment. In the case of the planktonic foraminifera, the ancient DNA is in competition with recent authoctonous and probably more abundant benthic foraminifera DNA, more likely to be amplified.

As reported in studies comparing environmental or ancient DNA content with micro or macrofossil in sediments (Lejzerowicz et al., 2013a; Pedersen et al., 2013; Pawłowska et al., 2014), the diversity unveiled by the DNA content is partly overlapping the extent of the morphological diversity of the fossil content. In the case of the present dataset, this could result from a (i) a preferential amplification by the primers used during the amplification, (ii) a difference in gene copy number available in different species of foraminifera and possibly (iii) a differential representation of certain morphospecies in

small size fractions in comparison to large size fraction used in the census counts. In the present dataset, the microperforate species represent 55 to 99% of the molecular assemblages in the low latitude whilst they represent only 0 to 30% of the morpho-logical assemblages (Kučera et al., 2005). Therefore, it might be needed to develop specific primer to be able to recover the full diversity spectrum rather than increasing the sequencing depth. In this regard, an a priori knowledge of the group of interest is primordial to do not overlook potential groups not targeted by the primer used.

On the other hand, studying diversity from the sediments has the advantage to not be affected by seasonality, reproductive cycle or habitat depth and could be regarded as a compressed picture of the water column. An a priori knowledge of the zonation and trophic behavior of organisims is crucial to disentangle what taxa are living in the water column from those which are living in the benthos. Targeting chloroplast genes (Decelle et al., 2015) in deep sea sediment will be an obvious way of tracking photosynthesizing organisims which are only living in the sunlit ocean. However, re-cent metabarcoding approach at the global scale showed that chloroplastic organisms represent only 20% of the protist diversity (de Vargas et al., 2015). In this regard, the role of the reference databases and barcoding is crucial for a meaningful interpretation of the aDNA metabarcoding datasets (Pawlowski et al., 2012).