• Aucun résultat trouvé

A constrained SSU-rRNA phylogeny reveals the unsequenced diversity of photosynthetic Cyanobacteria (Oxyphotobacteria).

N/A
N/A
Protected

Academic year: 2021

Partager "A constrained SSU-rRNA phylogeny reveals the unsequenced diversity of photosynthetic Cyanobacteria (Oxyphotobacteria)."

Copied!
6
0
0

Texte intégral

(1)

RESEARCH NOTE

A constrained SSU-rRNA phylogeny

reveals the unsequenced diversity

of photosynthetic Cyanobacteria

(Oxyphotobacteria)

Luc Cornet

1,2

, Annick Wilmotte

3,4

, Emmanuelle J. Javaux

2

and Denis Baurain

1*

Abstract

Objective: Cyanobacteria are an ancient phylum of prokaryotes that contain the class Oxyphotobacteria. This

group has been extensively studied by phylogenomics notably because it is widely accepted that Cyanobacteria were responsible for the spread of photosynthesis to the eukaryotic domain. The aim of this study was to evaluate the fraction of the oxyphotobacterial diversity for which sequenced genomes are available for genomic studies. For this, we built a phylogenomic-constrained SSU rRNA (16S) tree to pinpoint unexploited clusters of Oxyphotobacteria that should be targeted for future genome sequencing, so as to improve our understanding of Oxyphotobacteria evolution.

Results: We show that only a little fraction of the oxyphotobacterial diversity has been sequenced so far. Indeed 31

rRNA clusters of the 60 composing the photosynthetic Cyanobacteria have a fraction of sequenced genomes < 1%. This fraction remains low (min = 1%, median = 11.1%, IQR = 7.3%) within the remaining “sequenced” clusters that already contain some representative genomes. The “unsequenced” clusters are scattered across the whole Oxypho-tobacteria tree, at the exception of very basal clades. Yet, these clades still feature some (sub)clusters without any representative genome. This last result is especially important, as these basal clades are prime candidate for plastid emergence.

Keywords: Cyanobacteria, Phylogeny, Diversity, Genomics, SSU rRNA (16S), Sequenced genome fraction

© The Author(s) 2018. This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creat iveco mmons .org/licen ses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creat iveco mmons .org/ publi cdoma in/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Introduction

Oxyphotobacteria are a class of the phylum

Cyanobacte-ria, one of the most important groups of prokaryotes [1].

This morphologically diversified class is the only group of Bacteria that can carry out oxygenic photosynthesis. It is commonly assumed that this bioenergetic process appeared in Oxyphotobacteria (or their ancestor) and caused the Great Oxidation Event (GOE) 2.4 billion years

ago [2–5]. The atmospheric increase in free oxygen had

a critical impact on early Earth evolution, including on

Oxyphotobacteria themselves, with the appearance of

new morphological features (e.g., akinetes) [6].

Oxypho-tobacteria were also responsible for the spread of photo-synthesis to eukaryotic lineages, through an initial event of primary endosymbiosis, followed by higher-order

endosymbiosis [7]. In consequence of their great

scien-tific interest, numerous phylogenies of

Oxyphotobacte-ria have been published during the last decade (e.g., [5,

8–19]), using ever more genomes as it appears in

pub-lic repositories. The diversity within Oxyphotobacteria is very high, both morphologically and genetically, and

their GC-content ranges from 35 to 71 [20]. Therefore,

it is important to assess how well the existing genomes represent the whole diversity of the class and detect the lineages for which representative genomes are not yet

Open Access

*Correspondence: denis.baurain@uliege.be

1 InBioS-PhytoSYSTEMS, Eukaryotic Phylogenomics, University of Liège,

4000 Liège, Belgium

(2)

available. Indeed, only 659 genomes of Oxyphotobacte-ria have been sequenced, whereas 6997 non-redundant rRNA sequences can be found in the SILVA database (as of October 2017). In this short communication, we built a SSU rRNA (16S) phylogeny to investigate the distribu-tion of representative genomes across oxyphotobacterial diversity, so as to guide the selection of taxa for future sequencing projects.

Main text

We took advantage of 7603 different SSU rRNA (16S)

sequences [6997 from SILVA 128 Ref NR [21], 466 from

genome assemblies publicly available on the NCBI

[22] and 140 from the BCCM/ULC public collection of

Cyanobacteria (http://bccm.belsp o.be/about

-us/bccm-ulc)], leading to a dataset of 2302 unique Operational

Taxonomic Units (OTUs) after dereplication at an iden-tity threshold of 97.5%. Because multiple sequence align-ment (MSA) quality is essential in single-gene analyses

[23–25], we used an in-house hand-curated MSA as a

starting point to carefully align all the OTUs. This MSA represented a high fraction of known Oxyphotobacteria diversity, as it completely included the broadly sampled

dataset of Schirrmeister et al. [26]. To alleviate the

sto-chastic error plaguing single-gene analyses [27], we

con-strained our SSU rRNA (16S) phylogeny using a more reliable multi-gene phylogeny computed on a subset of the OTUs at hand. To this end, we used the phylogeny

of Oxyphotobacteria developed in Cornet et al. [28], the

largest ever-built in terms of conserved sites (> 170,000 unambiguously aligned amino-acid positions from 675 genes). This strategy allowed us to easily compare the genome diversity available for phylogenomic stud-ies to the more completely sampled rRNA diversity of Oxyphotobacteria.

As shown in Fig. 1, 31 rRNA clusters have a

frac-tion of sequenced genomes < 1% and are thus nearly completely ignored in phylogenomic studies. These

0.2 U-2 (7-0C-0%) U-12 (12-0C-0%) C3-3 (198-10C-5.95%) A-5 (51-2C-8%) Gloeomargarita (6-0C-16.6%) HM241069.1.1309 U-19 (2-0C-0%) B2-9 (69-2C-7.2%) U-16 (7-0C-0%) U-18 (8-0C-0%) G-1 (4-2C-50%) B2-7 (9-0C-11.1%) U-7 (14-0C-0%) U-30 (6-0C-0%) U-10 (10-0C-0%) U-21 (30-0C-0%) F (55-7C-12.7%) Geitlerinema sp._PCC7407_GCF_000317045.1* B1-4 (112-4C-18.7%) C1 (231-4C-4.8%) U-22 (2-0C-0%) FJ790611.1.1311 AY151729.1.1399 B2-8 (11-0C-9%) U-24 (11-0C-0%) U-20 (5-0C-0%) U-13 (10-0C-0%) U-15 (6-0C-0%) U-1 (4-0C-0%) B1-5 (239-8C-14.2%) B2-3 (16-0C-6.3%) A-3 (6-0C-16.7%) B2-2 (11-1C-9.1%) U-25 (140-0C-0.7%) B1-3 (40-2C-5%) A-1 (16-1C-12.5%) U-6 (15-0C-0%) C2 (7-1C-14.3%) U-23 (50-0C-0%) G-2 (22O-3C-13.6%) U-11 (4-0C-0%) U-28 (2-0C-0%) U-9 (3-0C-0%) B1-1 (12-0C-8.3%) AB696385.1.1339 A-4 (39-3C-12.8%) U-27 (6-0C-0%) U-5 (4-0C-0%) Melainabacteria U-14 (212-0C-0.9%) B2-6 (61-2C-11.5%) U-4 (6-0C-0%) U-17 (44-0C-0%) U-29 (3-0C-0%) U-3 (11-0C-0%) KF037642.1.1459 B2-1 (29-1C-6.9%) C3-1 (23-1C-4.3%) Neosynechococcus sphagnicola_sy1_GCF_000775285.1* U-31 (33-0C-0%) B2-5 (92-6C-12%) C3-2 (68-4C-8.82%) B1-2 (104-0C-2.9%) B2-4 (50-1C-4%) E (28-4C-14.3%) A-2 (19-0C-21.1%) U-26 (3-0C-0%) U-8 (3-0C-0%) 7 2 10 0 10 0 8 2 10 0 10 0 9 9 7 8 9 0 7 5 9 8 10 0 G F E C2C1 C3 A B1 B2 1 0 0 7 5 9 6

Fig. 1 Collapsed constrained phylogeny of Oxyphotobacteria based on an alignment of 2302 SSU rRNA (16S) sequences (1233 unambiguously aligned nucleotide positions). The backbone of the tree shown in thick lines corresponds to the branches constrained by a phylogenomic tree of 71 OTUs × 170,983 positions [28]. The tree is rooted on Melainabacteria. Clusters on a red background have a fraction of sequenced genomes < 1% and can be considered as “unsequenced” due to the lack of representative genomes. Clusters on another coloured background have a higher fraction of sequenced genomes (“sequenced” clusters) and were annotated using the clade names of Shih et al. [12]. For each cluster (at the exception of Melainabacteria), we report in brackets the number of OTUs, the number of constrained genome sequences and the fraction of sequenced genomes

(3)

“unsequenced” clusters are scattered across the whole oxyphotobacterial tree, at the exception of very basal clades (G, F, E) and the Oscillatoriales clade (A), which have higher fractions of representative genomes. The group C (C1, C2, C3) appears paraphyletic in our tree, and is especially rich in “unsequenced” clusters (14 out of 31), including the largest one (212 OTUs). In all the remaining “‘sequenced” clusters that contain repre-sentative genomes, the fraction of sequenced genomes remains low (min = 1%, median = 11.1%, IQR = 7.3%). It is thus clear that genomes have not been evenly sampled across the tree. To help researchers to select organisms for future sequencing projects, our final dataset and tree

are made available (see below). In these files, SSU rRNA (16S) genes can be readily identified by their accession number, whether for SILVA/ULC strains or for rRNA genes predicted from genome assemblies.

The emergence of eukaryotic plastids from Oxypho-tobacteria has been the subject of different hypotheses during the last decade, the most frequent being an early origin around clade E, composed of Acaryochloris and

Thermosynechococcus species [17, 18]. By focusing on the

base of our oxyphotobacterial phylogeny, Fig. 2 shows

that in spite of the presence of a number of representa-tive genomes, very basal clades (G, F, E) still feature some clusters without any genome (highlighted in yellow in

0.2 AY862014.1.1296 FR667359.1.1336 JU016970.1.1457 KJ566932.1.1550 Synechococcus sp._PCC6312_GCF_000316685.1* KP676748.1.1334 JQ404420.1.1452 Pseudanabaena sp._PCC6802_GCF_000332175.1* JN438072.1.1385 AM710386.1.1460 EF205530.1.1452 FJ230802.1.1481 JU521378.1.1437 Melainabacteria Pseudanabaena frigida_O-155_945775* AB491712.1.1405 Synechococcus sp._JA23Ba213_GCF_000013225.1* LC085885.1.2032 Pseudanabaena biceps_PCC7429_GCF_000332215.1* AB862893.1.1444 FR667316.1.1341 FR798916.1.1462 AY151736.1.1397 FJ886676.1.1290 AB491710.1.1346 KF944970.1.1245 GU935353.1.1957 KM020012.1.1471 FM995185.1.1482 Thermosynechococcus sp._NK55a_GCF_000505665.1* AB757744.1.1451 EF205455.1.1408 AB039019.1.1436 JU088101.1.1400 FJ381988.1.1381 GU437456.1.1470 JQ062629.1.1394 JN825341.1.1275 GU935357.1.1962 KP676744.1.1323 FJ382000.1.1333 AF317627.1.1369 Cyanothece sp._PCC7425_GCA_000022045.1* KU382127.1.1405 EF205477.1.1409 Synechococcus sp._JA33Ab_GCF_000013205.1* KP676738.1.1359 AB630682.1.1386 AF445691.1.1446 EF205462.1.1417 FR667314.1.1309 Synechococcus sp._PCC7336_GCF_000332275.1* JN447700.1.1402 KF975568.1.1441 GU118090.1.1425 EF205526.1.1391 KU382126.1.1407 Acaryochloris marina_MBIC11017_GCF_000018105.1* AY151725.1.1397 KT952785.1.1466 HQ904122.1.1449 Gloeobacter kilaueensis_JS1_GCA_000484535.1* EF126240.1.1441 HM240889.1.1308 FR798944.1.1462 FR667331.1.1354 JU127188.1.1471 KF944967.1.1254 GQ370819.1.1463 GU437443.1.1466 EF205554.1.1444 AB486797.1.1338 JN437179.1.1347 KC831415.1.1441 HQ904134.1.1446 other Cyanobacteria Pseudanabaena sp._PCC7367_GCF_000317065.1* Pseudanabaena frigida_0-202_1153* DQ917811.1.1455 Cyanobium sp._FW064_1153* Gloeobacter violaceus_PCC7421_GCF_000011385.1* LC016778.1.2024 KF944974.1.1234 EF205458.1.1409 DQ181677.1.1436 DQ131175.1.1474 HM241072.1.1310 FR667294.1.1342 HF677155.1.1325 JQ404415.1.1455 KM386855.1.2051 Pseudanabaena sp._PCC6802_GCF_000332175.1* EF429508.1.1457 GQ324966.1.1425 FR667443.1.1363 JQ404422.1.1490 HM241066.1.1308 FJ891026.1.1346 HM585026.1.1263 AB757741.1.1454 FJ902633.1.1351 AY151732.1.1395 FN984873.1.1324 LC016776.1.1997 AB630391.1.1408 KF179651.1.1407 KC621875.1.1436 JU521389.1.1433 FN984806.1.1299 AB862910.1.1358 EF205448.1.1398 9 8 8 7 8 2 9 9 9 4 1 0 0 1 0 0 1 0 0 1 0 0 8 0 7 4 9 6 1 0 0 1 0 0 8 5 9 4 8 9 7 7 7 6 8 3 1 0 0 7 7 7 5 9 7 9 2 9 9 9 0 8 7 7 5 9 4 7 2 1 0 0 1 0 0 1 0 0 8 1 9 8 8 2 1 0 0 9 9 1 0 0 7 5 1 0 0 1 0 0 7 6 9 8 9 1 1 0 0 9 7 G-1 G-2 F E

Fig. 2 Expanded basal part of the constrained rRNA phylogeny of Oxyphotobacteria, with clusters E, F, G1 and G2. The tree is rooted on Melainabacteria. Asterisk indicate constrained genome sequences. The colour of the backbone corresponds to the groups of Shih et al. [12]. “Unsequenced” clusters (both monophyletic and paraphyletic) are shown on a yellow background

(4)

Fig. 2). Therefore, these clusters should be targeted in pri-ority for genome sequencing if one wants to resolve the early diversification of Oxyphotobacteria and the origin of plastids by assembling phylogenomic datasets more faithful to the true diversity.

Methods

Cyanobacterial (651 Oxyphotobacteria and 8 Melaina-bacteria) genome assemblies were downloaded from the NCBI FTP server on October 1st, 2017. SSU rRNA (16S) genes were predicted for all assemblies with RNAmmer

v1.2 [29]. Predicted rRNA genes were taxonomically

clas-sified with SINA v1.2.11 [30], using the release 128 of

SILVA database composed of 1,922,213 SSU rRNA (16S)

reference sequences [21]. Among the 651

Oxyphotobac-teria assemblies, 193 were devoid of OxyphotobacOxyphotobac-teria rRNA genes (176 without any prediction and 17 with only non-cyanobacterial predicted genes). All Oxypho-tobacteria rRNA genes (7001 sequences) of SILVA 128 Ref NR (non-redundant) were downloaded on Octo-ber 1st, 2017. Four SSU rRNA sequences with a SILVA-provided incongruent lineage (e.g., Proteobacteria for Prochlorothrix hollandica PCC 9006) were deleted from the dataset, whereas the 466 sequences predicted with RNAmmer from genome assemblies were added, along with 140 new sequences from strains of the BBMC/

ULC Cyanobacteria collection (http://bccm.belsp o.be/

about -us/bccm-ulc), hence leading to a total of 7603 SSU rRNA (16S) sequences (including 8 Melainabacteria).

The dataset was dereplicated with CD-HIT (cd-hit-est)

v4.6 [31], using an identity threshold of 97.5%, an

align-ment coverage control of 0.0 for the longest sequence and 0.9 for the shortest sequence. The value of 97.5% for the identity threshold was chosen by following the OTU

definition of Taton et al. [32]. It probably underestimates

the true number of different species, since Edgar [33]

has recently shown that the bacterial species delineation threshold is closer to 99% for full-length sequences. How-ever, for the purpose of this study aimed at pinpointing large patches of unsequenced diversity, a coarse-grained clustering was sufficient. The cluster file produced by CD-HIT was retrieved and 13 representative sequences initially selected by CD-HIT were replaced by genomic sequences from the same CD-HIT clusters, so as to be able to enforce the phylogenomic constraints from

Cor-net et al. [28] when inferring the rRNA tree. This process

led to a dataset of 2302 OTUs.

The 2302 sequences were aligned using the

BLASTN-based software “two-scalp” (D. Baurain; https ://metac

pan.org/relea se/Bio-MUST-Apps-TwoSc alp), start-ing from a hand-curated MSA (1252 sites post-BMGE; 1.52% missing character states) of 1128 SSU rRNA (16S) sequences (L. Cornet; unpublished). The 1128 sequences

had been selected in order to be as close as possible to

the dataset of Schirrmeister et al. [26], which represented

all the Oxyphotobacteria diversity known at that time. Unambiguously aligned sites were selected with BMGE

v1.12 [34], using moderately severe settings (entropy

cut-off 0.5, gap cut-cut-off 0.2) and yielding a high-quality MSA of 1233 conserved sites with only 1.83% missing charac-ter states. In comparison, building a MSA from scratch

on the same dataset with MAFFT v7.273 [35], using its

automatic mode, led to only 1013 conserved sites (2.52% missing character states) after applying the same protocol to select the unambiguously aligned sites.

The tree was inferred with RAxML v8.1.17 [36] under

a GTR + Γ4 model with a 100× rapid bootstrap

analy-sis, using a constraining tree of 75 different Oxyphoto-bacteria. The constraining tree was a phylogenomic tree based on a concatenation of 675 genes (> 170,000

unam-biguously aligned amino-acid positions [28]). However,

four organisms (Leptolyngbya sp. GCF_000733415.1, Oscillatoriales cyanobacterium GCF_000309945.1, Lep-tolyngbya sp. OTC1/1, Phormidium priestleyi ANT. PROGRESS2.5) had to be pruned before inference due to the lack of genuine rRNA genes in their assemblies. The rRNA tree was then rooted on the Melainabacteria clan (sister-group of Oxyphotobacteria, composed of eight sequences), manually collapsed using FigTree v1.4.3 (http://tree.bio.ed.ac.uk/softw are/figtr ee) and further

arranged in InkScape v0.92 [37]. For each collapsed

sub-tree, a “sequenced genome fraction” statistic was com-puted by taking the ratio between the number of OTUs having a publicly available genome in their CD-HIT clus-ter to the total number of OTUs in the subtree.

Limitations

Among the 659 available Oxyphotobacteria genomes, 193 genomes were either completely devoid of rRNA genes (or only featured taxonomically incongruent rRNA

genes) and were thus unusable for phylogeny (see “

Meth-ods” for details). Obviously, the results of this study

might have been different if we had used these genomes to evaluate the sequenced genome fraction. The absence of rRNA genes is explained by the fact that the majority of these genomes (143 out of 193) were assembled with a metagenomic pipeline, according to NCBI metadata. This

phenomenon was noticed in Cornet et al. [38] and is due

to the frequent loss of rRNA sequences during

metagen-omic assembly [28]. Given that neither the isolation

source nor the assembly pipeline was easy to determine, we decided not to consider these 193 genomes in our analysis by precautionary principle. Hence, a significant number of these metagenomes were found in the genome section of RefSeq, even though it is not allowed. This decision is moreover consolidated by the observation

(5)

that only a small number (about 20) of SSU rRNA (16S) sequences could be recovered after extensive searches on the NCBI using the “strain” names of these 193 genomes. Abbreviations

OTUs: Operational Taxonomic Units; MSA: multiple sequence alignment. Authors’ contributions

LC and DB designed the experiments, AW generated new SSU rRNA (16S) sequences for 140 strains from the BCCM/ULC public collection of Cyanobac-teria, LC performed the computational analyses and drew the figures, EJJ and AW helped with the biogeological interpretations of the results, LC and DB wrote the manuscript, with the assistance of AW and EJJ. All authors read and approved the final manuscript.

Author details

1 InBioS-PhytoSYSTEMS, Eukaryotic Phylogenomics, University of Liège,

4000 Liège, Belgium. 2 UR

Geology-Palaeobiogeology-Palaeobotany-Palae-opalynology, University of Liège, 4000 Liège, Belgium. 3 InBioS-CIP, Centre

for Protein Engineering, University of Liège, 4000 Liège, Belgium. 4 BCCM/ULC

Collection of Cyanobacteria, University of Liège, 4000 Liège, Belgium. Acknowledgements

We thank Yannick Lara (ULiège) for the cultivation and sequencing of some the BCCM/ULC strains. Rosa Gago is acknowledged for her help when drawing the figures. Finally, we are grateful to the editor and the anonymous reviewer for their insightful comments.

Competing interests

The authors declare that they have no competing interests. Availability of data and materials

The dataset (hand-curated MSA, final raw MSA, final MSA post-processed by BMGE, phylogenomic constraining tree, and final SSU rRNA ML tree) sup-porting the conclusions of this article is available as a compressed archive in the figshare repository through the following link (https ://doi.org/10.6084/ m9.figsh are.62256 41).

Consent for publication Not applicable.

Ethics approval and consent to participate Not applicable.

Funding

This work was supported by operating funds from FRS-FNRS (National Fund for Scientific Research of Belgium) (AW), the European Research Council Stg ELITE FP7/308074 (EJJ), the BELSPO project CCAMBIO (SD/BA/03A) (AW), the BELSPO Interuniversity Attraction Pole Planet TOPERS (EJJ and LC). LC was a FRIA fellow of the FRS-FNRS. AW is a Research Associate of the FRS-FNRS. Com-putational resources were provided through two grants to DB (University of Liège “Crédit de démarrage 2012” SFRD-12/04; FRS-FNRS “Crédit de recherche 2014” CDR J.0080.15).

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in pub-lished maps and institutional affiliations.

Received: 10 May 2018 Accepted: 26 June 2018

References

1. Soo RM, Hemp J, Parks DH, Fischer WW, Hugenholtz P. On the origins of oxygenic photosynthesis and aerobic respiration in Cyanobacteria. Sci-ence. 2017;355:1436–40.

2. Bekker A, Holland HD, Wang P-L, Rumble D, Stein HJ, Hannah JL, et al. Dating the rise of atmospheric oxygen. Nature. 2004;427:117–20. 3. Knoll AH. The geological consequences of evolution. Geobiology.

2003;1:3–14.

4. Kopp RE, Kirschvink JL, Hilburn IA, Nash CZ. The Paleoproterozoic snowball Earth: a climate disaster triggered by the evolution of oxygenic photosynthesis. Proc Natl Acad Sci USA. 2005;102:11131–6.

5. de Alda JAGO, Esteban R, Diago ML, Houmard J. The plastid ancestor originated among one of the major cyanobacterial lineages. Nat Com-mun. 2014;5:4937.

6. Gonzalez-Esquer CR, Smarda J, Rippka R, Axen SD, Guglielmi G, Gugger M, et al. Cyanobacterial ultrastructure in light of genomic sequence data. Photosynth Res. 2016;129:147–57.

7. Archibald JM. The puzzle of plastid evolution. Curr Biol. 2009;19:R81–8. 8. Rodríguez-Ezpeleta N, Brinkmann H, Burey SC, Roure B, Burger G,

Löf-felhardt W, et al. Monophyly of primary photosynthetic eukaryotes: green plants, red algae, and glaucophytes. Curr Biol. 2005;15:1325–30. 9. Blank CE, Sánchez-Baracaldo P. Timing of morphological and ecological

innovations in the cyanobacteria—a key to understanding the rise in atmospheric oxygen. Geobiology. 2010;8:1–23.

10. Criscuolo A, Gribaldo S. Large-scale phylogenomic analyses indicate a deep origin of primary plastids within Cyanobacteria. Mol Biol Evol. 2011;28:3019–32.

11. Dagan T, Roettger M, Stucken K, Landan G, Koch R, Major P, et al. Genomes of Stigonematalean cyanobacteria (subsection V) and the evo-lution of oxygenic photosynthesis from prokaryotes to plastids. Genome Biol Evol. 2013;5:31–44.

12. Shih PM, Wu D, Latifi A, Axen SD, Fewer DP, Talla E, et al. Improving the coverage of the cyanobacterial phylum using diversity-driven genome sequencing. Proc Natl Acad Sci. 2013;110:1053–8.

13. Blank CE. Origin and early evolution of photosynthetic eukaryotes in freshwater environments: reinterpreting proterozoic paleobiology and biogeochemical processes in light of trait evolution. J Phycol. 2013;49:1040–55.

14. Li B, Lopes JS, Foster PG, Embley TM, Cox CJ. Compositional biases among synonymous substitutions cause conflict between gene and protein trees for plastid origins. Mol Biol Evol. 2014;31:1697–709.

15. Sánchez-Baracaldo P. Origin of marine planktonic cyanobacteria. Sci Rep. 2015;5. https ://www.ncbi.nlm.nih.gov/pmc/artic les/PMC46 65016 /. Accessed 7 Nov 2017.

16. Uyeda JC, Harmon LJ, Blank CE. A comprehensive study of cyanobacte-rial morphological and ecological evolutionary dynamics through deep geologic time. PLoS ONE. 2016;11:e0162539.

17. Ponce-Toledo RI, Deschamps P, López-García P, Zivanovic Y, Benzerara K, Moreira D. An early-branching freshwater cyanobacterium at the origin of plastids. Curr Biol. 2017;27:386–91.

18. Sánchez-Baracaldo P, Raven JA, Pisani D, Knoll AH. Early photosyn-thetic eukaryotes inhabited low-salinity habitats. Proc Natl Acad Sci. 2017;114:E7737–45.

19. Walter JM, Coutinho FH, Dutilh BE, Swings J, Thompson FL, Thompson CC. Ecogenomics and taxonomy of Cyanobacteria phylum. Front Microbiol. 2017;8:2132. https ://doi.org/10.3389/fmicb .2017.02132 .

20. Herdman M, Janvier M, Waterbury JB, Rippka R, Stanier RY, Mandel M. Deoxyribonucleic acid base composition of cyanobacteria. Microbiology. 1979;111:63–71.

21. Quast C, Pruesse E, Yilmaz P, Gerken J, Schweer T, Yarza P, et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 2013;41:D590–6.

22. Geer LY, Marchler-Bauer A, Geer RC, Han L, He J, He S, et al. The NCBI biosystems database. Nucleic Acids Res. 2010;38:D492–6.

23. Lunter G, Rocco A, Mimouni N, Heger A, Caldeira A, Hein J. Uncertainty in homology inferences: assessing and improving genomic sequence align-ment. Genome Res. 2008;18:298–309.

24. Wong KM, Suchard MA, Huelsenbeck JP. Alignment uncertainty and genomic analysis. Science. 2008;319:473–6.

25. Dessimoz C, Gil M. Phylogenetic assessment of alignments reveals neglected tree signal in gaps. Genome Biol. 2010;11:R37.

26. Schirrmeister BE, Anisimova M, Antonelli A, Bagheri HC. Evolution of cyanobacterial morphotypes. Commun Integr Biol. 2011;4:424–7. 27. Delsuc F, Brinkmann H, Philippe H. Phylogenomics and the

(6)

fast, convenient online submission

thorough peer review by experienced researchers in your field rapid publication on acceptance

support for research data, including large and complex data types

gold Open Access which fosters wider collaboration and increased citations maximum visibility for your research: over 100M website views per year

At BMC, research is always in progress. Learn more biomedcentral.com/submissions

Ready to submit your research? Choose BMC and benefit from:

28. Cornet L, Bertrand AR, Hanikenne M, Javaux EJ, Wilmotte A, Baurain D. Metagenomic assembly of new (sub)arctic Cyanobacteria and their associated microbiome from non-axenic cultures. bioRxiv. 2018. https :// doi.org/10.1101/28773 0.

29. Lagesen K, Hallin P, Rødland EA, Stærfeldt H-H, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007;35:3100–8.

30. Pruesse E, Peplies J, Glöckner FO. SINA: accurate high-throughput multiple sequence alignment of ribosomal RNA genes. Bioinformatics. 2012;28:1823–9.

31. Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–9. 32. Taton A, Grubisic S, Brambilla E, Wit RD, Wilmotte A. Cyanobacterial

diversity in natural and artificial microbial mats of Lake Fryxell (McMurdo Dry Valleys, Antarctica): a morphological and molecular approach. Appl Environ Microbiol. 2003;69:5157–69.

33. Edgar RC. Accuracy of taxonomy prediction for 16S rRNA and fungal ITS sequences. PeerJ. 2018;6:e4652.

34. Criscuolo A, Gribaldo S. BMGE (Block Mapping and Gathering with Entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol Biol. 2010;10:210. 35. Katoh K, Standley DM. MAFFT multiple sequence alignment software

version 7: improvements in performance and usability. Mol Biol Evol. 2013;30:772–80.

36. Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006;22:2688–90.

37. Bah T. Inkscape: guide to a vector drawing program (Digital Short Cut). New York: Pearson Education; 2009.

38. Cornet L, Meunier L, Vlierberghe MV, Leonard RR, Durieu B, Lara Y, et al. Consensus assessment of the contamination level of publicly available cyanobacterial genomes. PLOS One. 2018. https ://doi.org/10.1101/30178 8.

Figure

Fig. 1  Collapsed constrained phylogeny of Oxyphotobacteria based on an alignment of 2302 SSU rRNA (16S) sequences (1233 unambiguously  aligned nucleotide positions)
Fig. 2  Expanded basal part of the constrained rRNA phylogeny of Oxyphotobacteria, with clusters E, F, G1 and G2

Références

Documents relatifs

Instrumentation 2x Alto Saxophone 2x Tenor Saxophone Baritone Saxophone 4x Trumpet / Flugelhorn 3x Tenor Trombone Bass Trombone Electric Guitar Bass Guitar

We considered the possibility that there was only very weak molecular correlation along the texture axis (a classical nematic) so that the observed meridional arcs could be

[r]

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des

The cunning feature of the equation is that instead of calculating the number of ways in which a sample including a given taxon could be drawn from the population, which is

Unit´e de recherche INRIA Rennes, Irisa, Campus universitaire de Beaulieu, 35042 RENNES Cedex Unit´e de recherche INRIA Rhˆone-Alpes, 655, avenue de l’Europe, 38330 MONTBONNOT ST

Since welding is bound to be indispensable for construction of large-scale structures and long-term repairs in space environment, a need for advancement of metals

Traditional contact theory assumes that when two surfaces meet at a single contact spot, the electrical current flow lines spread in all directions upon exiting the contact spot,