• Aucun résultat trouvé

Evaluation of the relative importance between gene loss and gene gain during mollicute evolution

N/A
N/A
Protected

Academic year: 2021

Partager "Evaluation of the relative importance between gene loss and gene gain during mollicute evolution"

Copied!
1
0
0

Texte intégral

(1)

M. mycoides subsp. mycoides M. mycoides subsp. capri M. capricolum subsp. capripneumoniae M. capricolum subsp. capricolum M. leachii M. yeatsii M. putrefaciens Me. florum S. citri M. bovigenitalium M. fermentans M. bovis M. agalactiae M. synoviae M. crocodyli M. pulmonis M. hyopneumoniae M. ovipneumoniae M. conjunctivae M. hyorhinis M. mobile M. alkalescens M. auris M. arginini M. hominis M. arthritidis U. parvum U. urealyticum U. diversum M. penetrans M. genitalium M. pneumoniae M. gallisepticum Ca. phytoplasma asteris Onion Yellows strain Ca. phytoplasma asteris Aster Yellows strain

Ca. phytoplasma australiense Ca. phytoplasma mali A. laidlawii 100 100 100 98 100 100 100 100 100 100 100 100 72 100 100 100 100 100 100 99 100 100 100 100 100 100 100 100 100 100 100 100 100 100 67 100 100 100 100 99 100 100 100 100 0.2

M. mycoides subsp. mycoides M. mycoides subsp. capri M. capricolum subsp. capripneumoniae M. capricolum subsp. capricolum M. leachii M. yeatsii M. putrefaciens Me. florum S. citri M. bovigenitalium M. fermentans M. bovis M. agalactiae M. synoviae M. crocodyli M. pulmonis M. hyopneumoniae M. ovipneumoniae M. conjunctivae M. hyorhinis M. mobile M. alkalescens M. auris M. arginini M. hominis M. arthritidis U. parvum U. urealyticum U. diversum M. penetrans M. genitalium M. pneumoniae M. gallisepticum Ca. phytoplasma asteris Onion Yellows strain Ca. phytoplasma asteris Aster Yellows strain

Ca. phytoplasma australiense Ca. phytoplasma mali A. laidlawii 100 100 100 98 100 100 100 100 100 100 100 100 72 100 100 100 100 100 100 99 100 100 100 100 100 100 100 100 100 100 100 100 100 100 67 100 100 100 100 99 100 100 100 100 0.2

Genome sequencing of 20 genomes from

ruminant mycoplasmas

Sequencing and assembly of 20 genomes from the EVOLMYCO project. Genomes were

sequenced with several technologies resulting in drafts with coverages ranging from 31X to

215X.

1

454 sequencing (8 kbp mate paired libraries,)

2

Sanger sequencing (0.5 - 1X

coverage),

3

whole genome amplification step.

Evaluation of the relative importance between gene loss

and gene gain during mollicute evolution

Pascal Sirand-Pugnet

1

, Marc Breton

1

, Emilie Dordet-Frisoni

2

, Eric Baranowski

2

, Aurélien Barré

3

, Carole Couture

1

, Virginie Dupuy

4

,

Patrice Gaurivaud

5

, Daniel Jacob

1

, Claire Lemaitre

6

, Lucia Manso-Silvan

4

, Macha Nikolski

7

, Laurent Xavier Nouvel

2

, François

Poumarat

5

, Florence Tardy

5

, Patricia Thébault

7

, Sébastien Theil

4

, Christine Citti

2

, François Thiaucourt

4

, Alain Blanchard

1

1

Univ. Bordeaux, INRA, UMR 1332 de Biologie du Fruit et Pathologie, F-33140 Villenave d’Ornon, France;

2

Univ. Toulouse, INRA, ENVT, UMR 1225, F-31076 Toulouse, France;

3

Univ. Bordeaux, Centre de Bioinformatique de Bordeaux,

F-33076 Bordeaux cedex, France;

4

CIRAD, UMR CMAEE, Campus de Baillarguet, F-34398 Montpellier, France;

5

Anses, Lyon Laboratory, UMR Mycoplasmoses of Ruminants, F-69364 Lyon cedex 07, France;

6

Equipe GenScale, INRIA, Rennes

Bretagne Atlantique, Campus de Beaulieu, F-35042 Rennes, France;

7

Univ. Bordeaux, Laboratoire Bordelais de Recherche en Informatique, Centre de Bioinformatique de Bordeaux, UMR 5800, F-33405 Talence, France

Introduction

Conclusion

This collective study gathered an impressive amount of data. The first most significant outputs are:

Species/subspecies

Strain

Sequencing strategy

Scaffold

number

Total bp

(contigs

>500b)

M. agalactiae

14628-cl

454 + Solexa (+ 454MP

1

)

1

919591

M. alkalescens

14918-cl

Sanger

2

+ 454 + Solexa (+ 454MP) 1

771939

M. arginini

7264-cl

Sanger + 454 + Solexa (+ 454MP) 2

615621

M. auris

15026-cl

454 MP + Solexa

1

767861

M. bovis

8790

454 + Solexa (+ 454MP)

1

905798

M. bovis

1067

454 MP + Solexa

4

907147

M. bovigenitalium

cl-51080

454 MP + Solexa

1

862247

M. capricolum subsp. capricolum

14232-cl

454 + Solexa (+ 454MP)

3

1032227

M. capricolum subsp. capripneumoniae

99108-P1

454 MP + Solexa

10

1006277

M. leachii

06049-C3

Sanger + 454 + Solexa (+ 454MP) 2

1008437

M. mycoides subsp. capri

PG3 (TS)

454 MP + Solexa

2

1024498

M. ovipneumoniae

cl-14811

454 MP + Solexa

7

1071470

M. mycoides subsp. mycoides

PO-67

454 + Solexa (+ 454MP)

10

967062

M. mycoides subsp. mycoides

B345/93-cl

454 MP + Solexa

5

957208

M. mycoides subsp. mycoides

C425/93-cl

454 MP + Solexa

10

967852

M. mycoides subsp. mycoides

KH3J

454 MP + Solexa

15

955221

M. mycoides subsp. mycoides

Gemu Goffa

454 MP + Solexa

5

966678

M. mycoides subsp. mycoides

8740/53

454 MP + Solexa

8

965363

M. yeatsii

13926-cl

Sanger + 454 + Solexa (+ 454MP) 2

897474

U. diversum

246

454 MP + Solexa (WGA

3

)

64

1011155

Whole genome phylogeny of mollicutes

• 59 genomes

• 38 taxa

• 94 concatenated proteins

(1 single gene per genome)

• 17635 amino acid residues

• 95% branches with statistical

support >98%

• Method:

• MUSCLE alignment

• Gblocks cleaning

• PhyML (LG model)

• aLRT statistical test

2

+5

2

+5

2

+1

2

+1

+1

+1

1

+1

1

+1

2

+1

2

+1

+1

+1

1

1

1

1

+1

+1

1

1

2

+2

2

+2

2

+1

2

+1

1

1

1

1

1

1

3

3

+1

+1

1

1

1

1

1

1

+1

+1

+1

+1

+1

+1

1

1

1

1

2

2

1

1

+1

+1

1

1

1

1

1

1

3

3

1

1

1

1

1

1

1

1

1

1

1

1

99

99

S

p

ir

o

p

la

sm

a

S

p

ir

o

p

la

sm

a

H

o

m

in

is

H

o

m

in

is

P

n

e

u

m

o

n

ia

e

P

n

e

u

m

o

n

ia

e

A ch o le p la sm a A ch o le p la sm a P h yt o p la sm a P h yt o p la sm a

By contrast with 16S rDNA-based phylogeny, the concatenation of 94 conserved proteins

of mollicutes leads to a high-resolution phylogeny with nearly all branches supported with

maximum statistical values. In this reconstruction, the phylogenetic groups Hominis and

Pneumoniae are unambiguously related, as descendants of a common ancestor. SEM,

Spiroplasmataceae

Entomoplasmataceae

-

Mycoplasmataceae

branch;

AAP,

Acholeplasmataceae - Anaeroplasmataceae - Phytoplasma branch.

SEM

SEM

AAP

AAP

Core and pan genomes of 59 mollicutes

• Pan and core genomes calculated from 43147 genes predicted in 59

genomes from 38 taxa

• Pan genome : 5203 gene families

• Core genome (predicted in 90% genomes): 182 gene families

• 90% gene families with 0 or 1 gene per genome

• 65% gene families present in only one genome

• Only 8% gene families present in at least 50% genomes

Gene families were computed from the 43147 CDS predicted in 59 genomes using a

stepwise process: (1) all-against-all alignment using ssearch and the dedicated

substitution matrix MOLLI60

(Lemaitre et al, 2011, BMC Bioinformatics)

, (2) clustering of the

predicted homologs, (3) multiple alignement with MUSCLE

(Edgar, 2004, NAR)

, (4)

phylogenetic tree reconstruction using PhyML

(Guindon et al, 2010, Syst Biol)

and (5) analysis of

tree topology to get orthologs families. In the figure above, each row of dots corresponds

to a gene family. Each of the 59 columns represents a genome.

Horizontal gene transfers among mollicutes

Mobile genetic elements found in mollicutes

A phylogeny-based analysis was conducted to predict horizontal gene transfer (HGT) between mollicutes belonging to

different phylogenetic groups. As expected, a large majority of the predicted HGT implicate species that share a same host:

human (M. hominis/Ureaplasmas), avian (M. gallisepticum/M. synoviae) or ruminant. Beside previously reported transfers

between the Mycoplasma mycoides cluster and the M. bovis/agalactiae cluster, we predicted numerous exchanges between

newly sequenced ruminant mycoplasmas belonging to the Hominis group (M. auris, M. alkalescens and M. bovigenitalium),

the Spiroplasma group (M.yeatsii) and the Pneumoniae group (U. diversum).

Systematic search identified potential Mobile Genetic Elements (MGEs) in 58 genomes out of 59. Main MGE categories

are found in mollicutes: insertion sequences (54 genomes), ICE/genomic islands (29 genomes), phages (9 genomes),

plasmids (6 genomes) and restriction/modification systems (56 genomes). Thought they are ubiquitous, mollicutes MGEs

are heterogeneously distributed within genera and even within species. This observation illustrates the ability of MGE to

modify mollicutes genomes at a short-time scale.

This work was supported by grant by the EVOLMYCO project (ANR-07-GMGE-001) from ANR to A. Blanchard (principal investigator [PI]), F. Thiaucourt (co-PI), M. Nikolski (co-PI), F. Poumarat (co-PI), and C. Citti (co-PI).

Evolution of gene content from the

mollicutes LCA to the strain level

The evolution of gene content was simulated from the mollicutes LCA to the strain level.

While gene family losses and contractions were predominant during the early steps of

evolution, these forces were less active in the following steps. Moreover, genome erosion

was somehow counter-balanced by some gene family gains and expansions, especially at

the species level. These novel gene families might have been involved in the adaptation to

a wide diversity of hosts, even within groups of phylogenetically related species.

• Persistence of gene family

losses and contractions but

at a slower rate

• Counter-balance effect of

gene family gains and

expansions

• Main gene family gains

occurred at the species (or

cluster of species) level

After the two massive gene

losses of the SEM and

phytoplasma LCAs branches:

Two massive gene losses occurred during

the early steps of mollicutes evolution

The evolution of gene content in mollicutes was inferred using the Wagner parsimony

method. Numbers of gene families in mollicutes Last Common Ancestor (LCA) and other

LCAs are written in yellow. Gene family losses (blue, - ) and gains (green, + ) are

indicated on branches. The early steps of mollicutes evolution were marked by two

massive losses that occurred during the speciation of the SEM and phytoplasma LCAs.

Only very few gene family gains were predicted for these early steps.

Mollicutes colonize a wide diversity of hosts and are characterized by a highly

reduced genome. Until recently, gene loss was considered as the only force

driving their evolution. Our previous results challenged this dogma by showing

that massive gene exchange had occurred between ruminant mycoplasmas. In

order to evaluate the evolutionary forces shaping mollicutes’ genomes, we

conducted a large scale genome study, with a particular focus on ruminant

mycoplasmas. Twenty genomes representing 12 species or sub-species, were

sequenced using Next Generation Sequencing technologies. Homogeneous,

expert annotation was achieved by the EVOLMYCO consortium and genome

drafts were integrated in the MolliGen database. Comparative genomics

analyses were performed on a total set of 59 genomes with the aim of

deciphering the main evolutionary forces that led to the current wide diversity of

mollicutes species.

An improved phylogenetic tree showing reliable branching using a much larger set of informative sites

The identification of a very limited number of genes shared by all mollicutes

The reconstruction of the mollicutes evolution including the main events of gene loss and/or gain

A detailed view of gene transfers among all mollicutes sharing a common host

A complete survey of Mobile Genetic Elements

2: number of genomes available at the

time of the study

+5: number of genomes sequenced by

the EVOLMYCO consortium

2: number of genomes available at the

time of the study

+5: number of genomes sequenced by

the EVOLMYCO consortium

123456789 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

1M. leachii PG50 1

2M. leachii 99/014/6 2

3M. leachii 06049-C3 3

4M. capricolum subsp. capricolum ATCC 27343 4

5M. capricolum subsp. capricolum 14232-cl 5

6M. capricolum subsp. capripneumoniae 99108-P1 6

7M. mycoides subsp. mycoides SC PG1 7

8M. mycoides subsp. mycoides SC Gladysdale 8

9M. mycoides subsp. mycoides SC Gemu Goffa 9

10 M. mycoides subsp. mycoides SC KH3J 10

11 M. mycoides subsp. mycoides SC C425/93-cl 11

12 M. mycoides subsp. mycoides SC PO-67 12

13 M. mycoides subsp. mycoides SC B345/93-cl 13

14 M. mycoides subsp. capri PG3 14

15 M. mycoides subsp. capri 95010 15

16 M. mycoides subsp. capri GM12 16

17 M. yeatsii 13926-cl 17

18 M. putrefaciens 18

19 Mesoplasma florum L1 19

20 Spiroplasma citri G II-3 20

21 M. auris 15026-cl 21 22 M. alkalescens 14918-cl 22 23 M. arginini 7264-cl 23 24 M. hominis PG21 24 25 M. arthritidis 158L3-1 25 26 M. mobile 163K 26 27 M. hyorhinis HUB-1 27 28 M. conjunctivae HRC/581T 28 29 M. ovipneumoniae cl-14811 29 30 M. hyopneumoniae 232 30 31 M. hyopneumoniae J 31 32 M. hyopneumoniae 7448 32

33 M. pulmonis UAB CTIP 33

34 M. synoviae 53 34 35 M. crocodyli MP145 35 36 M. fermentans JER 36 37 M. bovigenitalium cl-51080 37 38 M. bovis PG45 38 39 M. bovis 1067 39 40 M. bovis Hubei-1 40 41 M. bovis 8790 41 42 M. agalactiae PG2 42 43 M. agalactiae 5632 43 44 M. agalactiae 14628-cl 44

45 U. parvum serovar 3 ATCC 700970 45

46 U. parvum serovar 3 ATCC 27815 46

47 U. urealyticum Western serovar 10 ATCC 33699 47

48 U. diversum 246 48 49 M. penetrans HF-2 49 50 M. pneumoniae M129 50 51 M. genitalium G37 51 52 M. gallisepticum R (low) 52 53 M. gallisepticum R(high) 53 54 M. gallisepticum F 54

55 Acholeplasma laidlawii PG-8A 55

56 Phytoplasma mali AT 56

57 Phytoplasma australiense 57

58 Phytoplasma asteris OY 58

59 Phytoplasma asteris AYWB 59

2 to 45 to 9≥ 10 2 to 45 to 9≥ 10 2 to 45 to 9≥ 10 2 to 45 to 9≥ 10 Different 2 to 45 to 9≥ 10

hosts

IS256IS1182 IS1634 IS21 IS3 IS30 IS481ISNCY Tyr DDE ICE PMUs Tra Isl. RCR Theta I II III IV

1 M. leachii PG50 1

2 M. leachii 99/014/6 2

3 M. leachii 06049-C3 3

4 M. capricolum subsp. capricolum ATCC 27343 4

5 M. capricolum subsp. capricolum 14232 5

6 M. capricolum capripneumoniae 99108-P1-C3fcfc6

7 M. mycoides subsp. mycoides SC PG1 7

8 M. mycoides subsp. mycoides SC Gladysdale 8

9 M. mycoides subsp. mycoides SC Genum Goffa(e)9

10 M. mycoides subsp. mycoides SC KH3J (d) 10

11 M. mycoides subsp. mycoides SC C425/93-cl (c) 11

12 M. mycoides subsp. mycoides SC PO67 (a) 12

13 M. mycoides subsp. mycoides SC B345 93-cl (b)13

14 M. mycoides subsp. capri PG3 14

15 M. mycoides subsp. capri 95010 15

16 M. mycoides subsp. capri GM12 16

17 M. yeatsii 13926-cl 17 18 M. putrefaciens 18 19 Mesoplasma florum 19 20 Spiroplasma citri 20 21 M. auris 15026-cl 21 22 M. alkalescens 14918-cl 22 23 M. arginini 7264-cl 23 24 M. hominis PG21 24 25 M. arthritidis 158L3-1 25 26 M. mobile 163K 26 27 M. hyorhinis HUB-1 27 28 M. conjunctivae HRC/581T 28 29 M. ovipneumoniae cl-14811 29 30 M. hyopneumoniae 232 30 31 M. hyopneumoniae J 31 32 M. hyopneumoniae 7448 32

33 M. pulmonis UAB CTIP 33

34 M. synoviae 53 34 35 M. crocodyli MP145 35 36 M. fermentans JER 36 37 M. bovigenitalium cl-51080 37 38 M. bovis PG45 38 39 M. bovis 1067 (b) 39 40 M. bovis Hubei 40 41 M. bovis 8790 (a) 41 42 M. agalactiae PG2 42 43 M. agalactiae 5632 43 44 M. agalactiae 14628 44 45 U. parvum 3 ATCC 700970 45

46 U. parvum serovar 3 ATCC 27815 46

47 U. urealyticum Western serovar 10 ATCC 3369947

48 U. diversum 246 48 49 M. penetrans HF-2 49 50 M. pneumoniae 50 51 M. genitalium 51 52 M. gallisepticum R (low) 52 53 M. gallisepticum R(high) 53 54 M. gallisepticum F 54

55 Acholeplasma laidlawii PG-8A 55

56 Phytoplasma mali AT 56

57 Phytoplasma australiense 57

58 Phytoplasma asteris OY-M 58

59 Phytoplasma asteris AYWB 59

Insertion sequences

Prophages

Genomic

Références

Documents relatifs

We found that in case of high W val- ues (like 4 seconds), generaly we have a poor detection probability, specifically in accelerations and decelerations moments. This is true even

En 2015, le Conseil communal de Delémont octroie une subvention aux propriétaires delémontains qui installent des panneaux solaires thermiques ou qui procèdent au changement de

To characterize the study group, the following details from the VigiFlow ADR reports were collected: the report year, the sender, the sender ’ s report number, the report ’

Abstract – In conventional gene flow theory the rate of genetic gain is calculated as the summed products of genetic selection differential and asymptotic proportion of genes

The aim of this study was to describe, using orthopaedic criteria the spinal deformities of a zebrafish mutant model of AIS targeting a gene involved in cilia polarity and

In this analysis, we were able to estimate gene turnover rates (λ) for the six multi-species taxonomic orders, to infer ancestral gene counts for each taxonomic family on each node

Using field measurements, flow curves could be established only for Hawaiian lavas and 746. revealed a pseudo-plastic

Here, we aimed to characterize HLA-G haplotypes and describe their association with asthmatic clinical features and sHLA-G peripheral expres- sion and to describe variations