Detecting Highways of HorizontalGeneTransfer
MUKUL S. BANSAL, 1,3 GUY BANAY, 1 J. PETER GOGARTEN, 2 and RON SHAMIR 1
In a horizontalgenetransfer (HGT) event, a gene is transferred between two species that do not have an ancestor-descendant relationship. Typically, no more than a few genes are hor- izontally transferred between any two species. However, several studies identified pairs of species between which many different genes were horizontally transferred. Such a pair is said to be linked by a highway of gene sharing. We present a method for inferring such highways. Our method is based on the fact that the evolutionary histories of horizontally transferred genes disagree with the corresponding species phylogeny. Specifically, given a set of gene trees and a trusted rooted species tree, each gene tree is first decomposed into its constituent quartet trees and the quartets that are inconsistent with the species tree are identified. Our method finds a pair of species such that a highway between them explains the largest (normalized) fraction of inconsistent quartets. For a problem on n species and m input quartet trees, we give an efficient O(m + n 2 )-time algorithm for detecting highways, which is optimal with respect to the quartets input size. An application of our method to a dataset of 1128 genes from 11 cyanobacterial species, as well as to simulated datasets, illustrates the efficacy of our method. Key words: algorithms, horizontalgenetransfer, microbial evolution, quartets.
Horizontalgenetransfer (HGT) is an important mode of adaptation and diversification of prokaryotes and eukaryotes and a major event underlying the emergence of bacterial pathogens and mutualists. Yet it remains unclear how complex phenotypic traits such as the ability to fix nitrogen with legumes have successfully spread over large phylogenetic distances. Here we show, using experimental evolution coupled with whole genome sequencing, that co-transfer of imuABC error- prone DNA polymerase genes with key symbiotic genes accelerates the evolution of a soil bacterium into a legume symbiont. Following introduction of the symbiotic plasmid of Cupriavidus taiwanensis, the Mimosa symbiont, into pathogenic Ralstonia solanacearum we challenged transconjugants to become Mimosa symbionts through serial plant- bacteria co-cultures. We demonstrate that a mutagenesis imuABC cassette encoded on the C. taiwanensis symbiotic plasmid triggered a transient hypermutability stage in R. solanacearum transconjugants that occurred before the cells entered the plant. The generated burst in genetic diversity accelerated symbiotic adaptation of the recipient genome under plant selection pressure, presumably by improving the exploration of the fitness landscape. Finally, we show that plasmid imuABC cassettes are over-represented in rhizobial lineages harboring symbiotic plasmids. Our findings shed light on a mechanism that may have facilitated the dissemination of symbiotic competency among a- and b-proteobacteria in natura and provide evidence for the positive role of environment-induced mutagenesis in the acquisition of a complex lifestyle trait. We speculate that co-transfer of complex phenotypic traits with mutagenesis determinants might frequently enhance the ecological success of HGT.
highly expressed enzyme in the Peruphasma schultei anterior midgut 7 .
Evidence for HorizontalGeneTransfer. The Bayesian analysis for insect, nematode, and microbial GH28 pectinases converged after 300,000 generations with an average standard deviation of split frequencies of 0.014. The alpha-shape parameter of the gamma distribution was 1.8. Due to the bootstrap algorithm in RAxML, the Maximum Likelihood (ML) analysis converged after 450 bootstrap replicates. The Le and Gascuel model (LG) was determined as best-scoring evolutionary model; the estimated alpha-shape parameter of the gamma distribu- tion was 1.7. The ML and Bayesian phylogenetic analyses converged, providing trees with almost the same topol- ogy. The Bayesian topology was chosen as reference topology, and the more conservative 16,17 bootstrap support
Received 19 July 2006/Accepted 19 October 2006
The phylogenetically closely related species Streptococcus salivarius and Streptococcus vestibularis are oral bacteria that are considered commensals, although they can also be found in human infections. The relation- ship between these two species and the relationship between strains isolated from carriers and strains responsible for invasive infections were investigated by multilocus sequence typing and additional sequence analysis. The clustering of several S. vestibularis alleles and the extent of genomic divergence at certain loci support the conclusion that S. salivarius and S. vestibularis are separate species. The level of sequence diversity in S. salivarius alleles is generally high, whereas that in S. vestibularis alleles is low at certain loci, indicating that the latter species might have evolved recently. Cluster analysis indicated that there has been genetic exchange between S. salivarius and S. vestibularis at three of the nine loci investigated. Horizontalgenetransfer between streptococci belonging to the S. salivarius group and other oral streptococci was also detected at several loci. A high level of recombination in S. salivarius was revealed by allele index association and split decom- position sequence analyses. Commensal and infection-associated S. salivarius strains could not be distin- guished by cluster analysis, suggesting that the pathogen isolates are opportunistic. Taken together, our results indicate that there is a high level of gene exchange that contributes to the evolution of two streptococcal species from the human oral cavity.
Keywords: glycoside hydrolase; horizontalgenetransfer; nematodes; plant parasitism
Plant-parasitic nematodes (PPN) cause damage to crops across the world and are a major threat to global food security. A phylogenetic analysis of the phylum Nematoda [ 1 , 2 ] has shown that the ability to parasitize plants has arisen independently on at least four separate occasions within the phylum. The majority of the most economically important PPN species are located in Clade 12 (Tylenchida), and include migratory endoparasitic species as well as the biotrophic, sedentary endoparasitic root-knot and cyst nematodes. These nematodes, and the Clade 10 plant parasite Bursaphelenchus xylophilus, have been intensively studied and extensive genome and transcriptome resources are available for these nematodes. These resources include full genome sequences for several root-knot and cyst nematodes (e.g., [ 3 – 6 ]) and B. xylophilus [ 7 ] as well as extensive transcriptome analysis for a wide range of other species in these clades (reviewed in [ 8 ]). In contrast to endoparasitic nematodes that are restricted to Clades 12 and 10, ectoparasitic nematodes species can be found in all four plant parasite Clades [ 9 ]. However, very little genome or transcriptome information is available for the ectoparasitic nematodes in Clades 1 (e.g., Trichodoridae) and 2 (e.g., Longidoridae) other than a small-scale expressed sequence tag project for the Longidoridae Xiphinema index [ 10 ]. Consequently, the molecular process by which Clades 1 and 2 ectoparasitic nematodes infect plants is poorly known. Ectoparasitic nematodes from Clades 1 and 2 cause damage to plants, either through direct feeding or by transmission of plant viruses. The economic damage caused by plant viruses explains why major vector species, including the nematodes X. index and Longidorus elongatus, are among the most studied ectoparasites. Both these nematodes belong to the family Longidoridae and are members
Only for few of the specific genes a putative function can be predicted like genes coding for proteins involved in sugar and nucleotide metabolism, for uridine dipho- sphoglucuronate 5’-epimerase or for an UDP-glucose 6- dehydrogenase. Furthermore a specific ANK motif con- taining protein and a leucine reach repeat protein are present in strain HL 0604 1035. In strain Lorraine we identified mainly specific metabolic enzymes like a puta- tive flavanone 3-dioxygenase, an enzyme involved in fla- vonoids metabolism and in biosynthesis of phenylpropanoids, which are secondary metabolites of plants and algae. In addition, lpo2614 is predicted to encode a kynurenine-oxoglutarate transaminase, an enzyme that is part of the tryptophan metabolism and lpo2960 codes for a putative glycolate oxidase that cata- lyses the conversion of glycolate and oxygen to glyoxy- late and hydrogen proxide. lpo2502 codes a homologue of CsbD, a general stress response protein of Bacillus subtilis . However, the best BLASTp hit is with the Protochlamydia amoebophila homologue, an Acantha- moeba sp. symbiont . Probably this gene has been acquired by HGT between these two bacteria within their amoeba host. Quite surprisingly, we identified a gene coding a putative methyl-accepting chemotaxis sensory transducer (lpv1770) although all L. pneumo- phila strains analyzed to date do not encode chemotaxis systems. This gene shares 71.34% amino acid identity with Llo3301 of L. longbeachae a protein that is part of its chemotaxis system  also present in L. drancourtii . Probably a common ancestor encoded a chemo- taxis system that was lost in L. pneumophila through a deletion and degradation process.
Collectively, our data suggest that HGT is an important factor in the evolution and ubiquity of the c-di-GMP sig- naling system. This indicates an overarching relationship between HGT and the c-di-GMP system, which, to our knowledge, has not previously been described despite the ubiquity of the c-di-GMP system and the great importance of HGT on the evolution of bacteria. Supportive hereof, Bordeleau et al.  showed that some integrating con- jugative elements of Vibrio cholera encode active DGCs affecting bio ﬁlm formation as well as motility, and it was noted that GGDEFs could be identi ﬁed in some conjugative plasmids and a bacteriophage. Richter et al.  found that human pathogen E. coli O104:H4 expressed DgcX (a c-di- GMP DGC) at high levels and that this facilitated a unique bio ﬁlm phenotype related to the strain’s high virulence. Interestingly, the dgcX gene is encoded at an attB phage integration site and ﬂanked by prophage elements, sug- gesting that the gene was acquired via HGT. Kulesekara et al.  found that several c-di-GMP DGC and PDE genes of P. aeruginosa are located on presumptive horizontally acquired genomic islands.
licates in Z. subfasciatus, since it survived ~5–10 genera- tions in the laboratory.
Ecological context of HGT
As this HGT event has been detected only between indi- viduals from the Altiplano where the three species co- occur and share the same host plant, it gives clues to the ecological context which may favor such a gene exchange. Indeed, the ecologically distinct A. argillaceus does not demonstrate a HGT pattern for cytb, despite its phyloge- netic proximity to A. obtectus and A. obvelatus. This sug- gests that the probability of the occurrence of lateral genetransfer is partly controlled by the environment. The occurrence of HGT between bruchid species (i) presenting a high degree of phylogenetic divergence but (ii) sharing some dimensions of their ecological niche (e.g., their host plant) could indicate that ecology, as well as phylogeny, might play a role in the distribution of genetic variation across taxa.
Collectively, our data suggest that HGT is an important factor in the evolution and ubiquity of the c-di-GMP sig- naling system. This indicates an overarching relationship between HGT and the c-di-GMP system, which, to our knowledge, has not previously been described despite the ubiquity of the c-di-GMP system and the great importance of HGT on the evolution of bacteria. Supportive hereof, Bordeleau et al. [ 19 ] showed that some integrating con- jugative elements of Vibrio cholera encode active DGCs affecting bio ﬁlm formation as well as motility, and it was noted that GGDEFs could be identi ﬁed in some conjugative plasmids and a bacteriophage. Richter et al. [ 20 ] found that human pathogen E. coli O104:H4 expressed DgcX (a c-di- GMP DGC) at high levels and that this facilitated a unique bio ﬁlm phenotype related to the strain’s high virulence. Interestingly, the dgcX gene is encoded at an attB phage integration site and ﬂanked by prophage elements, sug- gesting that the gene was acquired via HGT. Kulesekara et al. [ 4 ] found that several c-di-GMP DGC and PDE genes of P. aeruginosa are located on presumptive horizontally acquired genomic islands.
Experimental Evolution after HGT Induces Global and Specific Changes in the Proteome
Extensive global changes of the cellular proteomic profile were observed, the main driver being antibiotic selection. Additionally, changes were associated with both the plasmid and the chromosome. Unsupervised cluster analysis of the results of a first differential quantitative proteomics experi- ment (comparing ancestral to evolved populations at g1000) revealed that populations clustered by selection his- tory, yielding three groups (fig. 4A): generation zero (i.e., ref- erence starting point before selection), populations selected in chloramphenicol and populations selected in ampicillin, with differences between populations carrying the different ver- sions of the cat gene being of smaller amplitude. Interestingly, the three ancestral populations transformed with the three cat versions did not display the same proteomic profile, meaning that introduction of synonymous genes impacts differentially the cellular proteome. Such impact of the CUP of a few genes on an important part of the proteome has recently been shown in another study in E. coli (Frumkin et al. 2018). Cluster analysis for the second differential quan- titative proteomics experiment showed that evolved popula- tions presented a different proteomic profile from the ancestral bacteria transformed with the corresponding evolved plasmid: although the three populations carrying the cat-AT and evolved in chloramphenicol clustered together (labeled as ATCam1-3 in fig. 4B), ancestral bacteria trans- formed with plasmids extracted from the evolved populations (labeled as ATCam1-3tr in fig. 4B) did not cluster together nor were closer to the respective evolved population from which the plasmids originated (e.g., ATCam1 did not cluster with ATCam1tr in fig. 4B). This means that evolutionary changes in the proteome were far from being fully determined by the evolved plasmid.
However, we assert that these changes are far too fun- damental to be explained by the hypothesis of gene du- plication plus LBA alone, but require a much longer evolutionary history and a more distant shared origin, accessed via HGT. A radical within-lineage functional di- vergence would suggest the acquisition of a new function for the rare SerRS variant; however, this is unlikely to be the case, as after its introduction some lineages retained the ancestral form and lost the rare form, while other line- ages kept the rare form and lost the ancestral form. This is even the case for closely related groups, such as the genus Methanosarcina: M. acetivorans and M. mazei both pos- sess the common form of SerRS, while M. barkeri carries the rare form. This suggests functional equivalence, which is more consistent with an invasion followed by sorting. Even if the functional roles of the rare and common SerRS types were somehow different, they both still recognize the same cognate tRNA type, which is highly conserved. Therefore, the tRNA recognition domain should be under strong continuous purifying selection, and be least likely to change over such a short time, even if the rest of the protein is under positive selection to acquire novel function(s). In fact, tRNA recognition is under such strong purifying selection within aaRS proteins that HGT events between domains can even result in recombined displacements, with chimeric gene prod- ucts retaining the vertically-inherited tRNA recognition domain . Yet, in the case of SerRS the exact oppos- ite scenario is observed, with the most radical differ- ence between the rare and common form being the region involved in tRNA recognition. This suggests that the rare form coevolved in a distant lineage with a dif- ferent tRNA, undergoing subsequent adaptations to ac- commodate the methanogen tRNA following transfer.
* Correspondence: firstname.lastname@example.org
Penguins are widely distributed from Antarctica to the tropics. Some penguin species occupy large latitudinal and climatic ranges, and others more limited areas characterized by stable but extreme climatic conditions. Penguin’s evolutionary history has been strongly correlated with historical events of climate change, due to the co-occurrence between diversification of penguin species and global cooling periods that resulted in Antarctica becoming ice-covered. How did penguin species adapt? Has temperature been a driver in penguin speciation? Thanks to current advances in genomic technologies, it is possible to search for genetic signatures across populations that could shed light into evolutionary processes. In this study, we sequenced 12 penguin genomes from 6 species living under different temperature conditions to evaluate inter- and intra- specific genetic diversity patterns with emphasis on genes known to be involved in temperature regulation. We observed higher genetic diversity among Antarctic species and species from lower latitudes. Also, according to SNPs and posterior Gene Ontology enrichment analysis, we found that “Biosynthetic Process” was the most represented biological process, and that genes belonging to “Response to temperature” associated process were found to be enriched (p- value < 0.001). In addition, temperature related genes showed a higher inter-specific nucleotide diversity than the rest of the genes analyzed. Our results thus far suggest that substitutions have been accumulating in genes associated with metabolism and temperature to adapt to new environments. Phylogenetic and positive selection analyses would help better understand the effects of the observed genetic diversity among penguin species.
Serial horizontalgenetransfer underlies multi- partner co-obligate mutualistic association
Both the biotin- and thiamin-biosynthetic genes of the Erwinia endosymbiont ’s plasmids (bioA, bioB, thiC, thiE, thiF, thiS, thiG, thiH, and thiD) encoded in this molecule consistently showed top BLASTP hits against Sodalis and Sodalis-like bacteria. Surprisingly, the thiamin-biosynthetic genes present in the Hamiltonella ’s symbiont genome showed similar results. These results contrasted with what we observed for the plasmid-encoded nupC, apbE, and gpmA, where the top BLASTP hits were consistently against Erwinia bacteria. To test for HGT events across the Erwinia genome, we ran BLASTP similarity searches of the proteins of Erwinia endosymbionts vs. a database built from the proteomes of Erwinia and Sodalis species. The search revealed 11 proteins that putatively originated from an HGT event from Sodalis-related bacteria: the plasmidic bioA, bioB, thiC, thiE, thiF, thiS, thiG, thiH, and thiD genes; and the chromosomal bioD and thiI genes. None of these genes were found to have a “native” copy in the genome that hosts them. To validate these HGT events, we collected ortho- logous genes across different enterobacterial species and reconstructed Bayesian phylogenies (Fig. 4 ). All 11 genes supported a single event of HGT for both Erwinia and Hamiltonella symbionts of Cinara, as the Erwinia and Hamiltonella sequences were consistently recovered as a monophyletic group nested within or sister to Sodalis spp. This contrasted with what is observed for the gpmA and nupC genes, that are con ﬁdently recovered nested within Erwinia spp. (supplementary Fig. S8, Supplementary Material online). Additionally, the majority of the genes ’ subtrees are congruent with the topology of the Erwinia endosymbionts ’ subtree (supplementary Fig. S9, Supple- mentary Material online). No Sodalis-related bacteria was detected during the assembly and binning process in any of the analysed samples.
Abstract Phylogenetic studies reveal that horizontalgenetransfer (HGT) plays a prominent role in evolution and genetic variability of life. Five biotic mechanisms of HGT among prokaryotic organisms have been extensively characterized: conjugation, competence, transduction, genetransfer agent particles, and transitory fusion with recom- bination, but it is not known whether they can account for all natural HGT. It is even less clear how HGT could have occurred before any of these mechanisms had developed. Here, we consider contemporary conditions and experi- ments on microorganisms to estimate possible roles of abiotic HGT—currently and throughout evolution. Candi- date mechanisms include freeze-and-thaw, microbeads- agitation, and electroporation-based transformation, and we posit that these laboratory techniques have analogues in nature acting as mechanisms of abiotic HGT: freeze-and- thaw cycles in polar waters, agitation by sand at foreshores and riverbeds, and lightning-triggered electroporation in near-surface aqueous habitats. We derive conservative order-of-magnitude estimates for rates of microorganisms subjected to freeze-and-thaw cycles, sand agitation, and lightning-triggered electroporation, at 10 24 , 10 19 , and 10 17 per year, respectively. Considering the yield of viable transformants, which is by far the highest in electropora- tion, we argue this may still favor lightning-triggered transformation over the other two mechanisms.
More recently, a reverse genetic approach revealed that the virulence master regulators are quite the same in D. solani and D. dadantii .
The analysis of population genome structure and dynam- ics, including additive or replacing horizontalgenetransfer (HGT) may bring valuable clues on the mechanisms of emergence of D. solani. While additive HGT allows the ac- quisition of novel genes by a population [15–20], replacing HGT provokes the replacement of an allele by another from close relatives . HGT events inform about the genome diversification and adaptation processes, but also on the companion populations that the pathogens met during the emergence and dissemination steps. Replacing HGT is also of a major stake in pathogen diagnostic, as it may provoke false identification when the alleles ex- changed by replacing HGT are used as molecular taxo- nomic markers.
HGTs between species from the same environment
We analyzed 42 instances of HGT with phylogenetic support and identiﬁed new instances of transfer into the genome of R. bellii from diﬀerent bacterial phyla, including Firmicutes, Nitrospirales, Bacteroidetes, and Proteobacteria (Beta and Gamma). R. bellii has been pre- viously reported to exchange genes related to amoebal symbionts , and there have been no reported cases of gene loss . The newly introduced genes may be maintained as a consequence of the large size of the R. bellii genome. Our phylogenetic analyses also identiﬁed four instances of horizontalgenetransfer among amoeba- resistant microorganisms (Additional ﬁle 1, Table S7, gi:157964915, gi:157825906, gi:157827021, gi:229586611). Three out of these four instances were among Amoe- bophilus asiasticus 5a2 and Rickettsia species, and one case was among Legionella species and Rickettsia species. Recently, genes were found to have been transferred between M. avium and L. pneumophila, which have sym- patric lifestyles in free-living protozoa . Additionally, empirical observations of transferred genes have been described in sympatric population [46,47]. Here, our phy- logenetic reconstructions identiﬁed possible instances of HGT between the ancestors of Rickettsia and the amoeba- parasites. Our results support a model in which free-living amoeba act as a “melting pot” for genetic exchange, partic- ularly HGT [47-49]. In addition to amoeba-resistant bac- teria, fungi, giant virus and virophages have been detected in free-living protozoa [50-52], which suggests that bacte- ria living in sympatric populations facilitate more genetic exchange than those living in allopatric populations and that protists can serve as a hot spot for horizontalgenetransfer.
Keywords: horizontalgenetransfer; alien index; lateral genetransfer
Horizontalgenetransfer (HGT), is the transmission of genes between organisms by other way than direct (vertical) inheritance from parental lineages to their offspring. HGT is prevalent in prokaryotes [ 1 ], with substantial proportions of bacterial genes jumping horizontally, rather than being vertically inherited [ 2 ]. These horizontally-acquired genes play important functions in bacteria, including spreading of antibiotic resistance and emergence of pathogenicity [ 3 – 6 ]. Although HGT is much less prevalent in eukaryotes, and particularly multicellular eukaryotes, there are reported cases in the literature, including in viridiplantae and Metazoa [ 7 – 12 ]. Some of the reported examples also evoke important associated roles in the recipient organism. This suggests that the genomic and biological impact of HGT could be more widespread in the tree of life than initially thought [ 13 ].
Competition at the plasmid level of selection. In Model (2.2) we assume that plasmid-bearing strains always carry drug resistance genes. This assump- tion does not have any impact in the case of the treatment-free model. However, although a significant proportion of plasmid-bearing strains is involved in drug resistance, a small proportion can be ‘non-resistant plasmids’. To focus even more on the question of drug action with the model, it could be interesting to introduce additional bacteria strains with non-resistant plasmids but paying a cost of plasmid carriage, as in . This would allow to consider competition between plasmidic strains carrying the drug resistance gene with other plasmidic strains that do not carry the drug resistance gene. Indeed, competition at the plasmid level can be of great importance, since the spread of a resistant plasmid can be slowed or entirely stopped by a nonresistant version of the same plasmid . This issue will will be rigorously investigated in a forthcoming work.
Many horizontalgenetransfer events into Wolbachia are likely to be mediated by bacteriophage, which are known to transfer laterally between Wolbachia strains coinfecting the same host  and are capable of trans- ferring flanking non-phage genes in the process , thus facilitating horizontalgenetransfer and genome di- versification. To identify possible recent horizontalgene transfers into the wBol1-b genome, we used wBol1-b genes that had not been clustered with any other gene in the orthoMCL analysis (and were therefore putatively wBol1-b-specific) as blastp queries against the NR data- base. 26 of these genes had blastp matches to genes from other Wolbachia strains not included in the clustering analysis, and are thus components of the accessory gen- ome but are not wBol1-b-specific. A total of 44 genes are present in wBol1-b but no other currently sequenced Wolbachia strain (Additional file 2: Table S4). Of these, 35 had no NR matches; these may be very rapidly evolv- ing genes, genes in the late stages of degeneration, or the result of horizontaltransfer from a genome not yet represented in the database . It is also possible that some, especially the shorter of these genes, could be artefacts of the annotation process. Finally, nine wBol1- b-specific genes lacked Wolbachia homologs but had high-quality matches to non-Wolbachia genes in the NR database. All but one of these genes are either within or adjacent to phage regions. We searched for degenerate or unannotated copies of these genes in the wPip gen- ome and found no evidence of them, and it is likely that they represent recent phage-mediated horizontalgene transfers into the wBol1-b genome that occurred subse- quent to divergence from wPip. These genes and their homologs are described below.
Besides cellulase genes, initial analysis of the genome of Pristionchus revealed the presence of Diapausin genes. Diapausins encode antifungal peptides specifically produced during diapause. These genes are absent from the genomes of other nematodes and are otherwise found in insects, including the beetles they live in association with. This finding could well represent the first example of a successful genetransfer between the genomes of two different animals. An alternative hypo- thesis is that a Diapausin gene was present in the last common ancestor of nematodes and insects and that those observed in extant genomes derive from this ancestral gene through vertical transmis- sion. However, this would imply an improbably high number of independent gene losses to explain their absence from all the other currently sequenced nema- tode genomes and transcriptomes. Their otherwise presence in the only insect- associated nematode with a fully sequenced genome further argues in favor of an HGT event from insects. A recent detailed analysis of the gene content of Pristionchus pacificus revealed that a subset of the genes lacking homology to other nematodes had a peculiar codon usage distinct from that of the core genes conserved with other nematodes. 12