• Aucun résultat trouvé

A draft genome sequence of the common, or spectacled caiman Caiman crocodilus

N/A
N/A
Protected

Academic year: 2021

Partager "A draft genome sequence of the common, or spectacled caiman Caiman crocodilus"

Copied!
10
0
0

Texte intégral

(1)

HAL Id: hal-03241670

https://hal.archives-ouvertes.fr/hal-03241670

Preprint submitted on 28 May 2021

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

A draft genome sequence of the common, or spectacled

caiman Caiman crocodilus

Kenichi Okamoto, Nichole Dopkins, Elias Kinfu

To cite this version:

(2)

A draft genome sequence of the common, or spectacled caiman

Caiman crocodilus

Kenichi W. Okamoto„…, Nichole S. Dopkins„, Elias S. Kinfu„ „Department of Biology

(3)

Abstract

The common, or specatacled, caiman Caiman crocodilus is an abundant, widely dis-tributed Neotropical crocodilian exhibiting notable morphological and molecular diversifi-cation. The species also accounts by far for the largest share of crocodilian hides on the global market, with the C. crocodilus hide trade alone valued at about US$86.5 million per year. We obtained 239,911,946 paired-end reads comprising approximately 72 G bases using Illumina—sequencing of tissue sampled from a single Caiman crocodilus individual. These reads were de-novo assembled and progressively aligned against the genomes of increasingly related crocodilians; liftoff was used to annotate the draft C. crocodilus genome assembly based on an Alligator mississipiensis (a confamilial species) annotation. The draft assembly and annotation are available at (doi.org/10.5281/zenodo.4755063).

keywords: Caiman crocodilus, spectacled caiman, genome, assembly, next-generation se-quencing, crocodilian, vertebrate genome

(4)

Introduction

1

The common, or spectacled, caiman Caiman crocodilus is one of the most widely distributed 2

and abundant crocodilian species, ranging continuously from Mexico to Argentina (Busack and 3

Pandya 2001; US Fish and Wildlife Service 2018). A generalist predator, C. crocodilus is re-4

markably adaptable, occupying a wide range of habitats from urban to seasonal savannahs to 5

tropical rainforests (Medem 1981, 1983), and has recently been introduced to Cuba, Puerto 6

Rico and Florida where it is considered an invasive species (US Fish and Wildlife Service 2018). 7

The broad distrbution and diversity of habitats has facilitated considerable intraspecific diver-8

sification within C. crocodilus; a recent single-locus molecular analysis by Roberto et al. (2020) 9

identified between seven and ten lineages within C. crocodilus across differing biogeographic 10

regions and watersheds throughout Central and South America. Within-species diversity is 11

also morphologically apparent, with skull shape in particular exhibiting systematic patterns 12

of regional differentiation (Medem 1955; Gans 1980; Medem 1981, 1983; Ayarzag¨uena 1984; 13

Escobedo-Galv´an et al. 2015). These intraspecific patterns of cranial shape variation within C. 14

crocodilus have been shown to parallel patterns of interspecific cranial diversity found in extant 15

crocodilians (Okamoto et al. 2015). 16

Additionally, C. crocodilus is a species of commercial importance, chiefly in the leather 17

industry. While the hides of C. crocodilus contain osteoderms that render the manufacturing 18

process more difficult than for other crocodilians, a majority of the approximately 1.5 million 19

crocodilian skins traded globally come from C. crocodilus (Brazaitis et al. 1998; Caldwell 2015). 20

As with other crocodilians, most legal hides come from commercial farming operations, and 21

the market for caiman hides is estimated to be over US $85 million (Caldwell 2015). Wild 22

populations of C. crocodilus are also hunted for meat and even fishing bait (Da Silveira and 23

Thorbjarnarson 1999; Brum et al. 2015; Pimenta et al. 2018) and provide ecosystem services 24

including nutrient cycling and biological control (Valencia-Aguilar et al. 2013; Marley et al. 25

2019). Due to its role as an apex predator, C. crocodilus exhibits considerable bioaccumulation, 26

with genotoxic analyses demonstrating molecular signatures of pollution on the C. crocodilus 27

genome (Oliveira et al. 2021). 28

Thus, a draft genome sequence for C. crocodilus can not only help provide insight into 29

(5)

evolutionary processes driving intraspecific diversification, but can also assist with improved 1

husbandry, ecotoxicology and wildlife management. 2

Methods

3

DNA was extracted from a tissue sample belonging to a single Caiman crocodilus museum 4

specimen (UF-FLMNH 171438) using the DNeasy—kit from Qiagen (Hilden, Germany). DNA 5

was quantitated using Thermofisher’s (Waltham, MA, USA) Picogreen—kit (for a final Picogreen 6

concentration of 77.78 ng/µL). Tecan’s (M¨annedorf, Switzerland) NuGEN Celero—kit was then 7

used to construct a paired-end library, which was subsequently sequenced on a single Illumina 8

(San Diego, CA) NovaSeq S4 lane. This yielded 239,911,946 paired-end reads of 2x150bp each. 9

Nucleic acid isolation, quantitation, library generation and raw-read sequencing were performed 10

at the University of Minnesota Genomics Center. 11

The reads were assembled de novo using the Iterative de Bruijn Graph Assembler (IDBA-12

UD; Peng et al. 2012). To assess the reliability of our pipeline from sequencing to de novo 13

assembly using IDBA-UD, we repeated the sequencing and assembly using a museum-derived 14

tissue sample from a single Alligator mississippiensis individual (UF-FLMNH 175565). This 15

resulted in 249,325,204 paired-end reads of 2x150bp each. As was the case for the C. crocodilus 16

individual, the reads were then de novo assembled using IDBA-UD, and we used QUAST 17

(Gurevich et al. 2013) to determine that the IDBA assembly of A. missippiensis captured 18

approximately 94.2% of a recently published A. missippiensis assembly (GCA_000281125.4; 19

Rice et al. 2017), with an N50 of 21172 based on de novo assembled contigs alone. 20

We scaffolded the resulting draft C. crocodilus contigs using a two-step procedure. First, 21

we scaffolded the caiman’s contigs against a Crocodylus porosus assembly (GCF_001723895.1; 22

Ghosh et al. 2020) using ragtag (Alonge et al. 2019). We then re-scaffolded the resulting 23

contigs/scaffolds against the confamilial Alligator mississipiensis assembly (GCA_000281125. 24

4), again using ragtag. The draft assembly was then submitted to the National Center for 25

Biotechnology Information (NCBI). Contaminants, mitochondrial DNA, vectors, adapters, and 26

sequences shorter than 200 bp identified by NCBI were manually removed using seqkit (Shen 27

et al. 2016) and custom scripts (available at http://github.com/kewok/ncbi_scrubber). 28

(6)

The resulting scaffold (10.5281/zenodo.4755063) was then masked using RepeatMasker 1

(Smit et al. 2015) relying on the HMMER database (Finn et al. 2011) and with “alligator” 2

specified as species. Finally, liftoff (Shumate and Salzberg 2020) was used to generate a draft 3

annotation based on the masked assembly using the annotations associated with A. mississip-4

iensis (GCA_000281125.4; Rice et al. 2017) as a reference. table2asn gff (National Center 5

for Biotecnology Information 2020) was then used to generate a Sequin file (National Center 6

for Biotechnology Information (US) 2014), and features flagged as errors were manually re-7

moved using custom scripts (available at https://github.com/kewok/ncbi_scrubber); the 8

draft annotation is available at 10.5281/zenodo.4755063). 9

Results

10

Our assembly yielded a draft assembly of length 2,341,057,913 bp with 465,471 scaffolds and 11

contigs, and an N50 of 70,464,410 bp (Proch::N50 - Telatin 2018). A total of 297,374 gene 12

features were predicted. 13

Conclusion

14

Here we have described the first draft assembly and annotation of the C. crocodilus genome. We 15

feel this can assist natural resource management, agriculture and research into broader questions 16

about the interplay between microevolutionary and macroevolutionary processes across broad 17

biogeographic scales. 18

Acknowledgments

We are especially indebted to Dr. P. S. Soltis, T. A. Lott and the Herpetology Collection at the University of Florida - Florida Natural History Museum (UF-FLNHM) for generously providing us with tissue samples. We would also like to thank Dr. A. Deshpande, D. Johnson, E. Froehling and the staff at the University of Minnesota Genomics Center (Minneapolis, MN, USA) for isolating DNA from museum samples, library preparation and raw sequencing. We wish to thank S. Landwehr and the Minnesota Supercomputing Institute (MSI) at the University

(7)

of Minnesota and Dr. J. P. Layfield at the University of St. Thomas for allowing us to access critical computational resources. Finally, we are very grateful to Dr. S. Pirro at Iridian Genomes (Bethesda, MD, USA) for valuable insight on scaffolding the draft assemblies. This research was made possible by start-up funds to KWO from the University of St. Thomas.

References

Alonge, M., Soyk, S., Ramakrishnan, S., Wang, X., Goodwin, S., Sedlazeck, F. J., Lippman, Z. B., and Schatz, M. C. 2019. RaGOO: Fast and accurate reference-guided scaffolding of draft genomes. Genome Biology 20:1–17.

Ayarzag¨uena, J. 1984. Variaciones en la dieta de Caiman sclerops. La relacion entre mor-fologia bucal y dieta. Memoria De La Sociedad De Ciencias Naturales La Salle 44:123–140. Brazaitis, P., Watanabe, M. E., and Amato, G. 1998. The Caiman Trade. Scientific

American 278:70–76.

Brum, S. M., Da Silva, V. M., Rossoni, F., and Castello, L. 2015. Use of dolphins and caimans as bait for Calophysus macropterus (Lichtenstein, 1819) (Siluriforme: Pimelodidae) in the Amazon. Journal of Applied Ichthyology 31:675–680.

Busack, S. D. and Pandya, S. 2001. Geographic variation in Caiman crocodilus and Caiman yacare (Crocodylia : Alligatoridae): Systematic and legal implications. Herpetologica 57:294– 312.

Caldwell, J. 2015. World Trade in Crocodilian Skins 2013-2015. Technical report, UN Environment Programme World Conservation Monitoring Centre.

Da Silveira, R. and Thorbjarnarson, J. B. 1999. Conservation implications of commercial hunting of black and spectacled caiman in the Mamiraua Sustainable Development Reserve, Brazil. Biological Conservation 88:103–109.

Escobedo-Galv´an, A. H., Velasco, J. A., Gonz´alez-Maya, J. F., and Resetar, A. 2015. Morphometric analysis of the Rio Apaporis Caiman (Reptilia, Crocodylia, Alligatori-dae). Zootaxa 4059:541–54.

(8)

Finn, R. D., Clements, J., and Eddy, S. R. 2011. HMMER web server: Interactive sequence similarity searching. Nucleic Acids Research 39:W29–W37.

Gans, C. 1980. Allometric Changes in the Skull and Brain of Caiman crocodilus. Journal of Herpetology 14:297–301.

Ghosh, A., Johnson, M. G., Osmanski, A. B., Louha, S., Bayona-V´asquez, N. J., Glenn, T. C., Gongora, J., Green, R. E., Isberg, S., Stevens, R. D., and Ray, D. A. 2020. A High-Quality Reference Genome Assembly of the Saltwater Crocodile, Crocodylus porosus, Reveals Patterns of Selection in Crocodylidae. Genome Biology and Evolution 12:3635–3646.

Gurevich, A., Saveliev, V., Vyahhi, N., and Tesler, G. 2013. QUAST: Quality assess-ment tool for genome assemblies. Bioinformatics 29:1072–1075.

Marley, G., Lawrence, A. J., Phillip, D. A., and Hayden, B. 2019. Mangrove and mudflat food webs are segregated across four trophic levels, yet connected by highly mobile top predators. Marine Ecology Progress Series 632:13–25.

Medem, F. 1955. A new subspecies of Caiman sclerops from Colombia. Fieldiana: Zoology 37:339–343.

Medem, F. 1981. Los Crocodylia de Sur America Volumen I. Ministerio de Educaci´on Nacional, Bogot´a, Colombia.

Medem, F. 1983. Los Crocodylia de Sur America Volumen II. Ministerio de Educaci´on Nacional, Bogot´a, Colombia.

National Center for Biotechnology Information (US) 2014. Submitting Sequences using Specific NCBI Submission Tools., p. NBK566995. In The GenBank Submissions Hand-book [Internet]. Bethesda, MD.

National Center for Biotecnology Information 2020. table2asn gff.

(9)

Okamoto, K. W., Langerhans, R. B., Rashid, R., and Amarasekare, P. 2015. Mi-croevolutionary patterns in the common caiman predict maMi-croevolutionary trends across extant crocodilians. Biological Journal of the Linnean Society p. In press.

Oliveira, V. C. S., Viana, P. F., Gross, M. C., Feldberg, E., Da Silveira, R., de Bello Cioffi, M., Bertollo, L. A. C., and Schneider, C. H. 2021. Looking for genetic effects of polluted anthropized environments on Caiman crocodilus crocodilus (Reptilia, Crocodylia): A comparative genotoxic and chromosomal analysis. Ecotoxicology and Environmental Safety 209:111835.

Peng, Y., Leung, H. C., Yiu, S. M., and Chin, F. Y. 2012. IDBA-UD: A de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth. Bioinformatics 28:1420–1428.

Pimenta, N. C., Barnett, A. A., Botero-Arias, R., and Marmontel, M. 2018. When predators become prey: Community-based monitoring of caiman and dolphin hunting for the catfish fishery and the broader implications on Amazonian human-natural systems. Biological Conservation 222:154–163.

Rice, E. S., Kohno, S., St John, J., Pham, S., Howard, J., Lareau, L. F., O’Connell, B. L., Hickey, G., Armstrong, J., Deran, A., Fiddes, I., Platt, R. N., Gresham, C., McCarthy, F., Kern, C., Haan, D., Phan, T., Schmidt, C., Sanford, J. R., Ray, D. A., Paten, B., Guillette, L. J., and Green, R. E. 2017. Improved genome assembly of American alligator genome reveals conserved architecture of estrogen signaling. Genome Research 27:686–696.

Roberto, I. J., Bittencourt, P. S., Muniz, F. L., Hern´andez-Rangel, S. M., N´obrega, Y. C., ´Avila, R. W., Souza, B. C., Alvarez, G., Miranda-Chumacero, G., Campos, Z., Farias, I. P., and Hrbek, T. 2020. Unexpected but unsurprising lin-eage diversity within the most widespread Neotropical crocodilian genus Caiman (Crocodylia, Alligatoridae). Systematics and Biodiversity 18:377–395.

Shen, W., Le, S., Li, Y., and Hu, F. 2016. SeqKit: A cross-platform and ultrafast toolkit for FASTA/Q file manipulation. PLoS ONE 11:e0163962.

(10)

Shumate, A. and Salzberg, S. L. 2020. Liftoff: accurate mapping of gene annotations. Bioinformatics In press.

Smit, A., Hubley, R., and Grenn, P. 2015. RepeatMasker Open-4.0. Telatin, A. 2018. Proch::N50.

US Fish and Wildlife Service 2018. Common Caiman (Caiman crocodilus) Ecological Risk Screening Summary. Technical report, US Fish and Wildlife Service.

Valencia-Aguilar, A., Cort´es-G´omez, A. M., and Ruiz-Agudelo, C. A. 2013. Ecosys-tem services provided by amphibians and reptiles in Neotropical ecosysEcosys-tems. International Journal of Biodiversity Science, Ecosystem Services and Management 9:257–272.

Références

Documents relatifs

Analysis of the expansin gene family provided an example of the quality of the gene prediction and an insight into the relationships among one class of cell wall related genes

Draft Genome Sequence of Desulfovibrio BerOc1, a Mercury-Methylating Strain Marisol Goñi Urriza, a Claire Gassie, a Oliver Bouchez, b Christophe Klopp, c.. Rémy

Initially submerged under water on a shallow sandy bank of a small river island, the caiman emerged on the bank with an adult individual of C.. caninus in its jaws (Fig. 1)

Habitat use and behaviour of Schneider’s Dwarf Caiman (Paleosuchus trigonatus Schneider 1801) in the Nouragues Reserve, French Guiana.. The ecology of a cryptic predator,

Anastasia Gioti, Romanos Siaperas, Efstratios Nikolaivits, Géraldine Goff, Jamal Ouazzani, Georgios Kotoulas, Evangelos Topakas.. To cite

Draft Genome Sequence of Xanthomonas sacchari Strain LMG 476 Isabelle Pieretti, a Stéphanie Bolot, b,c Sébastien Carrère, b Valérie Barbe, d Stéphane Cociancich, a Philippe Rott,

We report the draft genome sequence of the Xanthomonas cassavae type strain CFBP 4642, the causal agent of bacterial necrosis on cassava plants.. These data will allow the comparison

Le Président doit éviter avec soin toutes les choses qui le rendraient odieux et méprisable, moyennant quoi il aura fait tout ce qu'il avait à faire, et il ne