Ecology and Evolution. 2019;1–20. www.ecolevol.org
|
11 | INTRODUCTION
Greater yam (Discorea alata L.) is one of the major cultivated yam spe‐ cies (Discorea spp.) and the most widely spread among tropical and subtropical regions. The high importance of D. alata for food security has prompted the establishment of several international and national ex situ collections. Due to the limited shelf‐life of stored tuber, yam genetic resources are conserved in vitro or/and in the field. All of these repeated manipulations are time‐consuming and may affect long‐term conservation. Quality control of genotype purity and general collection management is mainly based on morphological
descriptors (IPGRI/IITA, 1997; Mahalakshmi et al., 2007). However, these descriptors are not reliable enough to rationalize ex situ D. alata collection. Indeed, several studies have revealed that morphological variations are not necessarily linked to geographic origin or genetic lineage (Arnau et al., 2017; Lebot, Trilles, Noyer, & Modesto, 1998; Vandenbroucke et al., 2016). Complementary characterization tools are thus required for the conservation and dynamic management of ex situ collections related to germplasm exchange, the development of core collection or identification of future parents for breeding pro‐ grams. D. alata is also a polyploid species with ploidy levels of 2n = 2x, 3x, or 4x and a basic chromosome number of x = 20 (Arnau, Némorin,
Received: 29 August 2018
|
Revised: 12 March 2019|
Accepted: 15 March 2019DOI: 10.1002/ece3.5141
O R I G I N A L R E S E A R C H
Development of a cost‐effective single nucleotide
polymorphism genotyping array for management of greater
yam germplasm collections
Fabien Cormier
1,2| Pierre Mournet
2,3| Sandrine Causse
2,3| Gemma Arnau
1,2|
Erick Maledon
1,2| Rose‐Marie Gomez
4| Claudie Pavis
4| Hâna Chair
2,3This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
© 2019 The Authors. Ecology and Evolution published by John Wiley & Sons Ltd.
1CIRAD, UMR AGAP, Petit‐Bourg, France
2CIRAD, INRA, Univ Montpellier,
Montpellier SupAgro, Montpellier, France
3CIRAD, UMR AGAP, Montpellier, France
4INRA, UR ASTRO Agrosytèmes Tropicaux,
Petit‐Bourg, France Correspondence
Fabien Cormier, CIRAD, UMR AGAP, Petit‐ Bourg, France.
Email: fabien.cormier@cirad.fr Funding information
European Union and Guadeloupe Region
Abstract
Using genome‐wide single nucleotide polymorphism (SNP) discovery in greater yam (Discorea alata L.), 4,593 good quality SNPs were identified in 40 accessions. One hundred ninety six of these SNPs were selected to represent the overall dataset and used to design a competitive allele specific PCR array (KASPar). This array was vali‐ dated on 141 accessions from the Tropical Plants Biological Resources Centre (CRB‐ PT) and CIRAD collections that encompass worldwide D. alata diversity. Overall, 129 SNPs were successfully converted as cost‐effective genotyping tools. The results showed that the ploidy levels of accessions could be accurately estimated using this array. The rate of redundant accessions within the collections was high in agreement with the low genetic diversity of D. alata and its diversification by somatic clone se‐ lection. The overall diversity resulting from these 129 polymorphic SNPs was con‐ sistent with the findings of previously published studies. This KASPar array will be useful in collection management, ploidy level inference, while complementing accu‐ rate agro‐morphological descriptions.
K E Y W O R D S
2
|
CORMIER Etal.Maledon, & Abraham, 2009). Ploidy levels detection is consequently a prerequisite for the identification of possible parents as crosses be‐ tween the different ploidy levels can fail (Nemorin et al., 2013).
Molecular markers have been used to characterize D. alata di‐ versity: random amplified polymorphic DNA (RAPD; Asemota, Ramser, Lopez‐Peralta, Weising, & Kahl, 1996), isoenzymes (Lebot et al., 1998), amplified fragment length polymorphism (AFLP; Malapa, Arnau, Noyer, & Lebot, 2005), simple sequence repeats (SSRs; Siqueira, Marconi, Bonatelli, Zucchi, & Veasey, 2011; Sartie, Asiedu, & Franco, 2012; Otoo, Anokye, Asare, & Telleh, 2015; Chaïr et al., 2016; Arnau et al., 2017), plastid sequences (Chaïr et al., 2016), and Diversity Arrays Technology (DArT; Vandenbroucke et al., 2016). These studies generated essential information on the diversity and representativity of the germplasm collections. However, these tools were not tailored for routine collection management. They were found to be either poorly discriminating within D. alata species or they were complex and not cost‐effective to use. Besides the de‐ velopment of high‐throughput methods for genome‐wide variant detection, such as genotyping‐by‐sequencing (Davey et al., 2011) paired with cost‐effective SNP assay (Broccanello et al., 2018) as KASPar can lead to the development of appropriate markers for collection management. This approach has been successfully im‐ plemented in maize (Semagn et al., 2012), chickpea (Hiremath et al., 2012), Citrus (Garcia‐Lor, Ancillo, Navarro, & Ollitrault, 2013), pigeon pea (Saxena et al., 2014), and Brassica rapa (Su et al., 2018). Regarding the recent release of yam (Dioscorea spp.) genomic resources (Saski, Bhattacharjee, Scheffler, & Asiedu, 2015; Tamiru et al., 2017), the design of such markers for D. alata collection management would be worthwhile. Indeed, once developed they do not require any specific bioinformatics or wet chemistry skills. The results contain few erro‐ neous and missing data and can be easily analyzed and interpreted.
The main objectives of this study were (a) to identify genome‐ wide polymorphic SNP markers, (b) to develop a cost‐effective SNP genotyping array using KASPar technology and (c) to test its use as a tool in managing yam ex situ collections.
2 | MATERIALS AND METHODS
2.1 | Materials
Based on a previous microsatellite markers study (Arnau et al., 2017), a set of 48 accessions representing worldwide D. alata diversity was selected and genotyped to identify polymorphic SNPs and design KASPar markers. Then, for the purpose of validating these markers, 141 landraces from the Tropical Plants Biological Resources Centre (CRB‐PT) and CIRAD ex situ collections maintained in the West French Indies (Guadeloupe) were used.
2.2 | Genotyping‐by‐sequencing (GBS) and
SNP discovery
SNP discovery was based on genotyping‐by‐sequencing (GBS). First, DNA extractions were performed with dried leaves from the
48 accessions as described by Risterucci et al. (2009). The genomic DNA quality was checked using agarose gel electrophoresis, and the quantity was estimated using a Nanodrop ND‐1000 spectropho‐ tometer (Thermo Scientific, Wilmington, USA). For GBS, a genomic library was prepared using the PstI‐MseI restriction enzymes (New England Biolabs, Hitchin, UK) with a DNA normalized quantity of 200 ng per sample. The procedures published by Elshire et al. (2011) were adapted as described in Cormier et al. (2019).
Digestion and ligation reactions were conducted in the same plate. Digestion was conducted at 37°C for 2 hr and then 65°C for 20 min to inactivate the enzymes. The ligation reaction was achieved using T4 DNA ligase enzyme (New England Biolabs, Hitchin, UK) at 22°C for 1 hr, and the ligase was then inactivated, prior to sample pooling, by heating at 65°C for 20 min. Pooled samples were PCR‐ amplified in a single tube. Single‐end sequencing was performed on a paired‐end lane of an Illumina HiSeq3000 (at the GeT‐PlaGe plat‐ form, Toulouse, France). The Tassel 5.2 pipeline (Glaubitz et al., 2014) was used for SNP and indel calling. Sequence tags were aligned to
D. alata contigs (http://www.ebi.ac.uk/ena/data/view/PRJEB10904)
using Bowtie2 v2.2.6 (Langmead & Salzberg, 2012). Accessions with more than 70% missing data were removed. Vcf filtering was per‐ formed using Vcftools 0.1.14 (Danecek et al., 2011; option: ‐‐minDP 8, ‐‐maf 0.1, ‐‐max‐missing 0.60, ‐‐max‐alleles 2, ‐‐thin64).
2.3 | KASPar genotyping and allele calling
Polymorphic SNP flanking sequences (60 bp upstream and 60 bp downstream around the variant position) were selected using SNiPlay3 (Dereeper et al., 2011). In order to assess their puta‐ tive physical positions, these sequences were then blasted to the
D. rotundata reference genome (TDr96_F1 Pseudo_Chromosome:
BDMI01000001–BDMI01000021; Tamiru et al., 2017). The physi‐ cal position of each SNP was defined using their flanking sequences best hit using a BLAST E‐value threshold of 1e−30 (Basic Local Alignment Search Tool). Finally, 192 SNPs were selected by form‐ ing 192 k‐means cluster based on their relative physical distance (Euclidean distance) and selecting the SNP nearest to the centroid of each cluster using R 3.4.0 (R core team, 2017).
The 192 SNPs were converted into a KASPar assay at LGC ge‐ nomics where the primer design and wet chemistry was conducted (Middlesex, UK) on a validation panel of 141 landraces from the CRB‐PT and CIRAD ex situ collections. From raw fluorescence data, allele calling was performed using LGC Kluster Caller software by defining fluorescence clusters. Some accessions with known ploidy level were used as reference to identify fluorescence clusters and assess allelic dosage.
2.4 | Diversity analysis
To identify duplicate accessions and compare accessions with dif‐ ferent ploidy levels, a matrix of dissimilarity between each accession pair was computed as the percentage of shared alleles based on the allele presence/absence.
Then, to refine the kinship assessment, similarities between ac‐ cessions with the same ploidy level were computed in the same way but using the allelic dosage. For diploid accessions, genotypes were coded as 0, 1, and 2 where the number represents the number of nonreference allele. Heterozygous genotypes assessed as polyploid during allele calling were converted to 1. Moreover, for triploid ac‐ cessions, genotypes were coded as 0, 1, 2, and 3 with allelic dosage score as 1:1 during allele call converted to 1.5. For tetraploid acces‐ sions, genotypes were thus coded as 0, 1, 2, 3, or 4 and no correction was needed.
Diversity analysis was conducted in two steps. During the first step, groups of duplicate accessions (redundancy groups) were de‐ fined by grouping accessions having up to one allele mismatch. Then, in the second step, the diversity analysis focused on the similarity between those groups. Clustering based on allele frequencies within redundancy groups followed by a bootstrap approach (pvclust R package, ward.D2, 10,000 boots, AU threshold = 0.95; Suzuki & Shimodaira, 2006) was used to identify gene pools. A diversity net‐ work between redundancy groups was also drawn using significant kinship detected through genotype permutations (1,000), with a sig‐ nificance threshold of 0.05.
3 | RESULTS
3.1 | KASPar assay development and validation
Genotyping‐by‐sequencing (GBS) produced more than 344 mil‐ lion reads resulting in 521,918 sequence tags out of which 207,810 (39.82%) aligned exactly once on D. alata contigs. The remaining reads aligned at multiple locations (25.18%) or did not align to any contig (35%). From these sequence tags, SNP calling produced a rawvcf file of 158,695 SNPs. This raw vcf file was then filtered resulting in a dataset of 40 accessions (Appendix A), and 4,593 good quality SNPs out of which 3,879 (84%) SNPs were mapped by BLAST on the
D. rotundata reference genome. The KASPar assay was then devel‐
oped by selecting 192 SNPs representative of SNPs mapped along the D. rotundata reference sequence, and they were tested on 141 accessions.
Among the 192 SNPs, 26 (13%) SNPs failed as they did not pro‐ duce any amplification signal. From the remaining 166 SNPs (87%), 129 SNPs (Appendix C) with less than 20% missing data and a minor allele frequency of over 5% were retained as high‐quality SNPs. This final dataset (129 SNPs × 141 accessions) contained an overall miss‐ ing data rate of only 0.5% with a maximum of 3% missing data per accession.
The 129 validated KASPar SNPs were distributed on all link‐ age groups used to construct the D. rotundata reference genome (Figure 1). Their distribution was not homogeneous along chromo‐ somes as their position was planned to be representative of that of the initial set of 3,879 mapped SNPs and not equally spaced.
3.2 | Assessment of ploidy levels
In our D. alata validation panel, three ploidy levels (2x, 3x and 4x) coexisted (Appendix B). Thus, the KASPar assay could theoretically produce a maximum of seven types of fluorescence signal (Table 1) corresponding to two types of fluorescence signal in homozygous states (2:0 = 3:0 = 4:0; 0:2 = 0:3 = 0:4), the fluorescence signal of mixed and balanced allelic dosages (1:1 for diploids or 2:2 for tetra‐ ploids) and the four types of fluorescence signal corresponding to the different possible unbalanced allelic dosages at heterozygotic loci (“polyploid‐like” in Table 1) of triploids and tetraploids (1:3; 1:2; 2:1;
F I G U R E 1 Location of KASPar SNPs
on the D. rotundata reference genome (Tamiru et al., 2017). The 21 linkage group are aligned from left to right. Black dots, failed or bad quality SNPs; red dots, the 129 validated SNPs
4
|
CORMIER Etal.3:1). In our case, due to insufficient fluorescence resolution, it was not possible to distinguish fluorescence signals of the 1:3 tetraploid allelic dosage from the 1:2 triploid allelic dosage, or the 2:1 triploid allelic dosage from the 3:1 tetraploid allelic dosage. Consequently, a maximum of five types of fluorescence signals were identified. Overall, five, four, three, and two allelic dosages were detected for 64 (50%), 41 (32%), 19 (15%), and 5 (4%) SNPs, respectively, because some allelic dosages were not present in the validation panel or they were cofounded.
However, the overall allele call and allelic dosage assess‐ ment quality were good. Indeed, the ratio of genotypes scored as
“polyploid‐like” on overall heterozygous genotypes by accession was low (0.09 ± 0.05) for diploids and high for triploids (0.83 ± 0.05). In addition, the three distributions of this ratio corresponding to the three ploidy levels did almost not overlap (Figure 2).
We were thus not able to differentiate all allelic dosage from each other when looking at one SNP. However, ploidy level could be deduced when taking all the KASPar array into account and consid‐ ering the proportion of genotypes scored as “polyploid‐like” per ac‐ cession. This KASPar assay thus differentiated the accession ploidy level and allowed us to assign it for 12 accessions originally of un‐ known ploidy. Nine were set as diploid and three as triploid.
F I G U R E 2 Distribution of the percentage of polypoid‐like genotypes (1:3, 1:2, 2:1, and 3:1 allelic dosage) on overall heterozygous
genotypes by ploidy level (red, diploid; green, triploid; blue, tetraploid)
TA B L E 1 Summary of genotype, allelic
composition and fluorescence signals Type of
genotype Ploidy
Allelic Type of fluorescence signal Dosage Composition Theo. Obs.
Diploid‐like Diploid 0:2 X:X 1 1 1:1 X:Y 4 3 2:0 Y:Y 7 5 Triploid 0:3 X:X:X 1 1 3:0 Y:Y:Y 7 5 Tetraploid 0:4 X:X:X:X 1 1 2:2 X:X:Y:Y 4 3 4:0 Y:Y:Y:Y 7 5
Polyploid‐like Triploid 1:2 X:X:Y 3 2
2:1 X:Y:Y 5 4
Tetraploid 1:3 X:X:X:Y 2 2
3.3 | Diversity analysis
Overall, 141 accessions from CRB‐PT and CIRAD ex situ collections in Guadeloupe were used to validate the KASPar assay (96 diploids, 36 triploids, and nine tetraploids including accessions with known and deduced ploidy level).
The allele presence and/or absence was used to assess the sim‐ ilarity between accessions and thus to identify duplicate accessions
(Figure 3). Indeed, by defining redundancy groups, we ended up with 43 nonredundant groups each containing one to 24 accessions.
These groups of genetically similar accessions were partially ex‐ pected based on the accession vernacular names. For example, the second biggest group (redundancy group 6, Appendix B) was com‐ posed of 18 accessions, five of which had a name related to “Saint Vincent.” The third biggest group contained 14 accessions, four of which had a name related to “Pacala.”.
F I G U R E 3 Dendrogram of dissimilarity between 141 D. alata accessions (red, diploid; green, triploid; blue, tetraploid)
6
|
CORMIER Etal.The main group of redundant accessions was composed of 24 triploids collected at several distant locations (Caribbean islands, New Caledonia and Madagascar). This group consisted of 67% (24/36) of the triploid accessions present in the CRB‐PT and CIRAD collections.
More generally, redundancy groups only consisted of accessions with the same ploidy level (Figure 4). Moreover, similarities within triploids or within tetraploids were higher than within diploids.
The diversity analysis was based on these 43 redundancy groups to avoid bias. After clustering, the bootstrap procedure detected five significant gene pools, named “cluster” here, rep‐ resented in the kinship network (Figure 5). Only one (cluster C, Figure 5) consisted of accessions from the three ploidy levels. This cluster encompassed accessions from the Caribbean and Pacific re‐ gions. Clusters A, B, and D contained triploids from the Caribbean and Madagascar, tetraploids from the Pacific and diploids from the Caribbean, respectively (Figure 5, Appendix B). Cluster E was the biggest one, with 21 nonredundant diploid accessions originat‐ ing from India, Nigeria, Côte d'Ivoire, the Caribbean and Pacific (Figure 5, Appendix B).
Genotype permutations and network analysis gave a more de‐ tailed view of kinship between redundancy groups and Clusters. This approach revealed a low number of significant links between the di‐ versity clusters D or E and the others (Figure 5) revealing that these clusters could consist of original genepools.
4 | DISCUSSION
4.1 | Assessment of allelic dosage and detection of
ploidy levels
KASPar technology is based on competitive allele‐specific am‐ plification followed by allele‐specific fluorescence assessment
(Semagn, Babu, Hearne, & Olsen, 2014). Detection of allelic dos‐ age in polyploid species is thus possible (Cuenca, Aleza, Navarro, & Ollitrault, 2013). However, several parameters may influence the fluorescence, such as the DNA quality or primer specificity, and consequently the ability to discriminate fluorescence signals and the allelic dosage. In our case, we were able to discriminate five types of fluorescence signal. At heterozygous loci, fluorescence signals were a mixture of two types of allelic‐specific fluorescence. Fluorescence signals should also be balanced for diploids which have a balanced allelic dosage (1:1) at heterozygous loci. Diploids should therefore theoretically have no genotypes assessed as “polyploid‐like.” Conversely, triploids should theoretically have only genotypes assessed as “polyploid‐like” at heterozygous loci. A balanced allelic dosage is impossible for triploids. Our results showed that 91 ± 5% and 83 ± 5% of heterozygous genotypes were correctly called for diploids and triploids, respectively. Regarding the recent explosion of genotyping related to next‐gen‐ eration sequencing, bioinformatics tools have been developed to accurately determine dosages (e.g., GBS2ploidy; Gompert & Mock, 2017). However, this requires deep sequencing and usually an as‐ sumption of ploidy levels present in the dataset (Bourke, Voorrips, Visser, & Maliepaard, 2018).
Application in collection management may nevertheless not re‐ quire allelic dosage assessment at each locus. Our aim was thus to develop a tool for estimating ploidy levels and not variations in copy number. Moreover, the results showed that ploidy levels for each accession can be accurately deduced from the percentage of “polyp‐ oid‐like” genotypes on overall heterozygous genotypes. Regarding the overlapping distributions of this ratio (Figure 2), the only risk is to confuse triploids and tetraploids estimated at 3%. Consequently, ploidy level assessment is possible and fairly accurate for D. alata using the KASPar assay developed in this study.
F I G U R E 5 Network of kinship for the 43 D. alata redundancy groups based on significant similarity (p < 0.05, edge‐weighted spring‐
embedded layout). Nodes shape and letter, cluster of diversity identified by a bootstrap procedure; red nodes, diploids; green nodes, triploids; blue nodes, tetraploids; edge colors, similarity from gray (0.64) to black (1)
4.2 | Identification of duplicate accessions
The dataset included 129 SNPs validated on 141 accessions corre‐ sponding to 43 unique redundancy groups. The resuming of the 141 accessions to 43 unique redundancy groups was related to the nar‐ row D. alata genetic diversity, above all in polyploid germplasm (i.e., triploids and tetraploids) already identified in previous studies. For example, using DarT markers, a low varietal richness was revealed by Vandenbroucke et al. (2016), who studied 80 landraces from six different Vanuatu islands and differentiated only seven unique genotypes. Using isozyme markers, Lebot et al. (1998) studied 269 worldwide distributed cultivars and concluded that the genetic di‐ versity of the most widespread cultivars was narrow.
Regarding the accession vernacular names, redundant acces‐ sions were expected in our sample. Some of these redundancy groups contained accessions detected in duplicate, while they could be differentiated by morphological characterization. For example, redundancy group five (including Lupias, Malalagi, or Malankon) ex‐ hibited diversity in tuber shape and tuber flesh color in agreement with previous genetic diversity studies that already pooled these ac‐ cessions together and highlighted this intragroup variability in tubers (Arnau et al., 2017; Malapa et al., 2005).
Morphological variability within a redundancy cluster mostly arises via D. alata clonal reproduction and farmers' selection of new morphotypes resulting from somatic mutations (Lebot et al., 1998; Malapa et al., 2005; Vandenbroucke et al., 2016). Small genetic or epigenetic variations are commonly selected to create new diversity in horticultural crops such as yam as reviewed by Krishna et al. (2016).
The ability of KASPar assay developed in this study to differen‐ tiate duplicates in collections from genetically close accessions was related: (a) to the low number of studied loci (129), but also (b) to the
D. alata diversification process (i.e., selection of somaclonal mutants)
and (c) the presence of real duplicates within collections. This tool is thus efficient for attributing accessions to a genetic lineage (e.g., germplasm exchange), but a good complementary agro‐morpholog‐ ical and ecophysiological characterization of collections should also be done to completely differentiate somaclonal mutant clones from duplicates (e.g., identification of promising genitors for breeding programs).
4.3 | Diversity and collection management
The CRB‐PT collection has been shown to be representative of worldwide D. alata diversity (Arnau et al., 2017). A subset of this ex situ collection has been genotyped in this study. However, all diver‐ sity groups identified by Arnau et al. (2017) were present (except one containing five very similar Indian accessions). Our validation panel was thus representative of the worldwide D. alata diversity. Moreover, a good correlation was obtained between the findings of the previous study of worldwide D. alata diversity of Arnau et al. (2017) and the gene pools identified in this study (Appendix B). We can thus hypothesize that the 129 SNPs KASPar array developed for D. alata allow us to accurately assess genetic diversity and the
findings may be transferable to other collections. Moreover, this genotyping tool is a robust method: (a) to assess complementarity/ redundancy between the different collections, (b) to identify under represented genetic groups, and (c) to plan future collects to fill gaps in collections.
5 | CONCLUSION
This is the first SNP array designed for D. alata and validated on a sub‐ set of accessions representative of worldwide D. alata diversity. This tool will allow users to estimate accession ploidy levels and genetic lineages. The results showed a good correlation between the diversity assessed by this KASPar array and the findings of previous studies. This KASPar array is a robust and cost‐effective tool for diversity assess‐ ment and collections management. Regarding the importance of veg‐ etative reproduction and somaclonal selection in D. alata, it is a good tool to complement agro‐morphological description in collections.
ACKNOWLEDGMENTS
This study was financially supported by the European Union and Guadeloupe Region (Programme Opérationnel FEDER— Guadeloupe—Conseil Régional 2014–2017). The authors would like to thank Suzia Gélabale, Marie‐Claire Gravillon, Jean‐Luc Irep, David Lange, and Elie Nudol for their involvement in CRB‐PT and CIRAD in vitro and field collections conservation. Finally, we are grateful to Patrick Ollitrault for his valuable discussion and to David Manley for English proofing.
CONFLIC T OF INTEREST
The authors declare that they have no conflict of interest.
AUTHOR CONTRIBUTIONS
C.P., F.C., H.C., and P.M. designed the study. C.P., F.C., E.M., G.A., and R‐M.G. contributed to collecting materials and sample prepara‐ tion. P.M. and S.C. developed GBS protocol, carried out DNA ex‐ traction, and GBS library preparation. H.C. and P.M. performed SNP discovery. F.C. and H.C. designed the KASPar assay and performed its analysis. C.P., F.C., and H.C. wrote the manuscript with the input of all authors.
DATA ACCESSIBILIT Y
Plant materials may be requested at the CRB‐PT of Guadeloupe http://intertrop.antilles.inra.fr/Portail/accessions/find/11. KASPar primers sequence is available in Appendix B.
ORCID
8
|
CORMIER Etal.REFERENCES
Arnau, G., Bhattacharjee, R., Mn, S., Chair, H., Malapa, R., Lebot, V., … Pavis, C. (2017). Understanding the genetic diversity and population structure of yam (Dioscorea alata L.) using microsatellite markers.
PLoS One, 12(3), e0174150.
Arnau, G., Némorin, A., Maledon, E., & Abraham, K. (2009). Revision of ploidy status of Dioscorea alata L. (Dioscoreaceae) by cytogenetic and microsatellite segregation analysis. Theoretical and Applied Genetics,
118, 1239–1249. https://doi.org/10.1007/s00122‐009‐0977‐6
Asemota, H. N., Ramser, J., Lopez‐Peralta, C., Weising, K., & Kahl, G. (1996). Genetic variation and cultivar identification of Jamaican yam germplasm by random amplified polymorphic DNA analysis.
Euphytica, 92, 341–351. https://doi.org/10.1007/BF00037118
Bourke, P. M., Voorrips, R. E., Visser, R. G. F., & Maliepaard, C. (2018). Tools for genetic studies in experimental populations of poly‐ ploids. Frontiers in Plant Science, 9, 513. https://doi.org/10.3389/ fpls.2018.00513
Broccanello, C., Chiodi, C., Funk, A., McGrath, J. M., Panella, L., & Stevanato, P. (2018). Comparison of three PCR‐based assays for SNP genotyping in plants. Plant Methods, 14, 28. https://doi.org/10.1186/ s13007‐018‐0295‐6
Chaïr, H., Sardos, J., Supply, A., Mournet, P., Malapa, R., & Lebot, V. (2016). Plastid phylogenetics of Oceania yams (Dioscorea spp., Dioscoreaceae) reveals natural interspecific hybridization of the greater yam (D. alata). Botanical Journal of the Linnean Society, 180, 319–333.
Cormier, F., Lawac, F., Maledon, E., Gravillon, M.‐C., Nudol, E., Mournet, P., … Arnau, G. (2019). A reference high‐density genetic map of greater yam (Dioscorea alata L.). Theoretical and Applied Genetics. https://doi.org/10.1007/s00122‐019‐03311‐6
Cuenca, J., Aleza, P., Navarro, L., & Ollitrault, P. (2013). Assignment of SNP allelic configuration in polyploids using competitive allele‐spe‐ cific PCR: Application to citrus triploid progeny. Annals of Botany,
111, 731–742. https://doi.org/10.1093/aob/mct032
Danecek, P., Auton, A., Abecasis, G., Albers, C. A., Banks, E., DePristo, M. A., … 1000 Genomes Project Analysis Group (2011). The variant call format and VCFtools. Bioinformatics, 27, 2156–2158. https://doi. org/10.1093/bioinformatics/btr330
Davey, J. W., Hohenlohe, P., Etter, P., Boone, J., Catchen, J., & Blaxter, M. (2011). Genome‐wide genetic marker discovery and genotyping using next‐generation sequencing. Nature Reviews Genetics, 12, 499– 510. https://doi.org/10.1038/nrg3012
Dereeper, A., Nicolas, S., Lecunff, L., Bacilieri, R., Doligez, A., Peros, J. P., … This, P. (2011). SNiPlay: a web‐based tool for detection, manage‐ ment and analysis of SNPs. Application to grapevine diversity proj‐ ects. BMC Bioinformatics, 12, 134.
Elshire, R. J., Glaubitz, J. C., Sun, Q., Poland, J. A., Kawamoto, K., Buckler, E. S., & Mitchell, S. E. (2011). A robust, simple genotyping‐by‐se‐ quencing (GBS) approach for high diversity species. PLoS One, 6, e19379. https://doi.org/10.1371/journal.pone.0019379
Garcia‐Lor, A., Ancillo, G., Navarro, L., & Ollitrault, P. (2013). Citrus (Rutaceae) SNP markers based on Competitive Allele‐Specific PCR; transferability across the Aurantioideae subfamily. Applications
in Plant Sciences, 1(4), apps.1200406. https://doi.org/10.3732/
apps.1200406
Glaubitz, J. C., Casstevens, T. M., Lu, F., Harriman, J., Elshire, R. J., Sun, Q. I., & Buckler, E. S. (2014). TASSEL‐GBS: A high capacity genotyping by sequencing analysis pipeline. PLoS One, 9(2), e90346. https://doi. org/10.1371/journal.pone.0090346
Gompert, Z., & Mock, K. E. (2017). Detection of individual ploidy lev‐ els with genotyping‐by‐sequencing (GBS) analysis. Molecular Ecology
Resources, 17, 1156–1167. https://doi.org/10.1111/1755‐0998.12657
Hiremath, P. J., Kumar, A., Penmetsa, R. V., Farmer, A., Schlueter, J. A., Chamarthi, S. K., … Varshney, R. K. (2012). Large‐scale development
of cost‐effective SNP marker assays for diversity assessment and genetic mapping in chickpea and comparative mapping in legumes. Plant Biotechnology Journal, 10, 716–732. https://doi. org/10.1111/j.1467‐7652.2012.00710.x
IPGRI/IITA (1997). Descriptors for Yam (Dioscorea spp.). International Institute of Tropical Agriculture, Ibadan, Nigeria/International Plant Genetic Resources Institute, Rome, Italy.
Krishna, H., Alizadeh, M., Singh, D., Singh, U., Chauhan, N., Eftekhari, M., & Sadh, R. K. (2016). Somaclonal variations and their applica‐ tions in horticultural crops improvement. 3 Biotech, 6, 54. https://doi. org/10.1007/s13205‐016‐0389‐7
Langmead, B., & Salzberg, S. (2012). Fast gapped‐read alignment with Bowtie 2. Nature Methods, 9, 357–359. https://doi.org/10.1038/ nmeth.1923
Lebot, V., Trilles, B., Noyer, L. J., & Modesto, J. (1998). Genetic relation‐ ships between Dioscorea alata L. cultivars. Genetic Resources and Crop
Evolution, 45, 499–509.
Mahalakshmi, V., Ng, Q., Atalobor, J., Ogunsola, D., Lawson, M., & Ortiz, R. (2007). Development of a West African yam Dioscorea spp. core collection. Genetic Resources and Crop Evolution, 54, 1817–1825. https://doi.org/10.1007/s10722‐006‐9203‐4
Malapa, R., Arnau, G., Noyer, J. L., & Lebot, V. (2005). Genetic diversity of the greater yam (Dioscorea alata L.) and relatedness to D. nummularia Lam. and D. transversa Br. as revealed with AFLP markers. Genetic
Resources and Crop Evolution, 52, 919–929. https://doi.org/10.1007/
s10722‐003‐6122‐5
Nemorin, A., David, J., Maledon, E., Nudol, E., Dalon, J., & Arnau, G. (2013). Microsatellite and flow cytometry analysis to help under‐ stand the origin of Dioscorea alata polyploids. Annals of Botany, 112, 811–819. https://doi.org/10.1093/aob/mct145
Otoo, E., Anokye, M. L., Asare, P. A., & Telleh, J. P. (2015). Molecular categorization of some water yam (Dioscorea alata L.) germplasm in Ghana using microsatellites (SSR) markers. Journal of Agricultural
Science 7(10), 226–238.
R Core Team (2017). R: A language and environment for statistical
com-puting. Vienna, Austria: R Foundation for Statistical Comcom-puting.
Retrieved from https://www.R‐project.org/
Risterucci, A.‐M., Hippolyte, I., Perrier, X., Xia, L., Caig, V., Evers, M., … Glaszmann, J.‐C. (2009). Development and assessment of diver‐ sity arrays technology for highthroughput DNA analyses in Musa.
Theoretical and Applied Genetics, 119, 1093–1103. https://doi.
org/10.1007/s00122‐009‐1111‐5
Sartie, A., Asiedu, R., & Franco, J. (2012). Genetic and phenotypic diver‐ sity in a germplasm working collection of cultivated tropical yams (Dioscorea spp.). Genetic Resources and Crop Evolution, 59, 1753–1765. https://doi.org/10.1007/s10722‐012‐9797‐7
Saski, C. A., Bhattacharjee, R., Scheffler, B. E., & Asiedu, R. (2015). Genomic resources for water yam (Dioscorea alata L.): Analyses of EST‐sequences, de novo sequencing and GBS libraries. PLoS One,
10(7), e0134031.
Saxena, R. K., von Wettberg, E., Upadhyaya, H. D., Sanchez, V., Songok, S., Saxena, K., … Varshney, R. K. (2014). Genetic diversity and demo‐ graphic history of Cajanus spp. illustrated from genome‐wide SNPs.
PLoS One, 9, e88568.
Semagn, K., Babu, H., Hearne, S., & Olsen, M. (2014). Single nucleotide polymorphism genotyping using Kompetitive Allele Specific PCR (KASP): Overview of the technology and its application in crop im‐ provement. Molecular Breeding, 33, 1–14. https://doi.org/10.1007/ s11032‐013‐9917‐x
Semagn, K., Beyene, Y., Makumbi, D., Mugo, S., Prasanna, B. m., Magorokosho, C., & Atlin, G. (2012). Quality control genotyping for assessment of genetic identity and purity in diverse tropical maize in‐ bred lines. Theoretical and Applied Genetics, 125, 1487–1501. https:// doi.org/10.1007/s00122‐012‐1928‐1
Siqueira, M. V., Marconi, T. G., Bonatelli, M. L., Zucchi, M. I., & Veasey, E. A. (2011). New microsatellite loci for water yam (Dioscorea alata, Dioscoreaceae) and cross‐amplification for other Dioscorea species.
American Journal of Botany, 98, 144–146.
Su, T., Li, P., Yang, J., Sui, G., Yu, Y., Zhang, D., … Zhang, F. (2018). Development of cost‐effective single nucleotide polymorphism marker assays for genetic diversity analysis in Brassica rapa. Molecular
Breeding, 38, 42. https://doi.org/10.1007/s11032‐018‐0795‐0
Suzuki, R., & Shimodaira, H. (2006). Pvclust: An R package for assessing the uncertainty in hierarchical clustering. Bioinformatics, 12, 1540– 1542. https://doi.org/10.1093/bioinformatics/btl117
Tamiru, M., Natsume, S., Takagi, H., White, B., Yaegashi, H., Shimizu, M., … Terauchi, R. (2017). Genome sequencing of the staple food crop white Guinea yam enables the development of a molecular marker for sex determination. BMC Biology, 15, 86. https://doi.org/10.1186/ s12915‐017‐0419‐x
Vandenbroucke, H., Mournet, P., Vignes, H., Chaïr, H., Malapa, R., Duval, M. F., & Lebot, V. (2016). Somaclonal variants of taro (Colocasia
es-culenta Schott) and yam (Dioscorea alata L.) are incorporated into
farmers’ varietal portfolios in Vanuatu. Genetic Resources and Crop
Evolution, 63, 495–511. https://doi.org/10.1007/s10722‐015‐0267‐x
How to cite this article: Cormier F, Mournet P, Causse S, et
al. Development of a cost‐effective single nucleotide polymorphism genotyping array for management of greater yam germplasm collections. Ecol Evol. 2019;00:1–20. https:// doi.org/10.1002/ece3.5141
APPENDIX A
TA B L E A 1 Description of the 40 D. alata accessions used to detect polymorphic SNP
Collection Code Name Origin Ploidy
CRB‐PT PT‐IG‐00002 Pakutrany Nlle Caledonie
PT‐IG‐00006 Fénakué Puerto Rico 2
PT‐IG‐00010 Divin 1 Guadeloupe 2
PT‐IG‐00020 DA 26 Guyane Fr 3
PT‐IG‐00338 HYB 30 Guadeloupe
PT‐IG‐00350 Pacala Guadeloupe 2
PT‐IG‐00029 Plimbite Haïti 2
PT‐IG‐00033 Pyramide Puerto Rico 2
PT‐IG‐00046 Sea 190 Puerto Rico 2
PT‐IG‐00053 Kokoéta Nlle Calédonie 2
PT‐IG‐00686 Roujol 4
PT‐IG‐00687 INRA C 143
PT‐IG‐00688 INRA AL 56
PT‐IG‐00690 INRA AL 18
PT‐IG‐00692 INRA X 154 Guadeloupe
PT‐IG‐00693 INRA X 17 Guadeloupe
PT‐IG‐00694 Dou 4
PT‐IG‐00695 INRA X 142 Guadeloupe
PT‐IG‐00696 Ciradienne 4
PT‐IG‐00697 TiViolet 4
PT‐IG‐00698 Malalagi Vanuatu 2
PT‐IG‐00702 Manlankon Vanuatu 2
PT‐IG‐00689 Nureangdan Vanuatu 3
PT‐IG‐00077 Kinabayo Puerto Rico 2
PT‐IG‐00078 Toro Haïti 3
APPENDIX B
Collection Code Name Origin Ploidy
Cirad Vu 024a Tépuva Vanuatu 2
Vu 528a Tacharamivar 2
Vu 564a Mendrovar Vanuatu 2
Vu 567a Homb Vanuatu 2
Vu 754a Intejegan Vanuatu 4
Vu 231a Tagabé Vanuatu 4
Ovy taty Madagascar
Vu 247a n.a Vanuatu 2
Vu 401a Basa Vanuatu 2
Kabusa 2 74F 2 42F 2 61F 2 14M 2 H4x200 4 TA B L E A 1 (Continued)
TA B L E B 1 Description of the 141 D. alata used as the KASPar assay validation panel
Collection Code Ploidya Div. Clust.b Redund. Grpc Accession name Origin SSRd
PT‐IG‐00087 3 A 26 65 Martinique XII
PT‐IG‐00070 3 A 26 66 Martinique XII
PT‐IG‐00090 3 A 26 Caillade 1 Haïti XII
PT‐IG‐00020 3 A 26 DA 26 French Guyana XII
PT‐IG‐00037 3 A 26 DA 27 French Guyana XII
PT‐IG‐00022 3 A 26 De agua Puerto Rico XII
PT‐IG‐00061 3 A 26 Igname d eau Martinique XII
PT‐IG‐00550 3 A 26 Montpellier XII
PT‐IG‐00075 3 A 26 Renta Yam Jamaica XII
PT‐IG‐00072 3 A 26 Sassa 1 Martinique XII
PT‐IG‐00063 3 A 26 Sassa 2 Martinique
PT‐IG‐00088 3 A 26 St Martin Martinique XII
PT‐IG‐00034 3 A 26 Sweet yam Jamaica XII
PT‐IG‐00557 3 A 26 Tahiti couleuvre Guadeloupe XII
PT‐IG‐00068 3 A 26 Tahiti cultivé Guadeloupe XII
PT‐IG‐00069 3 A 26 Tahiti French Guadeloupe XII
PT‐IG‐00018 3 A 26 Tahiti messien Guadeloupe
PT‐IG‐00064 3 A 26 Tana New Caledonia XII
PT‐IG‐00021 3 A 26 Telemaque Martinique XII
PT‐IG‐00044 3 A 26 Ti Joseph 1 Haïti XII
PT‐IG‐00078 3 A 26 Toro Haïti XII
CT257_CIV 3 A 26 OvyTaty
AmbalaKindresy‐Ambohimasoa
Madagascar
CT258_CIV 3 A 26 OvyTaty Amboasary‐Ambohimasoa Madagascar
Collection Code Ploidya Div. Clust.b Redund. Grpc Accession name Origin SSRd
PT‐IG‐00685 3 A 26 Sainte Anne
PT‐IG‐00030 3 A 33 67 Martinique XII
PT‐IG‐00558 4 B 3 Wabé New Caledonia XVIII
Vu472a 4 B 3 Toufi Tetea Vanuatu XVIII
Vu231a 4 B 3 Vanuatu XVIII
Vu750a 4 B 3 Wanorak Vanuatu
Vu534a 4 B 3 Bisoro Vanuatu XVIII
Vu754a 4 B 30 Noulelcae Vanuatu XVI
Vu408a 4 B 31 Manioc Vanuatu
PT‐IG‐00039 2 C 2 Americano Dominican Republic VII
PT‐IG‐00023 2 C 2 Florido Puerto Rico
PT‐IG‐00553 2 C 2 Pro 1 VII
PT‐IG‐00095 2 C 2 SEA 144 Puerto Rico IV
PT‐IG‐00555 2 C 2 SRT 29 VII
PT‐IG‐00041 2 C 2 St Domingue Dominican Republic VII
Vu401a 2 C 2 Basa Vanuatu VII
CT256 2 C 2
PT‐IG‐00009 4 C 12 Nouméa New Caledonia XVI
Vu247a 2 C 14 Vanuatu
Vu528a 2 C 16 Sinoua Vanuatu
PT‐IG‐00025 3 C 22 Goana New Caledonia XIII
PT‐IG‐00002 3 C 22 Pakutrany New Caledonia XIII
Vu699a 3 C 22 Tumas Vanuatu
Vu461a 3 C 22 Tumas Vanuatu XIII
Vu755a 4 C 24 Nepelev Vanuatu
PT‐IG‐00014 2 C 37 Divin 2 Guadeloupe
PT‐IG‐00006 2 C 37 Fénakué Puerto Rico
PT‐IG‐00053 2 C 37 Kokoéta New Caledonia
PT‐IG‐00559 2 C 39 Wassa New Caledonia
PT‐IG‐00001 2 D 7 64 Martinique
PT‐IG‐00010 2 D 7 Divin 1 Guadeloupe
PT‐IG‐00568 2 D 25 77 Martinique IV
PT‐IG‐00092 2 D 34 Caplaou Puerto Rico
PT‐IG‐00561 2 D 42 H 23
PT‐IG‐00562 2 D 42 H 50
74F 2 E 4 India
PT‐IG‐00049 2 E 5 Cinq Puerto Rico III
PT‐IG‐00027 2 E 5 Lupias New Caledonia III
PT‐IG‐00046 2 E 5 Sea 190 Puerto Rico III
Vu590a 2 E 5 Vanuatu III
Vu423a 2 E 5 Manlankon Vanuatu III
Vu639a 2 E 5 Malalagi Vanuatu III
Vu024a 2 E 5 Ptris Vanuatu III
PT‐IG‐00065 2 E 6 DA 28 French Guyana IV
TA B L E B 1 (Continued)
12
|
CORMIER Etal.Collection Code Ploidya Div. Clust.b Redund. Grpc Accession name Origin SSRd
PT‐IG‐00093 2 E 6 DA 32
PT‐IG‐00395 2 E 6 Fafadro bis IV
PT‐IG‐00060 2 E 6 Grand Etang Guadeloupe IV
PT‐IG‐00051 2 E 6 Morado Cuba IV
PT‐IG‐00073 2 E 6 Purple Lisbon Puerto Rico IV
PT‐IG‐00333 2 E 6 Sainte Catherine Guadeloupe IV
PT‐IG‐00052 2 E 6 Smooth Statia Puerto Rico IV
PT‐IG‐00024 2 E 6 St Vincent blanc 1 Martinique IV
PT‐IG‐00036 2 E 6 St Vincent blanc 2 Martinique IV
PT‐IG‐00556 2 E 6 St Vincent mart. Guadeloupe IV
PT‐IG‐00045 2 E 6 St Vincent Violet Martinique IV
PT‐IG‐00016 2 E 6 St Vincent Yam St. Lucia IV
PT‐IG‐00374 2 E 6 Ti Joseph Haïti IV
PT‐IG‐00067 2 E 6 Wénéféla bis New Caledonia IV
Vu487a 2 E 6 Teroosi Vanuatu VI
770 2 E 6
PT‐IG‐00623 2 E 6
PT‐IG‐00396 2 E 8 A 24
PT‐IG‐00071 2 E 10 72 Martinique VIII
PT‐IG‐00055 2 E 10 76 Martinique VIII
PT‐IG‐00089 2 E 10 Asmhore
PT‐IG‐00058 2 E 10 Bété Bété Côte d'Ivoire VIII
PT‐IG‐00091 2 E 10 Campêche 2
PT‐IG‐00546 2 E 10 Jardin Haitien VIII
PT‐IG‐00547 2 E 10 Kourou 1 French Guyana VIII
PT‐IG‐00548 2 E 10 Kourou 2 French Guyana VIII
PT‐IG‐00350 2 E 10 Pacala Guadeloupe VIII
PT‐IG‐00551 2 E 10 Pacala cacao French Guyana VIII
PT‐IG‐00552 2 E 10 Pacala Guyane French Guyana VIII
PT‐IG‐00017 2 E 10 Pacala station Guadeloupe VIII
PT‐IG‐00554 2 E 10 SRT 24 VIII
19 2 E 10
PT‐IG‐00057 2 E 11 Vino Purple forme Puerto Rico
61F 2 E 15 India
PT‐IG‐00019 2 E 19 Gordito New Caledonia IX
PT‐IG‐00047 2 E 20 Buet New Caledonia
PT‐IG‐00029 2 E 20 Plimbite Haïti
PT‐IG‐00048 2 E 21 Bacala 1 Haïti
PT‐IG‐00413 2 E 21 St Vincent St. Vincent
Cuba6 2 E 23 Cuba
PT‐IG‐00542 2 E 27 AL 10 I
PT‐IG‐00042 2 E 27 Brazzo Fuerte Puerto Rico I
PT‐IG‐00038 2 E 27 Brésil 1 I
PT‐IG‐00564 2 E 27 KL 10 I
PT‐IG‐00565 2 E 27 KL 21
TA B L E B 1 (Continued)
Collection Code Ploidya Div. Clust.b Redund. Grpc Accession name Origin SSRd
PT‐IG‐00566 2 E 27 KL 40 I
PT‐IG‐00054 2 E 27 MP1 16H56 I
PT‐IG‐00033 2 E 27 Pyramide Puerto Rico I
PT‐IG‐00074 2 E 28 Oriental Barbados II
14M 2 E 29 India
PT‐IG‐00077 2 E 32 Kinabayo Puerto Rico II
PT‐IG‐00085 2 E 35 St Sauveur Guadeloupe
PT‐IG‐00560 2 E 35 Yam jamaïque
PT‐IG‐00543 2 E 36 Cross lisbon
PT‐IG‐00392 2 E 38 A 13
PT‐IG‐00398 2 E 38 A 2
PT‐IG‐00563 2 E 40 Sc.c 1.1
PT‐IG‐00008 2 E 41 AIA 445 Nigeria
PT‐IG‐00015 2 E 43 Igname rouge Guadeloupe X
Vu703a 3 F 1 Nawanurunkimanga Vanuatu
PT‐IG‐00544 3 F 9 Cuello largo Puerto Rico XV
PT‐IG‐00026 3 F 9 Féo Puerto Rico XV
Vu696a 3 F 9 Nowateknempian Vanuatu XV
PT‐IG‐00076 3 F 13 Bélep New Caledonia XIV
Vu735a 3 F 13 Noplon Vanuatu XIV
Vu760a 3 F 13 Nureangdan Vanuatu XIV
PT‐IG‐00397 2 F 17 SEA 119, Toki
Vu613a 2 F 17 Peter Vanuatu VI
Vu589a 2 F 17 Makila Vanuatu XI
VU590a 2 F 18 Vanuatu III
Vu554a 2 F 18 Nourembor Vanuatu VI
Vu567a 2 F 18 Letsletsbolos Vanuatu IV
Vu564a 2 F 18 Makila Vanuatu VI
Vu026a 2 F 18 Dammasis Vanuatu VI
aIn italic, ploidy detected using the percentage of polyploid genotype type on overall heterozygous loci. bGroup of diversity from diversity analysis cGroup of similarity used to select nonredundant accessions. Genotypes in the same group have a maximum of one allele mismatch). dCluster of diver‐
sity identified by SSR in Arnau et al. (2017).
14
|
CORMIER Etal. A PP EN D IX C T A B LE C 1 K A SP ar a ss ay d es cr ip tio n f or t he 1 29 h ig h‐ qu al ity S N Ps : N um be r o f f lu or es ce nc e t yp e d et ec te d, c hr om os om e a nd p os iti on o n D. ro tu nd at a r ef er en ce g en ome a ss es se d b y B LA ST ( E‐ val ue ) SN P_ ID # F lu o. ty pe C hr. Po s. E‐ value Seq uen ce S1 _78 46 478 9 4 1 10 666 77 8E− 52 G TT TC C C A ATG G TA A C A C TT TC TG C A A A G C C TG A A A G G C A C TTG A C TTG A C AT TG C C A A G [T /G ] G C AT TAG TT G C C AC AG C C C C A AT TC TA AC TA TAG C TG C AG C AG C AG C TA AC G G TG A AG C T S1 _16 61 77 83 1 4 1 30 0232 1 2E− 40 C AT C A C AA G C G AAA C AA TG C AA G AT C A C TG C A G C G C TAAA C AA G A C G AT G AAAA C TG C TA [G /T ] A AC TG C C C AC TC TT C C A A AG AT AG AC TG C AG C A A A AC A A A AG C C G C TT G G AT G AT C AC AC S1 _7 38 178 82 5 1 52 14 076 8E− 52 TG AT TC TTC TTC C TC TTC ATC TG C A G A C TT TT TG G ATG ATG C TA C TTC TTC A C TA A A C A A [A /G ] C A A C C TC TT TA TC A G ATG TC TTC TA TC A A G G C TG A A C TTC C A ATC A A G TTG TG TG AT TTG S1 _3 08 27 62 16 4 1 22 422 99 3 2E− 54 G A A G C TG AT TG A G C TG C TTG AT ATC G ATC TG C A G TG G A G G ATG C AT A A A G TT TC TG ATG G [G /C ] C A G C G TC G TC G TG TG C A A ATTT G C AT G G G A C TTTT A C A A C C AT A C A A G G C A AT G ATTTTT S3 _3 28 349 3 4 2 625 90 05 3E −51 TC TC AA G TAA C TA TT AT G G TA G TAA C A G AT G AT G C AAA TG TG AA G G C AA G AT AA G AAA TA [C /G ] C AT A C C TC C C C ATC TG C A G C A C A A G TA A C G A G G G TC C G ATC ATC TG TG TA A G G C ATG A AT S1 _15 62 03 93 5 2 21 40 40 53 8E− 52 G ATC A A C TG C A ATG C C A ATG G TTG G TG C A A G TT TC TTG G G A AT A C C TG C TG C C TG A A ATG [T /G ] A A A A C C C GT A C A AT AT GA TA C A A AT A A GT G GA GT G C C TGT G C TG C A G C TGA GA AT TGA GA S1 _1 94 68 01 24 3 2 251 33 44 1 8E− 52 TT TTG A C TG A C A G C C TT TA G TG A A C TG C A G G C TT A C ATG G A A A A C C TC TTG A C C TC G C TG [C /T ] G A G G C TG G AT AT A G C A AT TG AT G TT G C TC AT G C TAT TA C AT A C C TT C A C AT G TA C A C A G G S1 _1 49 01 30 38 4 2 29 03 387 3 1E− 43 C TC TC A G G TA TG A ATG G ATG G TG C C C A A ATG AT TT TG A A A C G C C C A C ATG G G TT TG TTG A [A /T ] TG TG TT ATC TA C G C A A C C A A AT TG TA AT A A ATG A C TA A ATG TG G TA A ATC TT TTC C C TG C S1 _54 90 860 4 5 2 31 94 61 76 8E− 52 AA C TG A C AAAA TG G C AA TG C AA TG C C C TT TG TC A C TG AT C A C AAA G AA G A G AA G A C AT AT [A /T ] A G G G TA TTTTT AT G G A A A A A C A A A G AT G G TC C AT C TT ATT ATT ATTTT C TC C TG C A G G G G S1 _2 20 35 87 74 4 3 562 36 4 8E− 52 GGG AT TG A C A A A A GC A C A AT C AT TT A C AT G C TGC A G AT TC GGC A G AT TT TGC TGC A G AT G [G /T ] TGC TC C A C C AT C AT C AT TGGC A A A GGGG TA GC C TG AT TT C C AT GGG A C A C TT GG A G A A A G S1 _2 94 34 74 89 4 3 318 22 57 5E− 48 A G G A A G A G TAT G TT C TC C AT C A AT TA C AT TC TC AT TA C G C A A C TT C A AT AT AT C C AT C A G [A /G ] A A AG G G TT AT TC TG G TG AG C A AT AC A AT AC AC AT TT TC TG C AG C AG G A AT AG A AC AT AT G S1 _2 13 49 67 00 5 3 66 08 59 0 2E− 53 TAT AT A A A C AT TC C AT TT TG AT G A G A AT G A G A G A C C AT TG TT G C TA G C AT C C C AT TG A C T[ A /G ] C C AT AT C TG C A G G G AT C TG TAT G G A A A A A G TG C AT G C AT G A A A G A C A A AT AT A AT A AT AT S1 _1 23 50 24 62 5 3 12 21 14 09 2E− 46 AT AG AG A A A AG AC C TG C AG A AG C AG A AG C A AC AC G AT C AT C C TT G TT G AC AT C TC G C A A A [C /T ] GA A GA GA A A G C TT TT TG GT GA A GT TT GA GT GA GA AT TGT A GA A GT C C TC C AT G G C C AT G G S2 _1 33 94 50 2 4 3 12 98 66 00 1E− 50 G TC TA TT TA G C ATG TC TT A G TT TC TTG G TG ATG ATG A C TG C A G TTG A A G TC A A A AT TTG A [G /T ] G ATC TC TC ATC TG A A C ATC ATC ATG C TTG TG A A G A A ATG A AT A A AT TG C A A G A A A A G C TG S1 _2 12 47 79 84 4 3 18 80 8497 2E− 53 TT GT A GT C GT A AT C GA A AT C GT AT TC TT TGT G GA GT AT TA TT TT A G GT G GA A GA TGT TGA [T /G ] AT TT C TG A A AG AG TT C AG AG AG G AT TAG A AT C AC C AG C C TAC TG C AG TG G A AG AT AT G TA S1_4 05 17 92 6 5 4 24 42 03 9 3E− 50 TG G TT A ATC G C A G ATG G G G C TTG G A A A G A C TC TG C A G G C G ATG TC TT TG TTG A G C TA TC T[ G /A ] A A G G TC A ATG G C ATC TC A A C G G G G C C G TTC TG TG A G TC TTG G TT TA TC TC C G ATG A G C TT S1 _3 41 69 07 21 3 4 41 7202 7 7E− 46 TTG TG C AT TC C TC C C TG C ATC TC TTG G A A C TG C A G C C TG C C C A C TC C ATC C TC C C ATG C T[ A /G ] C TC TTG TG A A C C ATC TC A A C C A C TC TTC TTC TT TC TC TC TC TC TT TC TC TC TG TTG TC TC (C on tin ue s)SN P_ ID # F lu o. ty pe C hr. Po s. E‐ value Seq uen ce S1_ 94 82 25 91 5 4 651 80 43 4E− 37 TC TG ATG ATC G TG C TTC TC TC ATC A A ATG TT A G ATG TTG TTC TA A C TC TTC A A G C A ATC A [G /A ] G A A C TT AT TTG C TA C TA TG A G TTG TG A C TT AT TG TTG C TG G TC A C TG G AT A C TG C A G G TC S1_ 22 305 98 54 5 4 959 08 72 4E− 49 G TTG C TG C C A G C A ATC ATG A G A C C AT TA TG A G C TA TG TTG ATG G ATG G TG A G G TTC G G G A [C /T ] G TC TG TGG AT A A GC TG TA A C TGC A GGC A GGC TA C C TA C C AT TGG A G A A A AT TGGC C A GC T S3 _1 94 04 55 4 4 11 60202 4 4E− 49 AT G C C TG A AT C TG G AG G AC A AG C TAC TG C AG TG TG TC A AG A A AT AG AC AT AC TT G A AG AG [A /C ] AT TA TG A ATC A G A A C A G TTC C A G G C C G G TA ATG TTG A G TTC ATG C TT TC TG TTC C TT TC T S2 _59 08 99 82 4 4 13 02 23 73 2E −47 C TC AA TAA TG G TG G A C AAA TG TG C C TT C TAA TT C A G AAAAAAAA TG TC TA C TAA TT A C C T[ C /G ] TC C A G C TC TG TC AT TT A C TTG TT AT A G G ATG A C AT A AT TG ATG A C TC TG C A G A G G ATG AT S1 _2 95 08 97 5 5 4 13 95 61 40 8E− 52 TAG TG AG C AG TT G AG AT C AT TA AC G AG C AT AG A AG A AC TC AC TG C AG A AG C C AC A A AG C C [A /T ] GA A GT C A AT G GA GT TT C TA TG GA A C AT A AT GA C GA G GA AT TG G GA A C A C TA TA TT G C A C T S1 _78 09 92 39 4 4 19 00 97 21 2E− 33 C AG G TT TT C TT TT C A AT TG C A AG AG AC AT C A AG C A A AG G C TT G C AG A A AC C G AT AC C A A A [C /G ] C TG A G G TA G TA TG C TT ATC AT TTG TG AT A A AT TC A G TA A A C C TG C A G G C A AT TA G TA ATG S1 _9 818 28 99 5 4 25 05 99 70 2E −47 TG TG C AC A AT G C TA TG AC C AC AC AT AT G G TT TC A A AC AG AC A AG G A AC AG A AT C A AC AG A [T /C ] AC AC TAC TT AC AG TG AC A AG C TC C G AT AT C AG TT G C G C AG AC A AC C C TG TT TT C TG C AG C S1 _1 33 76 874 3 4 25 69 25 81 2E− 40 C TTG AT ATG TC TG TA TTG G G TG TC AT TTC TG C A G TA TT AT TTG G TC C G C A A A G G TA A ATC [A /G ] AT A G A A ATT G G C G TC ATTT A G C A AT AT C TA A A C TG TA TTT G A ATTT G TA G TT C G A G C ATT S1 _5 08 63 27 0 5 4 27 38 25 03 2E− 54 G TGGC TC TC A A A G A C TG A GG A A G TGC G TG A A G AT A GGGC TC AT TGGGG TA C C A AT AT A A C [T /C ] G G TG AT AT TT ATG G TC A G G G TTG G ATC A G TG A A ATG TA TG G AT AT TC AT TTG G TG C TG C A S1 _3 02 40 81 3 5 5 441 85 52 4E− 43 TC A G A G A G TG A G TC A C AT A A A A AT A A G AT TC ATG TTG C AT TG TG ATG C C C C C TG AT TC TT [A /C ] TT A C TT G C C C C C AAA C G AT AA G G A C AT C TT TT C TG AAAA C TG C A G A G C C TAA G G AAA TT A S1_ 28 425 725 1 5 5 88 094 66 2E −41 AA TT TT C TG AT C TG A G TA TT G G TC AA C AA G AA TC C AA C AT AAAA C TC AA G TAAAA TG C A G [T /C ] A A ATT A C A ATT G TT A C AT A ATT G TTT C TC C A C AT A A C TT G C TA AT A ATT ATTT C TG C A G A S1 _1 72 013 713 2 5 15 915 33 9 5E− 48 A A C G C ATG AT A C TC A ATG TG TTG TT A C TA AT TG A ATC TC A AT TA AT ATG A C C TG C A G TTG [C /T ] TTG A AT TT TC ATG C TA TG TT TG TA A G G C C TC TA G TG TTG C C A A A A C C TC A G A C ATC TTC G S2 _4 60 465 47 5 5 18 63 63 85 5E− 54 C A C TG C A G G G G C TG C TTC C TTG ATG ATG A A C C C A A A G A A C TC TA TT TC TC A A AT A A A G C G [C /A ] TT C ATT G G A A A G A A ATT C TC TG A C C C G G A G C TT C A G TC TG A C TT G C A G TT ATTT C C TTTT S1 _16 14 04 50 8 4 5 19 74 15 36 5E− 42 TG C TA ATC A A A A C TG ATG C C TC TG C A G G G A C C A A G TTG ATC AT TG A A A G A ATC A G G G AT T[ A /C ] TC AC TT TC TA TC AC C TG TAG AG TAC TG C AT G TAC AT A AG TC AC C AT G A A A AG C AC C G G G C S1 _8 674 974 5 5 5 20 78 42 76 1E− 48 C C C TT C TG AT ATTT G C TT G G A G TT G A G AT G TC TG TT G A A C TTT A G C TG G A A ATTTT A C A G [T /C ] C A A C TC TA TG A AT TG TG TT TTC TG AT TC AT A C G C A C AT TTG TG AT TT TG TG C C TG C A G A G S1 _3 49 43 06 97 5 5 22 888 76 9 8E− 52 TC C AT C C TGGG A A GC A C TGGC A AT GG TT G AT TT C GG TA GGC C A A G AT TT GG A GC C C A A GC [G /A ] A C ATC TC TA A C C C A ATC G G A ATG C ATC TG C A G G G C A G G A A A G C A G TC C AT TT TC C A G C TC S3 _4 33 53 09 6 5 5 24 04 83 51 8E− 52 TG AT GGGGGG A A AT A A C C C A A A C TGG TGG A G TT C TA TC A GC A A C AT G A A G C C A A C TGC A G [G /C ] A G ATG A A A C C TC TC TTC TT TA TC C TT TTC C ATC TTC TTC A C C TC TC TTC C ATC A C TA C TC S1 _5 10 60 666 3 5 261 80 15 4 1E− 42 G A AG G A AG G C AG C AG C C TT TC A A AC C TG C AG AT G A AG TC G C C AC G TC TT TT G A A A AT TG C [A /T ] A A C C C A G C C A AT TT TG C A G C TG C TA G TT TA TA G G TA AT G ATA G ATA G TTA TC TA G G C TA T S1 _1 26 39 60 48 5 5 274 05 53 1 4E− 49 C AA C TC TG TC G TA G G AAAA G AAA C TG A C C C G TC C AT G AT C AA C AAAA C A G AA TT TT A G A G [G /A ] A ATG C C A A G G C A C TTC TTC TC A A G G TT TTC TA AT TC TG TTC TTG A G G G TG G C TTG TC TC T T A B LE C 1 (Co nti nue d) (C on tin ue s)
16
|
CORMIER Etal. SN P_ ID # F lu o. ty pe C hr. Po s. E‐ value Seq uen ce S1 _1 89 52 341 7 3 5 30 76 58 36 4E− 49 TT TA A A GC TT G TG A C A GC G A G TT TG A A G A C C TC TGC TGC A GGC AT GC AT C C A GC TT GC GC [A /G ] G C AG G A AC AG C TAC AC C G G C TG TT AC TG C C G TG G C G C TT TC AC TC AG TG G AC TG A A A AC A S1 _2 10 69 05 10 5 5 31 57 79 24 2E− 53 AT C A C C TG TC AT TC TT A G C AT A C G C TG A C A G C AT C C A G C A AT A A G C C AT G AT G C TG G G C A [A /T ] G AT TC C C A A G G A C TG G ATC G C TTG A A A C TG C TC TT TA C TC TC ATC TA C A A G C C C TG C A G A S1 _1 99 16 49 36 5 6 10 29 53 14 1E− 37 ATT ATT ATT ATT ATT ATTT C TT C TT C TG C A G TA A C G G G TC A C ATT G C TT G G A A G A A G TT G [T /C ] TG G AG C TT G A A AC G C A A AT A AT G AT AC G A AC AC AG C C TC AG C A AT G C AC G AT TAC TC G C T S1 _2 82 21 15 88 5 6 17 92 54 27 2E− 40 C A C TT TG C A C A AT TA TC TG C TG TA G A ATG TTC TA TT TG TT A A A C C TG C A G AT TA G G A A AT [C /T ] C TA A AT TC TA TC TG C TG TT ATG A A G TC C TG G TA G TA TG TA C A A G C A G G TTG AT TA TA C AT S1 _1 16 00 69 17 4 6 21 84 59 52 8E− 52 TT G TC A A G G A A CG AT CCC TT C A CC TCC TCG G A G A A G A AT CG CCCG A A C A C A CG A G A G AT G [G /C ] AT AT G G C C G G A AC C TG C AG AG G AG A A AG C G A AG C C C TA AC C C TG AG G TG C TT C AG C AC AG S1_4 57 61 96 3 5 6 27 08 35 04 1E− 50 TT TT AT C AT G A AC C G AT C AT C C TG AG AC AG G TAG A AG A AG C TC C C AC TC TT C C C AG G G G A [G /A ] G AT A AT TC C C TC A A A G C AT C A C TT C C A C A G AT C G TC A A C AT AT A AT C TG C A G C AT C A A C C S1 _2 89 56 329 7 3 6 31 21 076 0 2E− 53 AA TG AA C C AT AT C AT AA TC AA C TA G AT G TG AAAAAA G AA TA TT TG C A C AA C TG C A G G TG G [A /G ] C AG G A A AC C A AG G G G C TA A AT AG AC AC AC C TC AT G AC C TAG TT TC AC AC C C AT C TC C TG T S1 _2 10 28 474 2 5 6 3202 24 12 1E− 50 A A A AG AC C C A AG G A A AT G AC AC AG C AG A AC C AT TG TC C C AT TG G AC AT TT TC A AC TAC AT [T /G ] C AAA C TG C A G C AT AAAAA C C AA G AT TT AT AT C A C AT AT C C A C A C TA G TT C AA TG AAA C AA S1 _2 44 041 68 0 5 7 89 12 69 4E− 49 AT C A A A AC AT C G C TC TC TG C AG C C A A AT C AC AG AC G TT AG AG A A AT AT TT AT AG G C G AG T[ G /A ] ATG G C C TTG TTG TTC TA G A ATG G TA C A A G AT TG TG C A G C C A A A G G C TTC G A G TC G TT TTG S4 _1 83 12 47 3 7 318 07 48 5E− 48 G AC A A AC C AG A A AT C TT TC C TT TC C AT TA AG G A AG C A A AT C C AC C A AG G AG A AC TG C AG T[ T/ A ] G CCC A A AT G A AT CCG A G G G C TCCC A A A CC A C TG G C TG CC TT TT C A A G G AT TG CC A A G CG A S1 _1 56 52 08 59 4 7 10 36 714 3 8E− 52 TC TT TT A C TG AT AT A A A G A G A C TA C C A G A ATC C AT TT G TA TG TTG G TT A ATC TG C A G A C A [C /T ] TG A A AC TC TA TT G TT G TT AT A A AC TT TC C G AG C TT C C C A AG AG C AT A AC AT AC AT G A AC A S1 _1 09 90 70 43 3 7 15 65 896 6 2E− 53 A G G A G A A A A A AT TC ATG TG ATG TC C TC C AT ATC TC A G C C TC G TC TC G G G TG G TC A A ATG A [A /G ] A C TG C A G C A A G TA TTG G G A C A AT TG C A A G A AT A G A C ATG G ATG G C A C TC TC A ATG TG A G T S1_ 36 58 33 705 3 7 175 410 18 1E− 50 A C A C AT C TT C A C C AT TC A AT C A C TT TC AT C C A A C TG C A G C A A C G TC TC A A C A A G AT C TC C [C /A ] TG A G C TA G G TA TC ATC A AT TT TC TA C A A G C A ATC TG C AT TG G A A A G TG ATC ATG G A C C G A S1 _2 15 3759 78 5 7 17 82 82 47 4E− 49 TG C TG TG C TC A C G C C G ATG G AT A C TG TG A A G C A G C G G C TG C A G C TTG A G A G TA G TC C G TA [C /T ] A GA G G G GT G G GT GA TT GT G TGA G GA GA GT GA TGA G G GA A GA G G G G G TG C G TG C GT TT TA T S1 _5 95 696 0 5 8 19 90 861 5E− 35 A G AT A A G C A C TTT G TA TC TT G C TA TTTTT G TT G C TC TTT ATT ATT G AT G TG C A A C A AT G T[ C /T ] C C C A AC A AC C AC AC AC AC AC AC AC AC AC AC A AT TT TG TA TT TT TA TG TT AG C TAC TT C AT S1_ 14 28 325 46 5 8 407 30 06 4E− 49 AA C TAA C AT G AA TT TT G G C TC AA TG AT AT AA G AT TAA C AA C AAAAA C G TT TT TG C TG C A G [G /A ] GT TC TT GA A C A A GT TT GA TGA A AT C A C A A A AT G GA TA TT GA A A G TT TG TA A GA AT GT TA T S1 _1 02 92 69 38 3 8 57 71 89 3 3E− 50 AA TC TT TG AT G A C AAA G C TG C A G C TT C TT TT C AT G C AAAA C AA TAAAAA G TA TA C C G G AT [C /T ] TG ATG TG AT ATG G G ATG AT C A G ATC A C TA TA C TG A A A ATG A A A C C TG TG C C A G C TTC TC T S1 _2 923 132 7 5 8 64 76 874 1E− 48 TC C A GC A A AT A GG TGGGG A A C AT C AT A C A A C GGGC A C GG A AT G TT C AT C G A A A AT GC A C C [C /T ] G C TAA C AT C G TG G C C AAA G C TA TG G AAAA C AT TG C AA TA G AAA G TA TAA C C TG C A G C TG A S3_ 546 78 46 3 5 8 77 4840 3 4E− 56 TC A A GA G C TT C A A GA A GA G GA A A GA A G GA TA AT A A GT GA A AT AT C A GA GT TA GA GT GT G G [G /A ] A GA C C TG C A GA A GA A C A A A GA GT TT GT GT TA C TGA A AT GA TG GA TT GT GT TA TT GA TC C A T A B LE C 1 (Co nti nue d) (C on tin ue s)SN P_ ID # F lu o. ty pe C hr. Po s. E‐ value Seq uen ce S1 _20 82 36 88 9 2 8 947 92 07 4E− 49 A G G A A G G A A A G A G A A A G A A G TT TC TG C TG C A G TC TC A G C C C C TTC TTC G A AT TC TTC TTG [T /C ] A G TT TA TC A A C A AT C A C TC A AT C AT A C G G TG A C A AT G C C A C TC TT C A A AT A A C TC A G C AT S1 _16 93 56 49 5 5 8 12 13 316 4 8E− 52 AT A C A G A A AT TAT C A G TG TA AT AT AT TA C A G A A G TA G A AT G C TT C AT C A C C A G A AT C TG A [T /A ] TT TA TA TG A A A AC AC AC TG AC C TC TT G AT G A AG A AT TAG G C A A A AC AG G G AG TC TG C AG A S3 _3 53 09 77 0 5 8 19 352 52 1 2E −47 C ATTT C C A A ATTT C A G A A A AT A A AT C G G TTT C C AT A A G ATTT G A G G TA C A A AT A G TTT C C [A /G ] C A A AG G AG G TT TA TC TG AT AC A AC AC TG C AG C TT G A AT AT G G TA A AT A AC TAG TC TC AC A S1 _23 54 19 64 8 4 8 23 12 5222 5E− 48 G AA C TT C A G AAA TT G TT AT A C G C TG C A G AT TG C C C AAAA TG AA G C AT TC AT TA G A C AT AA [C /T ] TG AT C C C AT A A AT C A G C C C A G C TT C TT TT AT G TT G TA C AT A A A A G TT C A AT TA G C A A G AT S2 _3 08 75 42 6 3 8 27 55 47 86 1E− 43 TT C AT G TTT G G A A G AT C TA AT G TC A ATTT A G AT G TC AT AT G G TTT A G TTTT G TA TT A G TA [C /G ] TT TAT G TT G TAT TAT TC C A AT AT A AT C C A AT AT C AT AT C TG C A AT TC TG C A G C A G G TC TT S1 _7 12 85 261 4 9 10 39 39 6 2E− 53 AG AG A AT G TC C G G AT A A AT C C TC AG AG A AG AC TC C AC C TT C G C A AG G TG C C C AG C TC G C A [C /A ] GGC C AT G A A G A A C TC G A GC TC TG TGGGG TT A C GGC TGC A GC TC A GGC C AT GC C C C AT C TT S1 _3 52 41 33 90 4 9 31 687 74 2E− 53 C TT TC TTG G C A A A C A G TTC TG C A G TA G AT TTG A A G TC A G C TTC TTC TA C A A G TC TA C C A A [A /G ] A G A G G AT TC C A A G TC A G TG A ATG G AT ATG ATC A AT TT G C ATG A C TG C TTG A A A A G TC G G G S1 _5 84 54 21 3 5 9 35 70 970 2E− 53 TG TC C TG TG G C C TC ATC G A G G A G C C AT TTC TC TA G C AT TG AT A G A G G A G G G TTG TT ATG G [C /T ] TC TC A A C TC TC TC C TTC C C TTC A G A ATC A C C AT TG A A G AT TG C TG AT TC TG C A G ATG AT T S2 _9 85 61 10 5 9 506 606 4 1E− 50 TG G A A A AT TA G G TA TC C C A G TT A C C ATG G A A ATC G C TA G TG ATC TG C TTG AT A G G C A A G G [T /C ] C C A AT TT AC AG AG AG G AT AC TG C C G TG TT C G TT AG C C A AT C TG G AG A A AC TG C AG AT AC C S1 _1 574 48 00 6 5 9 777 31 68 2E− 46 TT TA A C TT TT G A A A GGC TGC A GGG TA TG A A AT C A C A GGC C C C G A GC C TGC TA AT G TA G AT [C /T ] ATG G TG A A G A A G C TG C ATC TG A G G ATG A A G A G G A ATC TT ATG ATG G A C ATG ATG C A G ATG S3 _1 469 996 0 5 9 14 77 13 43 1E− 43 AA C TAAAA G C AAA C C AAAAA TAAA TT C C TC TG C C G TT AAA TAA C C TG C A G AAAAAAA TA G [C /A ] G A A AT G TG AC A AG G AG A AT AT TT AC AT AC C TT C G C C TC C AT G AC AC TT C TT TC AG AG TT C S2 _5 88 43 16 0 5 9 17 85 53 77 5E− 54 C A AT TC AT TT C A G G C AT A AT G TT AT C A A G TA AT G C AT AT TC TA C C A G A A AT G A A C TT TAT [G /A ] TG G A A C ATC AT TC TTG A C AT TTG A A G A A C TG C A G TTG AT TA C A A G TG A AT TG C TT AT A A C S1 _10 85 05 610 5 9 23 13 83 42 1E− 48 G G AAAA TA TC C AAAAA G C AA TAAAA G C TG C AA TG G A C G C A G C C AA TG C C G C TG TT TC A C A [A /G ] TC G A A G G C AT TC TG C C TA A G C C G C G TC G ATG TG G G TC TA G A C A C C A C TG C A G TTC G A G A A S1 _9 64 09 13 6 4 10 27 3696 9E− 45 TG A A G TT TG AT G AT TC A AT TT A C C A AT G AT TT TC AT G C A C A G G A G TAT AT C TG A G A AT A A [C /T ] G AG C A A A AT G AT G C AG AG G TT AC TT C AG C A AG A A AT G C TG C AG AG A AG G AT G TG AC C A AG S1 _759 07 47 9 5 10 12 959 79 8E− 52 AA TG TC TT G A C A G AA G C C AT G AAAAA G C TC C A C AAAA TAA G TC C AAA TT G A G AT TG G A G A [G /A ] C AT A C TC TC C A C TG C A G C A A G TC TG TA C C C TG TC TA TG TG A C TG C TG G A G G G G C TC TTG C S3 _1 79 11 82 0 3 10 222 07 48 8E− 39 AG TAG TT TC TG A AC TG G TAC TT TG AT C A AT AC C TG C AG AG TT AG TAG C AG TAG C A AT A AG [C /T ] G A G G A G A C C TC A AT ATC TTG C A C ATG ATC A C TC A C C G A A A ATG G A A C AT TA TC A G C ATG A S1 _2 82 03 20 37 4 10 50 023 51 7E− 40 G AT TT A C A G G A C A AT TA C AT TT C A G AT TT C C AT A AT G AT G G TA A C TA C A A G A AT AT TAT T[ C /A ] TG TA A G C A C C ATG AT A C TTG TA TC C AT TA C ATG C AT TG A ATC A A A A G A A C TG C A G TT TT A S2 _6 6969 33 3 2 10 59 76 818 2E− 52 TT TA AC TG TA TT G G C AG TG TC TG C AG AC AG AG C TAC G TC TAG C A A AG TG AG C A AC TC AT C [A /G ] TC TG AAA C AA C TC C AA TC TG TAAAA C A G A G C AA G TC A C A G AAAA TT TA TA C C AA G AT AA G T A B LE C 1 (Co nti nue d) (C on tin ue s)
18