Development of a cost‐effective single nucleotide polymorphism genotyping array for management of greater yam germplasm collections

(1)

Ecology and Evolution. 2019;1–20. www.ecolevol.org

|

1

1 | INTRODUCTION

Greater yam (Discorea alata L.) is one of the major cultivated yam spe‐ cies (Discorea spp.) and the most widely spread among tropical and subtropical regions. The high importance of D. alata for food security has prompted the establishment of several international and national ex situ collections. Due to the limited shelf‐life of stored tuber, yam genetic resources are conserved in vitro or/and in the field. All of these repeated manipulations are time‐consuming and may affect long‐term conservation. Quality control of genotype purity and general collection management is mainly based on morphological

descriptors (IPGRI/IITA, 1997; Mahalakshmi et al., 2007). However, these descriptors are not reliable enough to rationalize ex situ D. alata collection. Indeed, several studies have revealed that morphological variations are not necessarily linked to geographic origin or genetic lineage (Arnau et al., 2017; Lebot, Trilles, Noyer, & Modesto, 1998; Vandenbroucke et al., 2016). Complementary characterization tools are thus required for the conservation and dynamic management of ex situ collections related to germplasm exchange, the development of core collection or identification of future parents for breeding pro‐ grams. D. alata is also a polyploid species with ploidy levels of 2n = 2x, 3x, or 4x and a basic chromosome number of x = 20 (Arnau, Némorin,

Received: 29 August 2018

|

Revised: 12 March 2019

|

Accepted: 15 March 2019

DOI: 10.1002/ece3.5141

O R I G I N A L R E S E A R C H

Development of a cost‐effective single nucleotide

polymorphism genotyping array for management of greater

yam germplasm collections

Fabien Cormier

1,2

| Pierre Mournet

2,3

| Sandrine Causse

2,3

| Gemma Arnau

1,2

|

Erick Maledon

1,2

| Rose‐Marie Gomez

4

| Claudie Pavis

4

| Hâna Chair

2,3

This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

1_{CIRAD, UMR AGAP, Petit‐Bourg, France}

2_{CIRAD, INRA, Univ Montpellier,}

Montpellier SupAgro, Montpellier, France

3_{CIRAD, UMR AGAP, Montpellier, France}

4_{INRA, UR ASTRO Agrosytèmes Tropicaux,}

Petit‐Bourg, France Correspondence

Fabien Cormier, CIRAD, UMR AGAP, Petit‐ Bourg, France.

Email: [email protected] Funding information

European Union and Guadeloupe Region

Abstract

Using genome‐wide single nucleotide polymorphism (SNP) discovery in greater yam (Discorea alata L.), 4,593 good quality SNPs were identified in 40 accessions. One hundred ninety six of these SNPs were selected to represent the overall dataset and used to design a competitive allele specific PCR array (KASPar). This array was vali‐ dated on 141 accessions from the Tropical Plants Biological Resources Centre (CRB‐ PT) and CIRAD collections that encompass worldwide D. alata diversity. Overall, 129 SNPs were successfully converted as cost‐effective genotyping tools. The results showed that the ploidy levels of accessions could be accurately estimated using this array. The rate of redundant accessions within the collections was high in agreement with the low genetic diversity of D. alata and its diversification by somatic clone se‐ lection. The overall diversity resulting from these 129 polymorphic SNPs was con‐ sistent with the findings of previously published studies. This KASPar array will be useful in collection management, ploidy level inference, while complementing accu‐ rate agro‐morphological descriptions.

K E Y W O R D S

(2)

2

|

CORMIER Etal.

Maledon, & Abraham, 2009). Ploidy levels detection is consequently a prerequisite for the identification of possible parents as crosses be‐ tween the different ploidy levels can fail (Nemorin et al., 2013).

Molecular markers have been used to characterize D. alata di‐ versity: random amplified polymorphic DNA (RAPD; Asemota, Ramser, Lopez‐Peralta, Weising, & Kahl, 1996), isoenzymes (Lebot et al., 1998), amplified fragment length polymorphism (AFLP; Malapa, Arnau, Noyer, & Lebot, 2005), simple sequence repeats (SSRs; Siqueira, Marconi, Bonatelli, Zucchi, & Veasey, 2011; Sartie, Asiedu, & Franco, 2012; Otoo, Anokye, Asare, & Telleh, 2015; Chaïr et al., 2016; Arnau et al., 2017), plastid sequences (Chaïr et al., 2016), and Diversity Arrays Technology (DArT; Vandenbroucke et al., 2016). These studies generated essential information on the diversity and representativity of the germplasm collections. However, these tools were not tailored for routine collection management. They were found to be either poorly discriminating within D. alata species or they were complex and not cost‐effective to use. Besides the de‐ velopment of high‐throughput methods for genome‐wide variant detection, such as genotyping‐by‐sequencing (Davey et al., 2011) paired with cost‐effective SNP assay (Broccanello et al., 2018) as KASPar can lead to the development of appropriate markers for collection management. This approach has been successfully im‐ plemented in maize (Semagn et al., 2012), chickpea (Hiremath et al., 2012), Citrus (Garcia‐Lor, Ancillo, Navarro, & Ollitrault, 2013), pigeon pea (Saxena et al., 2014), and Brassica rapa (Su et al., 2018). Regarding the recent release of yam (Dioscorea spp.) genomic resources (Saski, Bhattacharjee, Scheffler, & Asiedu, 2015; Tamiru et al., 2017), the design of such markers for D. alata collection management would be worthwhile. Indeed, once developed they do not require any specific bioinformatics or wet chemistry skills. The results contain few erro‐ neous and missing data and can be easily analyzed and interpreted.

The main objectives of this study were (a) to identify genome‐ wide polymorphic SNP markers, (b) to develop a cost‐effective SNP genotyping array using KASPar technology and (c) to test its use as a tool in managing yam ex situ collections.

2 | MATERIALS AND METHODS

2.1 | Materials

Based on a previous microsatellite markers study (Arnau et al., 2017), a set of 48 accessions representing worldwide D. alata diversity was selected and genotyped to identify polymorphic SNPs and design KASPar markers. Then, for the purpose of validating these markers, 141 landraces from the Tropical Plants Biological Resources Centre (CRB‐PT) and CIRAD ex situ collections maintained in the West French Indies (Guadeloupe) were used.

2.2 | Genotyping‐by‐sequencing (GBS) and

SNP discovery

SNP discovery was based on genotyping‐by‐sequencing (GBS). First, DNA extractions were performed with dried leaves from the

48 accessions as described by Risterucci et al. (2009). The genomic DNA quality was checked using agarose gel electrophoresis, and the quantity was estimated using a Nanodrop ND‐1000 spectropho‐ tometer (Thermo Scientific, Wilmington, USA). For GBS, a genomic library was prepared using the PstI‐MseI restriction enzymes (New England Biolabs, Hitchin, UK) with a DNA normalized quantity of 200 ng per sample. The procedures published by Elshire et al. (2011) were adapted as described in Cormier et al. (2019).

Digestion and ligation reactions were conducted in the same plate. Digestion was conducted at 37°C for 2 hr and then 65°C for 20 min to inactivate the enzymes. The ligation reaction was achieved using T4 DNA ligase enzyme (New England Biolabs, Hitchin, UK) at 22°C for 1 hr, and the ligase was then inactivated, prior to sample pooling, by heating at 65°C for 20 min. Pooled samples were PCR‐ amplified in a single tube. Single‐end sequencing was performed on a paired‐end lane of an Illumina HiSeq3000 (at the GeT‐PlaGe plat‐ form, Toulouse, France). The Tassel 5.2 pipeline (Glaubitz et al., 2014) was used for SNP and indel calling. Sequence tags were aligned to

D. alata contigs (http://www.ebi.ac.uk/ena/data/view/PRJEB10904)

using Bowtie2 v2.2.6 (Langmead & Salzberg, 2012). Accessions with more than 70% missing data were removed. Vcf filtering was per‐ formed using Vcftools 0.1.14 (Danecek et al., 2011; option: ‐‐minDP 8, ‐‐maf 0.1, ‐‐max‐missing 0.60, ‐‐max‐alleles 2, ‐‐thin64).

2.3 | KASPar genotyping and allele calling

Polymorphic SNP flanking sequences (60 bp upstream and 60 bp downstream around the variant position) were selected using SNiPlay3 (Dereeper et al., 2011). In order to assess their puta‐ tive physical positions, these sequences were then blasted to the

D. rotundata reference genome (TDr96_F1 Pseudo_Chromosome:

BDMI01000001–BDMI01000021; Tamiru et al., 2017). The physi‐ cal position of each SNP was defined using their flanking sequences best hit using a BLAST E‐value threshold of 1e−30 (Basic Local Alignment Search Tool). Finally, 192 SNPs were selected by form‐ ing 192 k‐means cluster based on their relative physical distance (Euclidean distance) and selecting the SNP nearest to the centroid of each cluster using R 3.4.0 (R core team, 2017).

The 192 SNPs were converted into a KASPar assay at LGC ge‐ nomics where the primer design and wet chemistry was conducted (Middlesex, UK) on a validation panel of 141 landraces from the CRB‐PT and CIRAD ex situ collections. From raw fluorescence data, allele calling was performed using LGC Kluster Caller software by defining fluorescence clusters. Some accessions with known ploidy level were used as reference to identify fluorescence clusters and assess allelic dosage.

2.4 | Diversity analysis

To identify duplicate accessions and compare accessions with dif‐ ferent ploidy levels, a matrix of dissimilarity between each accession pair was computed as the percentage of shared alleles based on the allele presence/absence.

(3)

Then, to refine the kinship assessment, similarities between ac‐ cessions with the same ploidy level were computed in the same way but using the allelic dosage. For diploid accessions, genotypes were coded as 0, 1, and 2 where the number represents the number of nonreference allele. Heterozygous genotypes assessed as polyploid during allele calling were converted to 1. Moreover, for triploid ac‐ cessions, genotypes were coded as 0, 1, 2, and 3 with allelic dosage score as 1:1 during allele call converted to 1.5. For tetraploid acces‐ sions, genotypes were thus coded as 0, 1, 2, 3, or 4 and no correction was needed.

Diversity analysis was conducted in two steps. During the first step, groups of duplicate accessions (redundancy groups) were de‐ fined by grouping accessions having up to one allele mismatch. Then, in the second step, the diversity analysis focused on the similarity between those groups. Clustering based on allele frequencies within redundancy groups followed by a bootstrap approach (pvclust R package, ward.D2, 10,000 boots, AU threshold = 0.95; Suzuki & Shimodaira, 2006) was used to identify gene pools. A diversity net‐ work between redundancy groups was also drawn using significant kinship detected through genotype permutations (1,000), with a sig‐ nificance threshold of 0.05.

3 | RESULTS

3.1 | KASPar assay development and validation

Genotyping‐by‐sequencing (GBS) produced more than 344 mil‐ lion reads resulting in 521,918 sequence tags out of which 207,810 (39.82%) aligned exactly once on D. alata contigs. The remaining reads aligned at multiple locations (25.18%) or did not align to any contig (35%). From these sequence tags, SNP calling produced a raw

vcf file of 158,695 SNPs. This raw vcf file was then filtered resulting in a dataset of 40 accessions (Appendix A), and 4,593 good quality SNPs out of which 3,879 (84%) SNPs were mapped by BLAST on the

D. rotundata reference genome. The KASPar assay was then devel‐

oped by selecting 192 SNPs representative of SNPs mapped along the D. rotundata reference sequence, and they were tested on 141 accessions.

Among the 192 SNPs, 26 (13%) SNPs failed as they did not pro‐ duce any amplification signal. From the remaining 166 SNPs (87%), 129 SNPs (Appendix C) with less than 20% missing data and a minor allele frequency of over 5% were retained as high‐quality SNPs. This final dataset (129 SNPs × 141 accessions) contained an overall miss‐ ing data rate of only 0.5% with a maximum of 3% missing data per accession.

The 129 validated KASPar SNPs were distributed on all link‐ age groups used to construct the D. rotundata reference genome (Figure 1). Their distribution was not homogeneous along chromo‐ somes as their position was planned to be representative of that of the initial set of 3,879 mapped SNPs and not equally spaced.

3.2 | Assessment of ploidy levels

In our D. alata validation panel, three ploidy levels (2x, 3x and 4x) coexisted (Appendix B). Thus, the KASPar assay could theoretically produce a maximum of seven types of fluorescence signal (Table 1) corresponding to two types of fluorescence signal in homozygous states (2:0 = 3:0 = 4:0; 0:2 = 0:3 = 0:4), the fluorescence signal of mixed and balanced allelic dosages (1:1 for diploids or 2:2 for tetra‐ ploids) and the four types of fluorescence signal corresponding to the different possible unbalanced allelic dosages at heterozygotic loci (“polyploid‐like” in Table 1) of triploids and tetraploids (1:3; 1:2; 2:1;

F I G U R E 1 Location of KASPar SNPs

on the D. rotundata reference genome (Tamiru et al., 2017). The 21 linkage group are aligned from left to right. Black dots, failed or bad quality SNPs; red dots, the 129 validated SNPs

(4)

4

|

CORMIER Etal.

3:1). In our case, due to insufficient fluorescence resolution, it was not possible to distinguish fluorescence signals of the 1:3 tetraploid allelic dosage from the 1:2 triploid allelic dosage, or the 2:1 triploid allelic dosage from the 3:1 tetraploid allelic dosage. Consequently, a maximum of five types of fluorescence signals were identified. Overall, five, four, three, and two allelic dosages were detected for 64 (50%), 41 (32%), 19 (15%), and 5 (4%) SNPs, respectively, because some allelic dosages were not present in the validation panel or they were cofounded.

However, the overall allele call and allelic dosage assess‐ ment quality were good. Indeed, the ratio of genotypes scored as

“polyploid‐like” on overall heterozygous genotypes by accession was low (0.09 ± 0.05) for diploids and high for triploids (0.83 ± 0.05). In addition, the three distributions of this ratio corresponding to the three ploidy levels did almost not overlap (Figure 2).

We were thus not able to differentiate all allelic dosage from each other when looking at one SNP. However, ploidy level could be deduced when taking all the KASPar array into account and consid‐ ering the proportion of genotypes scored as “polyploid‐like” per ac‐ cession. This KASPar assay thus differentiated the accession ploidy level and allowed us to assign it for 12 accessions originally of un‐ known ploidy. Nine were set as diploid and three as triploid.

F I G U R E 2 Distribution of the percentage of polypoid‐like genotypes (1:3, 1:2, 2:1, and 3:1 allelic dosage) on overall heterozygous

genotypes by ploidy level (red, diploid; green, triploid; blue, tetraploid)

TA B L E 1 Summary of genotype, allelic

composition and fluorescence signals Type of

genotype Ploidy

Allelic Type of fluorescence signal Dosage Composition Theo. Obs.

Diploid‐like Diploid 0:2 X:X 1 1 1:1 X:Y 4 3 2:0 Y:Y 7 5 Triploid 0:3 X:X:X 1 1 3:0 Y:Y:Y 7 5 Tetraploid 0:4 X:X:X:X 1 1 2:2 X:X:Y:Y 4 3 4:0 Y:Y:Y:Y 7 5

Polyploid‐like Triploid 1:2 X:X:Y 3 2

2:1 X:Y:Y 5 4

Tetraploid 1:3 X:X:X:Y 2 2

(5)

3.3 | Diversity analysis

Overall, 141 accessions from CRB‐PT and CIRAD ex situ collections in Guadeloupe were used to validate the KASPar assay (96 diploids, 36 triploids, and nine tetraploids including accessions with known and deduced ploidy level).

The allele presence and/or absence was used to assess the sim‐ ilarity between accessions and thus to identify duplicate accessions

(Figure 3). Indeed, by defining redundancy groups, we ended up with 43 nonredundant groups each containing one to 24 accessions.

These groups of genetically similar accessions were partially ex‐ pected based on the accession vernacular names. For example, the second biggest group (redundancy group 6, Appendix B) was com‐ posed of 18 accessions, five of which had a name related to “Saint Vincent.” The third biggest group contained 14 accessions, four of which had a name related to “Pacala.”.

F I G U R E 3 Dendrogram of dissimilarity between 141 D. alata accessions (red, diploid; green, triploid; blue, tetraploid)

(6)

6

|

CORMIER Etal.

The main group of redundant accessions was composed of 24 triploids collected at several distant locations (Caribbean islands, New Caledonia and Madagascar). This group consisted of 67% (24/36) of the triploid accessions present in the CRB‐PT and CIRAD collections.

More generally, redundancy groups only consisted of accessions with the same ploidy level (Figure 4). Moreover, similarities within triploids or within tetraploids were higher than within diploids.

The diversity analysis was based on these 43 redundancy groups to avoid bias. After clustering, the bootstrap procedure detected five significant gene pools, named “cluster” here, rep‐ resented in the kinship network (Figure 5). Only one (cluster C, Figure 5) consisted of accessions from the three ploidy levels. This cluster encompassed accessions from the Caribbean and Pacific re‐ gions. Clusters A, B, and D contained triploids from the Caribbean and Madagascar, tetraploids from the Pacific and diploids from the Caribbean, respectively (Figure 5, Appendix B). Cluster E was the biggest one, with 21 nonredundant diploid accessions originat‐ ing from India, Nigeria, Côte d'Ivoire, the Caribbean and Pacific (Figure 5, Appendix B).

Genotype permutations and network analysis gave a more de‐ tailed view of kinship between redundancy groups and Clusters. This approach revealed a low number of significant links between the di‐ versity clusters D or E and the others (Figure 5) revealing that these clusters could consist of original genepools.

4 | DISCUSSION

4.1 | Assessment of allelic dosage and detection of

ploidy levels

KASPar technology is based on competitive allele‐specific am‐ plification followed by allele‐specific fluorescence assessment

(Semagn, Babu, Hearne, & Olsen, 2014). Detection of allelic dos‐ age in polyploid species is thus possible (Cuenca, Aleza, Navarro, & Ollitrault, 2013). However, several parameters may influence the fluorescence, such as the DNA quality or primer specificity, and consequently the ability to discriminate fluorescence signals and the allelic dosage. In our case, we were able to discriminate five types of fluorescence signal. At heterozygous loci, fluorescence signals were a mixture of two types of allelic‐specific fluorescence. Fluorescence signals should also be balanced for diploids which have a balanced allelic dosage (1:1) at heterozygous loci. Diploids should therefore theoretically have no genotypes assessed as “polyploid‐like.” Conversely, triploids should theoretically have only genotypes assessed as “polyploid‐like” at heterozygous loci. A balanced allelic dosage is impossible for triploids. Our results showed that 91 ± 5% and 83 ± 5% of heterozygous genotypes were correctly called for diploids and triploids, respectively. Regarding the recent explosion of genotyping related to next‐gen‐ eration sequencing, bioinformatics tools have been developed to accurately determine dosages (e.g., GBS2ploidy; Gompert & Mock, 2017). However, this requires deep sequencing and usually an as‐ sumption of ploidy levels present in the dataset (Bourke, Voorrips, Visser, & Maliepaard, 2018).

Application in collection management may nevertheless not re‐ quire allelic dosage assessment at each locus. Our aim was thus to develop a tool for estimating ploidy levels and not variations in copy number. Moreover, the results showed that ploidy levels for each accession can be accurately deduced from the percentage of “polyp‐ oid‐like” genotypes on overall heterozygous genotypes. Regarding the overlapping distributions of this ratio (Figure 2), the only risk is to confuse triploids and tetraploids estimated at 3%. Consequently, ploidy level assessment is possible and fairly accurate for D. alata using the KASPar assay developed in this study.

F I G U R E 5 Network of kinship for the 43 D. alata redundancy groups based on significant similarity (p < 0.05, edge‐weighted spring‐

embedded layout). Nodes shape and letter, cluster of diversity identified by a bootstrap procedure; red nodes, diploids; green nodes, triploids; blue nodes, tetraploids; edge colors, similarity from gray (0.64) to black (1)

(7)

4.2 | Identification of duplicate accessions

The dataset included 129 SNPs validated on 141 accessions corre‐ sponding to 43 unique redundancy groups. The resuming of the 141 accessions to 43 unique redundancy groups was related to the nar‐ row D. alata genetic diversity, above all in polyploid germplasm (i.e., triploids and tetraploids) already identified in previous studies. For example, using DarT markers, a low varietal richness was revealed by Vandenbroucke et al. (2016), who studied 80 landraces from six different Vanuatu islands and differentiated only seven unique genotypes. Using isozyme markers, Lebot et al. (1998) studied 269 worldwide distributed cultivars and concluded that the genetic di‐ versity of the most widespread cultivars was narrow.

Regarding the accession vernacular names, redundant acces‐ sions were expected in our sample. Some of these redundancy groups contained accessions detected in duplicate, while they could be differentiated by morphological characterization. For example, redundancy group five (including Lupias, Malalagi, or Malankon) ex‐ hibited diversity in tuber shape and tuber flesh color in agreement with previous genetic diversity studies that already pooled these ac‐ cessions together and highlighted this intragroup variability in tubers (Arnau et al., 2017; Malapa et al., 2005).

Morphological variability within a redundancy cluster mostly arises via D. alata clonal reproduction and farmers' selection of new morphotypes resulting from somatic mutations (Lebot et al., 1998; Malapa et al., 2005; Vandenbroucke et al., 2016). Small genetic or epigenetic variations are commonly selected to create new diversity in horticultural crops such as yam as reviewed by Krishna et al. (2016).

The ability of KASPar assay developed in this study to differen‐ tiate duplicates in collections from genetically close accessions was related: (a) to the low number of studied loci (129), but also (b) to the

D. alata diversification process (i.e., selection of somaclonal mutants)

and (c) the presence of real duplicates within collections. This tool is thus efficient for attributing accessions to a genetic lineage (e.g., germplasm exchange), but a good complementary agro‐morpholog‐ ical and ecophysiological characterization of collections should also be done to completely differentiate somaclonal mutant clones from duplicates (e.g., identification of promising genitors for breeding programs).

4.3 | Diversity and collection management

The CRB‐PT collection has been shown to be representative of worldwide D. alata diversity (Arnau et al., 2017). A subset of this ex situ collection has been genotyped in this study. However, all diver‐ sity groups identified by Arnau et al. (2017) were present (except one containing five very similar Indian accessions). Our validation panel was thus representative of the worldwide D. alata diversity. Moreover, a good correlation was obtained between the findings of the previous study of worldwide D. alata diversity of Arnau et al. (2017) and the gene pools identified in this study (Appendix B). We can thus hypothesize that the 129 SNPs KASPar array developed for D. alata allow us to accurately assess genetic diversity and the

findings may be transferable to other collections. Moreover, this genotyping tool is a robust method: (a) to assess complementarity/ redundancy between the different collections, (b) to identify under represented genetic groups, and (c) to plan future collects to fill gaps in collections.

5 | CONCLUSION

This is the first SNP array designed for D. alata and validated on a sub‐ set of accessions representative of worldwide D. alata diversity. This tool will allow users to estimate accession ploidy levels and genetic lineages. The results showed a good correlation between the diversity assessed by this KASPar array and the findings of previous studies. This KASPar array is a robust and cost‐effective tool for diversity assess‐ ment and collections management. Regarding the importance of veg‐ etative reproduction and somaclonal selection in D. alata, it is a good tool to complement agro‐morphological description in collections.

ACKNOWLEDGMENTS

This study was financially supported by the European Union and Guadeloupe Region (Programme Opérationnel FEDER— Guadeloupe—Conseil Régional 2014–2017). The authors would like to thank Suzia Gélabale, Marie‐Claire Gravillon, Jean‐Luc Irep, David Lange, and Elie Nudol for their involvement in CRB‐PT and CIRAD in vitro and field collections conservation. Finally, we are grateful to Patrick Ollitrault for his valuable discussion and to David Manley for English proofing.

CONFLIC T OF INTEREST

The authors declare that they have no conflict of interest.

AUTHOR CONTRIBUTIONS

C.P., F.C., H.C., and P.M. designed the study. C.P., F.C., E.M., G.A., and R‐M.G. contributed to collecting materials and sample prepara‐ tion. P.M. and S.C. developed GBS protocol, carried out DNA ex‐ traction, and GBS library preparation. H.C. and P.M. performed SNP discovery. F.C. and H.C. designed the KASPar assay and performed its analysis. C.P., F.C., and H.C. wrote the manuscript with the input of all authors.

DATA ACCESSIBILIT Y

Plant materials may be requested at the CRB‐PT of Guadeloupe http://intertrop.antilles.inra.fr/Portail/accessions/find/11. KASPar primers sequence is available in Appendix B.

ORCID

(8)

8

|

CORMIER Etal.

REFERENCES

Arnau, G., Bhattacharjee, R., Mn, S., Chair, H., Malapa, R., Lebot, V., … Pavis, C. (2017). Understanding the genetic diversity and population structure of yam (Dioscorea alata L.) using microsatellite markers.

PLoS One, 12(3), e0174150.

Arnau, G., Némorin, A., Maledon, E., & Abraham, K. (2009). Revision of ploidy status of Dioscorea alata L. (Dioscoreaceae) by cytogenetic and microsatellite segregation analysis. Theoretical and Applied Genetics,

118, 1239–1249. https://doi.org/10.1007/s00122‐009‐0977‐6

Asemota, H. N., Ramser, J., Lopez‐Peralta, C., Weising, K., & Kahl, G. (1996). Genetic variation and cultivar identification of Jamaican yam germplasm by random amplified polymorphic DNA analysis.

Euphytica, 92, 341–351. https://doi.org/10.1007/BF00037118

Bourke, P. M., Voorrips, R. E., Visser, R. G. F., & Maliepaard, C. (2018). Tools for genetic studies in experimental populations of poly‐ ploids. Frontiers in Plant Science, 9, 513. https://doi.org/10.3389/ fpls.2018.00513

Broccanello, C., Chiodi, C., Funk, A., McGrath, J. M., Panella, L., & Stevanato, P. (2018). Comparison of three PCR‐based assays for SNP genotyping in plants. Plant Methods, 14, 28. https://doi.org/10.1186/ s13007‐018‐0295‐6

Chaïr, H., Sardos, J., Supply, A., Mournet, P., Malapa, R., & Lebot, V. (2016). Plastid phylogenetics of Oceania yams (Dioscorea spp., Dioscoreaceae) reveals natural interspecific hybridization of the greater yam (D. alata). Botanical Journal of the Linnean Society, 180, 319–333.

Cormier, F., Lawac, F., Maledon, E., Gravillon, M.‐C., Nudol, E., Mournet, P., … Arnau, G. (2019). A reference high‐density genetic map of greater yam (Dioscorea alata L.). Theoretical and Applied Genetics. https://doi.org/10.1007/s00122‐019‐03311‐6

Cuenca, J., Aleza, P., Navarro, L., & Ollitrault, P. (2013). Assignment of SNP allelic configuration in polyploids using competitive allele‐spe‐ cific PCR: Application to citrus triploid progeny. Annals of Botany,

111, 731–742. https://doi.org/10.1093/aob/mct032

Danecek, P., Auton, A., Abecasis, G., Albers, C. A., Banks, E., DePristo, M. A., … 1000 Genomes Project Analysis Group (2011). The variant call format and VCFtools. Bioinformatics, 27, 2156–2158. https://doi. org/10.1093/bioinformatics/btr330

Davey, J. W., Hohenlohe, P., Etter, P., Boone, J., Catchen, J., & Blaxter, M. (2011). Genome‐wide genetic marker discovery and genotyping using next‐generation sequencing. Nature Reviews Genetics, 12, 499– 510. https://doi.org/10.1038/nrg3012

Dereeper, A., Nicolas, S., Lecunff, L., Bacilieri, R., Doligez, A., Peros, J. P., … This, P. (2011). SNiPlay: a web‐based tool for detection, manage‐ ment and analysis of SNPs. Application to grapevine diversity proj‐ ects. BMC Bioinformatics, 12, 134.

Elshire, R. J., Glaubitz, J. C., Sun, Q., Poland, J. A., Kawamoto, K., Buckler, E. S., & Mitchell, S. E. (2011). A robust, simple genotyping‐by‐se‐ quencing (GBS) approach for high diversity species. PLoS One, 6, e19379. https://doi.org/10.1371/journal.pone.0019379

Garcia‐Lor, A., Ancillo, G., Navarro, L., & Ollitrault, P. (2013). Citrus (Rutaceae) SNP markers based on Competitive Allele‐Specific PCR; transferability across the Aurantioideae subfamily. Applications

in Plant Sciences, 1(4), apps.1200406. https://doi.org/10.3732/

apps.1200406

Glaubitz, J. C., Casstevens, T. M., Lu, F., Harriman, J., Elshire, R. J., Sun, Q. I., & Buckler, E. S. (2014). TASSEL‐GBS: A high capacity genotyping by sequencing analysis pipeline. PLoS One, 9(2), e90346. https://doi. org/10.1371/journal.pone.0090346

Gompert, Z., & Mock, K. E. (2017). Detection of individual ploidy lev‐ els with genotyping‐by‐sequencing (GBS) analysis. Molecular Ecology

Resources, 17, 1156–1167. https://doi.org/10.1111/1755‐0998.12657

Hiremath, P. J., Kumar, A., Penmetsa, R. V., Farmer, A., Schlueter, J. A., Chamarthi, S. K., … Varshney, R. K. (2012). Large‐scale development

of cost‐effective SNP marker assays for diversity assessment and genetic mapping in chickpea and comparative mapping in legumes. Plant Biotechnology Journal, 10, 716–732. https://doi. org/10.1111/j.1467‐7652.2012.00710.x

IPGRI/IITA (1997). Descriptors for Yam (Dioscorea spp.). International Institute of Tropical Agriculture, Ibadan, Nigeria/International Plant Genetic Resources Institute, Rome, Italy.

Krishna, H., Alizadeh, M., Singh, D., Singh, U., Chauhan, N., Eftekhari, M., & Sadh, R. K. (2016). Somaclonal variations and their applica‐ tions in horticultural crops improvement. 3 Biotech, 6, 54. https://doi. org/10.1007/s13205‐016‐0389‐7

Langmead, B., & Salzberg, S. (2012). Fast gapped‐read alignment with Bowtie 2. Nature Methods, 9, 357–359. https://doi.org/10.1038/ nmeth.1923

Lebot, V., Trilles, B., Noyer, L. J., & Modesto, J. (1998). Genetic relation‐ ships between Dioscorea alata L. cultivars. Genetic Resources and Crop

Evolution, 45, 499–509.

Mahalakshmi, V., Ng, Q., Atalobor, J., Ogunsola, D., Lawson, M., & Ortiz, R. (2007). Development of a West African yam Dioscorea spp. core collection. Genetic Resources and Crop Evolution, 54, 1817–1825. https://doi.org/10.1007/s10722‐006‐9203‐4

Malapa, R., Arnau, G., Noyer, J. L., & Lebot, V. (2005). Genetic diversity of the greater yam (Dioscorea alata L.) and relatedness to D. nummularia Lam. and D. transversa Br. as revealed with AFLP markers. Genetic

Resources and Crop Evolution, 52, 919–929. https://doi.org/10.1007/

s10722‐003‐6122‐5

Nemorin, A., David, J., Maledon, E., Nudol, E., Dalon, J., & Arnau, G. (2013). Microsatellite and flow cytometry analysis to help under‐ stand the origin of Dioscorea alata polyploids. Annals of Botany, 112, 811–819. https://doi.org/10.1093/aob/mct145

Otoo, E., Anokye, M. L., Asare, P. A., & Telleh, J. P. (2015). Molecular categorization of some water yam (Dioscorea alata L.) germplasm in Ghana using microsatellites (SSR) markers. Journal of Agricultural

Science 7(10), 226–238.

R Core Team (2017). R: A language and environment for statistical

com-puting. Vienna, Austria: R Foundation for Statistical Comcom-puting.

Retrieved from https://www.R‐project.org/

Risterucci, A.‐M., Hippolyte, I., Perrier, X., Xia, L., Caig, V., Evers, M., … Glaszmann, J.‐C. (2009). Development and assessment of diver‐ sity arrays technology for highthroughput DNA analyses in Musa.

Theoretical and Applied Genetics, 119, 1093–1103. https://doi.

org/10.1007/s00122‐009‐1111‐5

Sartie, A., Asiedu, R., & Franco, J. (2012). Genetic and phenotypic diver‐ sity in a germplasm working collection of cultivated tropical yams (Dioscorea spp.). Genetic Resources and Crop Evolution, 59, 1753–1765. https://doi.org/10.1007/s10722‐012‐9797‐7

Saski, C. A., Bhattacharjee, R., Scheffler, B. E., & Asiedu, R. (2015). Genomic resources for water yam (Dioscorea alata L.): Analyses of EST‐sequences, de novo sequencing and GBS libraries. PLoS One,

10(7), e0134031.

Saxena, R. K., von Wettberg, E., Upadhyaya, H. D., Sanchez, V., Songok, S., Saxena, K., … Varshney, R. K. (2014). Genetic diversity and demo‐ graphic history of Cajanus spp. illustrated from genome‐wide SNPs.

PLoS One, 9, e88568.

Semagn, K., Babu, H., Hearne, S., & Olsen, M. (2014). Single nucleotide polymorphism genotyping using Kompetitive Allele Specific PCR (KASP): Overview of the technology and its application in crop im‐ provement. Molecular Breeding, 33, 1–14. https://doi.org/10.1007/ s11032‐013‐9917‐x

Semagn, K., Beyene, Y., Makumbi, D., Mugo, S., Prasanna, B. m., Magorokosho, C., & Atlin, G. (2012). Quality control genotyping for assessment of genetic identity and purity in diverse tropical maize in‐ bred lines. Theoretical and Applied Genetics, 125, 1487–1501. https:// doi.org/10.1007/s00122‐012‐1928‐1

(9)

Siqueira, M. V., Marconi, T. G., Bonatelli, M. L., Zucchi, M. I., & Veasey, E. A. (2011). New microsatellite loci for water yam (Dioscorea alata, Dioscoreaceae) and cross‐amplification for other Dioscorea species.

American Journal of Botany, 98, 144–146.

Su, T., Li, P., Yang, J., Sui, G., Yu, Y., Zhang, D., … Zhang, F. (2018). Development of cost‐effective single nucleotide polymorphism marker assays for genetic diversity analysis in Brassica rapa. Molecular

Breeding, 38, 42. https://doi.org/10.1007/s11032‐018‐0795‐0

Suzuki, R., & Shimodaira, H. (2006). Pvclust: An R package for assessing the uncertainty in hierarchical clustering. Bioinformatics, 12, 1540– 1542. https://doi.org/10.1093/bioinformatics/btl117

Tamiru, M., Natsume, S., Takagi, H., White, B., Yaegashi, H., Shimizu, M., … Terauchi, R. (2017). Genome sequencing of the staple food crop white Guinea yam enables the development of a molecular marker for sex determination. BMC Biology, 15, 86. https://doi.org/10.1186/ s12915‐017‐0419‐x

Vandenbroucke, H., Mournet, P., Vignes, H., Chaïr, H., Malapa, R., Duval, M. F., & Lebot, V. (2016). Somaclonal variants of taro (Colocasia

es-culenta Schott) and yam (Dioscorea alata L.) are incorporated into

farmers’ varietal portfolios in Vanuatu. Genetic Resources and Crop

Evolution, 63, 495–511. https://doi.org/10.1007/s10722‐015‐0267‐x

How to cite this article: Cormier F, Mournet P, Causse S, et

al. Development of a cost‐effective single nucleotide polymorphism genotyping array for management of greater yam germplasm collections. Ecol Evol. 2019;00:1–20. https:// doi.org/10.1002/ece3.5141

APPENDIX A

TA B L E A 1 Description of the 40 D. alata accessions used to detect polymorphic SNP

Collection Code Name Origin Ploidy

CRB‐PT PT‐IG‐00002 Pakutrany Nlle Caledonie

PT‐IG‐00006 Fénakué Puerto Rico 2

PT‐IG‐00010 Divin 1 Guadeloupe 2

PT‐IG‐00020 DA 26 Guyane Fr 3

PT‐IG‐00338 HYB 30 Guadeloupe

PT‐IG‐00350 Pacala Guadeloupe 2

PT‐IG‐00029 Plimbite Haïti 2

PT‐IG‐00033 Pyramide Puerto Rico 2

PT‐IG‐00046 Sea 190 Puerto Rico 2

PT‐IG‐00053 Kokoéta Nlle Calédonie 2

PT‐IG‐00686 Roujol 4

PT‐IG‐00687 INRA C 143

PT‐IG‐00688 INRA AL 56

PT‐IG‐00690 INRA AL 18

PT‐IG‐00692 INRA X 154 Guadeloupe

PT‐IG‐00694 Dou 4

PT‐IG‐00696 Ciradienne 4

PT‐IG‐00697 TiViolet 4

PT‐IG‐00698 Malalagi Vanuatu 2

PT‐IG‐00702 Manlankon Vanuatu 2

PT‐IG‐00689 Nureangdan Vanuatu 3

PT‐IG‐00077 Kinabayo Puerto Rico 2

PT‐IG‐00078 Toro Haïti 3

(10)

APPENDIX B

Collection Code Name Origin Ploidy

Cirad Vu 024a Tépuva Vanuatu 2

Vu 528a Tacharamivar 2

Vu 564a Mendrovar Vanuatu 2

Vu 567a Homb Vanuatu 2

Vu 754a Intejegan Vanuatu 4

Vu 231a Tagabé Vanuatu 4

Ovy taty Madagascar

Vu 247a n.a Vanuatu 2

Vu 401a Basa Vanuatu 2

Kabusa 2 74F 2 42F 2 61F 2 14M 2 H4x200 4 TA B L E A 1 (Continued)

TA B L E B 1 Description of the 141 D. alata used as the KASPar assay validation panel

Collection Code Ploidya _{Div. Clust.}b _{Redund. Grp}c _{Accession name} _Origin _SSRd

PT‐IG‐00087 3 A 26 65 Martinique XII

PT‐IG‐00090 3 A 26 Caillade 1 Haïti XII

PT‐IG‐00020 3 A 26 DA 26 French Guyana XII

PT‐IG‐00037 3 A 26 DA 27 French Guyana XII

PT‐IG‐00022 3 A 26 De agua Puerto Rico XII

PT‐IG‐00061 3 A 26 Igname d eau Martinique XII

PT‐IG‐00550 3 A 26 Montpellier XII

PT‐IG‐00075 3 A 26 Renta Yam Jamaica XII

PT‐IG‐00072 3 A 26 Sassa 1 Martinique XII

PT‐IG‐00063 3 A 26 Sassa 2 Martinique

PT‐IG‐00088 3 A 26 St Martin Martinique XII

PT‐IG‐00034 3 A 26 Sweet yam Jamaica XII

PT‐IG‐00557 3 A 26 Tahiti couleuvre Guadeloupe XII

PT‐IG‐00068 3 A 26 Tahiti cultivé Guadeloupe XII

PT‐IG‐00069 3 A 26 Tahiti French Guadeloupe XII

PT‐IG‐00018 3 A 26 Tahiti messien Guadeloupe

PT‐IG‐00064 3 A 26 Tana New Caledonia XII

PT‐IG‐00021 3 A 26 Telemaque Martinique XII

PT‐IG‐00044 3 A 26 Ti Joseph 1 Haïti XII

PT‐IG‐00078 3 A 26 Toro Haïti XII

CT257_CIV 3 A 26 OvyTaty

AmbalaKindresy‐Ambohimasoa

Madagascar

CT258_CIV 3 A 26 OvyTaty Amboasary‐Ambohimasoa Madagascar

(11)

PT‐IG‐00685 3 A 26 Sainte Anne

PT‐IG‐00558 4 B 3 Wabé New Caledonia XVIII

Vu472a 4 B 3 Toufi Tetea Vanuatu XVIII

Vu231a 4 B 3 Vanuatu XVIII

Vu750a 4 B 3 Wanorak Vanuatu

Vu534a 4 B 3 Bisoro Vanuatu XVIII

Vu754a 4 B 30 Noulelcae Vanuatu XVI

Vu408a 4 B 31 Manioc Vanuatu

PT‐IG‐00039 2 C 2 Americano Dominican Republic VII

PT‐IG‐00023 2 C 2 Florido Puerto Rico

PT‐IG‐00553 2 C 2 Pro 1 VII

PT‐IG‐00095 2 C 2 SEA 144 Puerto Rico IV

PT‐IG‐00555 2 C 2 SRT 29 VII

PT‐IG‐00041 2 C 2 St Domingue Dominican Republic VII

Vu401a 2 C 2 Basa Vanuatu VII

CT256 2 C 2

PT‐IG‐00009 4 C 12 Nouméa New Caledonia XVI

Vu247a 2 C 14 Vanuatu

Vu528a 2 C 16 Sinoua Vanuatu

PT‐IG‐00025 3 C 22 Goana New Caledonia XIII

PT‐IG‐00002 3 C 22 Pakutrany New Caledonia XIII

Vu699a 3 C 22 Tumas Vanuatu

Vu461a 3 C 22 Tumas Vanuatu XIII

Vu755a 4 C 24 Nepelev Vanuatu

PT‐IG‐00014 2 C 37 Divin 2 Guadeloupe

PT‐IG‐00006 2 C 37 Fénakué Puerto Rico

PT‐IG‐00053 2 C 37 Kokoéta New Caledonia

PT‐IG‐00559 2 C 39 Wassa New Caledonia

PT‐IG‐00001 2 D 7 64 Martinique

PT‐IG‐00010 2 D 7 Divin 1 Guadeloupe

PT‐IG‐00568 2 D 25 77 Martinique IV

PT‐IG‐00092 2 D 34 Caplaou Puerto Rico

PT‐IG‐00561 2 D 42 H 23

PT‐IG‐00562 2 D 42 H 50

74F 2 E 4 India

PT‐IG‐00049 2 E 5 Cinq Puerto Rico III

PT‐IG‐00027 2 E 5 Lupias New Caledonia III

PT‐IG‐00046 2 E 5 Sea 190 Puerto Rico III

Vu590a 2 E 5 Vanuatu III

Vu423a 2 E 5 Manlankon Vanuatu III

Vu639a 2 E 5 Malalagi Vanuatu III

Vu024a 2 E 5 Ptris Vanuatu III

PT‐IG‐00065 2 E 6 DA 28 French Guyana IV

TA B L E B 1 (Continued)

(12)

12

|

CORMIER Etal.

PT‐IG‐00093 2 E 6 DA 32

PT‐IG‐00395 2 E 6 Fafadro bis IV

PT‐IG‐00060 2 E 6 Grand Etang Guadeloupe IV

PT‐IG‐00051 2 E 6 Morado Cuba IV

PT‐IG‐00073 2 E 6 Purple Lisbon Puerto Rico IV

PT‐IG‐00333 2 E 6 Sainte Catherine Guadeloupe IV

PT‐IG‐00052 2 E 6 Smooth Statia Puerto Rico IV

PT‐IG‐00024 2 E 6 St Vincent blanc 1 Martinique IV

PT‐IG‐00036 2 E 6 St Vincent blanc 2 Martinique IV

PT‐IG‐00556 2 E 6 St Vincent mart. Guadeloupe IV

PT‐IG‐00045 2 E 6 St Vincent Violet Martinique IV

PT‐IG‐00016 2 E 6 St Vincent Yam St. Lucia IV

PT‐IG‐00374 2 E 6 Ti Joseph Haïti IV

PT‐IG‐00067 2 E 6 Wénéféla bis New Caledonia IV

Vu487a 2 E 6 Teroosi Vanuatu VI

770 2 E 6

PT‐IG‐00623 2 E 6

PT‐IG‐00396 2 E 8 A 24

PT‐IG‐00071 2 E 10 72 Martinique VIII

PT‐IG‐00055 2 E 10 76 Martinique VIII

PT‐IG‐00089 2 E 10 Asmhore

PT‐IG‐00058 2 E 10 Bété Bété Côte d'Ivoire VIII

PT‐IG‐00091 2 E 10 Campêche 2

PT‐IG‐00546 2 E 10 Jardin Haitien VIII

PT‐IG‐00547 2 E 10 Kourou 1 French Guyana VIII

PT‐IG‐00548 2 E 10 Kourou 2 French Guyana VIII

PT‐IG‐00350 2 E 10 Pacala Guadeloupe VIII

PT‐IG‐00551 2 E 10 Pacala cacao French Guyana VIII

PT‐IG‐00552 2 E 10 Pacala Guyane French Guyana VIII

PT‐IG‐00017 2 E 10 Pacala station Guadeloupe VIII

PT‐IG‐00554 2 E 10 SRT 24 VIII

19 2 E 10

PT‐IG‐00057 2 E 11 Vino Purple forme Puerto Rico

61F 2 E 15 India

PT‐IG‐00019 2 E 19 Gordito New Caledonia IX

PT‐IG‐00047 2 E 20 Buet New Caledonia

PT‐IG‐00029 2 E 20 Plimbite Haïti

PT‐IG‐00048 2 E 21 Bacala 1 Haïti

PT‐IG‐00413 2 E 21 St Vincent St. Vincent

Cuba6 2 E 23 Cuba

PT‐IG‐00542 2 E 27 AL 10 I

PT‐IG‐00042 2 E 27 Brazzo Fuerte Puerto Rico I

PT‐IG‐00038 2 E 27 Brésil 1 I

PT‐IG‐00564 2 E 27 KL 10 I

PT‐IG‐00565 2 E 27 KL 21

TA B L E B 1 (Continued)

(13)

PT‐IG‐00566 2 E 27 KL 40 I

PT‐IG‐00054 2 E 27 MP1 16H56 I

PT‐IG‐00033 2 E 27 Pyramide Puerto Rico I

PT‐IG‐00074 2 E 28 Oriental Barbados II

14M 2 E 29 India

PT‐IG‐00077 2 E 32 Kinabayo Puerto Rico II

PT‐IG‐00085 2 E 35 St Sauveur Guadeloupe

PT‐IG‐00560 2 E 35 Yam jamaïque

PT‐IG‐00543 2 E 36 Cross lisbon

PT‐IG‐00392 2 E 38 A 13

PT‐IG‐00398 2 E 38 A 2

PT‐IG‐00563 2 E 40 Sc.c 1.1

PT‐IG‐00008 2 E 41 AIA 445 Nigeria

PT‐IG‐00015 2 E 43 Igname rouge Guadeloupe X

Vu703a 3 F 1 Nawanurunkimanga Vanuatu

PT‐IG‐00544 3 F 9 Cuello largo Puerto Rico XV

PT‐IG‐00026 3 F 9 Féo Puerto Rico XV

Vu696a 3 F 9 Nowateknempian Vanuatu XV

PT‐IG‐00076 3 F 13 Bélep New Caledonia XIV

Vu735a 3 F 13 Noplon Vanuatu XIV

Vu760a 3 F 13 Nureangdan Vanuatu XIV

PT‐IG‐00397 2 F 17 SEA 119, Toki

Vu613a 2 F 17 Peter Vanuatu VI

Vu589a 2 F 17 Makila Vanuatu XI

VU590a 2 F 18 Vanuatu III

Vu554a 2 F 18 Nourembor Vanuatu VI

Vu567a 2 F 18 Letsletsbolos Vanuatu IV

Vu564a 2 F 18 Makila Vanuatu VI

Vu026a 2 F 18 Dammasis Vanuatu VI

a_{In italic, ploidy detected using the percentage of polyploid genotype type on overall heterozygous loci.}b_{Group of diversity from diversity analysis} c_{Group of similarity used to select nonredundant accessions. Genotypes in the same group have a maximum of one allele mismatch).}d_{Cluster of diver‐}

sity identified by SSR in Arnau et al. (2017).

(14)

14

|

CORMIER Etal. A PP EN D IX C T A B LE C 1 K A SP ar a ss ay d es cr ip tio n f or t he 1 29 h ig h‐ qu al ity S N Ps : N um be r o f f lu or es ce nc e t yp e d et ec te d, c hr om os om e a nd p os iti on o n D. ro tu nd at a r ef er en ce g en ome a ss es se d b y B LA ST ( E‐ val ue ) SN P_ ID # F lu o. ty pe C hr. Po s. E‐ value Seq uen ce S1 _78 46 478 9 4 1 10 666 77 8E− 52 G TT TC C C A ATG G TA A C A C TT TC TG C A A A G C C TG A A A G G C A C TTG A C TTG A C AT TG C C A A G [T /G ] G C AT TAG TT G C C AC AG C C C C A AT TC TA AC TA TAG C TG C AG C AG C AG C TA AC G G TG A AG C T S1 _16 61 77 83 1 4 1 30 0232 1 2E− 40 C AT C A C AA G C G AAA C AA TG C AA G AT C A C TG C A G C G C TAAA C AA G A C G AT G AAAA C TG C TA [G /T ] A AC TG C C C AC TC TT C C A A AG AT AG AC TG C AG C A A A AC A A A AG C C G C TT G G AT G AT C AC AC S1 _7 38 178 82 5 1 52 14 076 8E− 52 TG AT TC TTC TTC C TC TTC ATC TG C A G A C TT TT TG G ATG ATG C TA C TTC TTC A C TA A A C A A [A /G ] C A A C C TC TT TA TC A G ATG TC TTC TA TC A A G G C TG A A C TTC C A ATC A A G TTG TG TG AT TTG S1 _3 08 27 62 16 4 1 22 422 99 3 2E− 54 G A A G C TG AT TG A G C TG C TTG AT ATC G ATC TG C A G TG G A G G ATG C AT A A A G TT TC TG ATG G [G /C ] C A G C G TC G TC G TG TG C A A ATTT G C AT G G G A C TTTT A C A A C C AT A C A A G G C A AT G ATTTTT S3 _3 28 349 3 4 2 625 90 05 3E −51 TC TC AA G TAA C TA TT AT G G TA G TAA C A G AT G AT G C AAA TG TG AA G G C AA G AT AA G AAA TA [C /G ] C AT A C C TC C C C ATC TG C A G C A C A A G TA A C G A G G G TC C G ATC ATC TG TG TA A G G C ATG A AT S1 _15 62 03 93 5 2 21 40 40 53 8E− 52 G ATC A A C TG C A ATG C C A ATG G TTG G TG C A A G TT TC TTG G G A AT A C C TG C TG C C TG A A ATG [T /G ] A A A A C C C GT A C A AT AT GA TA C A A AT A A GT G GA GT G C C TGT G C TG C A G C TGA GA AT TGA GA S1 _1 94 68 01 24 3 2 251 33 44 1 8E− 52 TT TTG A C TG A C A G C C TT TA G TG A A C TG C A G G C TT A C ATG G A A A A C C TC TTG A C C TC G C TG [C /T ] G A G G C TG G AT AT A G C A AT TG AT G TT G C TC AT G C TAT TA C AT A C C TT C A C AT G TA C A C A G G S1 _1 49 01 30 38 4 2 29 03 387 3 1E− 43 C TC TC A G G TA TG A ATG G ATG G TG C C C A A ATG AT TT TG A A A C G C C C A C ATG G G TT TG TTG A [A /T ] TG TG TT ATC TA C G C A A C C A A AT TG TA AT A A ATG A C TA A ATG TG G TA A ATC TT TTC C C TG C S1 _54 90 860 4 5 2 31 94 61 76 8E− 52 AA C TG A C AAAA TG G C AA TG C AA TG C C C TT TG TC A C TG AT C A C AAA G AA G A G AA G A C AT AT [A /T ] A G G G TA TTTTT AT G G A A A A A C A A A G AT G G TC C AT C TT ATT ATT ATTTT C TC C TG C A G G G G S1 _2 20 35 87 74 4 3 562 36 4 8E− 52 GGG AT TG A C A A A A GC A C A AT C AT TT A C AT G C TGC A G AT TC GGC A G AT TT TGC TGC A G AT G [G /T ] TGC TC C A C C AT C AT C AT TGGC A A A GGGG TA GC C TG AT TT C C AT GGG A C A C TT GG A G A A A G S1 _2 94 34 74 89 4 3 318 22 57 5E− 48 A G G A A G A G TAT G TT C TC C AT C A AT TA C AT TC TC AT TA C G C A A C TT C A AT AT AT C C AT C A G [A /G ] A A AG G G TT AT TC TG G TG AG C A AT AC A AT AC AC AT TT TC TG C AG C AG G A AT AG A AC AT AT G S1 _2 13 49 67 00 5 3 66 08 59 0 2E− 53 TAT AT A A A C AT TC C AT TT TG AT G A G A AT G A G A G A C C AT TG TT G C TA G C AT C C C AT TG A C T[ A /G ] C C AT AT C TG C A G G G AT C TG TAT G G A A A A A G TG C AT G C AT G A A A G A C A A AT AT A AT A AT AT S1 _1 23 50 24 62 5 3 12 21 14 09 2E− 46 AT AG AG A A A AG AC C TG C AG A AG C AG A AG C A AC AC G AT C AT C C TT G TT G AC AT C TC G C A A A [C /T ] GA A GA GA A A G C TT TT TG GT GA A GT TT GA GT GA GA AT TGT A GA A GT C C TC C AT G G C C AT G G S2 _1 33 94 50 2 4 3 12 98 66 00 1E− 50 G TC TA TT TA G C ATG TC TT A G TT TC TTG G TG ATG ATG A C TG C A G TTG A A G TC A A A AT TTG A [G /T ] G ATC TC TC ATC TG A A C ATC ATC ATG C TTG TG A A G A A ATG A AT A A AT TG C A A G A A A A G C TG S1 _2 12 47 79 84 4 3 18 80 8497 2E− 53 TT GT A GT C GT A AT C GA A AT C GT AT TC TT TGT G GA GT AT TA TT TT A G GT G GA A GA TGT TGA [T /G ] AT TT C TG A A AG AG TT C AG AG AG G AT TAG A AT C AC C AG C C TAC TG C AG TG G A AG AT AT G TA S1_4 05 17 92 6 5 4 24 42 03 9 3E− 50 TG G TT A ATC G C A G ATG G G G C TTG G A A A G A C TC TG C A G G C G ATG TC TT TG TTG A G C TA TC T[ G /A ] A A G G TC A ATG G C ATC TC A A C G G G G C C G TTC TG TG A G TC TTG G TT TA TC TC C G ATG A G C TT S1 _3 41 69 07 21 3 4 41 7202 7 7E− 46 TTG TG C AT TC C TC C C TG C ATC TC TTG G A A C TG C A G C C TG C C C A C TC C ATC C TC C C ATG C T[ A /G ] C TC TTG TG A A C C ATC TC A A C C A C TC TTC TTC TT TC TC TC TC TC TT TC TC TC TG TTG TC TC (C on tin ue s)

(15)

SN P_ ID # F lu o. ty pe C hr. Po s. E‐ value Seq uen ce S1_ 94 82 25 91 5 4 651 80 43 4E− 37 TC TG ATG ATC G TG C TTC TC TC ATC A A ATG TT A G ATG TTG TTC TA A C TC TTC A A G C A ATC A [G /A ] G A A C TT AT TTG C TA C TA TG A G TTG TG A C TT AT TG TTG C TG G TC A C TG G AT A C TG C A G G TC S1_ 22 305 98 54 5 4 959 08 72 4E− 49 G TTG C TG C C A G C A ATC ATG A G A C C AT TA TG A G C TA TG TTG ATG G ATG G TG A G G TTC G G G A [C /T ] G TC TG TGG AT A A GC TG TA A C TGC A GGC A GGC TA C C TA C C AT TGG A G A A A AT TGGC C A GC T S3 _1 94 04 55 4 4 11 60202 4 4E− 49 AT G C C TG A AT C TG G AG G AC A AG C TAC TG C AG TG TG TC A AG A A AT AG AC AT AC TT G A AG AG [A /C ] AT TA TG A ATC A G A A C A G TTC C A G G C C G G TA ATG TTG A G TTC ATG C TT TC TG TTC C TT TC T S2 _59 08 99 82 4 4 13 02 23 73 2E −47 C TC AA TAA TG G TG G A C AAA TG TG C C TT C TAA TT C A G AAAAAAAA TG TC TA C TAA TT A C C T[ C /G ] TC C A G C TC TG TC AT TT A C TTG TT AT A G G ATG A C AT A AT TG ATG A C TC TG C A G A G G ATG AT S1 _2 95 08 97 5 5 4 13 95 61 40 8E− 52 TAG TG AG C AG TT G AG AT C AT TA AC G AG C AT AG A AG A AC TC AC TG C AG A AG C C AC A A AG C C [A /T ] GA A GT C A AT G GA GT TT C TA TG GA A C AT A AT GA C GA G GA AT TG G GA A C A C TA TA TT G C A C T S1 _78 09 92 39 4 4 19 00 97 21 2E− 33 C AG G TT TT C TT TT C A AT TG C A AG AG AC AT C A AG C A A AG G C TT G C AG A A AC C G AT AC C A A A [C /G ] C TG A G G TA G TA TG C TT ATC AT TTG TG AT A A AT TC A G TA A A C C TG C A G G C A AT TA G TA ATG S1 _9 818 28 99 5 4 25 05 99 70 2E −47 TG TG C AC A AT G C TA TG AC C AC AC AT AT G G TT TC A A AC AG AC A AG G A AC AG A AT C A AC AG A [T /C ] AC AC TAC TT AC AG TG AC A AG C TC C G AT AT C AG TT G C G C AG AC A AC C C TG TT TT C TG C AG C S1 _1 33 76 874 3 4 25 69 25 81 2E− 40 C TTG AT ATG TC TG TA TTG G G TG TC AT TTC TG C A G TA TT AT TTG G TC C G C A A A G G TA A ATC [A /G ] AT A G A A ATT G G C G TC ATTT A G C A AT AT C TA A A C TG TA TTT G A ATTT G TA G TT C G A G C ATT S1 _5 08 63 27 0 5 4 27 38 25 03 2E− 54 G TGGC TC TC A A A G A C TG A GG A A G TGC G TG A A G AT A GGGC TC AT TGGGG TA C C A AT AT A A C [T /C ] G G TG AT AT TT ATG G TC A G G G TTG G ATC A G TG A A ATG TA TG G AT AT TC AT TTG G TG C TG C A S1 _3 02 40 81 3 5 5 441 85 52 4E− 43 TC A G A G A G TG A G TC A C AT A A A A AT A A G AT TC ATG TTG C AT TG TG ATG C C C C C TG AT TC TT [A /C ] TT A C TT G C C C C C AAA C G AT AA G G A C AT C TT TT C TG AAAA C TG C A G A G C C TAA G G AAA TT A S1_ 28 425 725 1 5 5 88 094 66 2E −41 AA TT TT C TG AT C TG A G TA TT G G TC AA C AA G AA TC C AA C AT AAAA C TC AA G TAAAA TG C A G [T /C ] A A ATT A C A ATT G TT A C AT A ATT G TTT C TC C A C AT A A C TT G C TA AT A ATT ATTT C TG C A G A S1 _1 72 013 713 2 5 15 915 33 9 5E− 48 A A C G C ATG AT A C TC A ATG TG TTG TT A C TA AT TG A ATC TC A AT TA AT ATG A C C TG C A G TTG [C /T ] TTG A AT TT TC ATG C TA TG TT TG TA A G G C C TC TA G TG TTG C C A A A A C C TC A G A C ATC TTC G S2 _4 60 465 47 5 5 18 63 63 85 5E− 54 C A C TG C A G G G G C TG C TTC C TTG ATG ATG A A C C C A A A G A A C TC TA TT TC TC A A AT A A A G C G [C /A ] TT C ATT G G A A A G A A ATT C TC TG A C C C G G A G C TT C A G TC TG A C TT G C A G TT ATTT C C TTTT S1 _16 14 04 50 8 4 5 19 74 15 36 5E− 42 TG C TA ATC A A A A C TG ATG C C TC TG C A G G G A C C A A G TTG ATC AT TG A A A G A ATC A G G G AT T[ A /C ] TC AC TT TC TA TC AC C TG TAG AG TAC TG C AT G TAC AT A AG TC AC C AT G A A A AG C AC C G G G C S1 _8 674 974 5 5 5 20 78 42 76 1E− 48 C C C TT C TG AT ATTT G C TT G G A G TT G A G AT G TC TG TT G A A C TTT A G C TG G A A ATTTT A C A G [T /C ] C A A C TC TA TG A AT TG TG TT TTC TG AT TC AT A C G C A C AT TTG TG AT TT TG TG C C TG C A G A G S1 _3 49 43 06 97 5 5 22 888 76 9 8E− 52 TC C AT C C TGGG A A GC A C TGGC A AT GG TT G AT TT C GG TA GGC C A A G AT TT GG A GC C C A A GC [G /A ] A C ATC TC TA A C C C A ATC G G A ATG C ATC TG C A G G G C A G G A A A G C A G TC C AT TT TC C A G C TC S3 _4 33 53 09 6 5 5 24 04 83 51 8E− 52 TG AT GGGGGG A A AT A A C C C A A A C TGG TGG A G TT C TA TC A GC A A C AT G A A G C C A A C TGC A G [G /C ] A G ATG A A A C C TC TC TTC TT TA TC C TT TTC C ATC TTC TTC A C C TC TC TTC C ATC A C TA C TC S1 _5 10 60 666 3 5 261 80 15 4 1E− 42 G A AG G A AG G C AG C AG C C TT TC A A AC C TG C AG AT G A AG TC G C C AC G TC TT TT G A A A AT TG C [A /T ] A A C C C A G C C A AT TT TG C A G C TG C TA G TT TA TA G G TA AT G ATA G ATA G TTA TC TA G G C TA T S1 _1 26 39 60 48 5 5 274 05 53 1 4E− 49 C AA C TC TG TC G TA G G AAAA G AAA C TG A C C C G TC C AT G AT C AA C AAAA C A G AA TT TT A G A G [G /A ] A ATG C C A A G G C A C TTC TTC TC A A G G TT TTC TA AT TC TG TTC TTG A G G G TG G C TTG TC TC T T A B LE C 1 (Co nti nue d) (C on tin ue s)

(16)

16

|

CORMIER Etal. SN P_ ID # F lu o. ty pe C hr. Po s. E‐ value Seq uen ce S1 _1 89 52 341 7 3 5 30 76 58 36 4E− 49 TT TA A A GC TT G TG A C A GC G A G TT TG A A G A C C TC TGC TGC A GGC AT GC AT C C A GC TT GC GC [A /G ] G C AG G A AC AG C TAC AC C G G C TG TT AC TG C C G TG G C G C TT TC AC TC AG TG G AC TG A A A AC A S1 _2 10 69 05 10 5 5 31 57 79 24 2E− 53 AT C A C C TG TC AT TC TT A G C AT A C G C TG A C A G C AT C C A G C A AT A A G C C AT G AT G C TG G G C A [A /T ] G AT TC C C A A G G A C TG G ATC G C TTG A A A C TG C TC TT TA C TC TC ATC TA C A A G C C C TG C A G A S1 _1 99 16 49 36 5 6 10 29 53 14 1E− 37 ATT ATT ATT ATT ATT ATTT C TT C TT C TG C A G TA A C G G G TC A C ATT G C TT G G A A G A A G TT G [T /C ] TG G AG C TT G A A AC G C A A AT A AT G AT AC G A AC AC AG C C TC AG C A AT G C AC G AT TAC TC G C T S1 _2 82 21 15 88 5 6 17 92 54 27 2E− 40 C A C TT TG C A C A AT TA TC TG C TG TA G A ATG TTC TA TT TG TT A A A C C TG C A G AT TA G G A A AT [C /T ] C TA A AT TC TA TC TG C TG TT ATG A A G TC C TG G TA G TA TG TA C A A G C A G G TTG AT TA TA C AT S1 _1 16 00 69 17 4 6 21 84 59 52 8E− 52 TT G TC A A G G A A CG AT CCC TT C A CC TCC TCG G A G A A G A AT CG CCCG A A C A C A CG A G A G AT G [G /C ] AT AT G G C C G G A AC C TG C AG AG G AG A A AG C G A AG C C C TA AC C C TG AG G TG C TT C AG C AC AG S1_4 57 61 96 3 5 6 27 08 35 04 1E− 50 TT TT AT C AT G A AC C G AT C AT C C TG AG AC AG G TAG A AG A AG C TC C C AC TC TT C C C AG G G G A [G /A ] G AT A AT TC C C TC A A A G C AT C A C TT C C A C A G AT C G TC A A C AT AT A AT C TG C A G C AT C A A C C S1 _2 89 56 329 7 3 6 31 21 076 0 2E− 53 AA TG AA C C AT AT C AT AA TC AA C TA G AT G TG AAAAAA G AA TA TT TG C A C AA C TG C A G G TG G [A /G ] C AG G A A AC C A AG G G G C TA A AT AG AC AC AC C TC AT G AC C TAG TT TC AC AC C C AT C TC C TG T S1 _2 10 28 474 2 5 6 3202 24 12 1E− 50 A A A AG AC C C A AG G A A AT G AC AC AG C AG A AC C AT TG TC C C AT TG G AC AT TT TC A AC TAC AT [T /G ] C AAA C TG C A G C AT AAAAA C C AA G AT TT AT AT C A C AT AT C C A C A C TA G TT C AA TG AAA C AA S1 _2 44 041 68 0 5 7 89 12 69 4E− 49 AT C A A A AC AT C G C TC TC TG C AG C C A A AT C AC AG AC G TT AG AG A A AT AT TT AT AG G C G AG T[ G /A ] ATG G C C TTG TTG TTC TA G A ATG G TA C A A G AT TG TG C A G C C A A A G G C TTC G A G TC G TT TTG S4 _1 83 12 47 3 7 318 07 48 5E− 48 G AC A A AC C AG A A AT C TT TC C TT TC C AT TA AG G A AG C A A AT C C AC C A AG G AG A AC TG C AG T[ T/ A ] G CCC A A AT G A AT CCG A G G G C TCCC A A A CC A C TG G C TG CC TT TT C A A G G AT TG CC A A G CG A S1 _1 56 52 08 59 4 7 10 36 714 3 8E− 52 TC TT TT A C TG AT AT A A A G A G A C TA C C A G A ATC C AT TT G TA TG TTG G TT A ATC TG C A G A C A [C /T ] TG A A AC TC TA TT G TT G TT AT A A AC TT TC C G AG C TT C C C A AG AG C AT A AC AT AC AT G A AC A S1 _1 09 90 70 43 3 7 15 65 896 6 2E− 53 A G G A G A A A A A AT TC ATG TG ATG TC C TC C AT ATC TC A G C C TC G TC TC G G G TG G TC A A ATG A [A /G ] A C TG C A G C A A G TA TTG G G A C A AT TG C A A G A AT A G A C ATG G ATG G C A C TC TC A ATG TG A G T S1_ 36 58 33 705 3 7 175 410 18 1E− 50 A C A C AT C TT C A C C AT TC A AT C A C TT TC AT C C A A C TG C A G C A A C G TC TC A A C A A G AT C TC C [C /A ] TG A G C TA G G TA TC ATC A AT TT TC TA C A A G C A ATC TG C AT TG G A A A G TG ATC ATG G A C C G A S1 _2 15 3759 78 5 7 17 82 82 47 4E− 49 TG C TG TG C TC A C G C C G ATG G AT A C TG TG A A G C A G C G G C TG C A G C TTG A G A G TA G TC C G TA [C /T ] A GA G G G GT G G GT GA TT GT G TGA G GA GA GT GA TGA G G GA A GA G G G G G TG C G TG C GT TT TA T S1 _5 95 696 0 5 8 19 90 861 5E− 35 A G AT A A G C A C TTT G TA TC TT G C TA TTTTT G TT G C TC TTT ATT ATT G AT G TG C A A C A AT G T[ C /T ] C C C A AC A AC C AC AC AC AC AC AC AC AC AC AC A AT TT TG TA TT TT TA TG TT AG C TAC TT C AT S1_ 14 28 325 46 5 8 407 30 06 4E− 49 AA C TAA C AT G AA TT TT G G C TC AA TG AT AT AA G AT TAA C AA C AAAAA C G TT TT TG C TG C A G [G /A ] GT TC TT GA A C A A GT TT GA TGA A AT C A C A A A AT G GA TA TT GA A A G TT TG TA A GA AT GT TA T S1 _1 02 92 69 38 3 8 57 71 89 3 3E− 50 AA TC TT TG AT G A C AAA G C TG C A G C TT C TT TT C AT G C AAAA C AA TAAAAA G TA TA C C G G AT [C /T ] TG ATG TG AT ATG G G ATG AT C A G ATC A C TA TA C TG A A A ATG A A A C C TG TG C C A G C TTC TC T S1 _2 923 132 7 5 8 64 76 874 1E− 48 TC C A GC A A AT A GG TGGGG A A C AT C AT A C A A C GGGC A C GG A AT G TT C AT C G A A A AT GC A C C [C /T ] G C TAA C AT C G TG G C C AAA G C TA TG G AAAA C AT TG C AA TA G AAA G TA TAA C C TG C A G C TG A S3_ 546 78 46 3 5 8 77 4840 3 4E− 56 TC A A GA G C TT C A A GA A GA G GA A A GA A G GA TA AT A A GT GA A AT AT C A GA GT TA GA GT GT G G [G /A ] A GA C C TG C A GA A GA A C A A A GA GT TT GT GT TA C TGA A AT GA TG GA TT GT GT TA TT GA TC C A T A B LE C 1 (Co nti nue d) (C on tin ue s)

(17)

SN P_ ID # F lu o. ty pe C hr. Po s. E‐ value Seq uen ce S1 _20 82 36 88 9 2 8 947 92 07 4E− 49 A G G A A G G A A A G A G A A A G A A G TT TC TG C TG C A G TC TC A G C C C C TTC TTC G A AT TC TTC TTG [T /C ] A G TT TA TC A A C A AT C A C TC A AT C AT A C G G TG A C A AT G C C A C TC TT C A A AT A A C TC A G C AT S1 _16 93 56 49 5 5 8 12 13 316 4 8E− 52 AT A C A G A A AT TAT C A G TG TA AT AT AT TA C A G A A G TA G A AT G C TT C AT C A C C A G A AT C TG A [T /A ] TT TA TA TG A A A AC AC AC TG AC C TC TT G AT G A AG A AT TAG G C A A A AC AG G G AG TC TG C AG A S3 _3 53 09 77 0 5 8 19 352 52 1 2E −47 C ATTT C C A A ATTT C A G A A A AT A A AT C G G TTT C C AT A A G ATTT G A G G TA C A A AT A G TTT C C [A /G ] C A A AG G AG G TT TA TC TG AT AC A AC AC TG C AG C TT G A AT AT G G TA A AT A AC TAG TC TC AC A S1 _23 54 19 64 8 4 8 23 12 5222 5E− 48 G AA C TT C A G AAA TT G TT AT A C G C TG C A G AT TG C C C AAAA TG AA G C AT TC AT TA G A C AT AA [C /T ] TG AT C C C AT A A AT C A G C C C A G C TT C TT TT AT G TT G TA C AT A A A A G TT C A AT TA G C A A G AT S2 _3 08 75 42 6 3 8 27 55 47 86 1E− 43 TT C AT G TTT G G A A G AT C TA AT G TC A ATTT A G AT G TC AT AT G G TTT A G TTTT G TA TT A G TA [C /G ] TT TAT G TT G TAT TAT TC C A AT AT A AT C C A AT AT C AT AT C TG C A AT TC TG C A G C A G G TC TT S1 _7 12 85 261 4 9 10 39 39 6 2E− 53 AG AG A AT G TC C G G AT A A AT C C TC AG AG A AG AC TC C AC C TT C G C A AG G TG C C C AG C TC G C A [C /A ] GGC C AT G A A G A A C TC G A GC TC TG TGGGG TT A C GGC TGC A GC TC A GGC C AT GC C C C AT C TT S1 _3 52 41 33 90 4 9 31 687 74 2E− 53 C TT TC TTG G C A A A C A G TTC TG C A G TA G AT TTG A A G TC A G C TTC TTC TA C A A G TC TA C C A A [A /G ] A G A G G AT TC C A A G TC A G TG A ATG G AT ATG ATC A AT TT G C ATG A C TG C TTG A A A A G TC G G G S1 _5 84 54 21 3 5 9 35 70 970 2E− 53 TG TC C TG TG G C C TC ATC G A G G A G C C AT TTC TC TA G C AT TG AT A G A G G A G G G TTG TT ATG G [C /T ] TC TC A A C TC TC TC C TTC C C TTC A G A ATC A C C AT TG A A G AT TG C TG AT TC TG C A G ATG AT T S2 _9 85 61 10 5 9 506 606 4 1E− 50 TG G A A A AT TA G G TA TC C C A G TT A C C ATG G A A ATC G C TA G TG ATC TG C TTG AT A G G C A A G G [T /C ] C C A AT TT AC AG AG AG G AT AC TG C C G TG TT C G TT AG C C A AT C TG G AG A A AC TG C AG AT AC C S1 _1 574 48 00 6 5 9 777 31 68 2E− 46 TT TA A C TT TT G A A A GGC TGC A GGG TA TG A A AT C A C A GGC C C C G A GC C TGC TA AT G TA G AT [C /T ] ATG G TG A A G A A G C TG C ATC TG A G G ATG A A G A G G A ATC TT ATG ATG G A C ATG ATG C A G ATG S3 _1 469 996 0 5 9 14 77 13 43 1E− 43 AA C TAAAA G C AAA C C AAAAA TAAA TT C C TC TG C C G TT AAA TAA C C TG C A G AAAAAAA TA G [C /A ] G A A AT G TG AC A AG G AG A AT AT TT AC AT AC C TT C G C C TC C AT G AC AC TT C TT TC AG AG TT C S2 _5 88 43 16 0 5 9 17 85 53 77 5E− 54 C A AT TC AT TT C A G G C AT A AT G TT AT C A A G TA AT G C AT AT TC TA C C A G A A AT G A A C TT TAT [G /A ] TG G A A C ATC AT TC TTG A C AT TTG A A G A A C TG C A G TTG AT TA C A A G TG A AT TG C TT AT A A C S1 _10 85 05 610 5 9 23 13 83 42 1E− 48 G G AAAA TA TC C AAAAA G C AA TAAAA G C TG C AA TG G A C G C A G C C AA TG C C G C TG TT TC A C A [A /G ] TC G A A G G C AT TC TG C C TA A G C C G C G TC G ATG TG G G TC TA G A C A C C A C TG C A G TTC G A G A A S1 _9 64 09 13 6 4 10 27 3696 9E− 45 TG A A G TT TG AT G AT TC A AT TT A C C A AT G AT TT TC AT G C A C A G G A G TAT AT C TG A G A AT A A [C /T ] G AG C A A A AT G AT G C AG AG G TT AC TT C AG C A AG A A AT G C TG C AG AG A AG G AT G TG AC C A AG S1 _759 07 47 9 5 10 12 959 79 8E− 52 AA TG TC TT G A C A G AA G C C AT G AAAAA G C TC C A C AAAA TAA G TC C AAA TT G A G AT TG G A G A [G /A ] C AT A C TC TC C A C TG C A G C A A G TC TG TA C C C TG TC TA TG TG A C TG C TG G A G G G G C TC TTG C S3 _1 79 11 82 0 3 10 222 07 48 8E− 39 AG TAG TT TC TG A AC TG G TAC TT TG AT C A AT AC C TG C AG AG TT AG TAG C AG TAG C A AT A AG [C /T ] G A G G A G A C C TC A AT ATC TTG C A C ATG ATC A C TC A C C G A A A ATG G A A C AT TA TC A G C ATG A S1 _2 82 03 20 37 4 10 50 023 51 7E− 40 G AT TT A C A G G A C A AT TA C AT TT C A G AT TT C C AT A AT G AT G G TA A C TA C A A G A AT AT TAT T[ C /A ] TG TA A G C A C C ATG AT A C TTG TA TC C AT TA C ATG C AT TG A ATC A A A A G A A C TG C A G TT TT A S2 _6 6969 33 3 2 10 59 76 818 2E− 52 TT TA AC TG TA TT G G C AG TG TC TG C AG AC AG AG C TAC G TC TAG C A A AG TG AG C A AC TC AT C [A /G ] TC TG AAA C AA C TC C AA TC TG TAAAA C A G A G C AA G TC A C A G AAAA TT TA TA C C AA G AT AA G T A B LE C 1 (Co nti nue d) (C on tin ue s)

(18)

18

|

CORMIER Etal. SN P_ ID # F lu o. ty pe C hr. Po s. E‐ value Seq uen ce S2 _5 383 683 2 3 10 10 47 74 61 5E− 48 G A A C A C TT TC C TG G AT TG A A A AT TA TT TC TG C TG C A G ATC ATC G TT TC TT TG G C A C C A C A [C /A ] C C TT TT TC TA AT A A A AT AT TC TTG A G C TC TT TC TC ATC TT TG A G ATG TG A A A C ATC TA A C S1 _2 16 6818 3 4 10 16 03 868 5 2E− 53 A G C TG C A G A A AT TA C ATC A A G G ATG G ATC TC A G A G C TTC C A A C C TG C AT A A C ATC TC ATC [G /A ] AT TG C A ATG TC ATC A A C TG G A A G G TG C C C TA TG ATC C C TG C C A C G G TTG ATC TC TC C TC T S1 _333 79 01 52 5 11 26 22 759 4E− 56 TG C TT TG A A A G G C C A G C ATG ATC TT TT TTC TA TT TTG TG TTC TTG C A ATG A G TTC TG TC T[ A /T ] A C AT TG C TG AT TT TTG TTC TTG TG C A G TC TG C A G TA G AT TT TG ATG AT A A A A C C A G TTG G S1 _2 38 13 151 2 2 11 1444 74 26 1E− 43 A A AT AT C GA GA TGA AT TT C TG G GA A C A A G C C TG C A GT TA GT TT GA G GA TGT GT TT G GA AT [A /T ] A G TG AT C AAA TG G C AT TT G A G AA G C AT A G TA TG TAA TT TT TC C G TAAAAAA TG AT C AA G A S1 _3 89 18 39 3 3 12 17 162 24 5 2E− 46 A AT C C C AT C A A ATTT G TC TTT C A A ATTT C TG A ATTTTTT C C C C TA TC C C A A A A A ATT C A G [C /G ] C C AT C AA G AAAA TA TG A G G C AAA G C AT AAAA C TG C A G C AT AA TC TC TA TT A G AT C TC AT C S1 _10 20 410 15 4 12 24 32 73 64 3E− 45 G C AG TA TAC A AC TT C AT TAC C AT G TT AG AC C AG C A A AT C TC AG TC TG AT TT C TT C C TAG C [T /C ] A AC AC AC AC AC AC AC AC AC AC A A A AG A A A AC TT C A AT C TC TG TT TG TT TT C TG AG TG C AT S1 _6 54 60 84 9 4 13 18 247 00 1E− 42 G C AT C TG C TA G TT C AAAA TAAAA G G C AA G C AT G G TA TC A C TAA TA C TG C A G AAAAA TG AT [A /G ] G TG C AT AAA TA TC TAA TG G G AAA TG AT G C A G C G AA G AT C AA TAA C TT A G AA TG AAA TT C A S2 _2 36 95 000 5 13 82 52 98 5 8E− 52 TAC AT C TG G G G G AC TG C AG AG C TT TG AG C AT C C AC TG A AC C TG G TG A A AC C AG TG C C C AG [G /A ] G TT G A C A G C AAA G G G C AA G TA TG TG G TG C C AA TT TC AAA G TT G AT G C C C AA G C TAA G AA G S1 _2 79 91 00 17 5 13 27 31 30 51 2E− 53 GGC A GC TT C C A GGGC TC GGG A GC TA C C A A G AT GG A G TT TGC TA TC C A C GG AT TT G TT C C A [C /T ] G A ATC TG TA G C TC C ATC G TA TA C C C TG A G C C TG C A G C C G TC TC TG C A G TC C A G A G C AT A A S1 _2 9752 92 58 3 14 39 33 202 7E− 46 C TT ATC TA TG G C C A G G TTC AT C A A C AT AT A A G G C TTC A ATG G TC TC A A AT TTC ATC G G TG [A /C ] TG C TTC TG C TG TG TG TA C TT TT A C TTG TC A C TT A C G C C A A A G TA A C TG C A G C C TG C A G G T S1_ 25 26 99 76 4 5 14 12 39 11 82 4E− 49 AC TAC TG A AT A A AT G G A AT C A AC TT TA TT TG C TG C AG TT G G AC TAG C C TA AG AG G A AC TA [A /T ] GT G G C TT TG GA A GA G TGT TGA TA C TT G G GA TT TT AT AT C A AT GT GT GA A A AT C A GT GA C T S1 _6 43 47 28 5 5 14 12 86 876 2 1E− 50 G G ATC A A A G TTC TC A G A G AT TA TTG AT TT TG A G A A G TC A A G AT ATG C ATC A A ATC C G TG G [T /G ] TG ATC TC TA C TG C A G C AT TT A C C ATC C C TT TC ATG TC A AT TG A C C A G A G A G G TG TC A G AT S1 _10 86 52 759 4 14 13 73 68 21 3E− 45 G A G G TA ATC TG TG A C TTG TC C AT AT TA C TG C A G A A C A G C A AT TT AT TG C TG ATC TTG G A C [T /C ] AC TG AT AT C C AG C TT TC C C C AG AT TA TG TC AT AT TG C AC C C AG C A A AC C AG TAC A AT TT A S2_6 80 74 87 8 5 15 32 061 9 6E −47 C A CC A CC AT C A C AT CCG C A CC AT TG TT G TCC A CC TT TG A A CCC A G TG TG CC TG A G A A G G A [C /T ] A C C TG C A A C AT ATC C G G TG ATG A C TTC TG G A A A G TTG C C TG C A G G G C A G AT TA AT A A A AT S2 _1 50 31 349 4 15 27 54 888 3E− 50 TA A G A G A A A A AT A G C TT A C AT TA TC TG TG TC G A C C TC TG AT AT A ATC TC TT TG AT TG TA G [T /C ] G G C ATC G C C C AT TC C G TA TT TC TC C ATG G C TG AT TC C A A C TC ATC TC TTG TG AT AT A G C T S1 _3 61 16 86 97 5 15 31 99 24 5 2E− 54 AT TA C TA A G ATG C A AT TT AT A A A C AT ATC TG C A G TTC AT TG C TC TTG A A ATC TC TG C A C A [A /G ] G C A GA A GA A GT TGA GA TT TC C GT A A A A A A G C C TGA A A AT G GT G GT A GT G C TT C A GA A GA G S1_4 28 12 02 4 5 15 34 562 80 8E− 52 G A AC AT TT TA TAG TAC C TC A AG G G G AG TG C C TT TAC TG TC A AG AG G A AG G G G A AC AC C A A [C /T ] TC C G C ATC C TC TC G G A A ATC TG A A G C A G C TA G G C C TG TC ATC A ATG G C TG C TG C A G TA G C S1 _3 19 17 52 3 5 15 38 39 67 9 2E− 54 TT TA C ATC TA TTG A A C TC TC TG C A G G TTG A ATC TG A A AT AT TT TG C TTG C ATG G TG G TT T[ G /A ] TC C C C TTC TA TTG A G A C C C TTG AT A A C AT A C G C A AT TT TG ATC G TG TTC A A G A G G TTC C T S1 _16 06 44 79 5 15 51 73 74 9 1E− 50 G C TC A GA A GA G C TC C AT AT GT A A GT TG GT TC TG GA C TA C C TC C G G GA GA C TA TT GA A GT A [A /T ] C TC TC TG C A G C ATC TA C C C C C TTG A C TT TG G AT AT A A G ATC TA TTC G TA TTG C ATG G G TC T A B LE C 1 (Co nti nue d) (C on tin ue s)