• Aucun résultat trouvé

Analysis of expression patterns of candidate genes for bud set and cold tolerance in scots pine (Pinus sylvestris l.)

N/A
N/A
Protected

Academic year: 2021

Partager "Analysis of expression patterns of candidate genes for bud set and cold tolerance in scots pine (Pinus sylvestris l.)"

Copied!
200
0
0

Texte intégral

(1)

HAL Id: hal-02947464

https://hal.inrae.fr/hal-02947464

Submitted on 24 Sep 2020

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Distributed under a Creative Commons Attribution| 4.0 International License

l.)

Komlan Avia, Katri Kärkkäinen, Outi Savolainen

To cite this version:

Komlan Avia, Katri Kärkkäinen, Outi Savolainen. Analysis of expression patterns of candidate genes

for bud set and cold tolerance in scots pine (Pinus sylvestris l.). IUFRO Tree Biotechnology Con-

ference 2011: From Genomes to Integration and Delivery, Jun 2011, Arraial d’Ajuda, Brazil. BMC

Proceedings, 2011, �10.1186/1753-6561-5-S7-P3�. �hal-02947464�

(2)

M E E T I N G A B S T R A C T S Open Access

IUFRO Tree Biotechnology Conference 2011:

From Genomes to Integration and Delivery

Arraial d ’ Ajuda, Bahia, Brazil. 26 June - 2 July 2011

Edited by Dario Grattapaglia Published: 13 September 2011

These abstracts are available online at http://www.biomedcentral.com/1753-6561/5?issue=S7

I N T R O D U C T I O N

A1

IUFRO Tree Biotechnology 2011:From genomes to integration and delivery”

Dario Grattapaglia1,2

1EMBRAPA Genetic Resources and BiotechnologyEstação Parque Biológico, 70770-910, Brasilia, DF, Brazil;2Graduate Program in Genomics and Biotechnology - Universidade Catolica de Brasilia - SGAN 916 Modulo B, 70790-160 Brasilia, Brazil

BMC Proceedings2011,5(Suppl 7):A1

Forest trees have unquestionably entered the genomic era. The updated version of thePopulusgenome, the recently releasedEucalyptus grandis genome and the concerted efforts towards the generation of genome sequences for spruces (Picea sp.) and pines (Pinus sp.) by several groups worldwide, are fueling a multitude of inter-disciplinary studies and applications in sustainable forest production and conservation. Time now calls for the integration of scientific fields with an increased sense of urgency for delivery of effective biotechnologies.

The IUFRO (International Union of Forestry Research Organizations) Tree Biotechnology biannual conference has established a solid tradition for over 20 years as the official meeting of the IUFRO working group 2.04.06 Molecular biology of forest trees. This conference has convened scientists and foresters interested in the genetics, genomics, molecular biology and physiology of forest trees, and the application of this knowledge to tree improvement and conservation. The Tree Biotechnology Conference has undoubtedly been the premiere international forum where the most cutting edge research in tree biotechnology developed both in academia and industry is presented.“From genomes to integration and delivery”, this was the theme chosen for the 2011 edition of the IUFRO Tree Biotechnology Conference, first time to be held in South America. Our intention was to promote a more integrated and applied dialogue on tree biotechnology and genomics, beyond the mainstream discussion of the fundamental advances on the genetic mechanisms that underlie tree phenotypes.

In nine scientific sessions some of the current advances of genomics applied to forest conservation, tree physiology, stress response, molecular breeding, in vitro and propagation technologies, wood development and genetically modified (GM) trees were highlighted. With 340 registered participants, the Conference brought to Brazil most of the world’s brain power in forest tree genomics and biotechnology. An outstanding team of international scientists shared their results and visions on the present and future of this fast moving area of forest science, while a brilliant group of young scientist and students delivered a very energetic and diverse collection of high-quality scientific presentations. Forty two countries were represented at the Conference with almost 100 different

laboratories from tens of Universities, research institutions and private companies.

During the seven days of the Conference 26 invited lectures, 63 oral and 185 poster presentations were delivered, totaling 274 papers made available as extended abstracts into this BMC Proceedings supplement. The special workshop on the hot topic of“Genomic Selection in tree breeding”and the several reports on whole-genome studies, made this conference edition inaugurate a deliberate effort towards a better integration between the quantitative genomics, the“single-gene”and the system biology approaches to more efficiently unravel the complex relationships between genotypes and phenotypes in forest trees. A field trip to the forest plantations, nurseries and mill of VERACEL Cellulose was a definite highlight and a welcome break from the scientific sessions, providing an overview of some of the advances and challenges facing the translation of research into plantation forestry.

In closing this introductory statement, acknowledgements are due to the outstanding financial support provided by the competitive grants of the Brazilian Ministry of Science and Technology through the National Research Council (CNPq) and the Ministry of Education through its agency for graduate studies (CAPES). Major support was also provided by EMBRAPA (Brazilian Corporation of Agricultural Research), and VERACEL Cellulose, the host organizations, together with an exceptional suite of private sponsors.

Besides the organizations that backed this conference and an active Scientific Committee involved in abstract review a number of people were involved in the organization and logistics. The conference would not have been possible without the valuable contributions of all these players.

Given the rewarding feedback received after the Conference, the original goal of providing an exceptional mix of science, social activities and field exploration in a relaxed atmosphere was truly accomplished. The IUFRO Tree Biotechnology Conference 2011 made a significant contribution to advance the forest biotechnology research community one step ahead on the challenging task of moving from gene and genome discoveries to the delivery of valuable technologies into sustainable forestry.

S1. POPULATION GENOMICS,

CONSERVATION AND ADAPTATION

I1

Missing heritability and missing Fst of candidate genes: why does gene variation differ from trait variation in trees?

Antoine Kremer

INRA, 69 Route dArcachon, Cestas 33610, France E-mail: antoine.kremer@pierroton.inra.fr BMC Proceedings2011,5(Suppl 7):I1

Molecular footprints of phenotypic variation are usually explored by two different approaches in trees. On the one hand, association genetics is

© 2011 various authors, licensee BioMed Central Ltd. All articles published in this supplement are distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

(3)

seeking for statistical correlation between SNPs and traits owing to the significant heritability that is usually observed in progeny tests. On the other hand, detection of outlier Fst values of single SNPs has also been implemented to account for the very large differentiation of traits that are observed in provenance tests. The rationale of both approaches is based on the extensive within or between population genetic variability that has been widely recorded in fields tests. While the cataloguing of candidate genes has steadily increased in trees, so has the inventory of their diversity in natural or breeding populations. There is now a rapidly growing body of experimental results showing a very large discrepancy between the expectations of both approaches and the SNP variation that is monitored in candidate genes. Most association studies show that single SNPs explain usually less than 5% of the phenotypic variation of the trait, while Fst values are at best of the same value than neutral markers. In my presentation I will explore the reasons of the decoupling between trait and gene variation, by focusing on the multilocus structure of traits, as compared to the monolocus SNP information. Indeed, association studies and Fst outlier detection are essentially based on single locus approaches, while traits are multidimensional structures.

There are at least three properties of multilocus structures that will be investigated: cumulativity, interactions and covariation of gene effects.

I will show that cumulativity and interactions may explain the discrepancy

in the case of association studies, while covariation of gene effects (at the between population level) explain the missing Fst of genes underlying adaptive traits. These conclusions are supported by experimental results, theoretical background and simulation predictions.

I2

Population and conservation genomics of forest trees: seeing the forest for the trees

Andrew Eckert

Department of Ecology and Evolution, University of California at Davis, Davis, CA, 95616, USA

E-mail: ajeckert@ucdavis.edu BMC Proceedings2011,5(Suppl 7):I2

Background: Forest trees exhibit striking adaptations to the environments in which they grow. A long history of quantitative genetic experimentation has established the genetic basis for many traits, which are likely adaptive since many of them are also correlated with environmental heterogeneity. The genes underlying these traits, however, have largely remained elusive. Recent applications of high-throughput sequencing and genotyping technologies to natural populations of forest

Figure 1(abstract I2)The distribution of Douglas-fir and loblolly pine in North America. The black box surrounds the sampling area for Douglas-fir, while the entire range for loblolly pine was sampled.

(4)

trees have identified several promising candidates for genes underlying complex and adaptive traits (reviewed by [1]). The diversity of analytical approaches employed in those studies, however, begs the question of the generality of reported results. Here, I exploited the diversity of analytical approaches used previously to identify genes underlying adaptive traits for two conifer species to assess the logical consistency among results generated from different conceptual frameworks.

Materials and methods:Data sets comprised of genotyped single nucleotide polymorphisms (SNPs) were gathered for two North American

conifers (Fig. 1): loblolly pine (Pinus taedaL.) and coastal Douglas-fir (Pseudotsuga menziesii(Mirb.) Franco var.menziesii). These data sets were generated through resequencing of diversity panels (n = 18-24 megagametophytes) for a sample of expressed sequence tag (EST) unigenes (n= 7,535) and cold-hardiness related candidate genes (n= 121), respectively. Genotyping was performed using Illumina’s Infinium (loblolly pine) or GoldenGate (coastal Douglas-fir) array technologies. For each data set, associations to phenotypes and environmental variables were gathered from the literature or performed as described elsewhere

Figure 2(abstract I2)An example of logical consistency for loblolly pine between genes associated to phenotypes or environmental variables and gene categories that have average deviations from neutrality consistent with positive or negative selection. Error bars give 99% bootstrap confidence intervals for the direction of selection statistic (DoS), with yellow bars having confidence intervals excluding zero. Lines give average TajimasDfor

nonsynonymous (blue) and synonymous sites (green), as well as the proportion of genes associated to at least one phenotype (red). Shaded areas give null distributions (99% quantiles) generated via permutations of genes among categories (n= 10,000 permutations). All lines, including those forming the null distributions, were smoothed using lowess smoothing.

(5)

(see references in [1]). For each associated gene, I asked two questions:

(1) Are genes associated to phenotypes more often also associated to environmental variables than randomly chosen genes? (2) Are associated genes also outliers for nucleotide diversity and site-frequency spectrum based statistics, nucleotide divergence orFSTmore often than randomly chosen genes? Permutation tests were used to assess whether or not observed patterns were different than those produced by chance.

Results: A total of ~30 genes were associated with cold-hardiness phenotypes or with climate data for coastal Douglas-fir, while ~850 associated genes were identified for loblolly pine. Genes associated with phenotypes for coastal Douglas-fir were more often associated to environmental variables than randomly chosen genes (P= 0.005), which was only moderately apparent for loblolly pine (P= 0.067). Genes associated to phenotypes or environmental variables were not more likely to be outliers for nucleotide diversity, site-frequency spectrum based statistics, nucleotide divergence orFSTfor Douglas-fir (P> 0.05). There was some evidence, however, for non-neutral evolution for associated genes using a statistic based on the McDonald-Kreitman table [2]. In this case, associated genes had too many extreme values for the direction of selection (DoS) statistic than expected from randomly resampling the available genes (P< 0.05).

Associated genes on average had skewed site-frequency spectra for loblolly pine, especially for synonymous sites, as well as too many extreme values for the DoS statistic. Further classification of genes for loblolly pine into functional categories revealed striking trends indicative of non-neutral processes underlying some of the associations (Fig. 2). A total of 11 gene categories were consistent with positive (n= 7) or negative (n= 4) selection.

In both cases, the frequency of associated loci increased for these categories and the synonymous site frequency spectra became more skewed. Many associations for loblolly pine may thus reflect linked selection, with molecular phenotypes (e.g. gene expression) accounting for all the associations in gene categories indicative of negative selection and environmental associations (e.g. aridity) being largely located in gene categories consistent with positive selection.

Conclusions:Many of the genes associated to phenotypes for both species were also correlated to environmental variables and exhibited patterns of non-neutral evolution. Thus, associated genes are prime targets for conservation efforts. The questions posed here, however, make the strong assumption that genes associated to phenotypes or environmental variables should also show non-neutral patterns of evolution. This is not always expected to be the case [3,4], yet the lack of consistency is often interpreted as such and is one explanation for those genes or sets of genes reported here as lacking non-neutral signals. The search for logical consistency among analytical approaches, however, often focuses on uninformative patterns. To illustrate this point, I employed a novel environmental association approach that correlates genetic divergence to environmental change and show that most of the site-frequency spectrum based outliers for coastal Douglas-fir [5] are correlated to change in climate variables but not to extant climate patterns. Taken together these results illustrate that non-neutral genes are often identified during association analyses, that departures from neutrality for genes driving associations are not only those due to recent directional selection, and that further work is needed to understand the population genetic processes underlying associations between genotypes, phenotypes and the environment.

Acknowledgements:I would like to thank J. L. Liechty, B. N. Figueroa G. Rosa, and J. L. Wegrzyn for bioinformatics and computational support.

I would like to thank all of the collaborators on the ADEPT2 project (http://dendrome.ucdavis.edu/NealeLab/adept2/). This work was funded by the National Science Foundation (IOS-PGRP-0501763, IOS-PGRP 0638502) and the United States Department of Agriculture (NRI Plant Genome 04-712-0084).

References

1. Neale DB, Kremer A:Forest tree genomics: Growing resources and applications.Nat Rev Genet2011,12:111-122.

2. Stoletzki N, Eyre-Walker A:Estimation of the neutrality index.Mol Biol Evol 2011,28:63-70.

3. Le Corre V, Kremer A:Genetic variability at neutral markers, quantitative trait loci and trait in a subdivided population under selection.Genetics 2003,164:1205-1219.

4. Pritchard JK, Pickrell JK, Coop G:The genetics of human adaptation: hard sweeps, soft sweeps, and polygenic adaptation.Curr Biol2010,20:

R208-R215.

5. Eckert AJ, Bower AD, Pande B, Jermstad KD, Krutovsky KV, St. Clair JB, Neale DB:Association genetics of coastal Douglas fir (Pseudotsuga menziesiivar.menziesii, Pinaceae). I. Cold-hardiness related traits.Genetics 2009,182:1289-1302.

I3

Tracing the history of South-American neotropical savannas and seasonally dry forests evidences from comparative phylogeography Rosane Collevatti1*, Suelen Rabelo1, Joao Nabout2, Jose Diniz-Filho1

1Universidade Federal de Goias, Brazil;2Universidade Estadual de Goias, Brazil E-mail: rosanegc68@hotmail.com

BMC Proceedings2011,5(Suppl 7):I3

Background:The Neotropical Seasonally Dry Forests (SDF) are tree- dominated ecosystems that occur in disjunct areas of fertile soils throughout the Neotropics. The hypothesis that the vicariance of a formerly continuous seasonal woodland formation, which may have reached its maximum extension during a dry-cool period 18,000–12,000 bp (the LGM), the Pleistocenic Arc Hypothesis, has been raised to explain the disjunct distribution [1]. Alternatively, based on the distribution of contemporary SDF species in Amazonia Forest and palynological data, Pennington [2] also proposed that SDF may have expanded into Amazonia Basin during the Pleistoce, with rain forest and montane taxa largely confined to gallery forest. In addition, a number of studies based on the fossil pollen record now available show that during the early Holocene period (until ca. 6000-5000 14C B.P.), the climate was drier in most of the South American savannas and distribution of savanna-like vegetation in Central and Southeast Brazil was more extensive in early compared with the late Holocene [3-5]. In southeastern Brazil, the current vegetation exist in the region only in the latest Holocene period (since 970 or 600 B.P. for some regions) under the current wet climatic conditions, with an annual dry season of about 4 months. Hence, the fossil record shows that savanna expansion in the Quaternary, especially in southeastern Brazil was characterized mainly by herbaceous and grass savanna which were favored by the drier and highly seasonal climate. It is possible that arboreal savanna taxa became restricted to sites with moist climatic conditions, which served as refugias. We are interested in test these hypotheses using the genusTabebuiaas model group. We have chosen five species based on the pattern of geographical distribution:T. aureaandT. ochracea, from savanna vegetation (cerrado sensu stricto),T. impetiginosa, T. roseo-alba from SDF,T. serratifolia, widely distributed in Mata Atlantica, SDF, riparian forests and Amazonia. Here, we present the results based on the phylogeography ofT. impetiginosa.

Methods: We first generated a phylogenetic hypothesis based on sequences from three non-coding regions of cpDNA and ITS from nuclear rDNA. At least 16 individuals from 14 populations were sampled and sequenced for three chloroplast intergenic spacers (1635bp). We also sequenced individuals ofT. ochracea,T. chrysanta,T. chrysotrichaand Cybstax antisyphilitica, as outgroups to estimate coalescence time and better understand the biogeographical history ofT. impetiginosa. Time to most recent common ancestor was estimated based on coalescent analysis implemented in BEAST 1.4.7 software [6]. For distribution modeling, we used 77 records ofT. impetiginosaand four climatic variables (mean annual rainfall and variability, average temperature of the warmest and coldest months) derived from four different coupled Atmosperic-Oceanic Global Circulation Models (AOGCM): CCSM3, CSIRO, HADCM3 and ECHAM. Five different niche models were used: the BIOCLIM , Euclidean Distances (EUCL) Mahalanabis Distances (MAHAL), Genetic Algorithm for Rule Set Production (GARP), and Maximum Entropy (MAXENT). For each of the NM, a total of 750 different models were generated, using distinct combinations of dataset partition and variable selection The paleoclimate scenarios were based on the GCM Genesis 2 [7].

Results and discussion:We have no evidences of a recent connection of the SDF in the LGM. Coalescent analyses showed thatT. impetiginosa populations probably originated at ~7 Myr BP, but populations started to diverge only ~2 Myr BP, with the divergence of two major clades, one that comprises the populations from the East and West boundaries of Cerrado Biome, and the other that corresponds to the forests from Central, West and Northeast Brazil. Major divergences in this last clade coincide with the Quaternary glaciations. Our results strongly support that the disjunct distribution ofT. impetiginosamay be derived from vicariance and that

(6)

long distance gene flow is unlikely, since we found a deep geneanalogy with alopatric lineages and a remarkable differentiation among populations (AMOVA, Arlequin v. 3.11) [8]. Paleodistribution modeling together with coalescent simulation showed no connection among SDFs in the LGM. Thus, glaciations may have most likely caused population divergence and differentiation due to retreat and range shift of an ancient forest widely distributed throughout the Central, Southwest and Northeast Brazil. Hence, we hypothesize that the isolation of an ancient forest due to a global cooling and drying during the glaciations of Pliocene/Pleistocene is responsible for the disjunct distribution of the SDF andT. impetiginosa.

Our results also indicated that the last glaciation in Pleistocene were most likely responsible for local differentiation because lineage diversification within localities coincides with ~ 100 kya BP but without important demographic effects because populations showed a constant effective population size and no exponential growth or shrink. In conclusion, the range shift ofT. impetiginosatoward the Northwest and the lack of a connection among SDFs in the LGM support the hypothesis of a kind of SDF in Amazonian Basin during Pleistocene glaciations but did not support the Pleistocenic Arc Hypothesis.

Acknowledgments:Phylogeography analysis of Cerrado tree species have been continuously supported by competitive grants of CNPq to RGC. Our research program integrating macroecology and molecular ecology has been continuously supported by grants to the research network GENPAC (Geographical Genetics and Regional Planning for natural resources in Brazilian Cerrado) supported by CNPq/MCT/CAPES (project # 564717/2010-0 and 563624/2010-8).

References

1. Prado DE, Gibbs PE:Patterns of species distributions in the dry seasonal forests of South America.Ann Missouri Bot Gard1993,80:902-927.

2. Pennington RT, Prado DA, Pendry C:Neotropical seasonally dry forests and Pleistocene vegetation changes.J Biogeogr2000,27:261-273.

3. Salgado-Labouriau ML, Barberi M, Ferraz-Vicentini KR, Parizzi MG:A dry climatic event during the late Quaternary of tropical Brazil.Rev Paleobot Palyn1998,99:115-129.

4. Behling H, Hooghiemstra H:Neotropical savanna environments in space and time: Late Quaternary.Interhemispheric climate linkagesOxford:

Academic Press: Markgraf V 2000, 307-323.

5. Behling H:Late glacial and Holocene vegetation, climate and fire history inferred from Lagoa Nova in the southeastern Brazilian lowland.Veget Hist Archaeobot2003,12:263-270.

6. Drummond AJ, Rambaut A:BEAST: Bayesian Evolutionary Analysis by Sampling Trees.BMC Evolutionary Biology2007,7:214.

7. Thompson SL, Pollard D:Greenland and Antarctic mass balances for present and doubled atomospheric CO2 from the GENESIS version-2 global model.J Climate1997,10:871-900.

8. Schneider S, Roessli D, Excoffier L:Arlequin Ver. 2000: A software for population genetic data analysis.Switzerland: Genetics and Biometry Laboratory, University of Geneva 2000.

S2. LINKAGE AND ASSOCIATION MAPPING

I4

Genomics-based breeding in forest trees: are we there yet?

David Neale

University of California-Davis, USA E-mail: DBNeale@ucdavis.edu BMC Proceedings2011,5(Suppl 7):I4

Efforts to develop genetic marker based approaches to breeding forest trees began in the late 1980s. Approaches based on first generation of markers, allozymes, were not feasible due to the very limited number of markers (<50). The first DNA-based markers, RFLPs, brought more hope as moderately dense genetic maps could be constructed to scan the genome and map quantitative trait loci (QTLs). This approach was quite effective toward mapping QTLs in many forest tree species but the approach could not be brought to application in tree breeding due to low levels of linkage disequilibrium (LD) in forest tree breeding populations and recombination between flanking markers and QTLs with each generation. The next generation of DNA markers based of the

polymerase chain reaction (PCR), RAPD, AFLP and SSR, did not solve the LD and recombination problem, even though more markers were available and throughput increased.

The situation began to change in the early 2000s with the availability of automated DNA sequencing technology and single nucleotide polymorphism (SNP) genetic markers. Now association studies could be performed where SNPs within candidate genes controlling complex traits could be identified and thus“solving”or minimizing the LD and recombination limitation. This approach to complex trait dissection has been widely applied in forest trees and the early approach of QTL mapping in segregating populations has been mostly abandoned. The association genetic approach has been used to find candidate gene SNPs associated to a broad array of quantitative traits of interest (wood properties, growth, abiotic stresses and disease resistance).

However, like QTL mapping before, individual SNP x trait associations only account for a small proportion of the variation (generally less than 2-3% of the total phenotypic variance) and the total variation explained by all markers is generally less than 50%. In human genetics, this situation is called the“missing heritability”and there has been great debate over whether this problem can ever be solved in complex trait dissection in humans. In a few agricultural systems however, notably dairy cattle, researchers are now accounting for nearly all the heritable variation.

Efforts to realize genomics-based breeding in conifer tree improvement in the US have culminated under a collaborative research project funded by the USDA. The Conifer Translational Genomics Network Coordinated Agricultural Project (CTGN CAP) has successfully brought genomics-based breeding to application in the four major cooperative forest tree breeding programs in the US. The project has completed the basic research on allele discovery of economic traits in conifers, conducted the translational research, and provided education and training to tree breeders to realize this goal during the four-year project period (2007 to 2011). Genomics-based breeding can now be applied to the production of over 1.3 billion loblolly pine, slash pine and Douglas-fir seedlings planted annually in the US.

The CTGN CAP obtained high-density single nucleotide polymorphic (SNP) genotypes (over 7000 SNPs) on nearly 10,000 loblolly pine, slash pine, and Douglas-fir trees from the breeding cooperatives. These data were used to estimate molecular breeding values that can now be used to make selections as an alternative to breeding values based on mature traits that can take many years to the time of evaluation and can be expensive to measure in field tests. The CTGN CAP was the first to apply the advanced Illumina Infininium SNP-genotyping technology that is now being used in the breeding of most major crops such as corn, wheat, tomato, potato, and many more.

The CTGN CAP culminates a long-term investment by the USDA in this team of researchers and their work on the technological advancement of tree breeding in the US. USDA’s efforts began more than 20 years ago with funding of a series of grants under the National Research Initiative Plant Genome Program and continued under the CSREES IFAFS and AFRI programs. Recently, NIFA has awarded a grant to members of this team to conduct full genome sequencing in loblolly pine, sugar pine, and Douglas-fir. The full genome sequence will soon allow tree breeders to practice genomics-based breeding using genetic variation from the entire genome (genomic selection), and thus minimizing or even eliminating the missing heritability problem in forest trees. The partnership between USDA research programs and this team of researchers has resulted in a modernization of tree breeding technology in the US that will secure US competitiveness in the production of forest products.

I5

Identification of genes and alleles influencing wood development in Eucalyptus

Simon Southerton*, Shannon Dillon, Bala Thumma CSIRO Plant Industry, Australia

E-mail: simon.southerton@csiro.au BMC Proceedings2011,5(Suppl 7):I5

The goal of many forest tree breeding programs is to increase the quantity and quality of wood products from plantations. Due to their outcrossing breeding systems, long generation times and relatively short history of domestication, breeding populations of most forest trees

(7)

closely resemble the wild state. Consequently, vast stores of genetic variation are available for selection. Because wood traits are under polygenetic control (quantitative), genetic improvement will rely on selection of multiple alleles, each of relatively small individual effect.

Marker-assisted selection may enhance tree breeding programs by enabling informed selection of parents for crossing; fixing desirable alleles in the homozygous state; increasing selection intensity through screening large numbers of individuals; enabling early selection in seedlings and by reducing phenotyping costs. The low linkage disequilibrium found in most forest trees makes them ideally suited to candidate gene-based association mapping approaches for marker discovery. This approach seeks to find alleles which affect phenotype and that remain linked to the trait across populations and over many generations. This methodology is well suited to tree breeding programs which aim to maintain a broad genetic base i.e. programs with a large number of families.

We are using association studies to identify genes and allelic variation that influences wood fibre properties inEucalyptus nitens. Candidate genes are being selected on the basis of their known involvement in cell wall synthesis pathways expected to impact wood traits. Single nucleotide polymorphisms (SNPs) are identified in candidate genes by sequencing in a number of unrelated individuals. Selected SNPs are being genotyped across large unrelatedE. nitenspopulations that have been extensively phenotyped for wood properties including cellulose and lignin content, pulp yield, MFA, and density. Several SNPs significantly associated with wood properties have been identified and subsequently validated in other provenance or mapping populations growing in different environments.

Selected SNPs are being investigated further to determine whether or not the SNP is the causative polymorphism and how the polymorphism influences the trait. DNA markers identified in this research may be used to complement existing index selection strategies inE. nitensbreeding programs. Strategies for exploiting SNPs for marker-assisted selection in seedling-based breeding programs will be discussed.

I6

Gene mapping in white spruce (P. glauca): QTL and association studies integrating population and expression data

John MacKay1*, Brian Boyle2, Walid El Kayal3, Marie-Claire Namroud2, Trevor Doerksen4, Janice Cooke3, Nathalie Isabel5, Jean Beaulieu4, Philippe Rigault6, Paul Bicho7, Jean Bousquet2,1

1Université Laval, Arborea, Centre for Forest Research and Institute for Systems and Integrative Biology, Québec, Québec, Canada;2Université Laval, Arborea, Centre for Forest Research and Institute for Systems and Integrative Biology, Québec, Québec, Canada;3University of Alberta, Department of biological Sciences, Edmonton, Canada;4Natural Resources Canada, Laurentian Forestry Center and Canadian Wood Fibre Centre, Québec, Québec, Canada;5Natural Resources Canada, Laurentian Forestry Center, Québec, Québec, Canada;6Gydle Inc., Québec, Québec, Canada;

7FPInnovations, Vancouver, British Columbia, Canada E-mail: john.mackay@sbf.ulaval.ca

BMC Proceedings2011,5(Suppl 7):I6

Background:Connecting phenotype with genotype is the basis for developing forest genetic applications such as marker assisted selection (MAS). Quantitative Trait Locus (QTL) mapping and genetic association mapping (or linkage disequilibrium (LD) are two major approaches to find genes that control phenotypes of interest in forest trees. Quantitative trait loci (QTL) and association mapping experiments in white spruce (Picea glauca[Moench] Voss) aimed to identify genes linked to or associated with growth, adaptation, and wood property traits.

Gene mapping in conifer trees presents us with specific challenges, including very large genome sizes, low level of linkage disequilibrium, and large effective size in breeding populations. Therefore, association mapping experiments have relied on testing candidate gene targets rather than genome-wide association scans. We have explored different approaches to utilize gene expression and SNP outlier data to identify candidate genes, to help to explain the findings of gene mapping experiments and provide a broader understanding of observed phenotypic variations.

Results: Growth and phenology:The genomic architecture of bud phenology and height growth was investigated by assessing QTLs across

pedigrees, years, and environments (1).A total of 11 distinct QTLs for bud flush, 13 for bud set, and 10 for height growth were localized on a linkage map highly-enriched in gene markers. Nearly 50% of the QTLs were stable across environments and/or years and 20% were replicated between populations. The proportion of phenotypic variance explained by QTLs ranged from 3% to 22.2%, and QTLs accounted for up to 70% of trait variance. These outcomes were integrated with findings from studies aimed identifying local adaption genes and gene expression associated with bud formation.

A genome-wide scan of 534 SNPs localized in 345 expressed genes was used to detect genes putative linked to local adaptation (2). We identified 5.5% of genes as outliers with FST at the 95% confidence level, and 14%

of genes as candidates for local adaptation with a Bayesian method. The list of candidate genes and outliers includes sequences which co- localized with the QTLs for bud phenology.

A bud set roadmap was constructed by comprehensive microarray and qRT-PCR analysis of dormancy transition in bud, stem, needle, and root tissues over a time course, under short and long days (3). Tissue expression profiles were used to identify genes expressed only or preferentially in developing buds, which we hypothesize to play a more prominent role in bud formation. A core group of genes likely involved in the initiation of bud formation included about 100 of the bud-prominent genes and several sequences encoding potential regulatory proteins. Several of the bud set roadmap genes including bud-prominent genes co-localized with QTLs for the time of bud set.

Wood properties:Wood physical traits were assessed using SilviScan technology in a population of 1700 trees comprising 215 open-pollinated families. In a pilot study, we tested for associations between single nucleotide polymorphisms (SNP) in 550 candidate genes and wood traits (4). We found 13 SNPs significantly associated with wood traits. The phenotypic variance explained reached up to 11% with approaches combining several SNPs.

Most association studies of wood properties have tested candidate genes that are highly expressed in secondary xylem, hypothesizing that genes that are preferentially or strongly expressed during wood formation are more likely to control wood properties. However, this hypothesis had not been tested. The genotyped sequences included genes with diverse expression profiles. Their transcript accumulation profiles were determined in trees grown under controlled conditions with a large-scale custom oligonucleotide microarray representing 25,094 different spruce genes. Of the 550 genes tested for association, 29% accumulated preferentially in secondary xylem compared to both secondary phloem and needles, but as many genes (29%) were phloem preferential. Xylem-preferential RNA accumulation was found for 10 of the 13 genes harbouring SNPs significantly associated. Our findings confirm that expression data were relevant for selecting candidate genes but not all of the genes containing significant SNPs were xylem preferential.

Transcript accumulation was also studied in secondary xylem of trees from the provenance-progeny trial, to further characterize the genes containing SNPs significantly associated with wood traits. In some cases, significantly different transcript levels were found among the different SNP genotypes.

Xylem-preferential RNA accumulation was shown for the majority of these genes, which indicates that. Our results suggest that differential expression may be associated with SNP genotypes.

Large-scale genotyping:A meta-analysis was used to integrate data from multiple experiments in order to identify and assign priorities to approximately 5000 candidate genes for association mapping experiments.

The candidate gene selection considered the above gene expression data, findings from transcriptomic investigations of gene regulation, studies investigating transcriptional variation within mapping populations trees, and outlier data related to local adaptation. A large-scale genotyping chip was developed and data were obtained for 7000 SNPs from nearly 2500 genes.

Conclusions:This report describes QTL mapping and genetic association mapping results. We have illustrated ways in which gene expression and population data may be of value in these approaches, whether they are used to select candidate genes or to characterize the physiological processes underlying marker-trait associations.

References

1. Pelgas B, Bousquet J, Meirmans PG, Ritland K, Isabel N:QTL mapping in white spruce: gene maps and genomic regions underlying adaptive traits across pedigrees, years and environments.BMC Genomics2011,12:145.

(8)

2. Namroud M-C, Guillet-Claude C, Mackay J, Isabel N, Bousquet J:Molecular evolution of regulatory genes in spruces from different species and continent: heterogeneous patterns of linkage disequilibrium and selection but correlated recent demographic changes.J Mol Evol2010, 70:371-386.

3. El Kayal W, Allen CCG, Ju CJ-T, Adams E, King-Jones S, Zaharia LI, Abrams SR, Cooke JEK:Molecular events of apical bud formation in white spruce,Picea glauca.Plant Cell Environ2011,34:480-500.

4. Beaulieu J, Doerksen T, Boyle B, Clément S, Deslauriers M, Beauseigle S, Blais S, Poulin P-L, Lenz P, Caron S,et al:Association genetics of wood physical traits in the conifer white spruce and relationships with gene expression.Genetics2011,188:197-214.

S3. GENOMICS ASSISTED BREEDING

I7

Capturing and genotyping the genome-wide genetic diversity of trees for association mapping and genomic selection

Matias Kirst1*, Márcio Resende2, Patricio Munoz3, Leandro Neves3

1School of Forest Resources and Conservation, University of Florida, P.O. Box 110410, Gainesville, FL 32611, USA;2Genetics and Genomics Graduate Program, University of Florida, P.O. Box 110690, Gainesville, FL 32611, USA;3Plant Molecular and Cellular Biology Graduate Program, University of Florida, P.O. Box 110690, Gainesville, FL 32611, USA E-mail: mkirst@ufl.edu

BMC Proceedings2011,5(Suppl 7):I7

Background:Growing demand for food and fiber, and a rapidly changing climate will require that plant breeders accelerate the improvement of germplasm adapted to new sources of biotic and abiotic stress. In trees, the threat from climate change is more evident and the solutions more challenging than in any other plant species, due to the complexity and cost of breeding programs, and the long breeding cycles. Therefore, the discovery of genetic polymorphism that can be exploited for early selection of better adapted and productive individuals is essential.

Quantitative trait loci (QTL) analysis provided an initial glimpse at the architecture of complex traits, but limited transferability across populations and resolution hampered the adoption of markers in tree breeding programs. Recently, association studies have become the method of choice for detection of markers implicated in trait variation, because of higher resolution, population transferability and allelic diversity captured relative to the QTL approach. However, in tree species, association studies have been largely constrained to sampling the genetic diversity in a limited fraction of the genome, and in small populations. Evidence from genome-wide association studies (GWAS) in humans and advanced crops clearly show that larger populations, and the sampling of regulatory variants and rare alleles is critical to dissect the genetic control of complex traits for marker-assisted breeding (MAB). As the limitations of QTL and GWAS approaches become evident,hybridintermediate strategies that combine the advantages of both methods have emerged. Notably, genomic selection has become an alternative to MAB. Genomic selection (GS), which relies on developing genome-wide marker-based models that predict the genetic value of progeny, will be particularly valuable for early selection in tree breeding programs. However, the implications of GS may also be highly valuable to identify mating designs that generate progeny with optimal allelic combinations for superior growth and wood properties, and adaptive capabilities.

Methods:To address limitation of recent association studies in forest tree species we have adopted an approach that combines targeted sequence- capture followed by high-throughput sequencing, to genotype eastern cottonwood (Populus deltoides) and loblolly pine (Pinus taeda) populations.

To identify genetic polymorphisms that regulate biomass productivity, wood quality, and disease resistance, we have optimized methods of sequence-capture targeted regions of the genome for unbiased, high- throughput and low cost recovery of coding and regulatory sequences.

This establishes the foundation for GWAS in both species, addressing a critical limitation of these studies in tree genomesi.e. sampling of regulatory variants and rare alleles.

While the work described above resolves part of the challenges of association studies, the use of GWAS will be most useful for discovery of genes to be targeted for genetic modification, rather than MAB. This is

the case because the fraction of the total genetic variation explained by association studies is likely to remain small, considering the limitations of the existing populations. In an effort to implement marker-based breeding, we recently completed the first assessment of the utility of GS in an experimental population of loblolly pine, where we explored the contribution of factors such as age of model estimation and site location on the accuracy of prediction models. Furthermore, the incorporation of non-additive effects to prediction models has been evaluated for improvement of their accuracy.

Results and conclusions:Two association studies are currently underway, in loblolly pine andPopulus deltoides, where a large fraction of the coding and regulatory sequences of their genomes are being re-sequenced for SNP detection. Existing results from sequence capture indicate that the majority of targeted regions can be effectively captured, even in genomes of very high complexity, such as that of loblolly pine. In parallel we have now verified the suitability of applying genomic selection to a breeding population of loblolly pine. Estimates generated for prediction models developed for this population indicates that accuracies that are comparable to traditional phenotypic selection can be obtained. Therefore, considering the significant reduction in the breeding cycle length due to early identification of elite genotypes, the increase in efficiency per unit of time in the selection response of genomic selection is almost twice as high, compared to traditional breeding. By combining GS with advanced methods of vegetative propagation, breeding and seedling production, a breeding cycle can be reduced from decades to less than 5 years. While early selection of superior genotypes is an obvious application of GS to tree improvement programs, prediction models can be utilized to guide the mating design of future breeding cycles to favor stacking of favorable alleles over multiple generations. Towards this end, we have initiated crosses aimed at generating families that are predicted to have exceptional adaption, as well as biomass growth and quality properties for bioenergy, pulp and paper and timber production.

Acknowledgements:This work was supported by the University of Florida Genetics and Genomics Graduate Program, the Plant Molecular and Cellular Biology Graduate Program, the Forest Biology Research Cooperative (FBRC), the US Department of Agriculture National Institute of Food and Agriculture Plant Breeding and Education Program (award no. 2010-85117-20569) and the National Science Foundation (award no.DBI-0501763).

I8

Genomic selection in loblolly pine - from lab to field Fikret Isik*, Ross Whetten, Jaime Zapata-Valenzuela, Funda Ogut, Steven McKeand

Cooperative Tree Improvement Program, North Carolina State University, Raleigh, USA

E-mail: fisik@ncsu.edu

BMC Proceedings2011,5(Suppl 7):I8

Background:Tree breeding is logistically complex and expensive, and breeders have long sought to use molecular markers to accelerate breeding. A candidate gene approach based on testing for association between the presence of DNA sequence variation in or near candidate genes, and phenotypic variation in a population has long been explored [1,2]. However, using candidate gene approach (QTLs) has not been successful in breeding [3,4]. QTL-trait associations detected in one genetic background are often not observed in other families, because of recombination of genes during the segregation and low levels of linkage disequilibrium in the population. A new technology called genomic selection (GS) is revolutionizing dairy cattle breeding. In GS, marker effects are first estimated in a large training population (>500) with both phenotypic and genotypic data. Subsequently, estimated marker effects are used to predict breeding values in validation populations for which marker genotypes but not phenotypes are available [4]. Several dairy cattle breeding companies now routinely use GS to select and market bulls. The success of GS in cattle breeding is largely based on bovine genome sequencing and discovery of thousands of SNP markers. GS application, if successful, will have a great impact on forest tree breeding because of their complex and logistically difficult breeding programs. Although, there have been several simulation studies examining the effective population size, linkage disequilibrium, and heritability on the predicted accuracy of

(9)

GS in tree breeding [5], GS has not yet been demonstrated for forest trees using empirical markers data, mainly due to lack of sufficient dense markers.

Methods:Biallelic SNP markers provided by the CTGN project (http://

dendrome.ucdavis.edu/ctgn/) were used for genotyping. A population of 149 cloned full-sib offspring of loblolly pine (Pinus taeda L.) was phenotyped. Fitting 3406 informative SNP markers simultaneously, we estimated genome-wide breeding values and compared them with breeding values based on pedigree model. Variances explained by the marker additive and dominant effects were obtained.

Results:The accuracy of the genomic estimated breeding values ranged from 0.30 to 0.83 for growth and wood quality traits. Lignin and cellulose content had great accuracy values from GS compared to growth traits. The accuracies were comparable with breeding values that were calculated based on the traditional pedigree model. If we take into account time needed to complete progeny testing, GS would be more efficient than classical progeny testing for some traits. The marker additive effects explained 18% and 23% for lignin and cellulose, respectively. Variances could not be determined for height and volume, because the Gibbs sampler failed to converge, even after five million iterations. We speculate that observed accuracies in this study trace familial linkage rather than historical LD with trait loci, because of small population size and relatively deep pedigrees. The markers are sampling the haplotypes and thus constructing the pedigrees rather than explaining phenotypic variance.

Nevertheless, the results are promising, and we expect that with decreases in genotyping cost, GS has a potential to fundamentally change tree breeding in the near future.

Challenges of GS applications in tree breeding:Despite promising results from some early work based on empirical data, there some challenges to overcome to routinely use GS in tree breeding. Conifers have genome size with a range between 18,000 and 40,000 Mbp [6].

Their populations have low levels of LD which decays rapidly. LD in loblolly pine decays to less than r2=0.25 within 2000 bp [7]. Low LD is due to genetic recombination over the evolutionary history of the species and causes inconsistency of QTL-marker association. Large genome size and historically low LD require large numbers of dense markers to explain a considerable amount of phenotypic variation in complex traits.

Another challenge is the lack of genetic maps in forest trees. With a few exceptions, the genomes of forest trees have not been sequenced, and thus precise locations of SNP markers are lacking, which hinders the use of haplotypes. Using haplotypes reduces the dimensions of the data and thus requires much smaller computing resources to analyze. More importantly, with haplotypes, larger variation between trees can be obtained using allelic combinations, although larger training populations are required to adequately sample the effects of all the haplotypes.

High marker genotyping cost is the major obstacle in applications of GS in forest trees. More cost efficient genotyping technologies, such as genotyping-by-sequencing and restriction digestion are being explored to reduce cost of markers. On the other hand, advances in computer power have made it possible to analyze large amount of complex data, but bioinformatics challenges still remain to analyze sequence data and SNP marker calling.

Further research is needed in development of training models and calibration of prediction model. The number of generations that statistical models can be used before losing accuracy remains to be determined in forest trees. Another question is the validity of models across different populations. In cattle breeding, lower accuracies of GS for dairy versus beef cattle remains a challenge. For some tree species, GxE interaction could be an issue to be addressed; observed marker-trait association observed in one population may not hold in another environment.

Conclusions:We are currently working on construction of realized genomic relationship matrix based on SNP markers to use in predictions of breeding values. This method provides flexibility in terms of fitting common environmental effects in mixed models. We expect that decreases in marker genotyping costs will make GS in pine breeding feasible in the near future. Our group will work on pilot projects with forestry companies in the southern US, and plans are underway to revise breeding strategies to incorporate genomic selection.

References

1. Tabor HK, Risch NJ, Myers RM:Candidate-gene approaches for studying complex genetic traits: practical considerations.Nat Rev Genet2002, 3:391-397.

2. Hayes BJ, Bowman PJ, Chamberlain AJ, Goddard ME:Invited review:

Genomic selection in dairy cattle: progress and challenges.J Dairy Sci 2009,92:433-443.

3. Goddard ME, Hayes BJ:Genomic selection.J Anim Breed Genet2007, 124:323-330.

4. Meuwissen THE, Hayes BJ, Goddard ME:Prediction of total genetic value using genome wide dense marker maps.Genetics2001,157:1819-1829.

5. Grattapaglia D, Resende M:Genomic selection in forest tree breeding.

Tree Genetics and Genomes2010,7(2):241-255.

6. Morse AM, Peterson DG, Islam-Faridi MN, Smith KE, Magbanua Z:Evolution of genome size and complexity in Pinus.PloS ONE2009,4(2):e4332, doi:10.1371/journal.pone.0004332.

7. Brown GR, Gill GP, Kuntz RJ, Langley CH, Neale DB:Nucleotide diversity and linkage disequilibrium in loblolly pine.PNAS2004,101:15255-15260.

S4. REPRODUCTION, GROWTH AND DEVELOPMENT

I9

Salicylate metabolism in Populus

Chung Jui Tsai1*, Wenbing Guo1, Benjamin Babst1, Batbayar Nyamdari1, Yinan Yuan2, Raja Payyavula1, Han-Yi Chen1, Xue Liangjiao1, Kate Tay1, Vanessa Michelizzi1, Scott Harding1

1University of Georgia, USA;2Michigan Technological University, USA E-mail: cjtsai@uga.edu

BMC Proceedings2011,5(Suppl 7):I9

Phenolic metabolites that contain salicylic acid (SA)-like moieties are major non-structural constituents inPopulusleaves, shoots and roots. These so- called phenolic glycosides (PGs) are taxonomically limited to the Salicaceae, where they are known to mitigate insect and animal herbivory. SA itself is an important signaling molecule in plant defense and abiotic stress responses, and is derived from the isochorismate or phenylpropanoid pathways, depending on the species [1,2]. Hyper-accumulation of SA reduces growth inArabidopsis[3], while high levels of PG have been associated with reduced growth inPopulus[4]. Although the common PGs, salicin, salicortin, and their derivatives contain a salicyl moiety, a direct metabolic relationship between PG and SA inPopulushas not been shown.

We have therefore been using stable isotope tracers, cell culture feeding, functional genomics and transgenic perturbation to probe the relationship between SA, PG and growth inPopulus.

Stable isotope incorporation confirmed a hydroxycinnamate-benzoate origin of the salicyl moiety of PGs, and provided the first empirical evidence that the bioactive hydroxycyclohexenone moiety of salicortin is also a phenylpropanoid derivative [5]. Therefore it is unlikely that PG biosynthesis involves the salicylic acid pathway. To further address any possible involvement of the isochorismate pathway in PG biosynthesis, isochorismate synthase (ICS) was characterized.ICSis present as a single-gene in most sequenced plant genomes, but is duplicated inArabidopsis. InArabidopsis, chorismate-derived SA is the obligatory route for defense signaling, mediated byAtICS1in response to various biotic and abiotic cues. In contrast, the single-copyPopulus ICSis not stress-inducible, and is involved in biosynthesis of phylloquinone for photosynthetic electron transport [2].

We found no evidence ofICSinvolvement in SA or PG biosynthesis in transgenicPopulus, pointing to lineage-specific evolution of the ICS-derived SA pathway inArabidopsis.

We transformedPopuluswith the bacterial genes for the biosynthesis and degradation of SA for a more in-depth investigation of the possibility of an SA interaction with PG regulation. SA-hyperaccumulating and SA-deficient lines were generated by expressing a salicylate synthase and a salicylate hydroxylase, respectively. SA was converted to SA-glycoside (SAG) and gentisic acid-glucoside (GAG) in the hyperaccumulating lines. SAG and GAG were very low in wild-type and were not detected in the SA-deficient lines. Despite these clear differences, no changes in PG levels were detected. These results argue against SA as potential precursor of PGs.

Survival and establishment of rooted cuttings were negatively affected by SA-hyperaccumulation. Once established, however, growth rates were similar among plant lines, in sharp contrast to SA-over-producing Arabidopsis. SA-hyperaccumulating lines exhibited altered thermal tolerance, based on electrolyte leakage assays and metabolite profiling.

Références

Documents relatifs

Candidate gene identification was performed by comparing gene expression and protein profile of different genetic materials (tolerant vs. susceptible) as well as, under

8 genes known to contribute to root development in rice and Arabidopsis or co-segregating with meta-QTLs for root development in rice (Courtois et al, 2009): CRL1/ARL1, 4

Expression analyses of many genes involved in detoxifying the reactive oxygen species have been studied in healthy trees, TPD trees (Sookmark, unpublished data) and also in

Allelic Diversity At Orthologous Candidate Genes For Drought Tolerance In Cereal Crops: Example Of The ASR Gene Family.. Many candidate genes have been proposed during the last

In the present study, we characterized eQTL in porcine skeletal muscle by combining genome-wide eQTL and ASE analyses based on RNAseq data from 189 animals, in order to

Use of meta-analysis to combine candidate gene association studies: application to study the relationship between the ESR PvuII polymorphism and sow litter size...  INRA, EDP

• We studied the differences in branch characteristics along the stems of six different genetic entries of 20 year old Scots pines (Pinus sylvestris L.) grown at di fferent

The aim of the study was to develop empirical models for predicting the production of wild mushrooms in Scots pine (Pinus sylvestris L.) forests in the Central Pyrenees based