David CROS, Florence JACOB, AchilleNYOUMA, Billy TCHOUNKE, DadangAFANDI, IndraSYAHPUTRA, Benoit COCHARD

(1)

Advances in oil palm

genomic selection

David CROS, Florence JACOB, Achille NYOUMA,

Billy TCHOUNKE, Dadang AFANDI, Indra SYAHPUTRA, Benoit COCHARD

(2)

1/8 - Principle of genomic selection

TRAINING POPULATION

VALIDATION POPULATION related

(3)

TRAINING POPULATION Phenotyping 𝑦𝑦𝑖𝑖 = µ + ∑𝑗𝑗=1𝑛𝑛 𝑍𝑍𝑖𝑖𝑗𝑗𝑚𝑚�𝑗𝑗 + 𝑒𝑒𝑖𝑖, … Prediction models VALIDATION POPULATION High density genotyping

(4)

TRAINING POPULATION APPLICATION POPULATION Phenotyping 𝑦𝑦𝑖𝑖 = µ + ∑𝑗𝑗=1𝑛𝑛 𝑍𝑍𝑖𝑖𝑗𝑗𝑚𝑚�𝑗𝑗 + 𝑒𝑒𝑖𝑖, … Prediction models VALIDATION POPULATION Validated model High density genotyping

(5)

TRAINING POPULATION APPLICATION POPULATION Phenotyping 𝑦𝑦𝑖𝑖 = µ + ∑𝑗𝑗=1𝑛𝑛 𝑍𝑍𝑖𝑖𝑗𝑗𝑚𝑚�𝑗𝑗 + 𝑒𝑒𝑖𝑖, … Prediction models (selection candidates) VALIDATION POPULATION Validated model High density genotyping

(6)

TRAINING POPULATION APPLICATION POPULATION Phenotyping 𝑦𝑦𝑖𝑖 = µ + ∑𝑗𝑗=1𝑛𝑛 𝑍𝑍𝑖𝑖𝑗𝑗𝑚𝑚�𝑗𝑗 + 𝑒𝑒𝑖𝑖, … Prediction models (selection candidates) VALIDATION POPULATION Validated model High density genotyping (Pre-) selection on GEGV/GEBV

(7)

2/8 - Information captured by markers

GS models can, depending on trait and population:

–

predict the genetic value of unevaluated selection

candidates

, with prediction accuracies ranging from 0.14

and 0.73 for various yield components

–

capture genetic differences within full-sib families and

between families

, enabling the selection of the best

individuals of the best families

(Cros et al. 2017, Kwong et al. 2017a, b, Cros et al 2015b)

(8)

• SSR suitable for validation, not for

practical application

(Cros et al. 2015b; Marchal et al. 2016)

• High throughput SNP genotyping required as:

- large number of individuals to genotype

- high marker density maximizes GS accuracy (Kwong et al. 2017b)

 SNP arrays

(Kwong et al. 2016, 2017a, b; Ithnin et al. 2017)

 _{genotyping by sequencing (GBS)}

(Cros et al. 2017; Nyouma et al. 2019b)

3/8 - Molecular data

Initial empirical validations with SSRs  high throughput SNP genotyping

… … … … … SNP genotyping:

(9)

• Minimum marker density for GS is affected by marker type, marker sampling, trait and population, but a few thousand SNPs are enough

(Marchal et al. 2016; Kwong et al. 2017a; Cros et al. 2017; Nyouma et al. 2019b)

E.g., in empirical between-site hybrid cross value prediction with GBS:

(Cros et al. 2017, BMC Genomics)

(10)

• Minimum marker density for GS is affected by marker type, marker sampling, trait and population, but a few thousand SNPs are enough

(Marchal et al. 2016; Kwong et al. 2017a; Cros et al. 2017; Nyouma et al. 2019b)

E.g., in empirical between-site hybrid cross value prediction with GBS:

(Cros et al. 2017, BMC Genomics)

3/8 - Molecular data

• SNP filtering can reduce marker density, with GS accuracies equal or

higher than with all the SNPs (GBS-SNPs with the least missing data (Cros et al. 2017, BMC Genomics); SNPs with the highest association scores (rrBLUP-B) (Kwong et al. 2017, Sci Report))

(11)

11

4/8 - Training and application populations

(12)

12

E.g., % oil in pulp in Deli oil palm:

(Cros et al, 2015 TAAG)

G

BL

UP

accu

racy

Maximum relationship between training and application individuals

• GS accuracy increases with

the relationship between training and application individuals (Cros et al. 2015b)

 _{implementing GS in}

full-sibs or progenies of the training individuals will maximize GS efficiency

4/8 - Training and application populations

(13)

5/8 - Statistical methods of predictions

A wide range of statistical methods has been applied for oil palm GS, and they did not significantly affect GS accuracy (although BayesB and

SVM could be slightly better in some cases)

(Cros et al. 2015b; Kwong et al. 2017a,b; Ithnin et al. 2017)

 GBLUP and RR-BLUP _{(widely used for GS predictions due to their}

simplicity and computational efficiency) are suitable for oil palm

(14)

6/8 - Data modeling

14

Independent GS predictions in each

parental group Parents

Parent performances in hybrid crosses

(GCAs, testcross phenotypic means)

Parent phenotypes

Prediction approach Genotypes Data records

Cros et al. 2015b;

Wong and Bernardo 2008

(15)

15 Independent GS predictions in each parental group Joint GS predictions in the 2 parental groups Hybrids Parents Parent performances in hybrid crosses

Parent phenotypes

Cros et al. 2015b;

Ithnin et al. 2017; Kwong et al. 2017b

Hybrid phenotypes

Cros et al. 2015a; Nyouma et al. 2019b

Parents

Cros et al. 2015a, 2017, 2018; Marchal et al. 2016

Kwong et al. 2017a

Using parental origin of alleles Not using parental

origin of alleles

(16)

16 Independent GS predictions in each parental group Joint GS predictions in the 2 parental groups Hybrids Parents Parent performances in hybrid crosses

Parent phenotypes

Cros et al. 2015b;

Ithnin et al. 2017; Kwong et al. 2017b

Hybrid phenotypes

Cros et al. 2015a; Nyouma et al. 2019b

Parents

Cros et al. 2015a, 2017, 2018; Marchal et al. 2016

Kwong et al. 2017a

Using parental origin of alleles Not using parental

origin of alleles

 _{Comparison of these different modeling approaches is lacking}

(17)

7/8 - Annual genetic progress

…but comparing GS and conventional phenotypic selection must take into account r, i (selection intensity) and L (generation interval)

17 Annual response to selection

i × r

â,a

× σ

a

L

=

accuracy (r)= key parameter to evaluate GS, as directly related to the annual genetic progress:

(18)

…but comparing GS and conventional phenotypic selection must take into account r, i (selection intensity) and L (generation interval)

18

 Simulations showed GS could largely increase annual selection

progress by decreasing the length of the breeding cycles and by increasing the selection intensity

(Wong and Bernardo 2008, Cros et al. 2015a, Cros et al. 2018)

Annual response to selection

i

× r

â,a

× σ

a

L

=

accuracy (r)= key parameter to evaluate GS, as directly related to the annual genetic progress:

₊

(19)

RRGS2:

• Calibration:

- genotypes = progeny-tested individuals - phenotypes = hybrid individuals

• 300 candidates

• progeny-tests 1 generation / 2

+72%

+45%

RRS:

• 120 candidates / population / generation • progeny-tests every generation

RRGS1:

• Calibration:

- genotypes = idem RRGS2 + hybrids - phenotypes = hybrid individuals • 300 candidates

• progeny-tests 1 generation / 4

Cros et al, 2015 BMC Genomics

(20)

Very promising simulation results…

but empirical studies showed GS can also have low accuracy (<0.2)

and/or fail to capture within-family genetic differences in some traits / populations

(Cros et al. 2015b, 2017, Ithnin et al. 2017, Nyouma et al. 2019)

 _{field evaluations remain necessary in all cycles}

 current possible practical application = …

20

(21)

Very promising simulation results…

but empirical studies showed GS can also have low accuracy (<0.2)

and/or fail to capture within-family genetic differences in some traits / populations

(Cros et al. 2015b, 2017, Ithnin et al. 2017, Nyouma et al. 2019)

 _{field evaluations remain necessary in all cycles}

 current possible practical application = genomic preselection prior

to field trials (progeny tests, clonal trials):

21

(22)

Current method for preselecting parents of hybrid crosses before

progeny tests = phenotypic preselection on 1 or 2 traits with high h²

(23)

Cros et al, 2017 BMC Genomics

 _{new possible method = preselection on more yield components} Current method for preselecting parents of hybrid crosses before

progeny tests = phenotypic preselection on 1 or 2 traits with high h²

… but in an empirical dataset, FFB in the hybrid crosses could have been

more than 10% higher with genomic preselection in the parental

populations prior to progeny tests (thanks to higher selection intensity):

(24)

Current method for preselecting hybrid individuals before clonal trials = phenotypic preselection on 1 or 2 traits with high H²

(25)

Current method for preselecting hybrid individuals before clonal trials = phenotypic preselection on 1 or 2 traits with high H²

…but for most of the oil yield components, GS models better estimate

the individual genetic values than conventional phenotyping:

 _{New possible methods = preselection on all yield components /}

early GS preselection

Nyouma et al. (in prep)

GS approaches conventional PS

GS PS GS

(26)

Oil palm GS simulations showed faster increase in inbreeding in the parental populations than conventional breeding (possibly detrimental for seed production and long term progress)

(27)

Oil palm GS simulations showed faster increase in inbreeding in the parental populations than conventional breeding (possibly detrimental for seed production and long term progress)

 _{better to combine GS and inbreeding management:}

Simulation results, Tchounke et al. (in prep)

max selected La Mé per full-sib family = 1 mate selection

no inbreeding management

Mate selection reduces inbreeding in parents, but also increases

genetic progress (by optimizing matings between selected individuals)

(28)

8/8 - Conclusions

GS has the potential to speed up oil palm genetic progress to a

unprecedented level by:

- allowing better preselection before progeny tests and clonal trials

(more traits can be submitted to GS than to phenotypic preselection)

- allowing earlier preselection before progeny tests and clonal trials

(at the nursery stage)

- predicting the genetic value of (much) more selection candidates

- (in the future?) avoiding field trials in some cycles

(29)

Further studies will optimize oil palm GS:

- comparing the GS modeling approaches

- optimizing the design of training and application populations

- making predictions for new traits (e.g. resistance to diseases)

- modeling G × E interactions

- using multi-omics data

- quantifying variations of GS accuracy between families

- making inter-specific predictions

- comparing imputation methods (for GBS)

Etc.

29

(30)

For more information, see PIPOC article and Nyouma et al 2019 TGG (and 2020 articles!).

(31)

Pobè, Benin _{Yaoundé, Cameroon} Montpellier, France

Indonesia

Montpellier, France

(32)

David CROS, Florence JACOB, AchilleNYOUMA, Billy TCHOUNKE, DadangAFANDI, IndraSYAHPUTRA, Benoit COCHARD

Advances in oil palm

genomic selection

1/8 - Principle of genomic selection

2/8 - Information captured by markers

GS models can, depending on trait and population:

–

predict the genetic value of unevaluated selection

candidates

, with prediction accuracies ranging from 0.14

and 0.73 for various yield components

–

capture genetic differences within full-sib families and

between families

, enabling the selection of the best

individuals of the best families

3/8 - Molecular data

3/8 - Molecular data

4/8 - Training and application populations

4/8 - Training and application populations

5/8 - Statistical methods of predictions

6/8 - Data modeling

7/8 - Annual genetic progress

i × r

× σ

L

i

× r

× σ

L

+

+72%

+45%

8/8 - Conclusions

GS has the potential to speed up oil palm genetic progress to a

unprecedented level by:

- allowing better preselection before progeny tests and clonal trials

- allowing earlier preselection before progeny tests and clonal trials

- predicting the genetic value of (much) more selection candidates

- (in the future?) avoiding field trials in some cycles

Further studies will optimize oil palm GS:

- comparing the GS modeling approaches

- optimizing the design of training and application populations

- making predictions for new traits (e.g. resistance to diseases)

- modeling G × E interactions

- using multi-omics data

- quantifying variations of GS accuracy between families

- making inter-specific predictions

- comparing imputation methods (for GBS)

Etc.

Thanks for your attention

-Terima kasih atas

perhatian anda

₊