HAL Id: hal-00894195
https://hal.archives-ouvertes.fr/hal-00894195
Submitted on 1 Jan 1998
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
On the relevance of three genetic models for the
description of genetic variance in small populations
undergoing selection
Florence Fournet-Hanocq, Jean-Michel Elsen
To cite this version:
Original
article
On
the relevance of three
genetic
models
for
the
description
of
genetic
variance
in small
populations undergoing
selection
Florence
Fournet-Hanocq
Jean-Michel
Elsen
Station d’amélioration
génétique
des animaux, Institut national de la rechercheagronomique,
31326 Castanet-Tolosancedex,
France(Received
15 October 1996;accepted
15 December1997)
Abstract - The conservation of
genetic variability
isrecognized
as a necessaryobjective
forthe
optimization
of selectionschemes, particularly
whenpopulations
are small. Numerousmodels, differing by
thegenetic
modelthey rely
on, are available to better understand andpredict
the evolution ofgenetic
variance in a smallpopulation undergoing
selection.This paper compares three
genetic models,
treated eitheranalytically
or with Monte-Carlosimulations,
first in order to validate thepredictions provided by
a ’full-finite model’ forwell-known
phenomena
(e.g.
the effect ofpopulation
management ongenetic
variability),
and
second,
to evaluate when and how theassumptions
made in the twoanalytical
models induce thedeparture
from the third model. The FFM isshown, first,
to be in close agreement with the Gaussiantheory
when used with alarge
number ofloci,
thestochastic
approach making
it much more flexible than the twoalgebraic
models. In the second part of thestudy,
the infinitesimal model appears to be more robust than thesemi-infinitesimal one.
Major
sources ofdiscrepancy
between the deterministic models and theFFM are
identified, notably
thehypothesis
ofindependence
betweenloci,
and then the infinite number of loci or alleles per locus.©
Inra/Elsevier,
Parismodelling
/
genetic variance/
selection/
small population*
Correspondence
andreprints
Résumé - De l’intérêt de trois modèles génétiques pour la description de la variance
génétique dans les petites populations sélectionnées. La conservation de la variabilité
génétique
est reconnue comme unobjectif
nécessaire pourl’optimisation
des schémas desélection,
notamment pour lespetites populations.
De nombreuxmodèles,
différant par leshypothèses
de déterminismegénétique
surlesquels
ils reposent, sontdisponibles
pourune meilleure
compréhension
et une meilleureprédiction
de l’évolution de la variabilitégénétique
dans unepetite population
soumise à sélection. Cet article compare trois modèlesgénétiques différents,
traités soit par la voieanalytique
soit par des simulationsMonte-Carlo,
d’une part pour valider lesprédictions
fournies par un « modèle finicomplet
»pour des
phénomènes
connus(comme
l’influence de lagestion
de lapopulation
sur ladans les deux modèles
analytiques
induisent un écart avec le dernier modèle. Le modèlefini
complet apparaît,
dans unpremier
temps, en bon accord avec la théoriegaussienne
quand
il est utilisé avec ungrand
nombre delocus, l’approche stochastique
le rendant deplus beaucoup plus souple
que les deux modèlesalgébriques.
Dans la secondepartie
del’étude,
le modèle infinitésimalapparaît plus
robuste que le modèle semi-infinitésimal. Dessources
majeures
d’écart entre les modèles déterministes et le modèle finicomplet
sontidentifiées,
notammentl’hypothèse
del’indépendance
entre leslocus, puis
le nombre infini de locus et d’allèles par locus.©
Inra/Elsevier,
Parismodélisation
/
variance génétique/
sélection/
petite population1. INTRODUCTION
Genetic
variability
is necessary toprovide genetic
progressthrough
selection and the conservation ofgenetic variability
is ofincreasing
concern in theoptimization
of selection
schemes,
particularly
when selectedpopulations
are small. Numerousexperiments
[1, 6]
have shown a decrease ingenetic
variability
overtime,
due togenetic
drift andselection,
until the exhaustion of variance in many cases.Therefore,
thorough knowledge
of the evolution ofgenetic
variance in relation to these twophenomena
isimportant
whenconstructing
optimal
selection schemes.Several kinds of models are available to describe the evolution of
genetic
variability, depending
on thehypotheses
usedconcerning
genetic
determinism.Analytical
models,
whoseproperties
oftenrely
on thehypothesis
ofnormality,
provide
aquite
simple
formalization of thephenomena acting
ongenetic
variance.Monte-Carlo simulations allow a more detailed and
complex description
ofpolygenic
inheritance.
This paper aims to compare the
predictions
ofgenetic variability
in a smallpopulation
undergoing long-term
selectionprovided
by
three differentgenetic
models,
two of them treatedanalytically
and the last with Monte-Carlo simulations. Thiscomparison
concernsessentially
theirsensitivity
to the characteristics of thegenome. The
genetic
modelsunderlying
the Monte-Carlo simulations and the twoanalytical
models arepresented
first. Thepredictions provided by
the three models will then becompared
first to validate thegenetic
model used in the MC simulation when itapproaches
the infinitesimalhypothesis,
andsecondly
to determine when theanalytical
modelsdepart
from the ’full-finitemodel’,
used here as reference.2. DESCRIPTION OF THE MODELS
Three
genetic
models arepresented,
two of thembeing
treatedanalytically
andthe last with Monte-Carlo simulations. The Monte-Carlo model will be
developed
first as the mostcomplete
situation,
where the effects ofgenetic
drift and selectionare not
algebraically
formalized butimplicitly
accounted for in the simulation. Theanalytical
models of Verrier et al.[14]
and Chevalet[3]
will then bedescribed,
eachcorresponding
to a different choice ofhypotheses
for thealgebraic
formalization of2.1. The ’full-finite model’
(or FFM)
The
genetic
modelunderlying
the simulation is based on theassumption
thata selected trait is controlled
by
a finite number of linked genesindividually
identified,
located onchromosomes,
with a finite number of alleles per locus.This model is then called the ’full-finite model’ or FFM. Individual
genotypes
aregenerated
according
to the number ofloci,
alleles perlocus,
loci perchromosome,
recombination rates and relative effects of the loci on the selected trait. For a
given
individual,
an environmental value(assumed
to benormally
distributed)
is added to itsgenerated
genotype,
giving
itsphenotypic
value. Awithin-generation
massselection is based on these
phenotypes.
Selectedbreeding
individuals arerandomly
mated and
produce
a newgeneration.
This isperformed
through
the simulation ofmeiosis and
pairing
ofgametes.
Generations are assumed to be discrete.The simulation
algorithm,
initiatedby Hospital
[8]
anddeveloped by
Fournet et al.[7],
uses the Monte-Carloprinciple
andprovides
the mean values and standard deviations forgenetic
mean and variance over time.2.2.
Analytical
modelsThe first model
presented,
establishedby
Verrier et al.[14],
relies on thehypothesis
that the selected trait is controlledby
an infinite number ofindependent
loci,
with identical and small effects. It will then be called in thefollowing
’infinitesimal model’ or IM. Thecomputation
of theinbreeding
coefficient in this model accounts for the effect of selection onfamily
structure(for
moredetails,
see[14]).
The second
model,
developed
by
Chevalet[3]
for a monoeciouspopulation
of Nindividuals,
assumes a finite number of unlinked loci L with an infinite number of alleles perlocus,
and will be called the ’semi-finite’ model or SFM. Thejoint
distribution of the gene effects is then assumed to be multivariate normal. Derivations of the variances of gene effects and covariances between gene effects under selection lead to theprediction
of thejoint
evolutions ofgenetic
andgenic
variances
by
two recurrenceequations.
The number N of individuals has beenreplaced
hereby
the effective size ofpopulation N
e
,
derived from the Latter-Hillequation
[11],
with a Poisson distribution of the number ofoffspring
for eachgenealogical
path, accounting
forgenetic
drift.3. CASES STUDIED
A
polygenic-like
situation with an infinite number of alleles per locus was firstsimulated in the
FFM,
to check if the evolution ofgenetic
variancegiven
by
the three models was similar. This basicsystem
was defined as apopulation
of size N(with
asmany males as
females),
evaluated on their ownperformances
for eachgeneration,
with,
respectively,
25 and 50%
males and females retained. Theheritability
of the selected trait was assumed to be 0.3. A thousandindependent
loci with 500 alleles per locus were simulated in the Monte-Carlo model. Thesensitivity
of the models3.1.
Population
sizeDifferent sizes of candidate
population
N(total
sizes of96,
192 and480,
with asmany males as females for each
size)
were tested. 3.2. Parametersof genetic
determinismIn this
study 1 000, 100,
10 and 2independent
loci were assumed in the FFM and the SFM. Thiscomparison
ofsensitivity
to the number of loci wasperformed
assuming
that all loci were located on one chromosome in the FFM.Here
100,
10 and 2 alleles perlocus,
in the case of 100 or 10loci,
were simulatedin the FFM in order to show the deviation of the SFM from the FFM when this
parameter
decreases.The recombination rate r was assumed to be
0.5,
0.1,
0.01 and0.001,
in the case of1000,
100 and 10 multiallelic locicontrolling
the selectedtrait,
in order to check the effect oflinkage
on theprediction
ofgenetic
variance over time in the FFM. Theeffect of
linkage
was not studied in the model ofChevalet,
asassuming
ri! !
0.5 for anypair
of loci(i,j)
wouldproduce
as manyequations
as differentpairs
of loci.In the ’full finite’
model,
the relative contributions of the loci to thegenetic
variance of the selected trait were assumed to be identical in the
preceding
comparisons.
To test the robustness of the results withrespect
to thisassumption,
andfollowing
Lande andThompson
(1990),
variances were assumed to follow ageometric series,
with the lth locuscontributing
V7l(l —a)a!B
where the constanta determines the relative
magnitude
of the contributions of each locus. This constantis related to the effective number of loci as:
L
e
=(1
+a)/(1 - a).
Simulations wereperformed
with 1 000 biallelicloci,
with effectsfollowing
ageometric
series where theparameter
a wasgiven
valuescorresponding
to2, 5, 10, 50
and 100 effective loci. Theresulting
evolutions ofgenetic
variability
werecompared
to thecorresponding
curves of loci with identical contributions.Thirty
generations
of selection were simulated. A hundred simulations wereperformed
for each combination of factors. With as manysimulations,
a differencejust
higher
than 5%
between thepredictions
of the different models would besignificant
(at
the 5%
level).
A 10%
difference was then considered assignificantly
larger.
4. RESULTS
The
predictions provided by
the three models for thejoint
effects ofpopulation
size and selectionintensity
on the evolution ofgenetic
variance werecompared.
Figure
1 illustrates the case of intermediate value of selectionintensity
(i.e.
25%
of males
selected),
as the trend was the same for the other two tested values(50
%
and 6.25%
of malesselected).
It can be
pointed
out that the three modelsprovided
almost the same evolution ofgenetic variance,
whatever thepopulation
size and selectionintensity.
Asexpected
[4],
thehigher
the selectionintensity,
thegreater
the influence of thepopulation
predictions given
by analytical
and stochastic models for well-knownphenomena
allowed validation of the
genetic
model used for the MC simulations. 4.1. Effect of number of lociTable I
presents
the values ofRV[30]
provided by
the three models whendecreasing
the number of loci L(with
multiallelic loci in theFFM).
RV
[30
]
in the IMdeparted
by
more than 10%
from the FFMprediction
for studied values of L lower than250,
while thepredictions
of SFM and FFM were inquite
good
agreement
for more than 50 loci.Although
the semi-infinitesimal model accounts for a finite number ofloci,
itssensitivity
to thisparameter
isquite
weak: the difference between IM and SFM exceeds 10%
only
when the number of loci considered in the SFM is lower than 10. This behaviour must be related to thehypothesis
of infinite number of alleles per locus. Results of FFMpresented later,
where the effect of the number of loci decreases whenincreasing
the number of alleles perlocus,
areconsistent with this observation.
Figure 2
illustrates the evolution ofRV’
I
overtime for
1 000, 100,
10 or 2 loci. For 100loci,
IM and SFMdepart
from FFMonly
after 20generations,
while for 10loci, they
bothdepart
from FFM asearly
asgeneration
7 and with 2loci,
IM and SFMdepart
from FFM aftergenerations
3 and6,
respectively.
The additivegenetic
variance in FFM decreaseddramatically
However,
the infinitesimal model remainsquite
robust solong
as the number of loci assumed is not too small.4.2. Effect of the number of alleles per locus
since 100 alleles per locus for L = 10. A combination of the two variables
indicating
the amount of available
variability
influenced the behaviour of thegenetic
variance overtime,
more than did each variable alone: at agiven generation,
1000 biallelic loci and 100 loci with 50 alleles per locusprovided
the sameRV’
I
(results
notshown).
4.3. Effect
of linkage
between lociThe decrease in
genetic
variance whenincreasing
thelinkage
between loci washigher
when the number of loci was lower. For 1000 multiallelicloci, only
verystrong
linkage
(r
=0.001)
gave a
significant
decrease in thepredicted RV[30]
in theFFM
(38
%
lower than for r =0.5).
For 100(figure
4)
or 10 multiallelicloci,
nodifference was observed between the absence of
linkage
and a recombination rate of0.1,
but below r =0.01,
stronger
linkage
led to ahigher
decrease ingenetic
variance in theearly generations
and a lowerasymptotic
plateau
in the latter ones.In
fact,
alarge
number of linked loci behave as a small number ofindependent
loci. To a certain
extent,
the genome size seemed the main factoracting
ongenetic
variance,
inagreement
with Robertson[12]
who showed that the limit ingenetic
response for agiven
recombination rate wasdirectly
linked to the chromosomelength.
Nevertheless,
thecomparison
remained limited as Robertson inferred ageneral description
of the selection process in finitepopulations
only
for alarge
4.4. Effect of
inequality
between loci effectsPredictions of the
genetic
variance for 1 000 biallelic loci whose effects followed ageometric series,
withvarying
parameter
agiving
a numberL
e
of effectiveloci,
andpredictions
for L =L
e
loci with identical effects arecompared
infigure
5. Itcan be seen that the
predictions
for L = x loci of identical effects and for 1 000loci of differential effects
giving L
e
= x effective loci werevery
close,
whatever the value of x. This showed that even 1000 loci could not be considered an infinitesimalgenome, if the
hypothesis
of differential effects of the loci(a
few loci withrelatively
large
effects and many others with smalleffects)
is true[10].
5. DISCUSSION AND CONCLUSION
The first
part
aimed toinvestigate
thevalidity
of thegenetic
modelunderlying
the MCsimulations,
whencompared
toanalytical
modelscorresponding
to differenthypotheses
on thegenetic
determinism. The first result to bepointed
out was that the threemodels,
when alarge
number of loci were assumed for the ’semi-finite’ and ’the full-finite’models,
were in closeagreement,
whatever thepopulation
size and selectionintensity
assumed.Moreover,
thecomputation
of the effectivepopulation
size for the monoecious model of Chevalet seemed to bequite
valid as itprovided
the sameprediction
as the dioecious models. Thispart
verified that under thehypothesis
of a verylarge
number ofindependent
identicalloci,
theoligogenic
modeland the Gaussian
theory
agreeclosely.
These first results validate the use of theFurthermore,
the stochasticapproach intrinsically
takes account of the reduction in selectionintensity
ascompared
with thetheory,
of therelationships
betweenmates and
inbreeding
induced in theoffspring
and of thechanging
variance of gene effects due togenetic
drift and faster or slower fixation of alleles. The need of unknownparameters
to introduce in the model does notprevent
its use, as the other two models also makeassumptions
on the unknownparameters
they
use. And theincreasing
knowledge
onQTLs
implied
in thegenetic
determinism of selected traits willprovide
elements for such amodelling.
In the second
part,
thehypothesis
of an infinite number of locicontrolling
theselected trait was shown to be very
important:
similarprediction
ofRVI’
L
with infinitesimal and ’full-finite’ models was obtainedonly
whensimulating
more than500 loci in the FFM. Under this
value,
thedeparture
of the Verrier et al.[14]
model from the FFM was substantial. One can wonder if such alarge
number of loci(500
or1000)
controlling
one trait is realistic?Tanksley
[13]
suggests
that the number ofQTL implicated
in various traits for alarge
set of animal orvegetable species
should vary between 1 and 18. But famous selectionexperiments
on maize
[5]
expressed genetic
progress over 76generations, suggesting
thatgenetic
variability
is infiniteaccording
to the infinitesimaltheory.
In animalbreeding also,
selectionexperiments
have been based on thistheory
for along
time and noevidence of
important discrepancy
between results andtheory
has been found. This contradictionstrengthens
the value ofestimating
the effective number ofQTL
controlling
a trait. It alsopoints
out the lack ofcomplexity
in the FFMhypotheses:
adding
interactions between genes ormutations,
orlocating
the genes on severalchromosomes,
might
be more realistic and informative.Indeed,
recent studies[2, 9]
investigated
the contribution of mutations as a source of newgenetic
variance inmice.
But,
in ourstudy,
since the other two models did not account for interactionsor for new
genetic
variancearising
frommutation,
thishypothesis
was excluded in this version of FFM.The second
hypothesis
of an infinite number of alleles per locus mayexplain
the
discrepancy
observed between the ’semi-finite’ and the ’full-finite’ models for various numbers of loci. The total amount of availablegenetic
variability,
i.e. the combination between the numbers of loci and alleles perlocus,
is not taken into account in the same way in the two models. Is an infinite number of alleles perlocus more realistic than an infinite number of loci? Mutations
might
lead to ahuge
number of very close allelicforms,
whose effects arenormally
distributed;
on thecontrary,
thelarge potential variability
is reduced because some of themutational allelic forms are not
functional,
and different allelesgive
exactly
thesame
phenotype.
In theend,
a discrete distribution of allelic effects ispossible.
In this case, theprediction provided by
the ’semi-finite’ model holds trueonly
in the shortterm,
with a very slow decrease in variance after the firstgenerations
ofselection,
whereas the ’full-finite’ model shows a cleartendency
to arapid
fixationof available alleles.
The effect of
linkage
was also studiedby Hospital
[8].
Twophases
can be observedin our results when the
linkage
isstrong:
a firstphase
with arapid
decreasein
genetic
variability
and a secondphase
of slow decreasereaching
aplateau.
genetic pool,
with arapid
reduction ingenetic
variability,
andproduces hitch-hiking
phenomena:
unfavourable genes are selectedjointly
with favourables ones and arepoorly
eliminated because of the low recombination rate in the secondphase.
Whenlinkage
is lessintense,
thegametes
arereorganized
by
recombination over alonger
period
and the decrease ingenetic
variance is moreregular.
Thispart
of thestudy
demonstrates amajor
source ofpossible
discrepancy
fromreality
whenusing
thetwo
analytical
models: asthey
both considerindependent loci, they
arelikely
to overestimate theremaining genetic
variance over time. It may then be underlinedthat the
ability
to consider apossible linkage
between loci confers agreater
concern to thegenetic
model in theFFM,
aslinkage
is known to exist between loci and isnotably
an essentialprinciple
forQTL
detection.Finally,
the identical effects of the lociappeared
to be astrong
hypothesis.
Indeed,
following
Lande andThompson
[10],
the distribution of gene effectsaccording
to ageometric
seriespointed
out that the number of effective loci was ofgreater
concern than the real number of loci. Thispoint
is also ofgreat
concern in thisstudy. Indeed,
results of moleculargenetics
indicate,
for most of thequantitative
traits understudy,
the influence of a mixedheredity,
i.e. a small number of genes withlarge
effects and alarge
number of genes with small effects. But this kind of genome structure is not easy tointegrate
inanalytical
models. Theapproach developed by
Chevalet[3]
accounts for differential distribution of geneeffects, by joining
genes with variable contributions to thegenetic
variance of the trait into severalindependent
clusters of genes withequal
contributions.Moreover,
within onegiven
group of genes in thismodel,
the loci may beclosely
linked. The maindisadvantage
of thisapproach
is the overestimation ofgenetic
variance after the firstgeneration,
probably
due to thedeparture
of the gene effects from Gaussian distribution.By accounting
for a finite number of loci and a finite number of alleles perlocus,
the ’full-finite’genetic
modeldeveloped
in this paper, with a stochasticapproach,
seems to be an easy andquite
consistent way forstudying
the behaviour ofgenetic
mean and variance in a situation of mixedheredity.
REFERENCES
[1]
Al Murrani W.K., RobertsR.C.,
Genetic variation in a line of mice selected to its limit forhigh body weight,
Anim. Prod. 19(1974)
273-289.[2]
CaballeroA., Keightley
P.D., HillW.G.,
Accumulation of mutationsaffecting body
weight
in inbred mouselines,
Genet. Res. 65(1995)
145-149.[3]
ChevaletC.,
Anapproximate theory
of selectionassuming
a finite number ofquantitative
traitloci,
Genet. Sel. Evol. 26(1994)
379-400.[4]
CrowJ.F.,
KimuraM.,
An Introduction toPopulation
GeneticsTheory, Harper
andRow,
NewYork,
1970.[5]
Dudley J.W., Seventy-six generations
of selection for oil andprotein
percentagein
maize,
in: PollackE.,
Kempthorne 0., Bailey
T.B. Jr(Eds.),
Proceedings
of International Conference onQuantitative Genetics,
Iowa State Univ. Press,Ames,
1977,
pp. 459-473.[7]
FournetF., Hospital F.,
ElsenJ.M.,
A FORTRAN program to simulate the evolution ofgenetic variability
in a smallpopulation,
Comput. Applic.
Biosci. 11(1995)
469-475.[8]
Hospital F.,
Effets de la liaisongénique
et des effectifs finis sur la variability descaract6res
quantitatifs
sousselection,
these deDoctorat,
UniversiteMontpellier
II.[9]
Keightley P.D.,
HillW.G., Quantitative genetic
variation inbody
size of mice fromnew mutations, Genetics 131