HAL Id: hal-00199755
https://hal.archives-ouvertes.fr/hal-00199755
Submitted on 19 Dec 2007
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Alternative models for QTL detection in livestock. III.
Heteroskedastic model and models corresponding to several distributions of the QTL effect
B. Goffinet, P. Le Roy, D. Boichard, J.M. Elsen, B. Mangin
To cite this version:
B. Goffinet, P. Le Roy, D. Boichard, J.M. Elsen, B. Mangin. Alternative models for QTL detection
in livestock. III. Heteroskedastic model and models corresponding to several distributions of the QTL
effect. Genetics Selection Evolution, BioMed Central, 1999, 31, pp.341-350. �hal-00199755�
Original article
Alternative models for QTL
detection in livestock.
III. Heteroskedastic model and models corresponding
to several distributions of the QTL effect
Bruno Goffinet Pascale Le Roy Didier Boichard Jean
Michel ElsenBrigitte Mangin
,
a Biométrie et
intelligence artificielle,
Institut national de la rechercheagronomique,
BP27, 31326Castanet-Tolosan,
Franceb
Station
degénétique quantitative
etappliquée,
Institut nationalde la recherche
agronomique,
78352Jouy-en-Josas,
Francec Station d’amélioration
génétique
des animaux, Institut national de la recherche agronomique, BP27, 31326 Castanet-Tolosan, France(Received
20 November1998; accepted
22April 1999)
Abstract - This paper describes two kinds of alternative models for
QTL
detection inlivestock: an heteroskedastic
model,
and modelscorresponding
to severalhypotheses concerning
the distribution of theQTL
substitution effect among the sires: a fixed and limited number of alleles or an infinite number of alleles. The power of different tests built with thesehypotheses
werecomputed
under different situations. Thegenetic
variance associated with the
QTL
was shown in some situations. The results showed small power differences between the differentmodels,
butimportant
differences in thequality
of the estimations. In addition, a model was built in asimplified
situation to investigate the gain in usingpossible linkage disequilibrium. © Inra/Elsevier,
Parishalf-sib families
/
heteroskedastic model / linkage disequilibrium/
QTLdetection
Résumé - Modèles alternatifs pour la détection de QTL dans les populations
animales. III. Modèle hétéroscédastique et modèles correspondant à différentes distributions de l’effet du QTL. Ce papier décrit deux types de modèles alternatifs pour la détection de
QTL
dans lespopulations
animales : un modèlehétéroscédastique
*
Correspondence
andreprints
E-mail: elsen@toulouse.inra.fr
d’une part, et des modèles correspondants à différentes
hypothèses
sur la distribution de l’effet de substitution duQTL
pourchaque
mâle : un nombre fixe et limité d’allèlesou au contraire un nombre infini d’allèles. Les
puissances
des différents tests construitsavec ces
hypothèses
sont calculées dans différentes situations. L’estimation de la variancegénétique
liée auQTL
est donnée dans certaines situations. Les résultats montrent de faibles différences de puissance entre les différentsmodèles,
mais desdifférences
importantes
dans laqualité
des estimations. Deplus,
on construit unmodèle dans une situation
simplifiée
pour étudier legain
que l’on peut obtenir en utilisant un éventueldéséquilibre
de liaison.© Inra/Elsevier,
Parisfamilles de demi-frères / modèle hétéroscédastique / déséquilibre de liaison
/
détection de QTL
1. INTRODUCTION
In theoretical papers
dealing
withQTL
detection inlivestock,
theQTL
effects are most often considered to be different across the sires
i,
and the residual variance within theQTL genotype
as constant among the sires(e.g.
[9, 10]).
Thesehypotheses
were made in the twoprevious
papers about alternative models forQTL
detection in livestock[4, 8!.
In this third paper, these two sets ofparameters
are studied.First,
a heteroskedastic model with residual variancea/ specific
to each sirei is evaluated. The rationale for this test is that it should be more robust
against
true
heteroskedasticity,
for instance when different alleles aresegregating
atanother
QTL
than theQTL
under consideration.However,
the power of the tests may be smaller than in the homoskedastic model if the homoskedastic model is correct.Different
possibilities concerning
the within sireQTL
substitution effecto!
will also be considered: a fixed and limited number of
alleles,
or an infinitenumber of alleles.
Taking
into account these distributions of theQTL
effectcan increase the power of the tests if the model is correct, and decrease this power if the model is incorrect.
Therefore,
the behaviour of the tests based onthese different models will be
compared
under different situationsconcerning
the distribution of the
QTL
effect. Morespecifically,
the case of a biallelicQTL
in
linkage disequilibrium
with themarker,
will beexplored
ingreater
detail.Jansen et al.
[6]
also considered the same kind of modelconcerning
theresidual variances and the number of
alleles,
but did not compare the power of the tests.Coppieters
et al.[3]
also considered these kinds of models andcompared
the power ofregression analysis
and of anon-parametric approach.
Most
hypotheses
and notations aregiven
in Elsen et al.[4].
Tosimplify
the
computations,
all thecomparisons
were madeusing
the mostprobable
siregenotype hsi
=argmax hS i P (hs il M d
and the linearisedapproximation
of the likelihood described inthe previous
paper. All the simulations were made with 5 000replications,
and thelength
of the confidence interval for the simulated power was smaller than 1%.
When ananalytical
solution could not befound,
we used a
quasi
Newtonalgorithm
tocompute
the maximum likelihood. The chromosomelength
was 1Morgan,
with 3 or 11markers, equally spaced,
eachwith two alleles
segregating
at anequal frequency
in thepopulation.
2. EVALUATION OF A HETEROSKEDASTIC MODEL
In this
section,
the power of theT 2
test built under a homoskedastic model[8]
will becompared
to the power of theT 6
test builtunder a
heteroskedasticmodel,
whereo, e’i 2
is used inplace
ofQ2 in
the likelihoodÂ’r, hs .
This compar- ison will be made for both homoskedastic and heteroskedastic situations. The heteroskedastic situation will be modelledassuming
the existence of an inde-pendent QTL,
i.e. located on another chromosome. ThisQTL
is assumed tobe
biallelic,
with balancedfrequencies (0.5)
in the sirepopulation
and with anadditive effect. Dams are
homozygous
for thisQTL.
Under thishypothesis,
thewithin
offspring
residual variance is lower for sireshomozygous
for thisQTL
than for the
heterozygous
sire. Powers were calculatedconsidering
anH o
re-jection
thresholdcorresponding
to a correcttype
I error, which iscomputed
in the same
situation,
homoskedastic orheteroskedastic,
with noQTL
on thetested chromosome.
Table I concerns true homoskedastic
situations,
with a residual varianceo l2
=
1. In thistable,
the power of theT Z and T 6
tests aregiven
for different values of the number of progeny per sire(20
or50),
of the number of markers in the differentlinkage
group(3
or11),
of theposition
of theQTL (0.05
or0.35)
and of the additive effect of theQTL (a =
0.5 or1).
The twopossible QTL
alleles thus had the sameprobability.
Note that in this case, theQTL
substitution effect
equals
theQTL
additive effect.Tables II and III concern true heteroskedastic situations. A
QTL
located onanother chromosome was simulated with an a2 effect. The thresholds of the
T
Z and T 6 tests
aregiven
in table II for different values of the a2 effect and for 20sires,
50 progeny per sire and 11 markers. The results were obtained with 5 000 simulations. The power of theT 2
andT 6
tests aregiven
in table IIIfor different values of the linked
QTL
additive effect(a
= 0.5 or1.0),
of theposition
of this linkedQTL (x -
0.05 or0.35)
and of theindependent QTL
additive effect
(a 2 = 0, 1,
1.5 or2).
For eachQTL,
the twopossible
alleles had the sameprobability.
In the true homoskedastic
situation,
and for agiven
number of sires andmarkers,
the thresholds of the two tests appear to be very close to each other for all cases(data
notshown),
which is inagreement
with theasymptotic theory
in linear models. In a linearmodel,
theasymptotic
distribution of Fisher test statistic is the same if the residual variance used in the denominator isreplaced by
any consistent estimate of this variance. The estimate of the residual variances in the modelcorresponding
to theT!’
test isconsistent,
asis the estimate in the other model. The thresholds
given
in table II show that theT 6
test is not sensitive at all to the value of a2, whereasT 2 is slightly
moresensitive. The use of the threshold
corresponding
to a2 = 0 when it is not true can lead to a firsttype
error of 5.5%
instead of 5%.
The power of the
T!
test appears to beonly slightly
smaller than the power of theT 2
test in the case of o,,i r = 0’e’ This very small decrease is inagreement
with the difference in powerof
ananalysis
of variance test when the number ofdegrees
of freedom of the residual varies from 50 to1000,
i.e. from the number of progeny per sire to the total number of progeny.The power of the
T!
test isslightly larger
than that of theT 2 test only
incases where the
QTL leading
toheteroskedasticity
has alarge
effect. Even in these cases, the differences between the power of the two tests remain small and of the same order as for homoskedasticsituations,
but with theopposite sign.
From these
results,
andconsidering
that the tests based on the heteroskedas- tic model take a little less time tocompute (about
5%),
thefollowing
tests willbe based on this model.
3. VARIOUS NUMBERS OF ALLELES AT THE
QTL
LOCUSIn the
previous
papers[4, 8!, QTL
substitution effectsai
were defined withinwith each sire i. In this paper, two
possible
alternative situationsconcerning
these effects are considered.
- A limited number of
QTL alleles,
and therefore a set ofonly
a fewpossible
values for
ai .
In this case, theparameters
are these values and theprobability
of
QTL genotypes.
This is the model usedby
Knott et al.(7!.
- An infinite number of
possible values,
drawn at random in a normal distribution. This is the model usedby Grignola
et al.(5!.
In these two
situations,
we will consider that theQTL
effects areindepen- dently
andidentically
distributed between the sires.In the two cases, the linearised version of the likelihood can be written as:
where
f(a7)
is thedensity
of the distribution ofa2 .
In the situation with two
possible
alleles at theQTL locus,
the likelihood becomes:where
p’
=p(ai
=a)
=p(ai
=-a)
and a are the twoparameters
of the distribution.In the situation with a normal distribution of the
QTL effect,
thedensity
f (a2 )
is the normaldensity 0(a’; 0, o, 2)
and the likelihood is written asA3!!
(normal).
The test built with the likelihood
AHhs(two alleles) will be T 7
and the test
built with the likelihood
A3!! (normal) , T 8 .
In table
IV, T 7
andT’
test thresholds aregiven
for different situationsconcerning
the number of markers and the number of progeny per sire. In tableV,
the power of theT 6 , 7 T
andT 8
tests arepresented
for two kinds ofsituations. In the
first,
theQTL
had twopossible equiprobable (p a
=1/2)
alleles with no dominance and an additive effect a. The
QTL
substitution effect ai for each sire i is therefore 0 with aprobability
of1/2
and a witha
probability
of1/2.
We haveE(an
=/2. 2 a
TheQTL
variance due to the sirein the progeny of i is
a2/4,
andglobally / a
=E(a2/4)
=a 2 /8.
In thesecond,
the effect of each value ai was drawn at random in a normal
distribution,
ol
=a 2 /2
of nullexpectation
and variance.Therefore, E(a?)
=a 2 /2
andor = E(af /4)
=a 2 /8
as in the first case. The results arepresented
for differentvalues of the
parameters.
It is
interesting
to note that the thresholds areappreciably
smaller than the thresholdspresented
in table Il. This is due to the fact that there isonly
one
parameter
for theQTL
effect inT 7
and, T 8
and 20 inT 6 .
The differences between the two kinds of thresholds can becompared
with the differences between thexi ddl
95% quantile, 3.84,
and theX!oddl 95 % quantile,
31.41.The main and
quite strange
result was that the power ofT! is always larger
than or
equal
to the power of the other tests.In order to compare the
T!
andT 7 tests
morethoroughly
when the modelreally
has twoalleles,
a verylarge
number of simulations wereperformed
in asimplified
situation. A very informativemarker,
linkedtotally
to theQTL
wasassumed to
exist,
and the residual variance was assumed to be known(20
siresand 50 progeny per
sire).
TheT 6
andT 7 tests
weresimplified accordingly.
The
T 6 test
was found to be morepowerful (with
a difference of 3-4%)
than the
T 7 test
for 0.1 <p’ < 0.9,
andT 7 was
morepowerful (with
thesame
differences)
thanT 6
for the other values ofp’.
This confirms that theloglikelihood
ratio test is not the morepowerful
test in mixturesituations,
forall values of the alternative
parameters.
Andrews andPloberger !1, 2]
showedthat the
loglikelihood
ratio test is admissible but notoptimal
in cases, such asmixture
models,
where aparameter disappears
under the nullhypothesis (here
the
probability
ofhaving
one of the twoalleles).
We tried a value pa = 0.05 inthe
general
framework with md =50,
L =11,
a =0.5,
butunfortunately
theT
6
test
remains morepowerful (with
a difference of 2%)
than theT 7 test.
Concerning
thecomparison
betweenT! and T’
in situations where theQTL
effect isnormally distributed,
it is clear in suchsimple
and balanced situations that bothT 6
andT 8
areasymptotically equivalent
to the test basedon the value of
6Z where
thea,
are the maximum likelihood estimatorsi
of the
QTL
substitution effect.Therefore,
their power should have beenquite
the same. The
relatively
poorperformance
ofT’
isperhaps partially
due tonumerical
problems,
because in some cases(2 %),
thealgorithm
had difficulties inconverging
and thecorresponding
simulations were excluded from the results.The estimation of the
QTL
variance due to the sireQ2 obtained
with thedifferent models is shown in table VI. With the models used in
T 6
andT 7 ,
thisestimation is obtained as a function of the estimates of the ai or a; with
T’,
it is estimated
directly.
The value 0.03125(resp. 0.125)
of( T2 corresponds
tovalues a = 0.5 and
o,2
= 0.125(resp.
1.0 and0.5).
It appears that the estimator obtained
using T 8
is theonly quite
unbiasedestimator of
u.;.
The bias is verylarge
whenusing
the other tests. Apractical
solution would be to use the
simple T 6
test to detect aQTL
and to use theestimate associated with
T 8
when aQTL
is detected.4. BETWEEN SIRES LINKAGE
DISEQUILIBRIUM
To
investigate
the usefulness ofusing
a modelincluding
alinkage disequilib-
rium between markers and
QTL
alleles at the between sireslevel,
asimplified situation,
which mimics the realsituation,
but which isconsiderably
easier tocompute,
was considered.The
QTL
issupposed
to be located on a markerlocus,
with all the 20 sires consideredA,
Bheterozygous
for this marker. The dams are considered ascarrying
other alleles and therefore all the progeny are informative. We denoteY A
(i) (resp. Ya(i))
the mean of then A (i) (resp. n B (i))
progeny of sirei carrying
allele A
(resp B).
The twopossible
alleles at theQTL
are denotedQ,
with anadditive effect of
a/2
and q, with an additive effect-a/2.
The model for theexpectation
ofY A (i)
andY B (i)
is:The
variability
around thisexpectation
will be considered asnormally distributed,
with mean 0 and variancen (i) A / 2 a (resp. u 2 / n B (i))
assumedto be known. We will consider two tests: the
analysis
of variance test whichcorresponds
to the modelE(Y A (i)) - E(Y B (i))
= ai, without anassumption concerning
the distribution of the ai, and the likelihood ratio testcorresponding
to the mixture model
concerning
the sire allele. The first test isanalogous
totest
T 6
and will be denotedT6!
and thesecond, analogous
to testT 7
will bedenoted
T 7’ .
This isonly
ananalogy
because the residual variance is assumed to beknown,
all the progeny are informative and the tests arecomputed only
on the marker.
The powers of these two tests for
U 2
=1,
a =0.5,
with different numbers of informative progenyn A (i)
+ rzB(i)
= constant across thesires,
and differentvalues of the
parameters
pi and p2, aregiven
in table VII. Note that the 25 informative progeny wouldcorrespond
to the mean number of informative progeny for 50 dams and asingle
biallelic marker.It appears that the use of a model with a
linkage disequilibrium
canincrease the power if there is
really
alinkage disequilibrium (that
is alarge
difference between pi and
p 2 )
but can lose power when there is a smalllinkage disequilibrium.
These resultsdepend heavily
however on thehypothesis
madein this
simplified
situation.-
QTL
locationknowledge;
thisknowledge
increases the power of the two tests butperhaps
does notchange
the difference between the two tests.- The females do not carry either of the sire’s
alleles;
it is not a very realisticsituation,
but it leads to easiercomputations
and one can think that it does notchange
the power difference between the two tests.- The use of a
completely
linkedmarker;
it isconsiderably
more difficultto build a model with one or several
partially
linked markers and thegain
inusing
this information would be smaller than thegain presented
in table VIL5. CONCLUSIONS
In many
situations,
the power of thesimple T! test,
which is easier and fasterto
compute,
isequal
to or a little bit better than the power of the other tests.This result could be
specific
toQTLs
of little effect. In thepresent study,
wefocused on
QTL
effects of such arelatively
smallmagnitude because,
with(aTLs
with
larger effects,
all the tests would have had the same power, one. For(aTLs
with
large effects,
thecomparison
shouldrely
upon other criteria than power, such as thelength
of theQTL
location confidence interval.Nevertheless,
theT
8
test isappreciably
better than the other test inestimating QTL
variance.The model
using
alinkage disequilibrium
can lead to more power in some situations.Nevertheless,
it is of interestonly
if one can be sure that there isreally
alinkage disequilibrium.
The otherproblem
for the use of this model is the extension to ageneral
situation where theQTL
is not located on a marker.REFERENCES
[1]
AndrewsD.W.K., Ploberger
W.,Optimal
tests when a nuisance parameter is presentonly
under thealternative,
Econometrica 62(1994)
1383-1414.[2]
AndrewsD.W.K., Ploberger
W.,Admissibility
of the likelihood ratio test when a nuisance parameter is presentonly
under the alternative, Ann. Stat. 23(1995)
1609-1629.
[3] Coppieters
W., KvaszA., Farnir F., Arranz J.-J., Grisart B., Mackinnon
M.,Georges M.,
A rank-basednonparametric
method formapping quantitative
trait lociin outbred half-sib
pedigrees: application
to milkproduction
in agranddaughter design,
Genetics 149(1998)
1547-1555.[4]
Elsen J.M.,Mangin
B., Goffinet B., LeRoy
P., Boichard D., Alternative models forQTL
detection in livestock. I. General introduction, Genet. Sel. Evol.31
(1999)
213-224.[5] Grignola F.E., Zhang Q.,
Hoeschele I.,Mapping
linkedquantitative
trait locivia residual maximum likelihood, Genet. Sel. Evol. 29
(1997)
529-544.[6]
JansenR.C.,
Johnson D.L., Van ArendonkJ.A.M.,
A mixture model ap-proach
to themapping
ofquantitative
trait loci incomplex populations
with anapplication
tomultiple
cattlefamilies,
Genetics 148(1988)
391-400.[7]
KnottS.A.,
Elsen J.M.,Haley C.,
Methods formultiple-marker mapping
ofquantitative
trait loci in half-sibspopulations,
Theor.Appl.
Genet.93(1996)
71-80.[8] Mangin
B., Goffinet B., LeRoy
P., Boichard D., Elsen J.M., Alternative models forQTL
detection in livestock. II. Likelihoodapproximations
and sire markergenotype estimations, Genet. Sel. Evol. 31