HAL Id: hal-00893817
https://hal.archives-ouvertes.fr/hal-00893817
Submitted on 1 Jan 1989
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Marker assisted selection using best linear unbiased
prediction
R.L. Fernando, M. Grossman
To cite this version:
Original
article
Marker
assisted selection
using
best
linear
unbiased
prediction
R.L. Fernando
M.Grossman
University of Idlinois, Department of Animal Sciences, 1207 West Gregory Drive Urbana,
IL 61801, USA
(received
12 May 1989; accepted 28 August1989)
Summary - Best linear unbiased prediction
(BLUP)
is applied to a mixed linear modelwith additive effects for alleles at a market quantitative trait locus
(MQTL)
and additive effects for alleles at the remaining quantitative trait loci(QTL).
A recursivealgorithm
is developed to obtain the covariance matrix of the effects of MQTL alleles. A simplemethod is presented to obtain its inverse. This approach allows simultaneous evaluation of fixed effects, effects of MQTL alleles, and effects of alleles at the remaining QTLs, using known relationships and phenotypic and marker information. The approach is sufficiently
general
to accommodate individuals with partial or no marker information. Extension ofthe approach to BLUP with multiple markers is discussed.
marker-assisted selection - best linear unbasied prediction - genetic marker
Résumé - Sélection assistée par un marqueur: utilisation du meilleur prédicteur
linéaire sans biais (BLUP). La méthode du BLUP
(meilleure
prédiction linéaire sansbiais)
est appliquée à un modèle linéaire mixte comprenant deseffets
additifs associé aux allèles d’un locus quantitatif flanqué d’un gène marqueur, et d’effetsadditifs
pour les autres locus quantitatifs. Un algorithme récursif permet d’obtenir la matrice de covariancesassociée aux
effets
des allèles du locus marqué. Une méthode simple est aussi proposée pour calculer l’inverse de cette matrice. Cette approche permet d’évaluer simultanémentles
effets fixés,
les effets des allèles du locus marqué, et leseffets génétiques additifs
de l’ensemble des autres locus, d’après les relations de parenté, les données phénotypiqueset l’information sur les marqueurs. Cette approche est assez générade pour tenir compte
de données incomplètes chez certains individus. On discute l’extension à un BL UP avec
plusieurs marqueurs.
sélection assistée par un marqueur - meilleure prédiction linéaire sans biais - marqueur
génétique
INTRODUCTION .
Genetic
engineering
techniques
haveproduced
avariety
of moleculargenetic
mark-ers with thepotential
toidentify
alarge
number ofgenetic polymorphisms
(Soller
*
and
Beckmann, 1982;
Smith andSimpson,
1986;
Schumm etal.,
1988).
Marker-assisted selection is oneapplication
of thesetechniques
to animal andplant
breeding.
Information on marker loci that are linked toquantitative
traitloci,
to-gether
withphenotypic information,
could be used to increasegenetic
progressby
increasing
accuracy of selection andby reducing generation
interval(Soller,
1978;
Smith andSimpson,
1986).
Geldermann
(1975)
proposed
aleast-squares procedure
to estimate effects ofmarker alleles on
quantitative
traits. Based on selection indexprinciples,
Soller(1978)
combined marker information andphenotypic
information to obtaingenetic
evaluations. This method has been used tostudy
the additionalgenetic
progressexpected
from marker-assisted selection(Soller,
1978;
Soller andBeckmann, 1983,
Smith andSimpson,
1986).
Because of thecomplex
nature of animalbreeding
data,
however,
these methods may not beapplicable directly
to marker-assisted selection with field data.Data from field-recorded
populations
are affectedby non-genetic
nuisancefac-tors, such as age of
animal,
age ofdam,
management system, season of birth andherb.
Also,
non-randommating,
selection andoverlapping generations
contribute to thecomplexity
of the data. Best linear unbasiedprediction
(BLUP;
Henderson,
1973, 1975,
1982)
deals with thesecomplications
whenpredicting breeding
val-ues from
phenotypic
data. Theobjective
of this paper is to presentmethodology
for the
application
of BLUP to marker-assisted selection in animalbreeding.
Eachmethodological
development
is illustrated with a numericalexample using
asingle
hypothetical pedigree
METHODOLOGYConsider a
single polymorphic
marker locus(ML),
closely
linked to aquantitative
trait locus
(QTL).
LetMP
andMi
l
denote alleles at the ML that individual i inherited from itspaternal
(p)
and its maternal(m)
parent, and letQP
andQ7
denote alleles at the marketQTL
(MQTL)
linked toM!
andMil,
as shown below:Let
vf
andvi&dquo;
be the additive effects ofQ
p
andQ7.
Additive effects of alleles at theremaining QTLs,
unlinked to theML,
will be denotedby
the residual additive effect ui.Now,
the additive effect forindividual i,
ai, can be written asThe usual model to obtain BLUP if additive
effects, given phenotypic information,
iswhere yiis the
phenotypic
value ofindividual i,
xi
is a vector of known constants,/
3
is a vector of unknown fixed
effects,
and ei is a random error.Using
equ.(2),
BLUPcovariance matrix of ai values. Note that this covariance matrix
depends
on thetype of
genetic
information available. Whenonly relationship
information(r)
isavailable,
the covariance of ai values iswhich is
proportional
to the numeratorrelationship
matrix(e.g.,
Henderson,
1976).
When marker information(m)
is alsoavailable,
the covariance matrix ai values isIt can be shown that
G
alr
i-
,
m
,
alr
G
ingeneral.
Forexample,
the covariance between half-sibs that receive the same ML allele from their common parent ishigher
thanthe covariance between half-sibs that receive different ML alleles. This is because half-sibs
receiving
the same ML allele also receive the sameMQTL
allele withgreater
frequency
than half-sibsreceiving
different ML alleles. A. Marker modelI
To obtain BLUP with
phenotypic
and markerinformation,
it is convenient to usewhich is
equivalent
toequ.(2).
The covariance matrix of vi values(G&dquo;)
depends
on
relationship
and marker information. The covariance matrix of ui values(G
u
)
depends only
onrelationship
information and isproportional
to the numeratorrelationship
matrix(e.g.,
Henderson,
1976).
Given the covariance matricesG
v
andG
u
,
BLUPs of vi and ui values can be obtainedusing
the mixed modelequations
(Henderson, 1973).
The inverse ofG
u
,
which isrequired
on the mixed modelequations,
usually
is obtainedusing
analgorithm given by
Henderson(1976).
A recursivealgorithm
to constructG
v
isgiven
in sectionB,
and analgorithm
toobtain its inverse is in section C.
B. Covariance matrix of
MQTL
effectsl.
Theory.
To constructG
v
,
consider the covariance between additive effects ofMQTL
alleles. Without loss ofgenerality,
consideronly paternal MQTL
alleles.Suppose arbitrary
individuals o and o’ have sires s and s’. TheMQTL
allelesinherited
by
o and o’ from their sires areQP
andQ!,
having
additive effectsvP
andV
’ .
Forpaternal MQTL
alleles in o ando’,
the covariance between their additiveeffects
vo
P
andvo,
iswhere
Var(vo)
=w
is the additive variance of anMQTL
allele andP(Q!
Q
pt)
is theprobability
thatQo
is identicalby
descent toQP
I
.
For anarbitrary
pair
ofindividuals,
one is not a direct descendent of other. If o is not a direct descendantof the
o’,
QP
can be identicalby
descent toQP,
in 2mutually
exclusive ways:1) Qo
is
identicalby
descent to the maternalMQTL
allele of the sire of o’(1!9, )
and o’ inherits
QP
I
or2) QP
is identicalby
descent to thepaternal
MQTL
allele of the sire of o’and
If marker information is
available,
the conditionalprobability
that o’ inheritsQ!/,
given
that o’ inheritsM
:
’,
is(1 - r),
where r is the recombination rate between the ML and theMQTL.
Thus if o’ inheritsM
:
’,
theprobability
inequ.(4)
can becalculated
recursively
asSimilarly,
if o’ inheritsMS
If marker information is not
available,
so that it is not known whether o’ inheritsM9
orM
fi
,
0.5replaces
r inequs.(5)
and(6).
This isbecause,
in the absence ofmarker
information,
Q!I
andQfi
haveequal probability
ofbeing
transmitted to o’.The above
development
leads to a tabular method to constructG
v
,
which is similar to the method used to construct the numeratorrelationship
matrix(e.g.
Henderson,
1976).
Note thatG
v
has twice as many rows as individuals because each individual has 2 effects: 1 for thepaternal
and 1 for the maternalMQTL
allele. The rows and columns ofG2!
should be ordered so that thosecorresponding
to progeny follow those for their parents. Let the row indices of
G
v
,
corresponding
to the effects of
MQTL
alleles of individualo( vg, v’),
beiP, io ;
of its sires(vP, v7 ) ,
bei!,i!;
and of its damd(vd,
!),
beid,
i’.
Also,
let elementi j
ofG
v
be gij. Then fromequs.(4), (5)
and(6),
the elements of rowio,
below thediagonal,
are obtained asfor j
= 1...io -
1,where p
§
= r if o inheritsM9
orpP,
=(1 —
r)
if o inheritsMfi.
Elements of column
iP,
above thediagonal,
are obtained from thecorresponding
row elements becauseG
v
issymmetric.
Similarly,
elements of row17,
below thediagonal,
are obtained asfor
j
= 1...io
-1
,
wherep
7
=rif oinheritsMd
andp
7
=(1 -
r
)
if o inheritsMd .
Elements of column
im,
above thediagonal,
are obtained from thecorresponding
row elements.From
equ.(4),
thediagonal
elements of Gv areequal
too,2.
If marker informationcannot be used to determine which of the 2 marker alleles o are inherited from its
sire or its
dam,
then 0.5replaces
pP
inequ.(7a)
orp
7
in(7b).
2. Numerical
example.
Consider thepedigree
in Table 1. To constructG
v
,
rowsand columns are
arranged by
individual andby
paternal
and maternalMQTL
alleleswithin individual
(Table II).
Forconvenience,
we will assume thatav =
1 and that r = 0.1. The first two individuals are assumed to beunrelated;
thus the upper left4 x 4 submatrix of
G
v
is theidentity
matrix. Elements on thediagonal
areequal
to
or2=
1.Now,
row elements below thediagonal
can be obtained fromequs.(7a)
and
(7b);
column elements above thediagonal
are obtainedby
symmetry. Each rowelement for
vl
is
equal
to(1 - r)
= 0.9 times thecorresponding
row element forvi
1
plus
r = 0.1 times thecorresponding
row element forvi .
Each row element forv3
is
equal
to r = 0.1 times thecorresponding
row element forv2
plus
(1 — r)
= 0.9its sire is unknown.
Thus,
each row element forf!
is
the mean(r
=0.5)
of thecorresponding
row element forvi
and forvi .
Marker information is available forv4 ,
so that each row element forv4
is(1 — r)
= 0.9 times thecorresponding
rowelement for
v3
plus
r = 0.1 times thecorresponding
row element forv3 .
C.
Algorithm
forinverting
G
v
1.
Theory.
Theapproach
taken here follows thatby Quaas
et al.(1984)
andQuaas
(1988)
to invert the matrix of additiverelationships.
We define a linear model torelate the effect of the
paternal MQTL
allele of an individual(o)
to effects ofpaternal
and maternalMQTL
alleles of its sire(s)
where
EP
is
a residual effect.Similarly,
a linear model for effect of the maternalMQTL
allele of o isIt can be shown that the residuals
eP
inequ.(8a)
andE
m
in(8b)
have adiagonal
where P is a matrix with each row
containing
only
two non-zeroelements,
if theparent is known or
containing
only
zeros, if the parent isunknown;
and where isa vector of residuals. For
example,
row iP
will have(1 —
po)
in columniP
andpa
in columni
l
,
if the sire of i is known.Similarly,
row17
will have(1 —
pl)
in thecolumn
iP
andp7
in columnid ,
if the dam of i is known.To
proceed,
we need thediagonal
elements ofG
e
.
Consider,
forexample,
thevariance of
eo.
Fromequ.(8a),
if the sire of o is knownbecause effects of
MQTL
alleles of sire s are uncorrelated with residuals of itsoffspring
o(see Appendix).
HenceThe covariance between the effects of
paternal
and maternalMQTL
alleles can bewritten as
where
F
S
is theinbreeding
of sire s.Now,
equ.(10)
can be written asbecause
Var(vo)
=Var(v§ )
=Var(v7 )
=Q
v,
and where(1-
!)!
=(1- r)r
forpo
or for
pg
=(1 - r).
When the sire is not inbred:Var(eo)
=2o!(l —
r)r,
if markerinformation is
available;
orVar(e§)
=a!/2,
if marker information is not available.If the sire is not
known,
Var(e§) =
w.
Similarly,
if dam of o isknown,
the variance ofe7
iswhere
(1 -
p7 )p§!
=(1 - r)r
forp’
= r or forp-
=(1 - r)
and whereF
d
is theinbreeding
of dam d. When the dam is not inbred:Var(eo )
=2o,2 (1 - r)r,
if markerinformation is
available;
orVar(eo )
=u§ /2,
if marker information is not available.If the dam is not
known,
Var(eo )
=Q
v.
Rearranging
(9),
v can be written asfor
non-singular
(I -
P),
and thusG.&dquo;
can be written asFrom
equ.(14),
it is clear thata
;
;l
can be written asAs shown
earlier,
P has asimple
structure, with each rowcontaining
at most 2non-zero
elements,
andGE
isdiagonal.
To obtain the rules for
inverting
G
v
1
,
equ.(15)
is written aswhere n is number of individuals in the
pedigree,
jq iscolumn j
ofQ,
andd
j
isdiagonal element j
ofG§! .
By
definition ofQ, element j
of qj isunity. Further,
qj will
have,
at most,only
2 other non-zeroelements;
for j = iP,
elementiP
equals
- (1 - pP,)
and elementis
equals -p
P
o,
if the sire of o is known.Similarly,
forj
=i!,
element id
equals -(1 -
p!)
and elementid
equals - p!,
if the damof o is known.
Thus, given
parent and marker information of anindividual,
thecontributions to
G
v
1
,
corresponding
to effects ofpaternal
and maternalMQTL
alleles of theindividual,
areeasily
obtained.Now,
to obtain the inverse ofG&dquo;:
1)
calculatediagonals
ofGs :
when the parent isknown,
thediagonal
isgiven by
equ.(12a)
or(12b),
and when the parent isunknown,
thediagonal
iso, V; 2
2)
setv
G
to
the nullmatrix;
3)
for eachoffspring
o, with sire s and damd,
add thefollowing
to the indicated elements ofG-
1
:
if sire is
known,
add(1 -
p!)2di!
todiagonal
elementiP, iP;
if dam is
known,
add(1 —
p:;’ )
2
d
i:
;.
todiagonal
elementil, il;
.. - - 0 .... - -
d d
2. Numerical
example.
Consider thepedigree
in Table 1. To constructGo
weagain
takeQ
v
= 1 and r = 0.1. Because the parents ofindividuals
1 and 2 are notknown,
the first 4 elements on thediagonal
ofG,
arew
= 1. For individual3,
eachparent is known and marker information is available.
Thus,
fromequs.(12a)
and(12b),
the twodiagonals
ofG,
corresponding
to effects ofpaternal
and maternalMQTL
alleles of individual 3 are2(1—r)r
= 0.18. Each parent of individual 4 is alsoknown,
but the marker inherited from the sire is not known.Therefore,
thediagonal
ofG,
corresponding
tov4
is
0.5,
and thatcorresponding
tov4
is2(1—
r)r
= 0.18.The P matrix for this
example
isgiven
in Table III. The first 4 rows of P are nullbecause parents of the first 2 individuals are not known. The sire of individual 3 is
1,
andMi
was transmitted to 3.Thus,
the rowcorresponding
tov3
has
(1 —
r)
= 0.9in the column
corresponding
tovi
and
r = 0.1 in the columncorresponding
tovr. Similarly,
the dam of individual 3 is2,
andM2
was transmitted to 3.Thus,
the row
corresponding
tov3
has r = 0.1 in the columncorresponding
tov2
and(1 -
r)
= 0.9 in the columncorresponding
tov2 .
The sire of individual 4 is1,
butwas transmitted to 4.
Thus,
the rowcorresponding
tov4
has(1 - r)
= 0.9 in thecolumn
corresponding
tovp
and r = 0.1 in the columncorresponding
tov7n
The matrix
Q
=(I -
P’)
isgiven
in Table IV. Theproduct
QG
E
1
Q’
isgiven
in Table V. It can be verified that this is identical to the inverse of the matrix
G
v
D. BLUP with
multiple
markersIf information on another marker locus linked to a
QTL
isavailable,
the model canbe
expanded
to include effects of alleles of thisMQTL.
Thisapproach, however,
results in 2n additionalequations
for each marker introduced into theanalysis.
Thus,
for alarge
number of individuals(n)
and alarge
number ofMQTLs, solving
the mixed model
equations
may not be feasible. An alternative would be to useequ.(2),
withwhere
v!i
andvk
j
are effects ofpaternal
and maternal alleles of thek
t
i’
MQTL.
The covariance matrix of effects of
MQTL
alleles at each locus( Gv! )
can be constructedusing
the tabular method described in Section ILB.Then,
assuming
gametic equilibrium,
the covariance of matrix ai values(Ga!T n!)
can be obtained aswhere Z is a n x 2n matrix with elements for row
i containing
a 1corresponding
toeach of the
paternal
and maternalMQTL
effects of individual i and zeros for theremaining
elements. Theproblem
with thisapproach, however,
is that it could notbe
applied
tolarge
systems, unless asimple
algorithm
to invertGalr,
m
is available.DISCUSSION
Results
presented
here are anapplication
of BLUP to marker-assisted selection. This is ageneralization
of the methodpresented by
Soller(1978)
and Sollerand Beckmann
(1983).
Thisgeneralization
allows simultaneous evaluation of fixedeffects,
MQTL
effects and the residualQTL effects,
using
knownrelationships
andphenotypic
and marker information. It issufficiently general
to accommodate individuals withpartial
or no marker information.Several authors have calculated the additional
genetic
progressexpected
from marker-assisted selection(Soller,
1978,
Soller andBeckmann, 1983;
Smith andSimpson,
1986).
Because the methodpresented
here is ageneralization
of the method consideredby
theseauthors,
their resultsgive
an indication of theadvantage
expected by using
marker-assisted BLUP.Application
of thisprocedure
requires
knowledge
of the recombination rate(r)
between the marker and the
MQTL
and the variance of the additive effect of theMQTL
alleles(a’).
Assuming
that effects ofMQTL
alleles arenormally distributed,
the model
presented
here could be used to estimate r andufl
by
restricted maximum likelihood(REML;
Patterson andThompson,
1971).
The robustness of REMLREFERENCES
Geldermann H.
(1975)
Investigation
on inheritance ofquantitative
characters inanimals
by
gene markers. 1. Methods. Theor.Appl.
Genet.46,
319-330Henderson C.R.
(1973)
Sire evaluation andgenetic
trends. In: AnimalBreeding
and GeneticsSymposium
in Honorof Dr Jay
L.Lush,
AmericanSociety
of Animal Science and AmericanDairy
ScienceAssociation,
Champaign,
IL,
pp. 10-41 Henderson C.R.(1975)
Best linear unbiased estimation andprediction
under aselection model. Biometrics
31,
423-439Henderson C.R.
(1976)
Asimple
method forcomputing
the inverse of a numeratorrelationship
matrix used inprediction
ofbreeding
values. Biometrics32,
69-83 Henderson C.R.(1982)
Best linear unbiasedprediction
inpopulations
that haveundergone
selection. In: WorldCongress
onSheep
andBeef
CattleBreeding,
(Barton
R.A. & Smith W.C.ed.)
vol.1,
DunsmorePress,
PalmerstonNorth, NZ,
pp. 191-200Henderson C.R.
(1988)
Use of an average numeratorrelationship
matrix formultiple-sire joining.
J. Anim. Sci.66,
1614-1621Patterson H.D. &
Thompson
R.(1971)
Recovery
of inter-block information when block sizes areunequal.
Biometrika58,
545-554Quaas
R.L.(1988)
Additivegenetic
model with groups andrelationships.
J.Dairy
Sci.71,
1338-1345Quaas
R.L.,
Anderson R.D. & Gilmour A.R.(1984)
BLUP SchoolHandbook;
Useof
Mixed Modelsfor
Prediction andfor
Estimationof
(Co)variance
Components.
Animal Genetics andBreeding
Unit,
University
of NewEngland,
N.S.W.2351,
Australia.
Schumm
J.W.,
KnowltonR.G.,
BramanJ.C.,
BarkerD.F.,
BotsteinD.,
AkotsG.,
BrownV.A.,
GraviusT.C.,
HelmsC.,
HsiaoK.,
RedikerK.,
Thurston J.G. & Donis-Keller H.(1988)
Identification of more that 500 RFLPsby
screening
randomgenomic
clones. Am. J. Hum. Genet.42,
143-159Smith C. &
Simpson
S.P.(1986)
The use ofgenetic
polymorphisms
in livestockimprovement.
J. Anim.Breeding.
Genet. 103, 205-217Soller M.
(1978)
The use of loci associated withquantitative
traits indairy
cattleimprovement.
Anim. Prod.27,
133-139Soller
M.,
Beckmann J.S.(1982)
Restrictionfragment length polymorphisms
andgenetic
improvement.
In: WorldCongress
on GeneticsApplied
to LivestockPro-duction, Madrid,
vol.6,
pp. 396-404APPENDIX
Proof that
G,
isdiagonal
Let o be an individual that is not a direct descendant of o’. From
equs.(8a)
and(8b),
the additive effects of theMQTL
alleles of o and o’ areand
where z can take values p or
m,
=
s when z = p,or
=
d when z = m.Similarly,
z’ can take values p or
m, !’
= s’ when z’ =p, or
!’ =
d when z’ = m. Notethat for an
arbitrary
pair
ofindividuals,
ones is not direct descendant of the other.Therefore,
to prove thatG,
isdiagonal,
it is sufficient to show thar the covariance betweenE
’
andeo,
is null.From
equs.(A1)
and(A2),
the covariance between additive effects ofMQTL
allelesv!
andv’,
can be written as0 0
But,
fromequs.(7a)
and(7b)
Thus,
forequ.(A3)
toequal
(A4),
the third term inequ.(A3),
Cov(vz,
E
z,),
must bezero. The same
reasoning
can be used to show thatCov( vf, E&dquo;!;)
andCov(v!7ejl )
are zero.
Therefore, given
thatCov( v!,
E
z,), Cov( vf, and
Cov( vr
’
ejl )
are zero,Cov(eo,
ejl )
must be zero.Further, taking
o to be the a parent ofo’,
the residual(e
j
l
)
inequ.(A2)
isuncorrelated with