HAL Id: hal-00893877
https://hal.archives-ouvertes.fr/hal-00893877
Submitted on 1 Jan 1991
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Reduced animal model for marker assisted selection
using best linear unbiased prediction
Rjc Cantet, C Smith
To cite this version:
Original
article
Reduced
animal
model
for marker
assisted selection
using
best linear
unbiased
prediction
RJC Cantet*
C
Smith
University of Guelpla,
Centrefor
GeneticIm
P
rovement
of Livestock,
Department of
Animal andI’oultry Science, Guelph, Ontario,
N1G2Wl,
Canada(Received
15 October 1990;accepted
11April
1991)
Summary - A reduced animal model
(RAM)
version of the animal model(AM)
incorpo-rating independent
markedquantitative
trait loci(M(aTL’s)
of Fernando and Grossman(1989)
ispresented.
Both AM and RAMpermit obtaining
Best Linear UnbiasedPre-dictions of
MQTL
effectsplus
theremaining portion
of thebreeding
value that is notaccounted for
by independent M(aTL’s.
RAM reducescomputational requirements by
a reduction in the size of the system of equations.
Non-parental MQTL
effects areex-pressed
as a linear function ofparental MQTL
effectsusing
marker information and therecombination rate
(r)
between the marker locus and theMQTL.
Theresulting
fraction of theMQTL
variance that isexplained by
theregression
onparental MQTL
effects is2[(1- r)
2
+ r
2
] /2
when the individual is not inbred and both parents are known. Formulae are obtained tosimplify
thecomputations
whenbacksolving
fornon-parental MQTL
andbreeding
values in case all non-parents have one record. A small numericalexample
is alsopresented.
maker assisted selection
/
best linear unbiased prediction/
reduced animal model/
genetic marker
Résumé - Un modèle animal réduit pour la sélection assistée par marqueurs avec BLUP. Une version du ncodèle animal réduit
(RAM)
basée sur le modèle animal(AM)
deFernando et Crossman
(1989)
avec lociindépendants
de caractèresquantitatifs marqués
(MQTL)
estprésentée.
Dans les 2 cas, RAM etAM,
on obtient les meilleursprédictions
linéaires sans biais
(BLUP)
deseffets
desMQTL
enplus
de laportion
restante de la valeurgénétique inexpliquée
par lesMG!TL indépendants. L’emploi
de RAM diminue lesexigences
de calcul par une réduction de la taille dusystème
d’équations.
Leseffets
desMQTL
reon-parentaux sont
exprimés
sous laforme
d’unefonction
linéaire deseffets
desMQTL
parentaux à l’aide de
l’information
provenant du marqueur et du taux de recombinaison(r)
entre le locus marqueur et le
MQTL.
Laproportion
résultante de la variance duMG!TL
* On leave from :
Departamento
deZootecnia,
Facultad deAgronomia,
Universidad de BuenosAires, Argentina
**
expliquée
par larégression
deseffets
desMQTL
parentaux est donnée parl’expression
2!(1 -
r)
2
+r2] /2
dans le cas d’un individu nonconsanguin
avec parents connus. Desformules
sont dérivées poursimplifier
les calculslorsque
l’on résout pour leseffets
desMQTL
et des valeursgénétiques
non parentaux dans le cas où tous les individus nonparents
possèdent
une seule observation. Unexemple numérique
estégalement
donné. sélection assistée par marqueurs/
BLUP/
modèle animal réduit/
marqueurgénétique
INTRODUCTION
In a recent paper, Fernando and Grossman
(1989)
obtained best linear unbiasedpredictors
(Henderson, 1984)
of the additive effects for alleles at a markedquantita-tive trait locus
(MQTL)
and of theremaining portion
of thebreeding
value.They
used an animal model(AM;
Henderson,
1984)
under apurely
additive mode ofinheritance.
Letting
p be the number of fixed effects in themodel,
n the number of animals in thepedigree
file and m the number ofM(!TL’s,
the number ofequations
in the
system
for this AM is p +n(2m
+1).
Forlarge
m, n orboth,
solving
such asystem
may notalways
be feasible. The reduced animal model(RAM;
Quaas
andPollak,
1980)
is anequivalent model,
in the sense of Henderson(1985),
to the AM andprovides
the sameresults,
but with a smaller number ofequations
to be solved. In this paper, the RAM version of the model of Fernando and Grossman(1989)
is obtained. Theresulting
system
ofequation
is of order p +s(2m
+1),
sbeing
thenumber of
parents.
Ingeneral
s is much smaller than n.Therefore,
theadvantage
due to the reduction in the number of
equations
by
using
RAM is considerable. Anumerical
example
is included to illustrate theapplication.
THEORY
For
simplicity,
derivations arepresented
for a model with oneMQTL.
The extensionto the case of 2 or more
independent M(!TL’s
is covered in the section entitled Morethan one
MQTL.
In the notation of Fernando and Grossman
(1989),
MP
andMm
are alleles atthe marker locus that individual i inherited from its
paternal
(p)
and its maternal(m)
parents,
and vf
andvi
are the additive effects of thepaternal
and maternalMQTL’s, respectively.
The recombinationfrequency
between the marker allele andthe
MQTL
is denoted as r. We will use theexpression
&dquo;breeding
value&dquo; to refer to the additive effects of all genes that affect the traitexcluding
theMQTL(s).
Matrixexpressions
for the animal model withgenetic
markerinforma-tion
A matrix version of
equation
(3)
in Fernando and Grossman(1989)
is :random vector of additive
breeding
values u and the random vector v of additiveeffects of the individual
MQTL effects, respectively.
The 2n x 1 vector v is orderedwithin animal such that
vf
always precedes
f!.
The matrices Z and W will have zero rows for animals that do not have records on themselves but that are relatedto animals with records. Non-zero rows of Z and W have 1 and 2 elements
equal
to1,
respectively,
with theremaining
elementsbeing
zero. First and second momentsof y are
given by :
where
Acr!
andG
2
,w
are
the variance-covariance matrices of u and v,respectively.
The scalarsa A 2 w
and
o,2
are the variancecomponents
of the additive effects ofbreeding
values,
theMQTL
additive effects and of the environmental effects.RAM
requires
partitioning
the data vector y into records of individuals with pro-geny(yp ; parents)
and records of individuals without progeny(y,!r ; non-parents)
so that
y’ =
[y%, y’ 1.
A conformablepartition
can be used inX, Z, W,
u, v and e.Using
this idea(1)
can be written as :To obtain
RANI,
uN and vN should beexpressed
as linear functions of upand vp,
respectively.
Since an individual’sbreeding
value can be described as the average of thebreeding
value of itsparents
plus
anindependently
distributed Mendeliansampling
residual(!) (Quaas
andPollak,
1980),
for uN we can write :where P is an
(n - s)
x s matrixrelating
non-parental
toparental
breeding
values. Each row of P contains at most two 0.5 values in the columnspertaining
to theBV’s of the sire and of the dam.
Now,
E(!)
= 0 andVar(cp)
=D
A
aA,
whereD
A
is a
diagonal
matrix withdiagonal
elementsequal
to :1 -
0.25(a!!
+add),
if both sire and dam of thenon-parent
are known1 -
0.25a
ss
,
ifonly
the sire is known1 -
0.25ad!,
ifonly
the dam is known1,
if both parents are unknownwith ass and add
being
thediagonal
elements of Acorresponding
to the sire andthe
dam, respectively.
A scalar version of the
relationship
between vN and vp can be obtained fromThe
subscripts
o, s and d denote theindividual,
its sire and itsdam,
respectively.
The coefficients
bis
are either 1-r or raccording
to any of these 4possible
patterns
of inheritance of the marker alleles : Paternal marker Maternal marker
The above
developments
lead us to thefollowing
relationship
betweenv!Br and
! :
The
2(n — s)
x 2s matrix F relates the additive effects of theMQTL
ofnon-parents to the additive effects of the
MQTL
ofparents
and s is the vector with element iequal
to residualeo
and element i + 1equal
to the residual&0’.
Each row ofF,
contains at most, 2 non-zero elements : thebis.
Let i and k be the rowindices for the
MQTL
markedby
MÓ
andA/o&dquo; respectively.
Let j
and j
+
1 be the column indicescorresponding
to the additive effects of theMQTL
for the sire thattransmits i : j
refers to thepaternal
grandsire and j
+ 1 to thepaternal granddam.
Also,
let 1 and 1 + 1 be the column indicescorresponding
to the dam that transmits i + 1 :corresponds
to the maternalgrandsire
and l + 1 to the maternalgranddam.
ThenFij =
b
l
,
Fi,!+1
=bz,
F!,! =
b
3
andF
k
,i
+1
=b
4
.
Allremaining
elements of F are 0. When marker information isunavailable,
r is taken to be 0.5(Fernando
andGrossman,
1989)
and allbis
are 0.5. Toexemplify,
consider individuals 1(male),
2(female)
and 3(progeny
of 1 and2).
Animals 1 and 2 are unrelated and 3has
paternal
and maternal marker allelesoriginating
from the dams of 1 and2,
namely
allelesM
d
and M.!
respectively. Then,
v =[v’,
v
í
&dquo;,
v
p
V!l,
vp
v’n!’
with5’ 1 2> > 1 1 2 2 ) 31 3
V!! =
!7J!,
vi
l
t 1 ,
v
p
2,
V2n]’
andyM
=w3,
’U!i!’.
The matrix W 1S :For r =
0.2,
the matrix F is 2 x 4 andequal
to :The residuals e have
E(s)
= 0 andVar(e)
=G
c
ufl.
Fernando and Grossman(1989)
showed thatG«u
fl
isdiagonal
with non-zero elementsequal
toVar(e’) =
2r(1 - r)(1 -
fg)u’§
and
Var(
E
ü)
=2r(1 -
r)(1 -
fd,)o, 2,
wheref
s
,
andf
d
are theinbreeding
coefficients at theMQTL
of the sire and of thedam, respectively.
They
a
given
MQTL
are the same. Thesef’s
are theof
f
-diagnonal
elements in the 2 x 2diagonal
blocks of the matrixG
v
(Fernando
andGrossman,
1989).
Using
(3)
and(4)
in(2)
gives :
or
On
letting
e* = eN +Z
N
#
+W,ve,
we have that :w
here Q = I(n-s)
+ Z,!DAZ;!aA + WNGEW!,av,aA = !A!!e and avU
2/U
2
where
!!!!!!!!!°&dquo;’fi$ilie
/
!! !! !i&dquo;v ,
MA =UA e
and
m, =v
e-Mixed model
equations
for(6)
are :The matrices
A
P
andG
vP
are thecorresponding
submatrices of A andG
v
thatbelong
toparents.
Equations
(7)
give
the solutions for RAM withgenetic
markers.Of
practical
importance
is the case where allnon-parents
haveonly
one record so thatN
Z
= I.Then,
E
WN
G
N
W
andQ-
1
arediagonal
(see
Appendix
A).
Thediagonal
elements ofW NGe W!
are derived inAppendix
A andthey
areequal
to :2r(1-r)(2-
f
s
-
f
d
),
when both the sire and the dam of thenon-parent
are known2r(1 - r)(1 - f
s
)
+1,
whenonly
the sire is known2r(1 - r)(1 -
f
d
)
+1,
whenonly
the dam is known2,
if both the sire and the dam of thenon-parent
are unknown.If there is zero
probability
that thepaternal
and maternal alleles at theMQTL
ofparent
p are the same(ie fp
=0),
the contribution to thediagonal
element ofW NGe: W!
is2r (I - r) (if
marker information isavailable)
or1/2
(if
marker infor-mation isunavailable).
This occursbecause,
in the absence of markerinformation,
there isequal
probability
ofreceiving
theMQTL
from thegrandsire
and from theA further
simplification
to(7)
occurs when parents do not have records so thatZp
andWy
are zero and the model becomes a sire-darn model. A program forRAM,
such as the onepresented by
Schaeffer and Wilton(1987)
and modified toinclude marker information can be
employed
to solveequations
(7).
More than oneMQTL
Multiple MQTL
(k,
say)
can be dealt withassuming independence
by
thefollowing
modification of model(1) :
where
j!
is a k x 1 vector with all elementsequal
to 1. We will assume thatVar(vi)
=G,,iu 2 ,,i
andCov(v2, vi!, )
= 0. For k = 2 andletting
Q
*
=I<___s> +
ZA’D.4Z!.(x,t
+Wn!(GEIa&dquo;1
+GE2cx.(2)W!,
RAMequations
for(8)
are :Backsolving
fornon-parents
After
solving
for fixedeffects, parental
breeding
values andparental
effects of theMQTL,
thebreeding
values and additiveMQTL
effects ofnon-parents
can be calculated. This isaccomplished by
writing
theequations
for §
and i from the mixed modelequations
of(5).
Thisgives :
and after a little
algebra :
Appendix
B shows how to obtain solutions ofequations
(10),
when allnon-parents have one
record, by
solving (n - s)
independent
systems of order 2.Using
the
predictors
obtained from(7)
and(10)
in(3)
and(4),
solutions for non-parentsEXAMPLE
We use the same data that Fernando and Grossman
(1989)
employed.
There are4
individuals,
3 of them areparents
and 1 is anon-parent.
The file is :Notice that individual 4 is inbred. A fixed effect was included and the matrix
resulting
fromadjoining
the incidence matrix X and the vector of observations y,ie
[Xly]
is :Variance components used were
(J!
=100,
a
§
= 10 andQ!
= 500 and r = 0.1. Thematrices
G
u
andG
U
are
presented
in Fernando and Grossman(1989).
First,
solutions for AM were obtained. The coefficient matrix for AM is :and the
right-hand
site vector is[445,
505, 235,
210,
250, 255, 235, 235, 210, 210,
250, 250, 255,
255]’.
The vector of solutions is[222.5,
251.764,
2.08109,
-2.08109,
- 0.083214, 1.16537,
0.213435, 0.216098,
-0.202783, -0.226749, 0.213102, -0.229745,
0.231409,
0.174809].
There are 11
equations
in thesystem
for RAM(as
compared
to 14 inAM)
since there is
only
1non-parent
(individual
4)
andQ
is a scalar : 1.1136 =(7)
is[445,
478.987,
349.494,
210, 364.494, 349.494, 349.494, 210, 210,
456.088,
272.899]’
and the coefficient matrix is :Solutions for RAM are
222.5, 251.764, 2.08109, -2.08109, -0.083214, 0.213435,
0.216098, -0.202783, -0.226749,
0.213102 and -0.229745. The next step is tobacksolve for individual 4
(non-parent)
using equation
(B.2).
Since bothparents
of 4 are
known, d
A44
= 0.5 andd
oH
=5/(5
+0.5)
=10/11
= 0.90909... Thediagonal
elements of the 2 x2 system
in(B.2)
are functions of r.However,
as the information from the sire marker isunavailable,
r = 0.5 for the firstdiagonal
element.
Also, F,
=F
3
= 0 andd
o44
y,!
=0.90909[255 -
251.764 -0.5(2.081090
+0.083214+0.213435+0.21G098)-0.9(0.213102)-0.1(-0.229745)!
= 1.6848545. Foranimal
4,
we then have :which has solutions
ei
= 0.0166428 ande3!
= 0.00599141.Putting
these into(B.3)
gives
!4
= 0.166428.Therefore,
BLUP(
U4
)
= 0.5BLUP(u
l
)
+ 0.5 BLUP(u
3
)+
BLUP«4)
=0.5[2.08109
+(-0.083214)]
+ 0.166428 = 1.16537.Also,
BLUP(v’)
= 0.5BLUP(vi )
+ 0.5BLUP(v
i
’2)+
BLUP(E!)
=0.5[0.213435
+0.216098]
+ 0.0166428 = 0.231409 andBLUP(v4 ) -
0.9BLUP(v’)
+ 0.1BLUP(v
l
&dquo;)+
BLUP(
E
4
&dquo;)
=0.9(0.213102)+0.1(-0.229745)+0.0059141
= 0.174809.As
expected,
solutions obtainedby
both AM and RAM are the same.DISCUSSION
The
advantage
of RAM over AM increases as both the ratio between the number of non-parents and the number ofparents
and the number ofindependent MQTL
increase. Goddard(1991)
suggested
the use of RAM to decrease the size of theresulting
system ofequations
whenworking
with information onflanking
markers.As shown in
Appendix
A and for a non-inbredindividual,
the fraction of the variance of theMQTL
that is due to Mendeliansegregation
is4r(1 - r)/2.
Now,
1 =
(r+l-r )
2
=r2 +2r(l-r)
+
(l-r?,
so that2(1-2r(1-r)J
=2[r
2
+ (l-r)
2
].
Therefore,
the fraction of the variance of theMQTL
that isexplained by parental
segregation
is2[r!
+
(1 -
r)!]/2.
Theseproportions
can also be worked out fromformulae derived
by
Dekkers and Dentine(1991).
Aslight
difference between their result and the one obtained here stems from the fact thatthey
define the variance of theMQTL
as one half the variance as definedby
Fernando and Grossman(1989)
(
0
.5a’).
Both AM and RAM rest on
knowing
the variancecomponents
as well as the recombination rate between the marker gene and theQTL.
As the latterparameter
enters into the variance-covariance matrix of
QTL
effects in a rathercomplex
manner, its estimation
by
the classical methodsemployed
in animalbreeding
seemsto be
difficult,
as discussedby
Fernando(1990).
When more than one
MQTL
isbeing
considered,
covariances betweenpairs
ofMQTL
effects arelikely
to be non-zero due tolinkage
disequilibrium
causedby
selection(Bulmer, 1985).
Model(8)
assumes that these covariances are zero. The extent of the error inpredicting
v(or
functions ofv)
due toincorrectly
assuming
null covariances between
MQTL
effects willdepend
on themagnitude
andsign
ofthe covariance. If the covariances are
mostly
negative,
which islikely
tohappen
on a traitundergoing
selection(Bulmer, 1985),
1VIQTL
effects may beoverpredicted.
Research is in progress to overcome this restriction of model(8).
APPENDIX A
Derivation of the
diagonal
elements ofW
N
G
E
W’N
when allnon-parents
have one recordFirst we show that
W
N
Ge W’tv
isdiagonal.
BecauseG,
isdiagonal
(Fernando
andGrossman,
1989),
we can write :where wj is the
column j
of
W
N
and
is jgdiagonal element j of
G
e
.
Now,
wj has all its elementsequal
to zeroexcept
for a 1 inposition j. Therefore,
the matrixWj
w
j
g
j
has all elementsequal
to zero except for elementj, j
which
isequal
to g! .The
paternal
and maternalMQTL
additive effects of an animal are in consecutivecolumns of the matrix W
(and W
N
),
wj and wj+l say, and these areequal.
We then have :and
W
N
G
E
WN
isdiagonal
with non-zero elementsequal
to g!+ g
j+1
.
Now,
(g!
+g!+1)!! =
Var(eo)+Var(eo )
=2r(I - r)(I - f,) + 2r(I - r)(I - f
d
),
where
f,
andf
d
are theinbreeding
coefficients of sire and dam for theMQTL,
respectively.
The lastequality
follows fromexpressions
(12a)
and(12b)
in Fernando and Grossman(1989).
After somerearranging,
thediagonal
element ofW NGe W!
when both
parents
of the individual are known. If the sire isunknown,
EP
=vP
0 and thediagonal
element is2r(1 - r)(1 - f
d
)
+ 1. If the dam isunknown,
eo v’
and thediagonal
element is2r(1 - r)(1 - f! )
+ 1. If bothparents
are unknown thediagonal
element ofW
N
GE W’¡y
is 2. APPENDIX BSolutions of
equations
(10)
whennon-parents
have one recordWhen
non-parents
have one record(Z
N
=I),
equations
(10)
reduce to :On
absorbing
theequations
for!,
the solution for e is :The matrix
Do
= I -(I
+1
OCA)-
1
ï
:
D:
isdiagonal
with elementd
oii
being equal
to :and
da
i
i
isdiagonal
element i of.
A
D
Since W(and
W
N
)
has rows with 2consecu-tive elements
equal
to 1 and the restequal
to0,
W!Do W N
is blockdiagonal,
each blockbeing
of order 2 x 2 with all elementsequal
tod
oii
.
Adding
G. ’
O
L,
gives
the coefficient matrix on the left-hand side of(B.1)
and solutions for i can be obtainedby solving
(n - s)
systems
of order 2. Thesystem
for animal i isequal
to :and
y*
is element i of the vector yN -X’!-
Pup - W N Fv p.
Since the coefficient matrix of this system is
diagonal,
BLUP(0
i
)
is :If there is more than one
MQTL
the matrix ofsystem
(10)
becomespoorly
conditioned. The reason is that alloff-diagonal
elements areequal
to 1 and thediagonals
arerelatively
large (may
be in the order ofhundreds,
depending
on thea’s).
An exact solution can be obtainedby writing
the matrix ofsystem
(10)
for each animal asjj’
+
S, where j
is a 1 + 2m vector with all elementsequal
to one andS is a
diagonal
matrix.Using
the inverse of the sum of matrices formula(Henderson
andSearle,
1981),
we have that :Notice that the
expression
inparenthesis
on theright-hand
side of(B.4)
is a scalar.Using
(B.4)
the inverse of the matrix ofsystem
(10)
when there are mMQTL’s
for anon-parent
is :and
g = 1 + S
1
+ S
z
+ ... +
S
2m
.
The S’s are such thatS
l
=D
Aii
x
A
l
,
S
2
=Gg
9
0Lvl
=2ri(1 - T
l
)(1 -
hs)cx.!/, S3 = Ge
d
aull -
2r
,
(l - ri)(1 -
f, )m-1,
and so on forMC!TL’s
2 to rra.Expression
(B.5)
is easy to program, does notrequire
iteration
and,
moreimportantly,
it is notsubject
to the numericalproblems
thatoccur when
solving
such asystem
ofequations.
ACKNOWLEDGMENTS
This work was
supported by
the Natural Sciences andEngineering
Research Council ofCanada. LR Schaeffer
provided
manyhelpful
comments. G Banos and N Caron translated the summary into French. Thanks are also extended to the editor for his comments andto a referee for
pointing
out aproblem
with the definition of theinbreeding
coefficient atREFERENCES
Bulmer :MG
(1985)
Tlae MathematicalTheory of Quantitative
Genetics. Clarendon
Press,
OxfordDekkers
JCJBŒ,
Dentine 1VIR(1991)
Quantitative
genetic
variance associated with chromosomal markers insegregating populations.
TheorAppl
Gen81,
212-220 Fernando RL(1990)
Statisticalproblems
in marker assisted selection forQTL.
In : .’4tla
WorldCongr,
GeneticsApplied
to LivestockProduction,
Edinburgh,
23-27July
1990,
vol 13(Hill
‘VG, Thompson R,
WooliamsJA,
eds)
433-436Fernando
RL,
Grossman M(1989)
Marker assisted selectionusing
best linear unbiasedprediction.
Gen SelEvol 21,
467-477Goddard ME
(1991)
A mixed model foranalyses
of data onmultiple
genetic
markers. Theor
Appl
Gen(submitted)
Henderson CR
(1984)
Applications of Linear
Model.s in AnimalBreeding. University
of
Guelph, Guelph,
OntarioHenderson CR
(1985)
Equivalent
linear models to reducecomputations.
JDairy
Sci68,
2267-2277Henderson
HV,
Searle SR(1981)
Onderiving
the inverse of a sum of matrices.SIAM Rev
23,
53-59Quaas
RL,
Pollak EJ(1980)
Mixed modelmethodology
for farm and ranch beef cattletesting
programs. J Anim Sci51,
1277-1287Schaeffer