HAL Id: hal-00893833
https://hal.archives-ouvertes.fr/hal-00893833
Submitted on 1 Jan 1990
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Estimating genetic parameters using an animal model
with imaginary effects
R Thompson, K Meyer
To cite this version:
Original
article
Estimating
genetic
parameters
using
an
animal model with
imaginary
effects
R
Thompson
1
K
Meyer
2
1
AFRC
Instituteof
AnimalPhysiology
and Genetics,Edinburgh
ResearchStation,
Roslin, Midlothian, EH25 9PS;2
Institute of Animal Genetics, University of Edinburgh,
King’s
Buildings,Edinburgh EH9 3JN, UK
(Received
8 July 1988; accepted 24 November1989)
Summary - The estimation of additive genetic variance
by
maximum likelihood isdiscussed. The extension of reduced animal models, when parents are not inbred to
allow the use of
existing
algorithms for maximum likelihood estimation of additive geneticvariance, is described. This involves the introduction of imaginary effects with
negative
variance, and leads to computation using complex arithmetic. Methods are developed to
allow the computation to be carried out
using
only real arithmetic. This method hascomputational advantages
whenonly
a small proportion of animals haveoffspring.
genetic parameter / animal model / estimation / maximum likelihood
R.ésumé - Estimation de paramètres génétiques selon un modèle animal incluant
des effets imaginaires - On discute l’estimation de la variance génétique additive par le
maximum de vraisemblance. On décrit l’extension des algorithmes
disponibles
d’estimationpar le maximum de vraisemblance de la variance génétique additive, à la situation d’un
modèle animal réduit au cas où les parents ne sont pas consanguins. Ceci passe par
l’introduction
d’effets
imaginaires, de variancenégative,
et conduit à des calculs enarithmétique complexe. Des méthodes sont développées pour n’utiliser que l’arithmétique
réelle. Cette méthode présente des avantages numériques quand une
faible
proportionseulement des animaux ont des descendants.
paramètre génétique / modèle animal / estimation / maximum de vraisemblance
INTRODUCTION
Additive
genetic
variance andheritability
have mostcommonly
been estimated inanimal
breeding
data information from collateralrelatives,
such as half-sibs orfull-sibs,
or non-collateralrelatives,
such asparent-offspring.
Covariancesgenerated
by
theserelationships provide
the most information forestimating
additive ge-netic variance.However,
there is interest incombining
these alternative estimates*
(Nicholas
andHill,
1974),
and alsousing
the additional information from othergenetic
relationships.
Linear models can be formed that contain
genetic
and environmental effects foreach
animal,
commonly
called individual animal models(eg,
Quaas
andPollak,
1980).
Intheory
atleast,
standard or restricted maximum likelihood estimation(ML
orREML)
procedures
can beapplied
to individual animal models for variance component estimation(Harville,
1977).
Oneproblem
is that this model generatesas many
equations
as there are animals in thepedigree
and so there have beenattempts to reduce the
computational
effort.Thompson
(1977)
considered the caseof two
generations
and showed how todevelop
REMLestimating equations just
for animals in the firstgeneration.
Moregenerally,
this estimation canusually
beinterpreted using
predicted
breeding
values.Quaas
and Pollak(1980)
showed thata reduced animal model
(RAM)
hasadvantages
incalculating predicted breeding
values.
Notably, only
equations
for animals withoffspring
are needed. Henderson(1986)
and Sorensen andKennedy
(1986)
showed that REML and minimum normquadratic
(MINQUE)
estimation can beexpressed
withadvantage using
a RAM.However,
these(REML)
methods are iterativeand,
aspresented,
require
the inversion of a matrix with as many rows and columns as the number of parents in each round of iteration. For some variance componentproblems,
thiscomputa-tional burden can be reduced
by using orthogonal
transformations togive simpler
equations,
either in terms of adiagonal
matrix(Patterson
andThompson,
1971;
Dempster
etal,
1984),
or in termsof a tridiagonal
matrix(Smith
andGraser,
1986).
These results are notdirectly applicable
to estimation methods based on a RAM.This paper shows how a RAM can be modified
by
the introduction of extra random effects withnegative
variances,
so that theresulting
matrices involved in estimation can betridiagonalized using
existing
algorithms
based on Householder transforma-tions.Complex
numbers are needed to take account ofnegative
variances,
but a newalgorithm
isgiven
that takes account of thespecial
structure of the matricesinvolved,
and reduces the need for arithmetic based oncomplex
numbers.THE MODEL
If additive
genetic
covariances are theonly
source of covariances betweenrecords,
then a linear model for the individual animal model(IAM)
can be written as:with
E(y)
=Xb,
E(a) = E(e)
= 0 andV(a)
=Ao, 2,
V(e)
=Ia5
andcov(a,e’)
=0,
where y,b,
a and e denote the vectors ofobservations,
fixedeffects,
animal effects and residual errors. X and Z are thecorresponding
incidencematrices,
and forsimplicity,
we assume X had r columns and rank r. Withsingle
records per
animal,
Z =I
nt
,
where m denotes the number of observations. A is thenumerator
relationship
matrix(NRM)
between animals.As described
by Quaas
and Pollak(1980),
with aRAM,
the vector of animals isdivided into parents,
ie,
animals which haveoffspring
(subscript
P in thefollowing),
and non-parents,ie,
animals without progeny(denoted
by subscript
N).
The additivegenetic
value for non-parents is thenpartitioned
into contributions fromwith the residual e in
eqn(l)
into a new residual error,gives
the RAM as areparameterisation
of eqn(1):
Let p denote the number of parents and q the number of non-parents, then
(p
+ q =m).
Z
N
is then a matrix of orderqxp, with elements z2! =
0.5,
if j
isa parent of
i,
and zero otherwise. Thisrepresentation,
strictly speaking,
assumesall parents of non-parents are in the data and one record per individual.
However,
eqn(2)
can beeasily
modified,
if that is not the case. The residual error for anon-parent has variance
at
v
=
a5
+0.5
QA
,
if both parents areknown,
and varianceat
v
=
QE
+0.75
QA
,
ifonly
one parent isknown, provided
they
are not inbred. In thefollowing,
it is assumed that there isequal parental
information for all non-parentsand that parents are non inbred
ie,
thata 2 w
is constant.Estimation
equations,
based on theRAM, usually
have to be reconstructed after each variance iteration because the ratio of error to additive variancechanges.
Therefore,
forcomputational
reasons, it is desirable to have a model where thevector of residuals has
homogeneous
variance. This can be achievedby
a furtherreparameterisation,
adding
effects to either parents or non-parents. As there areusually
fewer parents than non-parents, fewerequations
will begenerated
ifeqn(2)
is
expanded
to:The variance of e, is such that
var(e
j
)
+-
var(e
P
e
j
)
=var(ep)
=Ipas
andvar(ep —
e
l
)
=lo,2&dquo;
so that ep — ej and eN have the same variance.Hence,
var(e
1
)
_
-
C
2
UA
Ip,
with2
c
= 0.75 and 0.50 for one and both parentsknown,
respectively.
Because of theassumption
that there isequal parental
information for all non-parents,c
2
is the same for all e, values.Normally,
variances are assumed tobe
positive
so there is aslight difficulty
ininterpreting
el with anegative
variance.However,
if aD are effects with variancea A 2 Ip,
thendefining
ej =ica
D
,
wherei =
i
,
then the variance of ej =,
A Ip.
2
2
0
C
-
Hence,
ei can bethought
of asimaginary
effects asthey
are amultiple
of i.Hence,
eqn(3)
can be written in terms of real effects as:with e =
[(ep - e’) eN]&dquo;
This
gives
mixed modelequations
(MME)
as:where A denotes the variance ratio
!yt,/!A,
andAp
is the numeratorrelationship
Eliminating
b fromeqn(4)
gives
equations
of the formwhere
Hence,
The component matrices
H
u
,
H
12
andH
22
are all real.Equation
(7)
gives predictors
of the effects ap and aD. Note that ap is real and aD is
imaginary,
so that iaR = aD with aRreal,
and thepredictor
for el =-ca
R
is a real
quantity,
as would beanticipated.
REMLequations
to estimateQA
andQ
yy
are(in
terms of realquantities):
where r is the rank of
(X’ p Xp
+X%XN ) ,
and n is the number of elements in bothap and a
D
.
These
equations
involvea
2
A
andQ
yy
through A =
!yy/QA
on both sidesof eqns(8)
and
(10),
and have to be solvediteratively.
Tosimplify
eqns(8)
and(10)
and avoid the inversion associated withAp
l
,
Smith and Graser(1986)
andMeyer
(1987)
suggest
writing
Ap
as LL’ andusing
aL =L-
l
ap
in a modelequivalent
toeqn(3).
(A
I
is not to be confused with A orAp,
numeratorrelationship
matrices).
This iterative scheme
requires
the inversion of(A
I
+ IA)
in each iteration. Smith and Graser(1986)
have shown that thesecomputations
can be reducedby writing
A
l
as(PA!P’)
where P is anorthogonal
matrix andA*
is
atridiagonal
matrix.Using algebra
similar to that of Smith and Graser(1986),
it can be shown thatquantities
arising
ineqns(11)
and(12)
can be written in terms of(A*
+ >’1)-
1
and Forexample:
A!
can be foundusing
a sequence of Householdertransformations, using complex
arithmetic. In theAppendix,
it is shown how thespecial
structure ofA
l
with realquadrants
(L’BL
andD)
andimaginary quadrants
(iL’C)
and(iL’C)’
can be usedto
give
analgorithm
forA*
and
q
*
and to evaluateeqns(12)
to(16),
using only
real arithmetic. It should be noted that this
computational
strategy isusing
anexisting
iterative scheme and ismanipulating
theequations
to reduce the numberof
computations.
NUMERICAL EXAMPLE
To illustrate the formation of the model and mixed model
equations,
we use anexample
of 5 observations of the same sex with individuals 2 and3,
theoffspring
ofindividual 1,
and individuals 4 and5,
theoffspring
of individual 2. There is 1 fixedwith the observation vector coded so that observations on parents
(1
and2)
occur first. The variance of ei is
a
5
+3/4<r!,
as animals haveonly
one known parent. The variance of(y¡)
=QA
-
3/4<7!
+a5
+3/4
Q
A
=a5
+!A,
var(
y
s) _
1/4
QA
+
QE
+ 3/4
QA
=
QE
+
QA
.
and
eqn(5)
becomes:Eliminating
61 andb
2
gives:
This shows the structure of
eqn(6)
with two realquadrants
and twoimaginary
quadrants
in the matrix on the left hand side. For thisexample,
estimates of the effects can be foundby
partitioning
the coefficient matrix into 2 x 2 real andimaginary
parts andusing
results on inverses ofpartitioned
matrices.DISCUSSION
This is a novel
approach
with theadvantage of working
with matrices of size2p,
rather than(n+p),
and thecomputations
involved,
are of the order of(2p)
3
,
ratherthan
(n
+p)
3
.
Thistechnique
is,
therefore,
of more use inpopulations
with a lowproportion
of animals used as parents. Thetechnique
can beeasily
extended toestimates 2 multivariate residual and additive variance components matrices when
measurements are taken on all animals
using
theprocedures developed by Meyer
(1985).
Two
assumptions
are made.First,
that all non-parents haveequal parental
information. If this
assumption
is notsatisfied,
non-parents with unknown parentscan have
dummy
parents inserted into the model. In most animalbreedings
setswhere
inbreeding
isconsciously avoided,
the secondassumption
(parents
are notinbred)
isunlikely
to beimportant.
There is morelikely
to be concern about residual variancehomogeneity,
than aboutinbreeding generated genetic
variancehomogeneity.
obvious
competitor,
but it is difficult to sayprecisely
when each method is tobe
preferred.
The timerequired,
depends
on the number ofanimals,
structure ofpopulation
the sequence used incalculating
thelikelihood,
the number of variatesmeasured and the
speed
of convergence of the iterativeprocedure.
Their commentssuggest that our method could be
advantageous
if the number of parents is lessthan 350.
The method has been
presented
for a model with additivegenetic
covariances,
but in many cases, other components, such as littervariances,
need estimation. In such cases, theprocedure
can be used for agiven
ratio of litter variance to residual variance andrepeated
for different values of theratio,
in a similar manner to thatsuggested
by
Smith and Graser(1986).
REFERENCES
Dempster AP,
Selwyn
MR,
PatelIM,
Roth AJ(1984)
Statistical andcomputational
aspects of mixed model
analysis. Appl
Stat33,
203-214Graser
H-U,
SmithSP,
Tier B(1987)
A derivative-freeapproach
forestimating
variance components in animal models
by
restricted maximum likelihood. J AnimSci
64,
1362-1370Harville DA
(1977)
Maximum likelihoodapproaches
to variance component esti-mation and to relatedproblems.
J Am Stat Assoc72,
320-338Henderson CR
(1986)
Estimation of variances in Animal Model and Reduced Animal Model forsingle
traits andsingle
records. JDairy
Sci60,
1394-1402Meyer
K(1985)
Maximum likelihood estimation of variances components for a multivariate mixed model withequal
design
matrices. Biometrics41,
153-166Meyer
K(1987)
A note on the use of anequivalent
model to account forrelationships
between animals inestimating
variance components. J Anirrc Breed Genet104,
163-168Nicholas
FW,
Hill WG(1974)
Estimation ofheritability by
bothregression
ofoffspring
on parent and intra-class correlation of sibs in oneexperiment.
Biometrics30,
447-68Patterson
HD,
Thompson
R(1971)
Recovery
of inter-block information when blocksizes are
unequal.
Biorn.etrika58,
545-554Quaas
RL,
Pollack EJ(1980)
Mixed modelmethodology
for farm and ranch beefcattle
testing
programs. J Anirn, Sci51,
1277-1287Smith
SP,
Graser H-U(1986)
Estimating
variance components in a class of mixed modelsby
restricted maximum likelihood. JDairy
Sci69,
1156-1165Sorensen
DA, Kennedy
BW(1986)
Analysis
of selectionexperiments
using
mixedmodel
methodology.
J Anim Sci63,
245-258Thompson
R(1977)
The estimation ofheritability
with unbalanced data. II. Data available on more than twogenerations. Biometrics, 33,
497-504APPENDIX
Reduction of
complex
matrices totridiagonal
formA square matrix is said to be
tridiagonal
(in
the first trows)
if theonly
non-zeroelements are in the r,
s elements,
where s = r y-1,
r, or r + 1(r
<t).
A realsymmetric
matrixA
i
,
of size n x n, can be reduced totridiagonal
formAn-1
by
a sequence of
(n — 2)
Householder transformations(for
example,
Wilkinson andReinsch,
1971).
In this sequence of Householder
transformations,
at thet
th
stage, thetransfor-mation
P
t
is chosen to make thett!’
row ofAt+,
contain all zeroelements,
exceptpossibly
those in the(t — 1),
t, and(t
+1)
columns. Thisoperation
will be calledpivoting
on thet
th
row. The non-zero elements in thej
th
(1
< j
<n)
row ofA
n
-
i
are in the
(j - 1)t’
and(j
+&dquo;
t
1)
columns,
indicating
theprevious
(j -1)
and next( j
+1 )
pivots.
For the
algorithm presented
in this paper, acomplex
matrixA
l
of the form:where
B, C, D
are real matrices[of
size(n,
xn
l
), (n,
x)
n
2
and(n
2
xn
2
)]
is tobe
tridiagonalized.
In order to illustrate the
method,
a numericalexample
will begiven.
The matrix Alsatisfying
eqn(A1)
with:will be
used,
with n, = n2 = 4(for
convenience,
when matrices aresymmetrical,
only
the lowertriangular
part isgiven).
The sequence of Householdertransfor-mations derived for real matrices could be used
but,
as a square root of a sumof squares, at, is used and as this could be
negative,
this would involvedcom-plex arithmetic,
which iscomputationally considerably
moredemanding
than realarithmetic. A modification of the
tridiagonalization
is nowpresented
which avoidsThere are two stages involved. In the
first,
a sequence of Householder transfor-mation is used thatgives
At+,
=t
Pt,
A
t
P
whereAt+,
andP
t
are found to havethe same form as A in
eqn(A1)
withquadrants
of real andimaginary
numbers. This involveschanging
theordering
of thepivoting.
The rows ofA
l
aresplit
into two sets 1 to nl and(1 + n
i
)
to(n,
+ n
2
),
corresponding
to the division intoquadrants
in
eqn(Al).
Thepivoting
is started on row 1 and continues in the first set untilat <
0,
then wepivot
on the first row of the second set and continue in this setuntil
again
at < 0. The process is thenrepeated
untilnl + n2 - 1 = n -
1 rowshave been
pivoted
on. Thisprocedure produces
a matrixAt+,
with as many non-zero elements as the usualprocedure,
butA
t+i
is notnecessarily tridiagonal.
Forexample,
for the numericalexample,
theordering
ofpivoting
is found to be rows 1,2, 5, 3, 6, 7,
giving
A
7
.
There are atmost, 3 non-zero elements in each row of
A
7
.
In the secondstage, A
n
-
i
will bepermuted
togive
atridiagonal
matrix.The first stage is now illustrated
using
recursive arguments. Somehousekeeping
notation is needed to
identify pivoted
rows. In the real case, the index t was relatedto the number of transformations carried out
(t — 1),
the rows which have beentransformed
(1... t - 1)
and the next row(t)
to bepivoted.
Thecomplex
case ismore
complicated;
forexample,
formation ofA
4
for the numericalexample
involvespivoting
on row1,
2 and 5 in turn, and the next row to bepivoted
is row 3. It isconvenient to define
R
t
,
S
t
andT
t
to indicate thatprevious
operations
have used the first(R
t
-
1)
rows of the first set of rows and the first(S
t
-
1)
rows of thesecond set of rows as
pivots
and thatT
t
is the nextpivot. K
t
is used to indicate if the nextpivot
(T
t
)
is in the first set of rows(K
t
=1)
or in the second set ofrows
(K
t
=2).
Within the two sets ofrows, the rows are transformed in sequence
so that
T
t
=R
t
,
ifK
t
=1,
andT
t
=n
1 +
S
t
,
ifK
t
= 2.Hence,
in the numericalexample,
R4= 2 + 1; S
4
= 5 - 4 + 1; T
4
= 3; K
4
=1; R
5
= 4. As a total of(t - 1)
rows of
A
l
have beenpivoted,
thisequals
the sum of rowspivoted
in the two sets; i e,(R
t
-
1)
+(S
t
-
1)
= t - 1.Initially,
t =1, R
l
=1, S
l
= 1 and the first row inthe first set can be used as the first
pivot,
and so,T
i
=1, K
l
r 1.Two
slightly
differentstrategies
are needed forK
t
= 1 andK
t
= 2. Thederivation of
At+1
isgiven
firstwhen K
t
= 1. Tosimplify
notation in thist
th
stage, let r =
Rt,
s =S
t
, k
=,
K
t+1
and supposeAt
is of the form(Al).
AsK
t
=1,
then the next row to bepivoted
is r, soT
t
= r; the number of rows in thetwo sets
already
used aspivots
inAt+,
will be r, and s -1,
so thatR
t+1
= r + 1A matrix
A
t+i
isrequired,
so thatAt+,
=P
t
A
t
Pt.
Using
the same argumentas in usual Householder transformations:
with terms in
(A2)
to(A4)
expressed
as real numbers wherepossible.
A sum ofsquares ot is
needed,
given
by:
with
b
rm
,
crm andd
rm
being
elements of the real matricesB,
C, D,
thattogether
form
At
(using (A1)).
The elements off,
and,
2
f
added to ut, and Ut2, aregiven
by,
There are two
possibilities
in(A6):
if Qt > 0, then usual Householder transfor-mation isused,
and the nextpivot
will be the(r
+1)t!’
row inthe first
set(k
=1),
and the
(r,
r +1)
element ofAt+,
will be non-zero.However,
if at <0,
then the(n,
+s)
th
row is used as apivot
(k
=2)
in order thatAt+1
andP
t
have the sameform as
(Al).
Then
At+1
can be calculatedusing:
1- - ..
--For the numerical
example
with T1 =1,
then from(A5)
and(A7),
<ri =0.375;
f
l
=-0.6124;
f
2
=0;
k =K
2
= 1.So from
(A4):
Hence, using
(A10),
A
2
has the form of(Al)
with:The first column of
A
2
hasonly
1 non-zerooff diagonal
element;
the(2,1)
element.A
2
has R
2
= 2 andS
2
=
1;
asonly
the first row has beenpivoted.
As 172 <0,
thenK
2
=1,
and the next row to bepivoted
is 2 andT
2
= 2.Equations
(A5)
and(A7)
show that a2 =
-75.8750;
f
l
=0.0000;
f
2
=8.7106; K
3
= k = 2.
Formulae are now
presented
to show how topivot
on a row in the second set ofAt
i e, K
t
= 2. These formulae are similar to(A2)
to(A10),
with somechanges
inThen,
Using
(All)
to(A19),
matrixA
4
can be constructed from:A
7 is indicated above.
Hence,
a matrix has been constructedwith,
at most,2(n - 1)
off diagonal
elements. In this
example,
there are12,
as the(6,4)
and(4,6)
elements are zero. The second stage is to transformAt-1
totridiagonal
formAt_
1
.
This can besimply
achievedby
recording
the rows and columns in the order that thepivoting
was carried out;ie, T
t
=(t
= 1, ... ,n),
whereT
n
is the row not included in the set(Ti
7
T
2
, ...
)T
n
-
1
)
7
and is either n1 or(n,
+n
2
)-In the numerical
example:
T
1
=1,
T2 = 2,T
3
=5,
T4 =3,
T5 =6, T
6
=4,
T
Formation of
equations
(12)
to(15)
If
(A*
+
AI)
andq
*
are realmatrices,
say G and q, then Smith and Graser(1986)
suggestwriting
G as U’DU(where
U is a lowertriangular
matrix withdiagonal
elements
1,
and D is adiagonal
matrix)
andgive
analgorithm
to find the non-zeroelements of D and
U;
ie,D
j
andU!,!.
and the other elements of D and G are zero.
The elements of a+ =
G-
l
q
can be foundby
eliminating
the elements of a+ in turn, andback-substituting
to findat.
That is in the firstplace
form:Expression
(13)
can be found from eitheraj
+
2 / D
j
oratat
+
.
In our case,
(A*
+
AI), q
*
and a+ have real andimaginary
terms. The indicesK
j
indicate if the elements of(A*
+AI),
q
*
and a+ are real orimaginary. Suppose
Then Smith and Graser’s
algorithm
can be usedagain,
and in terms of realarithmetic if 3 extra sets of coefficients are defined. There are 4 cases to consider:
The real coefficients associated with U and D are now
given by:
The real coefhcients of
a
+
,
a, can be found fromFor instance, for the numerical