HAL Id: hal-00894120
https://hal.archives-ouvertes.fr/hal-00894120
Submitted on 1 Jan 1996
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Computation of all eigenvalues of matrices used in
restricted maximum likelihood estimation of variance
components using sparse matrix techniques
C Robert, V Ducrocq
To cite this version:
Original
article
Computation
of
all
eigenvalues
of matrices used in restricted maximum
likelihood estimation of variance
components
using
sparse
matrix
techniques
C
Robert,
V
Ducrocq
Station de
génétique
quantitative etappliquée,
Centre de recherches deJouy-en-Josas,
Institut national de la recherche agronomique, 78352Jouy-en-Josas cede!,
France(Received
16 March 1995;accepted
3 October1995)
Summary - Restricted maximum likelihood
(REML)
estimates of variance componentshave desirable
properties
but can be very expensivecomputationally. Large
costs resultfrom the need for the
repeated
inversion of thelarge
coefficient matrix of the mixed-modelequations. This paper presents a method based on the
computation
of alleigenvalues using
the Lanczosmethod,
atechnique reducing
alarge
sparsesymmetric
matrix to atridiagonal
form. Dense matrix inversion is not
required.
It is accurate and not verydemanding
onstorage
requirements.
The Lanczosmethod,
thecomputation
ofeigenvalues,
itsapplication
in agenetic
context, and anexample
arepresented.
Lanczos method
/
sparse matrix/
restricted maximum likelihood/
eigenvalueRésumé - Calcul de toutes les valeurs propres des matrices utilisées dans l’estimation du maximum de vraisemblance restreinte des composantes de variance à l’aide de
techniques applicables aux matrices creuses. Les estimations du maximum de
vraisem-blance restreinte
(REML)
des composantes de variance ont despropriétés
intéressantesmais peuvent être coûteuses en temps de calcul et en besoin de mémoire. Le
problème
vient de la nécessité d’inverser de
façon répétée
la matrice descoefficients
deséquations
du modèle mixte. Cet articleprésente
une méthode basée sur le calcul des valeurs propreset sur l’utilisation de la méthode de
Lanczos,
unetechnique
permettant de réduire unematrice creuse,
symétrique
et degrande
taille en une matricetridiagonale.
L’inversion dematrices denses n’est pas nécessaire. Cette méthode donne des résultats
précis
et nede-mande que très peu de
stockage
en mémoire. La méthode deLanczos,
le calcul des valeurspropres, son
application
dans le contextegénétique
et unexemple
sontprésentés.
INTRODUCTION
The accuracy of estimates of variance
components
isdependent
on the choice ofdata,
method and model. The estimation of(co)variance
components
by
restrictedmaximum likelihood
(REML,
Patterson andThompson,
1971)
procedures
isgener-ally
considered to be the best method for animalbreeding
data.Furthermore,
the animal model is considered to be the model which utilizes the information from the data in the most efficient way. Several different REMLalgorithms
(derivative-free,
expectation-maximization,
Fisher-scoring)
have been used with animalbreeding
data. Most methods are iterative andrequire
therepeated
manipulation
of themixed-model
equations
(Henderson,
1973).
The derivative-free
(DF)
algorithm
(Smith
andGraser,
1986)
involvesevaluating
thelog
likelihood functionexplicitly
anddirectly
finding
theparameters
that maximize it. Estimation of(co)variance
components
does not involve matrix inversion butrequires
evaluation of:at each
iteration,
where Crepresents
the coefficient matrix of the mixed-modelequations.
The
expectation-maximization
(EM)
algorithm
(Dempster
etal,
1977)
which utilizesonly
first derivative information to obtain estimates that maximize the likelihood function ofdata, requires
thediagonal
blocks of the inverse of the coefficient matrix for random effects(C
uu
)
and traces of theirproducts
with thecorresponding
inverse of the numeratorrelationship
matrix(A-
1
).
These tracescan be written as:
where Z is the incidence matrix for any random
effect,
M is theabsorption
matrixof fixed effects and a is the variance ratio. For a
comparison
of EM- andDF-REML,
see Misztal
(1994).
The
Fisher-scoring
(Meyer, 1985)
algorithm,
which is based on second derivativesof the likelihood
function,
requires
the calculation of morecomplicated
traces like:Expressions
in[1-3]
havestraightforward
multitrait extensions. Calculation of the inverses in[2]
and[3]
is the maincomputational
limitation for the use ofREML, particularly
when the coefficient matrix is verylarge.
Severalattempts
have been made to find direct or indirect methods for
alleviating
these numericalcomputations.
While suchalgorithms
such as DF-REML have proven robust andeasy to use,
they
aregenerally
slow to converge, oftenrequiring
many likelihoodevaluations,
inparticular
for multivariateanalyses
fitting
several random factors.However,
as notedby
Graser et al(1987),
they only
require
the factorization of alarge
matrix rather than its inverse and can beimplemented efficiently
using
sparsevariance
components
in an animal modelby
an EM-REMLalgorithm using
sparsematrix inversion
(FSPACK).
Othertechniques
forreducing computational
demands based onalgorithms using
derivatives of the likelihood function(EM
or method ofscoring
procedures)
have involved the use ofapproximations
(Boichard
etal,
1992)
or
sampling
techniques
(Misztal,
1990;
Thallman andTaylor,
1991;
Garcfa-Cort6set
al,
1992).
Along
the samelines,
Smith and Graser(1986)
have advocated theuse of sequences of Householder transformations to reduce the coefficient matrix C to a
tridiagonal form,
thuseliminating
the need for direct matrix inversion. This is based on the observation thattr[C]
=tr[QCQ’]
forany
Q
such thatQQ’ =
Q’Q
= I and so that(!C(a’
istridiagonal. Furthermore,
this idea has been extendedby
Colleau et al(1989)
tocompute
from thetridiagonalized
coefficientmatrix the trace of matrix
products required
in aFisher-scoring
algorithm applied
in a multivariate
setting
andby
Ducrocq
(1993)
with an extradiagonalization
step.
Diagonalization
is theproblem
offinding
eigenvectors
andeigenvalues
of C. As notedby
Dempster
et al(1984),
Lin(1987)
and Harville and Callanan(1990),
matrix inversion is avoided if aspectral decomposition
of the coefficient matrix is used. Then REML estimation of variancecomponents
in mixed models becomescomputationally
trivial andrequires
littlecomputer storage.
Inexpressions
!l-3!,
theproblem
amounts tofinding eigenvalues
oflarge
sparsesymmetric
matrices. Indeed[1]
can be written in terms ofeigenvalues:
where the
A
i
are theeigenvalues
of the coefficient matrix Cand,
if L is theCholesky
factor of A
(A
=LL’;
Henderson,
1976),
[2]
and[3]
can besimply expressed
as:where
the
7
i
are theeigenvalues
of L’Z’MZL.Until now, all of these methods have
implied working
on dense matrices stored inthe core of the
computer.
These transformations are therefore limited to data sets of moderate size. The purpose of this paper is topresent
andapply
in agenetic
contexta method for
computing
some or alleigenvalues
of a verylarge
sparse matrix withvery little
computer storage.
Thismethod,
attributed to Lanczos(1950),
generates
asequence
of tridiagonal
matricesT,
with theproperty
thateigenvalues
of T!
EJ2! &dquo;!
are
progressively
better estimates ofeigenvalues
of thelarge
matrixas j
increases.
This method wasdeveloped by
Cullum andWilloughby
(1985).
Thecomputer
METHODS
Computing
alleigenvalues
of a very sparsesymmetric
matrixAlthough
theproblem
ofcomputing eigensystems
ofsymmetric
dense matrices that fiteasily
in the core has beensatisfactorily
solvedusing
sequences of Householdertransformations or Givens rotations
(Golub
and VanLoan,
1983),
the same cannot be said for verylarge
sparse matrices. One way to find some or alleigenvalues
oflarge
sparsesymmetric
matrices B is to use the Lanczos method(Lanczos, 1950).
If a
large
matrix is sparse, then theadvantages
of methods whichonly
use matrix-vectormultiplications
are obvious as the matrix B need not be storedexplicitly.
Allthat is needed is a subroutine for
efficiently computing
the matrix vectorproduct
Bv for any
given
vector v. In thegenetic
contextconsidered,
it will be seen thatsuch a
computation
is often easy. Nostorage
oflarge
matrices is needed. In theLanczos
method, only
two vectors and the relevant information on B to build thesparse matrix-vector
product
Bv need to be used at eachstep
ifonly
theeigenvalues
are
required.
In
theory,
the Lanczosalgorithm
reduces theoriginal
matrix B(n
xn)
to asequence of
tridiagonal
matrices:For a real
symmetric
matrixB,
it involves the construction of a sequence oforthonormal vectors Vj
(for j
=
1, ... , n)
from anarbitrary
initial vector viby
recursively
applying
theequation:
and
/9
i
= 0 and vo = 0. The vectorsVj are referred to as the ’Lanczos vectors’. If
at each
stage,
the coefficient aj is chosen to make vj+lorthogonal
to vj and{
3
j+1
is chosen to normalize Vj+l tounity,
the vectors should form an orthonormal set and thetridiagonal
matrixT
n
withdiagonal
elements aj andoff-diagonal
elements!3!+1 (j
=1, ... ,
n)
should have the sameeigenvalues
as B. The coefficients oj and{3
j+1
are determined as follows:The
resulting
collection of scalar coefficients aj and{
3
1+l
obtained in theseorthogonalizations
defines thecorresponding
Lanczos matricesT
j
.
Paige
(1971)
showed in his thesis that for the basis Lanczos recursion[7]
orthogonalization
Lanczos matrices
T
j
( j
=1, ... , n).
Cullum andWilloughby
(1985)
showed that theeigenvalues
of theT
j
provide good approximations
to some of theeigenvalues
of B.
So,
if the Lanczos recursion is continueduntil j =
n, then theeigenvalues
ofT
n
will be theeigenvalues
of B. In this case,T
n
issimply
anorthogonal
transformationof B and must have the same
eigenvalues
as B.Theoretically,
the Lanczos method cannot determine themultiplicity
of anymultiple eigenvalue
of the matrix B. If the matrix B hasonly
m distincteigenvalues
(m
<n)
then the Lanczos recursion would terminate after msteps.
Additionalcomputations
will be needed to determine which of theseeigenvalues
aremultiple
also to
compute
theirmultiplicities.
The Lanczos method seems attractive for
large
sparse matrices because therequirements
forstorage
are very small(the
values of a! and(3j
for allj).
Theelements of vj and vj-l for the current value
of j
must
bestored,
as well asB,
in such a way that the subroutine for
computing
Bv from v takes fulladvantage
of the
sparsity
of B. Theeigenvalues
of B can be found from those of the moreeasily
handledsymmetric
matricesT
j
( j
=1, ... ,
n),
which are then determinedby
a standard method(eg,
the so-calledQL algorithm
or a bisectionmethod;
Martinand
Wilkinson,
1968).
However,
because ofrounding
errors, Vj+1 will never beexactly orthogonal
to Vk(for
allk ! j -
2).
Consequently,
the values of coefficients a! and,!!+1
areinaccurate and the nice
properties
of theT
j
described above arequickly
lost.Paige
(1971)
and Edwards et al(1979)
showed that the loss in theorthogonality
of the Lanczos vectors occursessentially
when thequantity
[,3
j
(ejx)]
becomes small(e
j
is the
jth
basevector,
ie,
allcomponents
areequal
to zeroexcept
for thejth
onewhich is
equal
to one, and x is anarbitrary
vector).
A firstapproach
to correctthe
problem
consists ofreorthogonalizing
all or some of the vectors Vj at eachiteration,
but the costs of thisoperation
and thecomputer storage
required
for all vj can beextremely large.
Anotherapproach
is not to force theorthogonality
of the Lanczos vectorsby
reorthogonalizing
but to workdirectly
with the basic Lanczosrecursion, accepting
the losses inorthogonality
and thenunravelling
the effects of these losses. Because of this failure oforthogonality,
the process does notterminate after n
steps
but can continueindefinitely
toproduce
atridiagonal
matrixof any desired size.
Paige
(1971)
showed that if thetridiagonal
matrix is truncatedafter a number of iterative
steps
k muchgreater
than n, theresulting tridiagonal
matrix
T
k
has a group ofeigenvalues
very close to the correcteigenvalues
for eacheigenvalue
of theoriginal
matrix. He also showed thatrounding
errorsonly
delayed
convergence but did not
stop
it.Indeed,
theeigenvalues
of the Lanczos matricesare either
’good’ eigenvalues,
which are trueapproximations
to theeigenvalues
of the matrixB,
or ’extra’ or’spurious’
eigenvalues,
which are causedby
the losses inorthogonality.
Twotypes
of’spurious’
eigenvalues
can bedistinguished.
Type
oneis a less accurate copy or
’ghost’
copy of thegood
eigenvalue;
type
two isgenuinely
spurious.
The maindifficulty
is to find out which of theseeigenvalues correspond
toeigenvalues
of theoriginal
matrix B. Forthat,
the inverse iteration method and thecorresponding
Rayleigh
quotient
(Wilkinson,
1965; Chatelin,
1988)
are used.The identification test which
separates
the badeigenvalues
from thegood
onesrow and column of the matrix
T
j
.
Any eigenvalue
ofT
j
,
which is also aneigenvalue
of the
corresponding
T
j
matrix,
is labelled as’spurious’
and is discarded from the list ofcomputed
eigenvalues.
Allremaining
eigenvalues, including
allnumerically
multiple
ones, areaccepted
and labelled as’good’.
So one candirectly identify
thoseeigenvalues
which arespurious.
In the Lanczosprocedure,
numerically
multiple
eigenvalues,
which differ from each otherby
less than auser-specified
relativetolerance
parameter,
areaccepted
as accurateapproximations
ofeigenvalues
of theoriginal
matrix B and the others(simple eigenvalues)
may or may not haveconverged.
This is checkedby
computing
error estimates(only
on theresulting
single
isolatedeigenvalues).
These error estimates are obtainedby calculating
theproduct
of the kthcomponent
(k
being
the size of Lanczos matrixconsidered)
of thecorresponding
Lanczos matrixeigenvector
uby {3
k+l
(Cullum
andWilloughby,
1985).
Therefore,
in order toapply
thistechnique,
it is necessary to first choose a valuek for the size of the Lanczos matrix
(often
greater
than the size of the B matrix if alleigenvalues
arerequired).
Paige
(1971)
and Cullum andWilloughby
(1985)
showedthat the
primary
factordetermining
whether or not it is feasible tocompute
large
numbers of
eigenvalues
for agiven
k is the gap ratio(the
ratio of thelargest
gapbetween two
adjacent eigenvalues
to the smallest suchgap).
The smaller thisratio,
the easier it is tocompute
all theeigenvalues
of theoriginal
matrix.
Thelarger
thisratio,
thelarger
the size of the Lanczos matrixrequired
to obtain alleigenvalues
of the B matrix. Cullum andWilloughby
(1985)
showed thatreasonably
well-separated eigenvalues
on the extremes of thespectrum
of B appear aseigenvalues
ofthe Lanczos matrices for
relatively
small size of the Lanczos matrix. Therefore this method can also beapplied
toefficiently
find the extremeeigenvalues
of a sparsematrix,
which arerequired
forexample
infinding
optimal
relaxationparameters,
when iterative successive relaxation methods are used to solve linear
systems
(Golub
and VanLoan,
1983).
Since there is noreorthogonalization,
eigenvalues
which haveconverged
by
agiven
of the Lanczos matrix maybegin
toreplicate
as the size ofthe Lanczos matrix is increased further.
The Lanczos
algorithm
does notgive
rules fordetermining
howlarge
the Lanczos matrix must be in order tocompute
all theeigenvalues. Paige
(1971)
showed that it takes more than nsteps
(generally
between 2n and 8n areneeded)
tocompute
alleigenvalues
of thespectrum.
Thestopping
criterion cannot be determined beforehand.A method to determine the
multiplicities of multiple
eigenvalues
The
Lanczosprocedure
cannotdirectly
determine themultiplicities
of thecomputed
eigenvalues
aseigenvalues
of B.Unfortunately,
themultiplicity
of aneigenvalue
of a Lanczos matrix has norelationship
with itsmultiplicity
in theoriginal
matrix B.Good
eigenvalues
mayreplicate
many times aseigenvalues
of a Lanczos matrix butbe
only single eigenvalues
of theoriginal
matrix B. Infact, numerically multiple
eigenvalues
areaccepted
asconverged approximations
toeigenvalues
of B.Cullum
(personal
communication)
proposed
anapproach
to determine andcompute
themultiplicities
ofmultiple eigenvalues
of the matrixB,
but itrequires
algorithm presented previously
has to beapplied
several times. The different matricesrequired
aresimple
modifications of theoriginal
B matrix. Theapproach
of Cullum is based upon thefollowing
property.
If B is a real
symmetric
matrix and A is aneigenvalue
of B withmultiplicity
p
(p > 2)
then A will also be aneigenvalue
of the matrix obtained from Bby adding
asymmetric
rank-one matrix:where VI is the
starting
vector used for B in the Lanczosalgorithm
and
v, is anarbitrary
scalar.Theoretically,
if the Lanczosprocedure
isapplied
toB
and with the same VI asstarting
vector,
then thetridiagonal
matricesT
j
generated
forB
would be related to those
generated
for B:One could continue this
approach
to determine thespecific
multiplicities
of each of thesemultiple eigenvalues by considering
the matrix:B
=B+v
2
v
2
v2,
where v2is the second Lanczos vector
generated
for B or any vectororthogonal
to vl, and V2 is a scalar which can beequal
to vl.Any eigenvalue
of B that hasmultiplicity
greater
than two will be in thespectrum
ofB,
and those withmultiplicity
equal
to two will be in the
spectrum
ofB
and not in thespectrum
ofB.
Thus,
theprocedure
could continue with successive transformations of the B matrix untila matrix is obtained with no
eigenvalues corresponding
toeigenvalues
of B. Thisapproach
seems attractive if themultiplicities
ofmultiple
eigenvalues
of the matrix B are small. If this is not the case, thisoperation
will beexpensive,
because thealgorithm
must beapplied
as many times as thelargest
multiplicities
ofmultiple
eigenvalues.
We
present
here anotherapproach
which can be used when the number ofmultiple
eigenvalues
of the B matrix is small but theirmultiplicities
arelarge.
Theabove
procedure
isapplied only
once to determine whicheigenvalues
aremultiple
and then the
following
expressions
are used tocompute
themultiplicities
of eachmultiple
eigenvalue.
First,
one makes use of the fact that the total number ofeigenvalues
of B isequal
to its dimension:where N is the dimension of matrix
B,
N
S
is the number ofsingle
nonzeroeigen-values,
r is the number ofmultiple
nonzeroeigenvalues
and mi is themultiplicity
of the ithmultiple
eigenvalue
!.where the
A
j
are thesimple
eigenvalues.
Finally,
the mi are obtainedsolving
the(possibly overdetermined)
system
ofthree
equations,
forexample, using integer
linearprogramming
subroutines(the
mi
must be
integers).
The traces of B and
B
2
aresimply
obtainedby
using
the subroutine of thesparse matrix-vector
multiplications
which will bepresented
in the nextpart
in aparticular
setting.
The matrix B need not be storedexplicitly.
Only
two vectorsare needed to
compute
these traces:where the
b
ii
are thediagonal
elements of the Bmatrix,
ei is the ith basevector,
and
(Be
i
)
i
represents
the ith element of the vector(Be
i
).
The
validity
of theapproach proposed
here to determinemultiplicities
wasverified on several
examples
for matrices of moderate size(up
to n =1000).
For suchmatrices,
aregular
approach
(for
example,
routine F02AAF of the NAG FortranLibrary)
working
on dense matrices stored in the core can be used tocompute
alleigenvalues
and theirmultiplicities.
It was found that Cullum andWilloughby’s
method
applied
using
a Lanczos matrix of size k = 2n andcomputing
multiplicities
from
equations
[11], [14]
and[15]
asproposed above,
leads toexactly
the sameresults as the
regular
approach.
Sparse computation
of the matrix vectorproduct
BvHere we illustrate the
previous
method and its use with sparse matrixtechniques
to
compute
[2] and/or [3]
in the context of asimple
animal model with one fixedeffect. In a
genetic
context,
atypical
mixed animal model is characterizedby:
where:
y = data
vector,
(3
= vector of fixedeffect,
X,
Z = known incidence matrices associated with vectors(3
and arespectively,
e = vector of random
residuals,
such that e NN(0, I
Q
e ),
A = numerator
relationship
matrixamong animals
(Henderson, 1975).
The mixed-model
equations
(MME)
of Henderson(1973)
are:Absorption
of the fixed effectequations
leads to:The estimation of
parameters
Q
a
and Q
e
by
an EM-REML(or
aFisher-scoring)
procedure
involves thecomputation
of the tracegiven
in[5]
(or
[5]
and[6]
forFisher-scoring).
The matrix B considered in the Lanczos methodcorresponds
toL’Z’MZL
here. We will assumethat,
before thecomputation
of any matrix vectorproduct
Bv = u(where
the vector v isarbitrary),
apedigree
file and a data file have been read and that their information has been stored into three vectors of size n, where n is the size ofA;
thesire,
dam and level of the fixed effect of each animal are stored in si,d
i
andh
i
,
respectively
(h
i
= 0 if the animal has norecord).
Progeny
mustprecede
parents.
Simultaneously,
the number of observations in each level of the fixed effect is cumulated in a vector of size n f.Note that u = Bv = L’Z’MZLv = L’Z’ZLV -
L’Z’X(X’X)-X’ZLv.
The sparsecomputation
of Bv = u involves thefollowing loops:
(1)
compute
w = Lv and t =(X’Z)Lv
(2)
compute
f= (X’X)-t
(3)
compute
u= L’Z’(Zw -
Xf)
= L’rThe
computation
of w = Lv in[1]
and u = L’r in[3]
isperformed by solving
the sparsetriangular
systems L-
l
w
= v andu
L -
T
= r. Each line i ofL-
1
and each column ofL -
T
contains at most three nonzero elements which can beeasily
identified(the
diagonal
element + the elements in columns(or lines) s
i
andd
i
).
Vector t is obtainedby simply cumulating
elements of wcorresponding
toanimals with records.
Similarly,
elements of r areequal
to the difference between the elements of w and theappropriate
elements of f for animals with records andto zero for the others. Note that
only
five vectors of size n(s, d, h, u
andv)
andtwo of size n
f (f
andt)
arerequired.
EXAMPLE
To illustrate the
method,
a data setincluding
thetype
scores of 9 686dairy
cows andtheir ancestors
(21 269
animals intotal)
for 18type
traits was created. To estimatethe
genetic
parameters
of thesetype traits,
the animal model[16]
was usedassuming
precorrection
of the data for age atcalving
andstage
oflactation,
andusing
amonth-herd-class classifier effect as a
unique
fixed effect(292 levels).
A canonicalanalyzed
according
to the same model withequal
design
matrices.However,
it wasconsidered that the
repeated computations
ofexpressions
[2]
and/or
[3]
for matrices of size n = 21269 could beadvantageously replaced by
aunique computation
of alleigenvalues
of the matrix B = L’Z’MZL followedby
therepeated
use ofequalities
[5]
and[6]
in aFisher-scoring
REML iterativeprocedure.
The main programs were
supplied
by
Cullum and the sparsecomputation
of the matrix vectorproduct
Bv was doneaccording
to thestrategy
described above onan IBM Rise
6000/590
computer.
Tridiagonal
matricesT
k
with k =2n,
4n, 6n,
8n and lOn werecomputed using
the Lanczos recursion[7].
Theprocedure
wasapplied
twice,
once on B and once onB
[9]
in order to detectmultiple eigenvalues,
then theirmultiplicities
were calculatedusing
therelationships
[11-13]
and aninteger
linear
programming
subroutine.REML estimates of
genetic
and residual(co)variances
were obtainedrepeatedly
using
theeigenvalues
computed
from Lanczos matrices ofincreasing
size in[5]
and[6].
Thequadratic
formsrequired
at eachFisher-scoring
iteration were calculatedby
iteratively solving
the mixed-modelequations
of amultiple-trait
best linearunbiased
prediction
(BLUP)
animalmodel,
after canonical transformation and withstarting
valuesequal
to the solutions of theprevious
system
ofequations.
Estimates ofasymptotic
standard errors of(co)variance
parameters
were available from theinverse of the information matrix at convergence of the
Fisher-scoring
iterativeprocedure.
Estimates ofheritabilities, genetic
and residual correlations and thecorresponding
asymptotic
errors were alsocompared.
RESULTS
Table I shows the number of distinct
eigenvalues computed
and the calculatedmultiplicities
of the fourmultiple eigenvalues
(0,
0.50, 0.6875,
0.75)
encountered,
and the CPU timerequired
for the different values of k(2n,
4n, 6n,
8n and10n).
CPU timemainly
consisted of twoparts:
the timerequired
for thecomputation
of the Lanczos matricesT
k
increasedlinearly
with k. Thecomputing
cost for the determination of theeigenvalues
of thetridiagonal
matricesT
k
increasedquadratically
with k and wasalways
muchlarger
than the calculation ofT
k
,
for thesimple
model withonly
one fixed effect considered here.Obviously,
verylarge
Lanczos matrices must be used in order to detect alleigenvalues:
31 new distincteigenvalues
were detected when the size of the Lanczosmatrix was increased from 8n to 10n. For k =
lOn,
all theeigenvalues
found had agood
precision
(at
least10-
5
with very fewprecisions
worse than10-
1
°)
but there isa
possibility
that othereigenvalues
may still have beenkept
undetected. As aresult,
the
multiplicities
of themultiple eigenvalues
obtainedusing
[11-13]
differ when the size of the Lanczos matrix increases.However,
a close look at theeigenvalues
shows that all distincteigenvalues
have been found with the Lanczos matrixT
k
for k = 4n with the rare
exception
of some of those located in the intervals[0.72;
0.75]
and[0.84;
0.87].
Ask increases,
these intervals become smaller: for k =6n,
these intervals are
[0.73; 0.75]
and[0.86; 0,87]
and for k = 8n[0.74; 0.75]
only.
Thedifficulty
ofdetecting
alleigenvalues
in clusters of very closeeigenvalues
wasclearly
Table II
presents
the value ofexpressions
[2]
and[3]
obtained with k = lOn and the relativeprecision
ofexpressions
[2]
and[3]
from theeigenvalues
and themultiplicities
obtained in table I for three different values of a(99,
4 and1/3)
corresponding
to an extreme range of heritabilities(h
2
=0.01, 0.20
and0.75).
Theeigenvalues
obtained for k = 10n are assumed to be true values. Thestriking
result isthat,
whatever the value of a considered andalthough
some distincteigenvalues
are undetected and the
multiplicities
of themultiple eigenvalues
areincorrect,
thetraces considered are very well
approximated
when a Lanczos matrix of size k = 2nis used and are
virtually
exact whenk !
4n.It is well known that small
departures
from the true values of the traces can lead to ratherlarge
biases in the final REML estimates due to the iterativealgorithms
used. This
explained
somedisappointing
conclusions whenapproximations
of traceswere
proposed
(eg,
Boichard etal,
1992).
Table IIIreports
some characteristics ofthe estimates of the
genetic
parameters
calculatedusing
theeigenvalues
obtained from Lanczos matrices of different size for thecomputations
of traces like those ineach run,
although
convergence waspractically
achieved on allparameters
after five or six iterations. Eachcomplete
run for 18 traits treatedsimultaneously
took about4 min of CPU
time,
mainly
devoted to the solution of the mixed-modelequations.
Estimates ofparameters
(variances,
heritabilities andcorrelations)
werevirtually
identical
regardless
of the value of k. This isobviously
a consequence of thegood
quality
of theapproximation
of the tracesreported
in table II.DISCUSSION AND CONCLUSION
This
example clearly
shows thefeasibility
of thecomputation
of alleigenvalues
forvery
large
sparse matrices. The maindifficulty
is to determine the size k of thetridiagonal
matrixT
k
to be used inpractice.
Cullum andWilloughby’s approach
has two main limitations:
a)
eigenvalues
in small dense clusters are difficult tofind;
andb)
there is noreally satisfying
way tocompute
themultiplicities
ofmultiple
eigenvalues
when thesemultiplicities
are verylarge. However,
in theparticular
casewhen
eigenvalues
are usedonly
to calculatetraces,
these two drawbacks annihilate eachother,
at least when theapproach proposed
here to determine themultiplicities
is chosen and when the number of
multiple eigenvalues
issmall;
the use of incorrectmultiplicities
compensates
for undetectedeigenvalues.
Therefore,
there is no needto find all
eigenvalues
to have an excellentapproximation
of thequantities
required
in first- and second-order REML
algorithms.
Our main
objective
was to show that the use of thesimple expressions
[5]
and[6]
is not limited tosmall,
dense matrices.Many
new directions of research areopened.
The efficientcomputation
of the matrixproduct
Bv for any vector v inmore
complex
models is akey
point
forextending
thisapproach
to other situations. Forexample,
we were able tocompute
Bv = L’Z’MZLv when another fixedeffect
(a
group of unknownparent
effect)
was added to the model used in ourexample.
CPU time for the Lanczos recursion was doubled but thecomputation
of the
eigenvalues
of thetridiagonal
matrices remainedby far
the mosttime-consuming
part.
Otherpromising techniques
like the ’divide andconquer’ approach
(Dongarra
andSorensen,
1987)
could be much more attractive than the traditionalQL
method used here. Thesetechniques
designed
to take fulladvantage
ofparallel
computations,
when these arepossible,
may stillsignificantly
reduce the timerequired
for thediagonalization
of the Lanczos matrices when used in serial mode(Sorensen,
personal
communication).
The value of
knowledge
of theeigenvalues
oflarge
sparse matrices is not limited to REML estimation. It appears forexample
in theanalysis
of the contributions of different lines to the estimation ofgenetic
parameters
from selectionexperiments
(Thompson
andAtkins,
1994).
Sometimes,
only
extremeeigenvalues
areneeded,
eg, in the
computation
of theoptimal
relaxation factors in iterativealgorithms
for
solving
linearsystems
(Golub
and VanLoan,
1983).
In the Lanczosprocedure,
information about Bs extreme
eigenvalues
tends to emergelong
before thetri-diagonalization
iscomplete
(for
k <n).
In ourexample,
thelargest eigenvalue
wasdetected with a value of k as low as 5.
In situations where the
knowlege
of alleigenvalues
is necessary, theproblem
of themultiplicities
ofmultiple
eigenvalues
may be tackleddifferently.
The three nonzeroanimals with similar characteristics
(full-sibs,
same sire and maternalgrandsire,
half-sibs)
asalready
pointed
outby Thompson
and Shaw(1990).
However,
we werenot able to determine the
expected multiplicities
by
a careful look at thepedigree
file(except
forfull-sibs).
If an efficientalgorithm
tocompute
theseeigenvalues
were
available,
thecomputation
ofeigenvalues
using
the Lanczos method could be limited to a modified smallermatrix,
assuggested
by
Thompson
and Shaw(1990),
at least as
long
as thesparsity
of the matrix is notsignificantly
altered.ACKNOWLEDGMENT
The authors are
extremely grateful
to JK Cullum(IBM,
YorktownHeights, NY,
USA)
forproviding
the Lanczossubroutine,
for her valuable comments and for herhelp
inunderstanding
andimplementing
the Lanczosalgorithm
in agenetic
context. The initialhelp
of ATabary
(IBM
France, Paris,
France)
and someinteresting
discussions with RThompson
(Roslin
Institute, Edinburgh,
UK)
are alsoacknowledged.
REFERENCES
Boichard
D,
SchaefferLR,
Lee AJ(1992)
Approximate
restricted maximum likelihood andapproximate
prediction
error variance of the mendeliansampling
effect. Genet Sel Evol24,
331-343Chatelin F
(1988)
Tlaleurs propres de matrices.Masson,
ParisColleau
JJ,
BeaumontC, Regaldo
D(1989)
Restricted maximum likelihood(REML)
estimation ofgenetic
parameter for type traits in Normande cattle breed. Livest Prod Sci 23, 47-66Cullum JK,
Willoughby
RA(1985)
Lanczosalgorithms
forlarge symmetric eigenvalue
computations.
Vol I:Theory,
Vol II : programs. Birkhauser BostonInc, Boston,
MADempster AP,
LairdNM,
Rubin DB(1977)
Maximum likelihood fromincomplete
datavia the EM
algorithm.
J R Statist Soc B 39, 1-38Dempster
AP, Selwyn
MR, PatelM,
Roth AJ(1984)
Statistical andcomputational
aspects of mixed modelanalysis. Appl
Stat 33, 203-214Dongarra JJ,
Sorensen DC(1987)
Afully parallel algorithm
for thesymmetric eigenvalue
problem.
SIAM J Sci StatComput 8,
139-154Ducrocq
V(1993)
Genetic parameters for type traits in the French Holstein breed basedon a
multiple
trait animal model. Livest Prod Sci 36, 143-156Edwards
JT,
LicciardelloDC,
Thouless DJ(1979)
Use of the Lanczos method forfinding
complete
sets ofeigenvalues
oflarge
sparse matrices. J Inst MathAppl 23,
277-283 Garcia-CortesLA,
MorenoC,
VaronaL,
Altarriba J(1992)
Variance componentestima-tion
by resampling.
J Anim Breed Genet 109, 358-363Golub
GH,
Van Loan CF(1983)
MatrixComputations.
JohnsHopkins
UnivPress,
Baltimore,
MDGraser
HU,
SmithSP,
Tier B(1987)
A derivative-freeapproach
forestimating
variance components in animal modelsby
restricted maximum likelihood. J Anim Sci64,
1362-1370
Henderson CR
(1973)
Sire evaluation andgenetic
trends. In:Proceedings of
AnimalBreeding
and GeneticsSymposium
in Honorof
Dr JL Lush. Am Soc Anim Sci-AmDairy
SciAssoc, Champaign, IL,
10-41Henderson CR
(1975)
Use of all relatives in intraherdprediction
ofbreeding
values andproducing
abilities. JDairy
Sci 58, 1910-1916Henderson CR
(1976)
Asimple
method forcomputing
the inverse of a numeratorrelationship
matrix used inprediction
ofbreeding
value. Biometrics32,
69-83Lanczos C
(1950)
An iterative method for the solution of theeigenvalue problem
linear differential andintegral
operators. J Res Nat Bur Standards 45, 255-282Lin CY
(1987)
Application
ofsingular
valuedecomposition
to restricted maximumlikelihood estimation of variance components. J
Dairy
Sci70,
2680-2784Martin
RS,
Wilkinson JH(1968)
Theimplicit QL algorithm.
Numer Math 12, 377-383Meyer
K(1985)
Maximum likelihood estimation of variance components for a multivariate mixed model withequal design
matrices. Biometrics 41, 153-165Misztal I
(1990)
Restricted maximum likelihood estimation of variance components inanimal model
using
sparse matrix inversion and a supercomputer. JDairy
Sci73,
163-172
Misztal
I,
Perez-Enciso M(1993)
Sparse
matrix inversion for restricted maximum likeli-hood estimation of variance componentsby expectation-maximization.
JDairy
Sci76,
1479-1483
Misztal I
(1994)
Comparison
ofcomputing properties
of derivative and derivative-freealgorithms
in variance component estimationby
REML. J Anim Breed Genet111,
346-355Paige
CC(1971)
Thecomputation
ofeigenvalues
andeigenvectors
of verylarge
sparsematrices. PhD
Thesis,
LondonPatterson HD,
Thompson
R(1971)
Recovery
of interblock information when block sizesare
unequal.
Biometrika58,
545-554Smith
SP,
Graser HU(1986)
Estimating
variance components in a class of mixed modelsby
restricted maximum likelihood. JDairy
Sci 69, 1156-1165Thallman
RM, Taylor
JF(1991)
An indirect method ofcomputing
REML estimates ofvariance components from
large
data setsusing
an animal model. JDairy
Sci74.(Suppl
I),
160(Abstr)
Thompson
EA,
Shaw RG(1990)
Pedigree analysis
forquantitative
traits: variancecom-ponents without matrix inversion. Biometrics 46, 399-413
Thompson R,
Atkins KD(1994)
Sources of information forestimating heritability
fromselection