Computation of all eigenvalues of matrices used in restricted maximum likelihood estimation of variance components using sparse matrix techniques

(1)

HAL Id: hal-00894120

https://hal.archives-ouvertes.fr/hal-00894120

Submitted on 1 Jan 1996

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Computation of all eigenvalues of matrices used in

restricted maximum likelihood estimation of variance

components using sparse matrix techniques

C Robert, V Ducrocq

To cite this version:

(2)

Original

article

Computation

of

all

_eigenvalues

of matrices used in restricted maximum

likelihood estimation of variance

components

using

sparse

matrix

_techniques

C

_Robert,

V

_Ducrocq

Station de

_génétique

quantitative et

_appliquée,

Centre de recherches de

_{Jouy-en-Josas,}

Institut national de la recherche _agronomique,78352

Jouy-en-Josas cede!,

France

(Received

16 March 1995;

accepted

3 October

₁₉₉₅₎

Summary - Restricted maximum likelihood

_(REML)

estimates of variance _components

have desirable

_properties

but can _{be very}_expensive

_{computationally. Large}

costs result

from the need for the

repeated

inversion of the

large

coefficient matrix of the mixed-model

equations. This _paper_presentsa method based on the

computation

of all

_{eigenvalues using}

the Lanczos

_method,

a

_{technique reducing}

a

_large

_sparse

_symmetric

matrix to a

_tridiagonal

form. Dense matrix inversion is not

_required.

It is accurate and not _very

_demanding

on

storage

requirements.

The Lanczos

_method,

the

_computation

of

_eigenvalues,

its

_application

in a

_genetic

_context,and an

_example

are

_presented.

Lanczos method

_/

sparse matrix

/

restricted maximum likelihood

/

eigenvalue

Résumé - Calcul de toutes les valeurs propres des matrices utilisées dans l’estimation du maximum de vraisemblance restreinte des composantes de variance à l’aide de

techniques applicables aux matrices creuses. Les estimations du maximum de

vraisem-blance restreinte

_(REML)

des composantes de variance ont des

_propriétés

intéressantes

mais _peuventêtre coûteuses en _tempsde calcul et en besoin de mémoire. Le

_problème

vient de la nécessité d’inverser de

façon répétée

la matrice des

_coefficients

des

_équations

du modèle mixte. Cet article

présente

une méthode basée sur le calcul des valeurs _propres

et sur l’utilisation de la méthode de

_Lanczos,

une

_technique

_permettantde réduire une

matrice _creuse,

_symétrique

et de

grande

taille en une matrice

_{tridiagonale.}

L’inversion de

matrices denses n’est pas nécessaire. Cette méthode donne des résultats

précis

et ne

de-mande que très peu de

stockage

en mémoire. La méthode de

Lanczos,

le calcul des valeurs

propres, son

_application

dans le contexte

_génétique

et un

_exemple

sont

_présentés.

(3)

INTRODUCTION

The _accuracyof estimates of variance

_components

is

_dependent

on the choice of

data,

method and model. The estimation of

_(co)variance

components

by

restricted

maximum likelihood

_(REML,

Patterson and

_Thompson,

₁₉₇₁₎

_procedures

_is

gener-ally

considered to be the best method for animal

_breeding

data.

Furthermore,

the animal model is considered to be the model which utilizes the information from the data in the most efficient _way.Several different REML

algorithms

(derivative-free,

expectation-maximization,

Fisher-scoring)

have been used with animal

_breeding

data. Most methods are iterative and

require

the

repeated

manipulation

of the

mixed-model

_equations

_(Henderson,

_1973).

The derivative-free

_(DF)

_algorithm

_(Smith

and

_Graser,

₁₉₈₆₎

involves

_evaluating

the

_log

likelihood function

explicitly

and

_directly

_finding

the

_parameters

that maximize it. Estimation of

_(co)variance

_components

does not involve matrix inversion but

_requires

evaluation of:

at each

_iteration,

where C

_represents

the coefficient matrix of the mixed-model

equations.

The

_{expectation-maximization}

_(EM)

_algorithm

_(Dempster

et

_al,

₁₉₇₇₎

which utilizes

_only

first derivative information to obtain estimates that maximize the likelihood function of

data, requires

the

_diagonal

blocks of the inverse of the coefficient matrix for random effects

_(C

_uu

₎

and traces of their

_products

with the

corresponding

inverse of the numerator

_relationship

matrix

_(A-

₁

_).

These traces

can be written as:

where Z is the incidence matrix for _anyrandom

_effect,

M is the

_absorption

matrix

of fixed effects and a is the variance ratio. For a

_comparison

of EM- and

DF-REML,

see Misztal

_(1994).

The

_{Fisher-scoring}

_{(Meyer, 1985)}

_algorithm,

which is based on second derivatives

of the likelihood

_function,

requires

the calculation of more

_complicated

traces like:

Expressions

in

_[1-3]

have

_{straightforward}

multitrait extensions. Calculation of the inverses in

_[2]

and

_[3]

is the main

_{computational}

limitation for the use of

REML, particularly

when the coefficient matrix is _very

_large.

Several

attempts

have been made to find direct or indirect methods for

alleviating

these numerical

computations.

While such

_algorithms

such as DF-REML have _provenrobust and

easy to use,

they

are

generally

slow to _converge,often

requiring

many likelihood

evaluations,

in

_particular

for multivariate

_analyses

_fitting

several random factors.

However,

as noted

by

Graser et al

(1987),

they only

require

the factorization of a

large

matrix rather than its inverse and can be

implemented efficiently

_using

_sparse

(4)

variance

_components

in an animal model

by

an EM-REML

_{algorithm using}

_sparse

matrix inversion

_(FSPACK).

Other

_techniques

for

_{reducing computational}

demands based on

_{algorithms using}

derivatives of the likelihood function

_(EM

or method of

scoring

procedures)

have involved the use of

_{approximations}

(Boichard

et

_al,

₁₉₉₂₎

or

_sampling

_techniques

_(Misztal,

_1990;

Thallman and

_Taylor,

_1991;

Garcfa-Cort6s

et

_al,

_1992).

_Along

the same

_lines,

Smith and Graser

₍₁₉₈₆₎

have advocated the

use of _sequencesof Householder transformations to reduce the coefficient matrix C to a

tridiagonal form,

thus

_eliminating

the need for direct matrix inversion. This is based on the observation that

_tr[C]

=

tr[QCQ’]

for

any

Q

such that

QQ’ =

Q’Q

= I and _sothat

(!C(a’

_is

_{tridiagonal. Furthermore,}

this idea has been extended

_by

Colleau et al

₍₁₉₈₉₎

to

_compute

from the

_{tridiagonalized}

coefficient

matrix the trace of matrix

_{products required}

in a

_{Fisher-scoring}

_{algorithm applied}

in a multivariate

_setting

and

_by

_Ducrocq

(1993)

with an extra

_{diagonalization}

step.

Diagonalization

is the

_problem

of

_finding

_eigenvectors

and

_eigenvalues

of C. As noted

_by

_Dempster

et al

_(1984),

Lin

₍₁₉₈₇₎

and Harville and Callanan

_(1990),

matrix inversion is avoided if a

_{spectral decomposition}

of the coefficient matrix is used. Then REML estimation of variance

_components

in mixed models becomes

computationally

trivial and

_requires

little

_{computer storage.}

In

_expressions

_!l-3!,

the

_problem

amounts to

_{finding eigenvalues}

of

_large

sparse

symmetric

matrices. Indeed

_[1]

can be written in terms of

_eigenvalues:

where the

_A

_i

are the

eigenvalues

of the coefficient matrix C

_and,

if L is the

_Cholesky

factor of A

_(A

=

LL’;

Henderson,

1976),

[2]

and

[3]

_canbe

simply expressed

_as:

where

_the

₇

_i

are the

_eigenvalues

of L’Z’MZL.

Until now, all of these methods have

implied working

on dense matrices stored in

the core of the

_computer.

These transformations are therefore limited to data sets of moderate size. The _purposeof this paper is to

_present

and

_apply

in a

_genetic

context

a method for

_computing

some or all

_eigenvalues

of a _very

large

_{sparse matrix}with

very little

computer storage.

This

_method,

attributed to Lanczos

_(1950),

_generates

a

sequence

of tridiagonal

matrices

T,

with the

_property

that

_eigenvalues

_{of T!}

E

J2! &dquo;!

are

_{progressively}

better estimates of

_eigenvalues

of the

_large

matrix

_{as j}

_increases.

This method was

developed by

Cullum and

_Willoughby

(1985).

The

_computer

(5)

METHODS

Computing

all

_eigenvalues

of a _{very sparse}

symmetric

matrix

Although

the

_problem

of

computing eigensystems

of

_symmetric

dense matrices that fit

_easily

in the core has been

satisfactorily

solved

using

sequences of Householder

transformations or Givens rotations

(Golub

and Van

Loan,

1983),

the same cannot be said for _very

_large

sparse matrices. One way to find some or all

eigenvalues

of

large

sparse

symmetric

matrices B is to use the Lanczos method

(Lanczos, 1950).

If a

_large

matrix is sparse, then the

advantages

of methods which

only

use matrix-vector

_{multiplications}

are obvious as the matrix B need not be stored

explicitly.

All

that is needed is a subroutine for

efficiently computing

the matrix vector

product

Bv for _any

_given

vector v. In the

genetic

context

considered,

it will be seen that

such a

computation

is often _easy.No

_storage

of

large

matrices is needed. In the

Lanczos

_{method, only}

two vectors and the relevant information on B to build the

sparse matrix-vector

product

Bv need to be used at each

step

if

only

the

eigenvalues

are

required.

In

_theory,

the Lanczos

_algorithm

reduces the

_original

matrix B

_(n

x

n)

to a

sequence of

tridiagonal

matrices:

For a real

_symmetric

matrix

B,

it involves the construction of a _sequenceof

orthonormal vectors _Vj

_{(for j}

₌

_{1, ... , n)}

from an

_arbitrary

initial vector vi

by

recursively

applying

the

equation:

and

/9

i

= 0 and _v_o = 0. The vectors

Vj are referred to as the ’Lanczos vectors’. If

at each

_stage,

the coefficient _a_j is chosen to make _v_j+l

_orthogonal

to _v_j and

_{

₃

_j+1

is chosen to normalize _Vj+l to

_unity,

the vectors should form an orthonormal set and the

_tridiagonal

matrix

_T

_n

with

_diagonal

elements _a_j and

_off-diagonal

elements

!3!+1 (j

=

1, ... ,

n)

should have the same

eigenvalues

as B. The coefficients _o_j and

{3

j+1

are determined as follows:

The

_resulting

collection of scalar coefficients _a_j and

_{

₃

_1+l

obtained in these

orthogonalizations

defines the

_{corresponding}

Lanczos matrices

_T

_j

_.

_Paige

₍₁₉₇₁₎

showed in his thesis that for the basis Lanczos recursion

_[7]

_{orthogonalization}

(6)

Lanczos matrices

_T

_j

_{( j}

=

1, ... , n).

Cullum and

Willoughby

(1985)

showed that the

_eigenvalues

of the

_T

_j

provide good approximations

to some of the

_eigenvalues

of B.

_So,

if the Lanczos recursion is continued

_{until j =}

_n,then the

_eigenvalues

of

_T

_n

will be the

_eigenvalues

of B. In this _case,

_T

_n

is

_simply

an

orthogonal

transformation

of B and must have the same

eigenvalues

as B.

Theoretically,

the Lanczos method cannot determine the

_multiplicity

of any

multiple eigenvalue

of the matrix B. If the matrix B has

_only

m distinct

_eigenvalues

(m

<

_n)

then the Lanczos recursion would terminate after m

_steps.

Additional

computations

will be needed to determine which of these

_eigenvalues

are

_multiple

also to

_compute

their

multiplicities.

The Lanczos method seems attractive for

large

_sparsematrices because the

requirements

for

_storage

are _verysmall

(the

values of _a!and

(3j

for all

j).

The

elements of _v_j and _v_j_-_l for the current value

_{of j}

_must

be

stored,

as well as

_B,

in such a _waythat the subroutine for

computing

Bv from v takes full

_advantage

of the

_sparsity

of B. The

_eigenvalues

of B can be found from those of the more

easily

handled

_symmetric

matrices

_T

_j

_{( j}

=

_{1, ... ,}

n),

which are then determined

by

a standard method

(eg,

the so-called

QL algorithm

or a bisection

method;

Martin

and

Wilkinson,

1968).

However,

because of

_rounding

errors, Vj+1 will never be

exactly orthogonal

to Vk

(for

all

k ! j -

2).

Consequently,

the values of coefficients a! and

,!!+1

are

inaccurate and the nice

_properties

of the

_T

_j

described above are

quickly

lost.

Paige

(1971)

and Edwards et al

₍₁₉₇₉₎

showed that the loss in the

_{orthogonality}

of the Lanczos vectors occurs

_essentially

when the

_quantity

[,3

j

(ejx)]

becomes small

(e

j

is the

_jth

base

vector,

ie,

all

components

are

equal

to zero

_except

for the

jth

one

which is

_equal

to _one,and x is an

arbitrary

vector).

A first

approach

to correct

the

_problem

consists of

_{reorthogonalizing}

all or some of the vectors _Vj at each

iteration,

but the costs of this

_operation

and the

computer storage

required

for all _v_j can be

extremely large.

Another

_approach

is not to force the

_{orthogonality}

of the Lanczos vectors

_by

_{reorthogonalizing}

but to work

_directly

with the basic Lanczos

_{recursion, accepting}

the losses in

_{orthogonality}

and then

unravelling

the effects of these losses. Because of this failure of

_{orthogonality,}

the process does not

terminate after n

steps

but can continue

indefinitely

to

produce

a

_tridiagonal

matrix

of any desired size.

Paige

(1971)

showed that if the

tridiagonal

matrix is truncated

after a number of iterative

_steps

k much

_greater

than n, the

resulting tridiagonal

matrix

T

k

has a _groupof

_eigenvalues

_veryclose to the correct

_eigenvalues

for each

eigenvalue

of the

_original

matrix. He also showed that

_rounding

errors

only

delayed

convergence but did not

stop

it.

_Indeed,

the

_eigenvalues

of the Lanczos matrices

are either

’good’ eigenvalues,

which are true

_{approximations}

to the

_eigenvalues

of the matrix

_B,

or ’extra’ or

’spurious’

eigenvalues,

which are caused

_by

the losses in

orthogonality.

Two

_types

of

_{’spurious’}

_eigenvalues

can be

distinguished.

Type

one

is a less _{accurate copy}or

’ghost’

_copyof the

good

eigenvalue;

type

two is

genuinely

spurious.

The main

_difficulty

is to find out which of these

_{eigenvalues correspond}

to

_eigenvalues

of the

_original

matrix B. For

_that,

the inverse iteration method and the

_{corresponding}

_Rayleigh

_quotient

_(Wilkinson,

1965; Chatelin,

1988)

are used.

The identification test which

_separates

the bad

_eigenvalues

from the

_good

ones

(7)

row and column of the matrix

_T

_j

_.

Any eigenvalue

of

T

j

,

which is also an

_eigenvalue

of the

_{corresponding}

T

j

_matrix,

is labelled as

_{’spurious’}

and is discarded from the list of

_computed

_eigenvalues.

All

_remaining

_{eigenvalues, including}

all

numerically

multiple

ones, are

accepted

and labelled as

’good’.

So one can

_{directly identify}

those

eigenvalues

which are

_spurious.

In the Lanczos

procedure,

numerically

multiple

eigenvalues,

which differ from each other

_by

less than a

_{user-specified}

relative

tolerance

_parameter,

are

_accepted

as accurate

_{approximations}

of

eigenvalues

of the

_original

matrix B and the others

_{(simple eigenvalues)}

may or may not have

converged.

This is checked

_by

_computing

error estimates

_(only

on the

_resulting

single

isolated

_{eigenvalues).}

These error estimates are obtained

by calculating

the

product

of the kth

component

(k

being

the size of Lanczos matrix

_considered)

of the

_{corresponding}

Lanczos matrix

_eigenvector

u

by {3

k+l

(Cullum

and

Willoughby,

1985).

Therefore,

in order to

_apply

this

_technique,

it is _necessaryto first choose a value

k for the size of the Lanczos matrix

_(often

greater

than the size of the B matrix if all

eigenvalues

are

required).

_Paige

(1971)

and Cullum and

_Willoughby

(1985)

showed

that the

_primary

factor

determining

whether or not it is feasible to

_compute

_large

numbers of

_eigenvalues

for a

_given

k is the _gapratio

(the

ratio of the

largest

_gap

between two

_{adjacent eigenvalues}

to the smallest such

_gap).

The smaller this

_ratio,

the easier it is to

_compute

all the

_eigenvalues

of the

_original

matrix.

The

_larger

this

ratio,

the

_larger

the size of the Lanczos matrix

_required

to obtain all

_eigenvalues

of the B matrix. Cullum and

_Willoughby

₍₁₉₈₅₎

showed that

_reasonably

well-separated eigenvalues

on the extremes of the

_spectrum

of B appear as

eigenvalues

of

the Lanczos matrices for

_relatively

small size of the Lanczos matrix. Therefore this method can also be

_applied

to

_efficiently

find the extreme

_eigenvalues

of a _sparse

matrix,

which are

_required

for

_example

in

_finding

_optimal

relaxation

_parameters,

when iterative successive relaxation methods are used to solve linear

_systems

_(Golub

and Van

_Loan,

_1983).

Since there is no

_{reorthogonalization,}

eigenvalues

which have

converged

by

a

_given

of the Lanczos matrix _may

_begin

to

_replicate

as the size of

the Lanczos matrix is increased further.

The Lanczos

_algorithm

does not

_give

rules for

_determining

how

_large

the Lanczos matrix must be in order to

_compute

all the

_{eigenvalues. Paige}

₍₁₉₇₁₎

showed that it takes more than n

_steps

(generally

between 2n and 8n are

needed)

to

_compute

all

_eigenvalues

of the

spectrum.

The

_stopping

criterion cannot be determined beforehand.

A method to determine the

_{multiplicities of multiple}

_eigenvalues

The

Lanczos

_procedure

cannot

_directly

determine the

_{multiplicities}

of the

_computed

eigenvalues

as

_eigenvalues

of B.

_{Unfortunately,}

the

_multiplicity

of an

_eigenvalue

of a Lanczos matrix has no

relationship

with its

multiplicity

in the

original

matrix B.

Good

_eigenvalues

_may

_replicate

many times as

eigenvalues

of a Lanczos matrix but

be

only single eigenvalues

of the

_original

matrix B. In

_{fact, numerically multiple}

eigenvalues

are

_accepted

as

_{converged approximations}

to

_eigenvalues

of B.

Cullum

_(personal

_{communication)}

_proposed

an

approach

to determine and

compute

the

_{multiplicities}

of

multiple eigenvalues

of the matrix

_B,

but it

_requires

(8)

algorithm presented previously

has to be

_applied

several times. The different matrices

_required

are

_simple

modifications of the

_original

B matrix. The

_approach

of Cullum is based upon the

following

property.

If B is a real

_symmetric

matrix and A is an

_eigenvalue

of B with

_multiplicity

p

(p > 2)

then A will also be an

_eigenvalue

of the matrix obtained from B

_{by adding}

a

_symmetric

rank-one matrix:

where VI is the

starting

vector used for B in the Lanczos

algorithm

and

v, is an

arbitrary

scalar.

_{Theoretically,}

if the Lanczos

_procedure

is

_applied

to

B

and with the same VI as

_starting

_vector,

then the

_tridiagonal

matrices

_T

_j

_generated

for

B

would be related to those

_generated

for B:

One could continue this

approach

to determine the

_specific

_{multiplicities}

of each of these

_{multiple eigenvalues by considering}

the matrix:

B

=

B+v

2 v

2 v2,

where _v₂

is the second Lanczos vector

_generated

for B or _anyvector

_orthogonal

to _v_l_,and V2 is a scalar which can be

equal

to vl.

Any eigenvalue

of B that has

multiplicity

greater

than two will be in the

spectrum

of

B,

and those with

multiplicity

equal

to two will be in the

_spectrum

of

B

and not in the

_spectrum

of

B. _Thus,

the

procedure

could continue with successive transformations of the B matrix until

a matrix is obtained with no

_{eigenvalues corresponding}

to

_eigenvalues

of B. This

approach

seems attractive if the

multiplicities

of

_multiple

_eigenvalues

of the matrix B are small. If this is not the case, this

operation

will be

_expensive,

because the

algorithm

must be

_applied

as _{many times}as the

_largest

_{multiplicities}

of

_multiple

eigenvalues.

We

present

here another

_approach

which can be used when the number of

multiple

eigenvalues

of the B matrix is small but their

_{multiplicities}

are

_large.

The

above

procedure

is

_{applied only}

once to determine which

_eigenvalues

are

_multiple

and then the

_following

_expressions

are used to

_compute

the

multiplicities

of each

multiple

eigenvalue.

First,

one makes use of the fact that the total number of

eigenvalues

of B is

_equal

to its dimension:

where N is the dimension of matrix

_B,

_N

_S

is the number of

_single

nonzero

eigen-values,

r is the number of

multiple

nonzero

eigenvalues

and mi is the

multiplicity

of the ith

_multiple

_eigenvalue

_!.

(9)

where the

_A

_j

are the

simple

eigenvalues.

Finally,

the mi are obtained

solving

the

(possibly overdetermined)

system

of

three

_equations,

for

_{example, using integer}

linear

_programming

subroutines

_(the

m

i

must be

integers).

The traces of B and

B

2

are

_simply

obtained

by

_using

the subroutine of the

sparse matrix-vector

multiplications

which will be

presented

in the next

part

in a

particular

setting.

The matrix B need not be stored

_explicitly.

_Only

two vectors

are needed to

_compute

these traces:

where the

b

ii

are the

_diagonal

elements of the B

_matrix,

ei is the ith base

vector,

and

_(Be

_i

₎

_i

represents

the ith element of the vector

_(Be

_i

_).

The

_validity

of the

_{approach proposed}

here to determine

_{multiplicities}

was

verified on several

_examples

for matrices of moderate size

(up

to n =

1000).

For such

matrices,

a

_regular

_approach

(for

_example,

routine F02AAF of the NAG Fortran

Library)

working

on dense matrices stored in the core can be used to

_compute

all

eigenvalues

and their

_{multiplicities.}

It was found that Cullum and

Willoughby’s

method

_applied

_using

a Lanczos matrix of size k = 2n and

computing

multiplicities

from

_equations

_{[11], [14]}

and

_[15]

as

proposed above,

leads to

_exactly

the same

results as the

regular

approach.

Sparse computation

of the matrix vector

_product

Bv

Here we illustrate the

_previous

method and its use with _{sparse matrix}

_techniques

to

compute

[2] and/or [3]

in the context of a

_simple

animal model with one fixed

effect. In a

_genetic

_context,

a

typical

mixed animal model is characterized

by:

where:

y = data

_vector,

(3

= _vectorof fixed

effect,

(10)

X,

Z = known incidence matrices associated with _vectors

₍₃

and _a

_{respectively,}

e = vector of random

residuals,

such that e N

N(0, I

Q

e ),

A = _numerator

relationship

matrix

among animals

(Henderson, 1975).

The mixed-model

_equations

_(MME)

of Henderson

₍₁₉₇₃₎

are:

Absorption

of the fixed effect

_equations

leads to:

The estimation of

_parameters

_Q

_a

_{and Q}

_e

_by

an EM-REML

(or

a

Fisher-scoring)

procedure

involves the

_computation

of the trace

_given

in

_[5]

_(or

_[5]

and

_[6]

for

Fisher-scoring).

The matrix B considered in the Lanczos method

corresponds

to

L’Z’MZL

here. We will assume

_that,

before the

_computation

of _anymatrix vector

product

Bv = u

_(where

the vector v is

_arbitrary),

a

_pedigree

file and a data file have been read and that their information has been stored into three vectors of size _n,where n is the size of

A;

the

sire,

dam and level of the fixed effect of each animal are stored in _s_i_,

d

i

and

h

i

,

respectively

(h

i

= 0 if the animal has no

record).

_Progeny

must

_precede

_parents.

_{Simultaneously,}

the number of observations in each level of the fixed effect is cumulated in a vector of size _n_f_.

Note that u = Bv = L’Z’MZLv = L’Z’ZLV -

L’Z’X(X’X)-X’ZLv.

The sparse

computation

of Bv = u involves the

_{following loops:}

(1)

compute

w = Lv and t =

(X’Z)Lv

(2)

compute

f

_{= (X’X)-t}

(3)

compute

u

= L’Z’(Zw -

Xf)

= L’r

The

computation

of w = Lv in

[1]

and u = L’r in

[3]

is

performed by solving

the _sparse

_triangular

_{systems L-}

l

w

= v and

u

L -

T

= _r.Each line i of

L-

1

and each column of

L -

T

contains at most three nonzero elements which can be

easily

identified

_(the

_diagonal

element + the elements in columns

_{(or lines) s}

_i

and

d

i

).

Vector t is obtained

_{by simply cumulating}

elements of w

_{corresponding}

to

animals with records.

_Similarly,

elements of r are

_equal

to the difference between the elements of w and the

_appropriate

elements of f for animals with records and

to zero for the others. Note that

only

five vectors of size n

_{(s, d, h, u}

and

v)

and

two of size _n

_{f (f}

and

_t)

are

required.

EXAMPLE

To illustrate the

_method,

a data set

_including

the

_type

scores of 9 686

_dairy

cows and

their ancestors

_{(21 269}

animals in

_total)

for 18

_type

traits was created. To estimate

the

_genetic

_parameters

of these

_{type traits,}

the animal model

_[16]

was used

_assuming

precorrection

of the data for _{age at}

_calving

and

_stage

of

_lactation,

and

_using

a

month-herd-class classifier effect as a

_unique

fixed effect

(292 levels).

A canonical

(11)

analyzed

according

to the same model with

_equal

_design

matrices.

_However,

it was

considered that the

_{repeated computations}

of

_expressions

_[2]

_and/or

_[3]

for matrices of size n = 21269 could be

advantageously replaced by

_a

_{unique computation}

of all

eigenvalues

of the matrix B = L’Z’MZL followed

by

the

_repeated

_useof

_equalities

[5]

and

_[6]

in a

_{Fisher-scoring}

REML iterative

_procedure.

The main _programswere

_supplied

_by

Cullum and the sparse

computation

of the matrix vector

_product

Bv was done

_according

to the

strategy

described above on

an IBM Rise

6000/590

_computer.

_Tridiagonal

matrices

_T

_k

with k =

2n,

4n, 6n,

8n and lOn were

_{computed using}

the Lanczos recursion

_[7].

The

_procedure

was

applied

twice,

once on B and once on

B

[9]

in order to detect

_{multiple eigenvalues,}

then their

_{multiplicities}

were calculated

_using

the

_{relationships}

_[11-13]

and an

_integer

linear

_programming

subroutine.

REML estimates of

_genetic

and residual

_{(co)variances}

were obtained

_repeatedly

using

the

_eigenvalues

_computed

from Lanczos matrices of

_increasing

size in

_[5]

and

[6].

The

_quadratic

forms

required

at each

_{Fisher-scoring}

iteration were calculated

by

iteratively solving

the mixed-model

_equations

of a

_{multiple-trait}

best linear

unbiased

_prediction

_(BLUP)

animal

_model,

after canonical transformation and with

starting

values

_equal

to the solutions of the

_previous

_system

of

_equations.

Estimates of

_asymptotic

standard errors of

_(co)variance

_parameters

were available from the

inverse of the information matrix at _convergenceof the

_{Fisher-scoring}

iterative

procedure.

Estimates of

_{heritabilities, genetic}

and residual correlations and the

corresponding

asymptotic

errors were also

_compared.

RESULTS

Table I shows the number of distinct

_{eigenvalues computed}

and the calculated

multiplicities

of the four

_{multiple eigenvalues}

_(0,

_{0.50, 0.6875,}

_0.75)

_encountered,

and the CPU time

_required

for the different values of k

_(2n,

_{4n, 6n,}

8n and

_10n).

CPU time

_mainly

consisted of two

_parts:

the time

_required

for the

_computation

of the Lanczos matrices

_T

_k

increased

_linearly

with k. The

_computing

cost for the determination of the

_eigenvalues

of the

_tridiagonal

matrices

_T

_k

increased

quadratically

with k and was

always

much

_larger

than the calculation of

_T

_k

_,

for the

_simple

model with

_only

one fixed effect considered here.

Obviously,

very

large

Lanczos matrices must be used in order to detect all

eigenvalues:

31 new distinct

_eigenvalues

were detected when the size of the Lanczos

matrix was increased from 8n to 10n. For k =

_lOn,

all the

_eigenvalues

found had _a

good

precision

(at

least

10-

5

with _veryfew

_precisions

worse than

10-

1 °)

but there is

a

_possibility

that other

_eigenvalues

_maystill have been

_kept

undetected. As a

_result,

the

_{multiplicities}

of the

_{multiple eigenvalues}

obtained

_using

_[11-13]

differ when the size of the Lanczos matrix increases.

_However,

a close look at the

_eigenvalues

shows that all distinct

_eigenvalues

have been found with the Lanczos matrix

_T

_k

for k = 4n with the rare

_exception

of some of those located in the intervals

_[0.72;

0.75]

and

_[0.84;

_0.87].

As

_{k increases,}

these intervals become smaller: for k =

6n,

these intervals are

[0.73; 0.75]

and

_{[0.86; 0,87]}

and for k = 8n

[0.74; 0.75]

only.

The

difficulty

of

_detecting

all

_eigenvalues

in clusters of very close

eigenvalues

was

_clearly

(12)

Table II

_presents

the value of

expressions

[2]

and

_[3]

obtained with k = lOn and the relative

_precision

of

_expressions

_[2]

and

_[3]

from the

_eigenvalues

and the

multiplicities

obtained in table I for three different values of a

(99,

4 and

1/3)

corresponding

to an extreme _rangeof heritabilities

_(h

₂

=

_{0.01, 0.20}

and

0.75).

The

eigenvalues

obtained for k = 10n are assumed to be true values. The

_striking

result is

_that,

whatever the value of a considered and

although

some distinct

_eigenvalues

are undetected and the

_{multiplicities}

of the

multiple eigenvalues

are

_incorrect,

the

traces considered are _verywell

approximated

when a Lanczos matrix of size k = 2n

is used and are

_virtually

exact when

k !

4n.

It is well known that small

departures

from the true values of the traces can lead to rather

_large

biases in the final REML estimates due to the iterative

_algorithms

used. This

_explained

some

disappointing

conclusions when

_{approximations}

of traces

were

_proposed

(eg,

Boichard et

_al,

_1992).

Table III

_reports

some characteristics of

the estimates of the

_genetic

parameters

calculated

_using

the

_eigenvalues

obtained from Lanczos matrices of different size for the

_computations

of traces like those in

(13)

(14)

each run,

although

convergence was

_practically

achieved on all

_parameters

after five or six iterations. Each

complete

run for 18 traits treated

simultaneously

took about

4 min of CPU

_time,

_mainly

devoted to the solution of the mixed-model

_equations.

Estimates of

parameters

(variances,

heritabilities and

_{correlations)}

were

_virtually

identical

_regardless

of the value of k. This is

_obviously

a _consequenceof the

good

quality

of the

_{approximation}

of the traces

_reported

in table II.

DISCUSSION AND CONCLUSION

This

_{example clearly}

shows the

_feasibility

of the

_computation

of all

_eigenvalues

for

very

large

sparse matrices. The main

difficulty

is to determine the size k of the

tridiagonal

matrix

_T

_k

to be used in

_practice.

Cullum and

Willoughby’s approach

has two main limitations:

_a)

_eigenvalues

in small dense clusters are difficult to

_find;

and

_b)

there is no

really satisfying

_{way to}

_compute

the

multiplicities

of

multiple

eigenvalues

when these

_{multiplicities}

are _very

_{large. However,}

in the

_particular

case

when

_eigenvalues

are used

_only

to calculate

_traces,

these two drawbacks annihilate each

_other,

at least when the

_{approach proposed}

here to determine the

_{multiplicities}

is chosen and when the number of

_{multiple eigenvalues}

is

_small;

the use of incorrect

multiplicities

compensates

for undetected

_eigenvalues.

Therefore,

there is no need

to find all

_eigenvalues

to have an excellent

_{approximation}

of the

_quantities

_required

in first- and second-order REML

_algorithms.

Our main

_objective

was to show that the use of the

simple expressions

[5]

and

[6]

is not limited to

_small,

dense matrices.

_Many

new directions of research are

opened.

The efficient

computation

of the matrix

_product

Bv for any vector v in

more

_complex

models is a

_key

_point

for

_extending

this

_approach

to other situations. For

_example,

we were able to

_compute

Bv = L’Z’MZLv when another fixed

effect

_(a

group of unknown

parent

effect)

was added to the model used in our

example.

CPU time for the Lanczos recursion was doubled but the

_computation

of the

_eigenvalues

of the

_tridiagonal

matrices remained

_{by far}

the most

time-consuming

part.

Other

_{promising techniques}

like the ’divide and

_{conquer’ approach}

(Dongarra

and

_Sorensen,

₁₉₈₇₎

could be much more attractive than the traditional

QL

method used here. These

_techniques

designed

to take full

advantage

of

parallel

computations,

when these are

_possible,

_maystill

_{significantly}

reduce the time

required

for the

_{diagonalization}

of the Lanczos matrices when used in serial mode

(Sorensen,

personal

communication).

The value of

_knowledge

of the

_eigenvalues

of

_large

sparse matrices is not limited to REML estimation. It appears for

example

in the

_analysis

of the contributions of different lines to the estimation of

_genetic

_parameters

from selection

_experiments

(Thompson

and

_Atkins,

_1994).

_Sometimes,

_only

extreme

_eigenvalues

are

needed,

eg, in the

computation

of the

optimal

relaxation factors in iterative

algorithms

for

_solving

linear

systems

(Golub

and Van

Loan,

1983).

In the Lanczos

procedure,

information about Bs extreme

_eigenvalues

tends to _emerge

_long

before the

tri-diagonalization

is

_complete

_(for

k <

_n).

In our

_example,

the

_{largest eigenvalue}

was

detected with a value of k as low as 5.

In situations where the

_knowlege

of all

_eigenvalues

is _necessary,the

_problem

of the

multiplicities

of

_multiple

_eigenvalues

_maybe tackled

_differently.

The three nonzero

(15)

animals with similar characteristics

_(full-sibs,

same sire and maternal

_grandsire,

half-sibs)

as

_already

_pointed

out

_{by Thompson}

and Shaw

_(1990).

_However,

we were

not able to determine the

_{expected multiplicities}

_by

a careful look at the

_pedigree

file

_(except

for

_full-sibs).

If an efficient

_algorithm

to

_compute

these

_eigenvalues

were

_available,

the

_computation

of

_eigenvalues

_using

the Lanczos method could be limited to a modified smaller

_matrix,

as

_suggested

_by

_Thompson

and Shaw

(1990),

at least as

long

as the

_sparsity

of the matrix is not

_{significantly}

altered.

ACKNOWLEDGMENT

The authors are

_{extremely grateful}

to JK Cullum

_(IBM,

Yorktown

_{Heights, NY,}

_USA)

for

_providing

the Lanczos

_subroutine,

for her valuable comments and for her

_help

in

understanding

and

_implementing

the Lanczos

_algorithm

in a

_genetic

context. The initial

help

of A

_Tabary

_(IBM

_{France, Paris,}

_France)

and some

_interesting

discussions with R

Thompson

(Roslin

Institute, Edinburgh,

UK)

are also

_{acknowledged.}

REFERENCES

Boichard

_D,

Schaeffer

_LR,

Lee AJ

₍₁₉₉₂₎

_Approximate

restricted maximum likelihood and

approximate

prediction

error variance of the mendelian

_sampling

effect. Genet Sel Evol

24,

331-343

Chatelin F

₍₁₉₈₈₎

Tl_{aleurs propres}de matrices.

_Masson,

Paris

Colleau

_JJ,

Beaumont

_{C, Regaldo}

D

₍₁₉₈₉₎

Restricted maximum likelihood

_(REML)

estimation of

_genetic

_parameterfor _typetraits in Normande cattle breed. Livest Prod Sci 23, 47-66

Cullum JK,

Willoughby

RA

₍₁₉₈₅₎

Lanczos

_algorithms

for

_{large symmetric eigenvalue}

computations.

Vol I:

_Theory,

_{Vol II : programs.}Birkhauser Boston

_{Inc, Boston,}

MA

Dempster AP,

Laird

_NM,

Rubin DB

₍₁₉₇₇₎

Maximum likelihood from

incomplete

data

via the EM

_algorithm.

J R Statist Soc B 39, 1-38

Dempster

AP, Selwyn

MR, Patel

_M,

Roth AJ

₍₁₉₈₄₎

Statistical and

_{computational}

_aspects of mixed model

analysis. Appl

Stat 33, 203-214

Dongarra JJ,

Sorensen DC

₍₁₉₈₇₎

A

_{fully parallel algorithm}

for the

_{symmetric eigenvalue}

problem.

SIAM J Sci Stat

_{Comput 8,}

139-154

Ducrocq

V

₍₁₉₉₃₎

Genetic parameters for _typetraits in the French Holstein breed based

on a

_multiple

trait animal model. Livest Prod Sci 36, 143-156

Edwards

_JT,

Licciardello

_DC,

Thouless DJ

₍₁₉₇₉₎

Use of the Lanczos method for

_finding

complete

sets of

_eigenvalues

of

_large

_sparsematrices. J Inst Math

_{Appl 23,}

277-283 Garcia-Cortes

LA,

Moreno

_C,

Varona

_L,

Altarriba J

₍₁₉₉₂₎

Variance _component

estima-tion

_{by resampling.}

J Anim Breed Genet 109, 358-363

Golub

_GH,

Van Loan CF

₍₁₉₈₃₎

Matrix

_{Computations.}

Johns

_Hopkins

Univ

_Press,

Baltimore,

MD

Graser

_HU,

Smith

_SP,

Tier B

₍₁₉₈₇₎

A derivative-free

_approach

for

_estimating

variance components in animal models

_by

restricted maximum likelihood. J Anim Sci

_64,

1362-1370

(16)

Henderson CR

₍₁₉₇₃₎

Sire evaluation and

_genetic

trends. In:

_{Proceedings of}

Animal

Breeding

and Genetics

Symposium

in Honor

_of

Dr JL Lush. Am Soc Anim Sci-Am

Dairy

Sci

_{Assoc, Champaign, IL,}

10-41

Henderson CR

₍₁₉₇₅₎

Use of all relatives in intraherd

prediction

of

_breeding

values and

producing

abilities. J

Dairy

Sci 58, 1910-1916

Henderson CR

₍₁₉₇₆₎

A

_simple

method for

_computing

the inverse of a numerator

relationship

matrix used in

_prediction

of

_breeding

value. Biometrics

_32,

69-83

Lanczos C

₍₁₉₅₀₎

An iterative method for the solution of the

_{eigenvalue problem}

linear differential and

_integral

operators. J Res Nat Bur Standards 45, 255-282

Lin CY

₍₁₉₈₇₎

Application

of

singular

value

decomposition

to restricted maximum

likelihood estimation of variance _components.J

Dairy

Sci

_70,

2680-2784

Martin

RS,

Wilkinson JH

₍₁₉₆₈₎

The

_{implicit QL algorithm.}

Numer Math 12, 377-383

Meyer

K

₍₁₉₈₅₎

Maximum likelihood estimation of variance components for a multivariate mixed model with

_{equal design}

matrices. Biometrics 41, 153-165

Misztal I

₍₁₉₉₀₎

Restricted maximum likelihood estimation of variance _componentsin

animal model

_using

sparse matrix inversion and a _{supercomputer.}J

_Dairy

Sci

_73,

163-172

Misztal

_I,

Perez-Enciso M

₍₁₉₉₃₎

_Sparse

matrix inversion for restricted maximum likeli-hood estimation of variance components

by expectation-maximization.

J

_Dairy

Sci

76,

1479-1483

Misztal I

₍₁₉₉₄₎

_Comparison

of

_{computing properties}

of derivative and derivative-free

algorithms

in variance _componentestimation

_by

REML. J Anim Breed Genet

_111,

346-355

Paige

CC

₍₁₉₇₁₎

The

_computation

of

_eigenvalues

and

eigenvectors

of _very

_large

_sparse

matrices. PhD

_Thesis,

London

Patterson HD,

Thompson

R

₍₁₉₇₁₎

_Recovery

of interblock information when block sizes

are

_unequal.

Biometrika

_58,

545-554

Smith

SP,

Graser HU

₍₁₉₈₆₎

_Estimating

variance _componentsin a class of mixed models

by

restricted maximum likelihood. J

_Dairy

Sci 69, 1156-1165

Thallman

_{RM, Taylor}

JF

₍₁₉₉₁₎

An indirect method of

_computing

REML estimates of

variance _componentsfrom

_large

data sets

_using

an animal model. J

_Dairy

Sci

74.(Suppl

I),

160

_(Abstr)

Thompson

EA,

Shaw RG

₍₁₉₉₀₎

_{Pedigree analysis}

for

quantitative

traits: variance

com-ponents without matrix inversion. Biometrics 46, 399-413

Thompson R,

Atkins KD

₍₁₉₉₄₎

Sources of information for

_{estimating heritability}

from

selection

experiments.

Genet Res 63, 49-55

Computation of all eigenvalues of matrices used in restricted maximum likelihood estimation of variance components using sparse matrix techniques

HAL Id: hal-00894120

https://hal.archives-ouvertes.fr/hal-00894120

Submitted on 1 Jan 1996

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Computation of all eigenvalues of matrices used in

restricted maximum likelihood estimation of variance

components using sparse matrix techniques

C Robert, V Ducrocq

To cite this version:

Original

article

Computation

of

all

eigenvalues

of matrices used in restricted maximum

likelihood estimation of variance

components

using

sparse

matrix

techniques

C

Robert,

V

Ducrocq

génétique

appliquée,

Jouy-en-Josas,

Jouy-en-Josas cede!,

(Received

accepted

1995)

(REML)

properties

computationally. Large

repeated

large

computation

eigenvalues using

method,

technique reducing

large

symmetric

tridiagonal

required.

demanding

requirements.

method,

computation

eigenvalues,

application

genetic

example

presented.

/

/

/

(REML)

propriétés

problème

façon répétée

coefficients

équations

présente

Lanczos,

technique

symétrique

_eigenvalues

_techniques

_Robert,

_Ducrocq

_génétique

_appliquée,

_{Jouy-en-Josas,}

₁₉₉₅₎

_(REML)

_properties

_{computationally. Large}

_{eigenvalues using}

_method,

_{technique reducing}

_large

_symmetric

_tridiagonal

_required.

_demanding

_method,

_computation

_eigenvalues,

_application

_genetic

_example

_presented.

_/

_(REML)

_propriétés

_problème

_coefficients

_équations

_Lanczos,

_technique

_symétrique

_{tridiagonale.}

_application

_génétique

_exemple

_présentés.

_components

_dependent

_(co)variance

_(REML,

_Thompson,

₁₉₇₁₎

_procedures

_breeding

_breeding

_equations

_(Henderson,

_1973).

_(DF)

_algorithm

_(Smith

_Graser,

₁₉₈₆₎

_evaluating

_log

_directly

_finding

_parameters

_(co)variance

_components

_requires

_iteration,

_represents

_{expectation-maximization}

_(EM)

_algorithm

_(Dempster

_al,

₁₉₇₇₎

_only

_diagonal

_(C

_uu

₎

_products

_relationship