Inference about multiplicative heteroskedastic components of variance in a mixed linear Gaussian model with an application to beef cattle breeding

(1)

HAL Id: hal-00893978

https://hal.archives-ouvertes.fr/hal-00893978

Submitted on 1 Jan 1993

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Inference about multiplicative heteroskedastic

components of variance in a mixed linear Gaussian

model with an application to beef cattle breeding

M San Cristobal, Jl Foulley, E Manfredi

To cite this version:

(2)

Original

article

Inference about

_{multiplicative}

heteroskedastic

_components

of variance

in

a mixed

linear

Gaussian

model with

an

application

to

beef cattle

_breeding

M

San

Cristobal

JL

_Foulley

E Manfredi

1

INRA, Station de Genetique Quantitative et Appliquée,

78352 _{Jouy-en-Josas Cedex;}

2

INRA, Station d’Amelioration G6n6tique des _{Animaux, BP, 27,} 31326 Castanet-Tolosan Cedex, France

(Received

28 _{April 1992 ; accepted}23 _September

₁₉₉₂₎

Summary - A statistical method for _{identifying meaningful}sources of _{heterogeneity}of residual and genetic variances in mixed linear Gaussian models is presented. The method is

based on a structural linear model for _logvariances. Inference about dispersion parameters is based on the

marginal

likelihood after _integratingout location _parameters.A likelihood ratio test _usingthe

_marginal

likelihood is also _proposedto test for

_hypotheses

about sources of variation involved. A _Bayesianextension of the estimation procedure of the dispersion parameters is _presentedwhich consists of _determiningthe mode of their

_marginal

_posterior distribution _{using log}inverted _chi-squareor Gaussian distributions as _priors.Procedures

presented in the paper are illustrated with the _analysisof muscle development scores

at _weaningof 8575 progeny of 142 sires in the Maine-Anjou breed. In this

analysis,

heteroskedasticity is _found,both for the sire and residual _componentsof variance.

heteroskedasticity / mixed linear _{model / Bayesian}technique

R.ésumé - Inférence sur une _{hétérogénéité multiplicative}des composantes de la

variance dans un modèle linéaire mixte _{gaussien: application}à la sélection des

bovins à viande. Une méthode statistique est _{présentée, capable d’identifier}les sources

significatives d’hétérogénéité de variances résiduelles et _génétiquesdans un modèle linéaire mixte gaussien. La méthode est

_fondée

sur un modèle structurel de décomposition du

logarithme des variances. _{L’inférence}concernant les _paramètresde _dispersionest basée sur la vraisemblance _marginaleobtenue _{après intégration}des _paramètresde position. Un

*

Correspondence

and reprints

**

(3)

test du rapport des vraisemblances utilisant la vraisemblance _marginaleest aussi _proposé

afin de tester des hypothèses sur _différentessources de variation. Une extension _bayésienne

de la procédure d’estimation des _paramètresde _dispersionest _présentée;elle consiste en la maximisation de leur distribution _marginalea _posteriori,_pourdes distributions a _priori

log x

2

inverse ou _gaussienne.Les _{procédures présentées}dans ce _papiersont illustrées _par

l’analyse de notes de _pointagessur le _{développement}musculaire au _sevragede 8 575 _jeunes veaux de race _Maine-Anjou,issus de _{142 pères.}Dans cette _analyse,une hétéroscédasticité a été trouvée sur les composantes père et résiduelle de la variance.

hétéroscédasticité _/modèles linéaires mixtes _/_techniques_bayésiennes

INTRODUCTION

One of the main concerns of

_{quantitative geneticists}

lies in evaluation of individuals

for selection. The statistical framework to achieve that is

_nowadays

the mixed linear

model

_{(Searle, 1971),}

_usually

under the

_assumptions

of

_normality

and

_homogeneity

of variances. The estimation of the location _parametersis

_performed

with BLUE-BLUP

_(Best

Linear Unbiased

_{Estimation-Prediction),}

_leading

to the well-known Mixed Model

_Equations

_(MME)

of Henderson

_(1973),

and REML

_(acronym

for REstricted -or REsidual- Maximum

_Likelihood)

turns out to be the method of

choice for

_estimating

variance _components

_(Patterson

and

_Thompson,

_1971):

However,

heterogeneous

variances are often encountered in

_practice,

eg for milk

yield

in cattle

_(Hill

et

al,

1983; Meinert et

al,

1988;

Dong

and

Mao, 1990;

Visscher et

_al,

_1991;

_Weigel,

₁₉₉₂₎

for meat traits in swine

_(Tholen,

₁₉₉₀₎

and for

_growth

performance

in beef cattle

_(Garrick

et

_al,

_1989).

This

_{heterogeneity}

of

_variances,

also called

_{heteroskedasticity}

_{(McCullogh, 1985),}

can be due to _many

_factors,

_eg

management

level,

genotype x environment _{interactions,}

_{segregating major}

_genes,

preferential

treatments

_(Visscher

et

al,

1991).

Ignoring heterogeneity

of variance may reduce the

_reliability

of

_ranking

and

selection

_procedures

_although,

in cattle for _instance,dam evaluation is

_likely

to

be more affected than sire evaluation

(Hill,

_1984;

Vinson,

_1987;Winkelman and

Schaeffer,

1988).

To overcome this

problem,

3 main alternatives are

possible.

First,

a

transfor-mation of data can be

performed

in order to match the usual

_assumption

of

ho-mogeneity

of variance. A

_log

transformation was

proposed by

several authors in

quantitative

genetics

(see

eg Everett and

Keown,

1984; De Veer and Van

Vleck,

1987; Short et

al, 1990,

for milk

_production

traits in

_cattle).

However,

while ge-netic variances tend to

_stabilize,

residual variances of

_{log-transformed}

records are

larger

in herds with the lowest

_production

level

_(De

Veer and Van

Vleck,

1987; Boldman and

_Freeman,

_1990;Visscher et

al,

1991).

’

The second alternative is to

_develop

robust methods which are insensitive to

moderate

_{heteroskedasticity}

_{(Brown, 1982).}

The last choice is to take

_{heteroskedasticity}

into account. Factors

_(eg

_region,

herd,

year,

parity,

sex)

to

adjust

for

heterogeneous

variances can be identified. But

such a stratification _generatesa _very

_large

number of cells

_{(800 000}

levels of herd x _{year in}the French Holstein

_file)

with obvious

_problems

of

_{estimability.}

Hence,

(4)

via a

modelling

(or structural)

_approach

so as to reduce the _parameter_space,

_by

appropriate

identification and

_testing

of

_meaningful

sources of variation of such

variances.

The model for the variance _componentsis described in the Model section. Model

fitting

and estimation of _parametersbased on

_marginal

likelihood

_procedures

are

presented

in the Estimation

_{of Parameters,}

followed

_by

a test statistic in

_Hypothesis

Testing.

A

_Bayesian

alternative to maximum

_marginal

likelihood estimation is

presented

in A

_Bayesian

_Approach

to a Mixed Model Structure In the Numerical

application

section,

data on French beef cattle are

_analyzed

to illustrate the

procedures

given

in _{the paper.}

_Finally,

some comments on the

_methodology

are

made in the Discussion and Conclusion.

MODEL

Following Foulley

et al

_(1990,

₁₉₉₂₎

and Gianola et al

_(1992),

the

_population

is

assumed to be stratified into I

_{subpopulations,}

or strata

_(indexed

_by

i =

_{1, 2, ... ,}

I)

with an

(n

i

x

₁₎

data vector _y_i_,

_sampled

from

a

normal distribution

_having

mean

i ii and variance

R. i

=

a2 ei I&dquo;

i

.

Given ii

i

and

_R

_i

Following

Henderson

_(1973),

the vector _II_i is

_{decomposed according}

to a linear

mixed model structure:

where

i

X

and _Z;are

(n

i

x p)

and

x q

(n

i

)

incidence

matrices,

corresponding

to fixed

J3 (p

x 1 )

and random

_u

_i

_(q

_i

_{x 1 )}

effects

_{respectively.}

Fixed effects can be factors or

covariates,

but it is assumed in the

_following

that,

without loss of

_{generality, they}

represent factors.

In the animal

_breeding

_context,ui is the vector of

_genetic

merits

_pertaining

to

breeding

individuals used

_(sires

_{spread by}

artificial

_{insemination)}

or _present

(males

and

_females)

in stratum i. These individuals are related via the so-called numerator

relationship

matrix

_A

_i

_,

which is assumed known and

_positive

definite

_(of

rank

_q

_i

_).

Elements of ui are not

_usually

the same from one stratum to another. A borderline case is the &dquo;animal&dquo; model

_((auaas

and

_Pollak,

₁₉₈₀₎

where animals

with records are

completely

different from one herd to another.

_{Nevertheless,}

such individuals are

genetically

related across herds.

Therefore,

model

[3]

has to

be refined to take into account _{covariances among}elements of different

_u!s.

As

proposed by

Gianola et al

_(1992),

this can be

accomplished by relating

Ui to a

(5)

with A

_being

the overall

_relationship

matrix of rank q,

relating the

q breeding

I

animals involved in the whole

_population,

with _{q x}

_L

qj.

i=l

Thus,

S

i

is an incidence matrix with 0 and 1 elements

_{relating the q}

_i

levels of

u

* _presentin the ith

_{subpopulation}

to the whole vector

_(q

x

₁₎

of u elements. For

instance, if stratification is made

_by

herd

level,

the matrices

S

i

and

_S

_i’

_{(i !}

_i’)

do not share _anynon-zero elements in their

columns,

since animals

_usually

have

records

_only

in one herd. On the _contrary,in a sire

_model,

a

_given

sire k _mayhave

progeny in 2 different herds

(i,

i’)

thus

resulting

in ones in both kth columns of

S

i

and

_Si.

Notice that in this

model,

any genotype x stratum interaction is due

_entirely

to

scaling

(Gianola

et

_al,

_1992).

Formulae

_[2],

_(3!,

_[4]

and

_[5]

define the model for means; a further _stepconsists

in

_modelling

variance _components

_{{!e! !i=1,...1}

_and

_{{Q!. },!=1,...t}

in a similar _way,

ie

_using

a structural model.

’ ’

The

_approach

taken here comes from the

_theory

of

generalized

linear models

involving

the use of a link function so as to _expressthe transformed _parameters with a linear

_predictor

_(McCullagh

and

_Nelder,

_1989).

For variances, a common

and convenient choice is the

_log

link function

_(Aitkin,

1987;

Box and

Meyer, 1986;

Leonard,

1975;

Nair and

_Pregibon,

_1988):

where

_wey

and

_{w’ .}

are incidence row vectors of size

k

e

and

_k

_u

_,

_{respectively,}

corresponding

to

_dispersion

_parameters_f_gand _!u.These incidence vectors can

be a subset of the factors for the mean in

(2!,

but _exogeneousinformation is also

allowed.

_Equations

_[6]

and

_[7]

define the variance component models. These models can be rewritten in a more _compactform as follows.

Let

y =

(y!,..., y!,...,

y’)’

be the n x 1 vector of data for the whole

_population,

I

with n

= ! ni,

i=l

IIi

xil

3 +

0&dquo;&dquo;izisiu*

11

=

(II!,... ,11:,... ,

ll

’)’

be the mean vector of _y,

I

R

_{= ®}

_R_i be the variance-covariance matrix of _y,with ?

representing

the

i=l

direct sum

_{(Searle, 1982).}

Equation

[1]

can then be rewritten as:

(6)

In the same _way,

[2]

becomes:

X!

the

_(n

_i

x

p)

incidence matrix defined in

_!2J;

Z

_{= (Z1 ,...,ZZ ,...,ZI ) ,}

Z!

=

o,,,iZ

i

S

i

the

(n

i

x

_q)

&dquo;incidence&dquo; matrix

_pertaining

to

u

*

,

T = (X, Z

*

)

and

0 =

Q3’,

U

*

’ I’.

The vector 0 includes _p+ q location parameters. The matrix T can be viewed as an &dquo;incidence&dquo;

_matrix,

but which

_depends

here on the

_dispersion

_parameters_T

u

through

the variances

_Q

_ua.

Both variance models can also be

compactly

written as:

The _k_e+ ku

dispersion

parameters !e and y! can be concatenated into a vector

(

T

=

(T!, T!)’

with

_{corresponding}

incidence matrix W =

W

e

EÐ W

u’

The

dispersion

model then reduces to:

where

a

2

=

_{(CF e 2&dquo; cF! 2’ )’}

and

1n a

2 is

a

_symbolic

notation for

_{(In a;}

₁

_’

...

Inaejl 2

In a!1 ’ , .. , In a![)’.

ESTIMATION OF PARAMETERS

In

_{sampling theory,}

a _{way to}eliminate nuisance _parametersis to use the

_marginal

likelihood

_{(Kalbfleisch, 1986). &dquo;Roughly}

_speaking,

the

_suggestion

is to break the data in two _{parts, one, part}whose distribution

_{depends only}

on the _parameterof

interest,

and another _partwhose distribution may well

depend

on the _parameter

of interest but which

_will,

in

_{addition, depend}

on the nuisance _parameter.

_!...!

This second _part

_will,

in

_general,

contain information about the _parameterof

interest,

but in such a _waythat this information is

inextricably

mixed _upwith

the nuisance

_{parameter&dquo;}

_{(Barnard, 1970).}

Patterson and

_Thompson

₍₁₉₇₁₎

used

this

_approach

for

_estimating

variance components in mixed linear Gaussian models. Their derivations were based on error contrasts. The

corresponding

estimator

_(the

so-called

_REML)

takes into account the loss in

_degrees

of freedom due to the

(7)

Alternatively,

Harville

₍₁₉₇₄₎

_proved

that REML can be obtained

_using

the

non-informative

_{Bayesian paradigm.}

_According

to the definition of

_{marginalization}

in

Bayesian

inference

_(Box

and

_{Tiao, 1973;}

_Robert,

_1992),

nuisance _parametersare

eliminated

_by

_integrating

them out of the

_{joint posterior}

_density.

Keeping

in mind that the

_sampling

and the non-informative

_{Bayesian approaches}

give

rise to the same estimation

_equations,

we have chosen the

_{Bayesian techniques}

for reasons of coherence and

_simplicity.

The _parametersof interest are here the

dispersion

_parameters_r,and the location

parameters 6 appear to be nuisance _parameters.Inference is hence based on the

log marginal

likelihood

_L(

_T

_{; y)}

_{of r:}

An

_{estimator y}

of T is

given

by

the mode of

L(

T

;

y):

where r is a _{compact part}of

R

k

e

+k

u.

This maximization can be

_{performed using}

a result

by

_Foulley

et al

_{(1990, 1992)}

which avoids the

_integration

in

_[13].

Details can be found in the

_A

_PP

_endix.

This

procedure

results in an iterative

_{algorithm. Numerically,}

let

_[t]

denote the iteration

t; the current estimate

9 [Hl]

_{of r}

is

_computed

from the

_following

_system:

where

i

lt]

is the current estimate at iteration _t, W the incidence matrix defined in

_!12!,

Q

M

is the

_weight

matrix

_depending

on

0

and on

ê

[t]

,

which are the solution and

the inverse coefficient matrix

_respectively

of the current _systemin 0

_(this

_system is described

_next),

z!

is the score vector

_depending

on

6

and

C!.

Elements of

Q

l

’

l

and

i

lt

)

are

_given

in the

Appendix.

The second _systemis:

where

i

[t]

is the &dquo;incidence&dquo; matrix T defined in

_[9]

and evaluated at T =

y

[t]

;

ft- 1

1 ’]

is the

_weight

matrix evaluated _at_T

= y[

t]

,

with R defined as in

[8];

E-C

0

1 )

and takes into account the

_prior

distribution of u* in

_!5!.

(8)

Regarding

computations

involved in

_!15!,

2 _typesof

_algorithms

can be considered as in San Cristobal

_(1992).

A second order

_algorithm

_{(Newton-Raphson type)}

converges

rapidly

and

gives

estimates of standard errors of

y,

but

_computing

time

can be excessive with the

_large

data sets

_typical

of animal

_{breeding problems.}

As shown in

_Foulley

et al

_(1990),

a first order

_algorithm

can be

_easily

obtained

_by

approximating

the

_(a

matrix in

_[15]

_by

its _{expectation component}

_(Qa!,E

in the

appendix

notations).

This EM

_{(Expectation-Maximization;}

_Dempster

et

al,

1977)

algorithm

converges more

_slowly,

but needs fewer calculations at each iteration

_and,

on the

_whole,

less total CPU time for

_large

data sets.

HYPOTHESIS TESTING

An

_{adequate modelling}

of

_{heteroskedasticity}

in variance _components

_requires

a

procedure

for

_hypothesis

_testing.

Let

H

o

:

H !

= 0 be the null

hypothesis

with

H

_being

a full

(row)

rank matrix with row size

equal

to the number of

_linearly

independent

estimable functions of _T

_defining

H

o

,

and

H

1

its alternative. For

example,

one can be interested in

_testing

the

_hypothesis

of

_homogeneity

of residual

variances

_H

_o

_:

_{u2 e}

_i

=

exp

(-y,,)

= Const for all i.

Letting

Ye =

f

7 R

,

&dquo;fe2-’7R. - - -, Yel

&dquo;f

R

}

f

with _&dquo;f_R

_being

the

_dispersion

_parameterfor the residual variance in the first

stratum taken as reference.

_H

_o

can be

_expressed

as

_e

_r

_H

= 0, _or

(H

e

, 0

h

= 0

with

_He

=

(O(

I

-I

)x

l

,,

I

-1

).

Let

_M

_o

and

_Nft

be the models

_{corresponding}

to

_H

_o

and

_H

₁

_,

_{respectively.}

Since

P(

YIT

)

=

e u

the

marginal

likelihood _canbe

interpreted

_{as a}likelihood

of error contrasts

_{(Harville, 1974),}

hence the likelihood ratio test based on the

marginal

likelihood can be

applied:

Under

H

o

, A

is

_{asymptotically}

distributed

_according

to

_a

_X

₂

with

_degrees

of freedom

_equal

to the rank of H. In the normal _case,

_explicit

calculation of

_L(

_T

_;

_y)

is

_analytically

feasible:

A BAYESIAN APPROACH TO A MIXED MODEL STRUCTURE

One can be interested to

_generalise

Henderson’s BLUP for subclass means

₍

₁₁

=

T9)

to

_dispersion

_parameters

_{(ln a}

₂

=

W7 )

ie

_proceed

as

if

T

had a mixed model

structure

_(Garrick

and Van

_Vleck,

_1987).

To overcome the

difficulty

of a realistic

interpretation

of fixed and random effects for

_{conceptual populations}

of variances

from a

_frequentist

(sampling)

_perspective,

one can

alternatively

use

Bayesian

procedures.

It is then _{necessary to}

_place

suitable

_prior

distributions on

dispersion

(9)

In linear Gaussian

_methodology,

theoretical considerations

_{regarding conjugate}

priors

or fiducial _argumentslead to the use of the inverted _gammadistribution as a

_prior

for a variance

a

2 _(Cox

and

_Hinkley,

_1974;

_Robert,

_1992).

Such a

density

depends

on

hyperparameters

₇₇ and

s

2 .

The former _conveysthe so-called

degrees

of

belief,

and the latter is a location _parameter.The ideas

_{briefly exposed}

in the

following

are similar to those described in

_Foulley

et al

_(1992).

Hence,

a

_prior

_{density for y}

=

ln

Q

2 can

be obtained _{as a}

log

inverted

gamma

density.

As a matter of

_fact,

it is more

_interesting

to consider the

_prior

distribution of v =

&dquo;y —

T

°,

with

q°

= In

s

2 ,

ie

where

_r(.)

refers to the gamma function.

Let us consider a K-dimensional &dquo;random&dquo; factor v such that

Vk

1 77k

(k

= _{1, ...}

K)

is distributed as a

log

inverted _gamma

’

)

1]k

(

l

InG-

Since the levels of each random

factor are

_{usually exchangeable,}

it is _{assumed that 1]k = 1] for}_everyk in

_{{1, ... K}:}

For v

k

in

_[20]

small

_enough,

the kernel of the

_product

of

_independent

distributions

having

densities as in

_[19]

can be

_{approximated (using}

a

Taylor expansion

of

[19]

about v

_equal

to

₀₎

_by

a Gaussian

_kernel,

_leading

to the

_following

_prior

for v:

As

_{explained by Foulley}

et al

_(1992),

this

_{parametrization}

allows

_expression

of the _T vector of

_dispersion

parameters under a mixed model _typeform.

Briefly,

from

_[19]

one

has

1

=

1 °

₊_v

or

1 =

P

’oS

₊_vif _onewrites the location

parameter

-to

= In

S

2

as a linear function of some vector 8 of

_explanatory

variables

_(p’

_being

a row incidence vector

_{of coefficients).}

_Extending

this

_writing

to several classifications

in v leads to the

_{following general}

expression:

where P and

Q

are incidence matrices

corresponding

to fixed effects E and random effects v,

respectively,

with

[20]

or

[21]

as

_prior

distribution for v.

Regarding dispersion

parameters T, it is then

possible

to

_proceed

as Henderson

(1973)

did for location _parameters₁₁_{, ie}describe them with a mixed model

structure.

_Again,

as illustrated

_by

formula

[22],

the statistical treatment of this

model can be

_{conveniently implemented}

via the

Bayesian

paradigm.

In

_fact,

_equations

_[22]

define a model on residual variances:

(10)

where

_{Pe, Pu, Qe, Qu}

are incidence matrices

_{corresponding respectively}

to fixed effects

5 e

, <!

and random effects _v_e =

(v!, Ve2, ... ,

v!,...)’,

Vu =

(v!, V

u2,

’

’ ’ ,

vu,

;

...)!

with,

for the

_jth

and kth random classification in ve and vu

respectively,

Let 11 =

(11!, 11!)’

with ₁₁_e =

{77ej}

and

1 1

u =

{77Uk}

be the vectors of

hyperparameters

introduced in the variance _componentmodels

_{[23], [24],}

_[25]

and

[26].

An

_{empirical Bayes procedure}

is chosen to estimate the _parameters.The

hyperparameters,

11

(or § =

(!e, !u)’)

are estimated

by

the mode of the

marginal

likelihood of these

_{hyperparameters}

_(Berger,

_{1985; Robert,}

_1992):

Then,

the

_dispersion

_parametersare obtained

by

the mode of the

posterior

density of

T

given

the

_{hyperparameters equal}

to their estimates:

or

_similarly

_for

t.

Maximization in

_[27]

and

_[28]

can be

performed

with a

Newton-Raphson

or an EM

_{algorithm, following}

ideas in the Estimation

_of

_parameters,

_{Unfortunately,}

the

_algorithm

derived from

_[27]

is

_{computationally demanding,}

since it involves

digamma

and

_trigamma

functions. On the other

hand,

an EM

_algorithm

derived

from

_[28]

has the same form as the EM-REML

_algorithm

for variance components.

It

_just

involves the solution and the inverse coefficient matrix of the system in T

at iteration

_(t).

This latter _systemis similar to

_(15),

but it takes into account the

informative _prioron the

_dispersion

_parameters.In the case of a Gaussian

_prior,

this

system can be written as

where

r

is the matrix

I- (!) = ( 0

i

.

)

evaluated

at the current estimate

I

of

!,

tanking

into account the

_priors

via

_{A(!) =}

Var

_{(v’, v’)}

=

A

e

?

A!

with

A,

=

0!1!,

and

A!, _

₍₁₎

_I

_K.,,.

i ’ k

(11)

NUMERICAL APPLICATION

Sires of French beef breeds are

routinely

evaluated for muscular

development

(MD)

based on

_{phenotypic performance}

of their male and female _progeny.

_Qualified

personnel subjectively classify

the calves at about 8 months of _age,with MD

scores

_ranging

from 0 to 100. Variance _componentsand sire

_genetic

values are

then estimated

_by

_applying

classical

_procedures,

ie REML and BLUP

_(Henderson,

1973;

Thompson,

1979),

to a mixed model

_including

the random sire effect and a

set of fixed effects described in table I. The second factor listed in table

_I,

condition

score

(&dquo;Condsc&dquo;),

accounts for the

_previous

environmental conditions

_{( eg}

nutrition via

_fatness)

in which calves have been raised.

Some _{factors among those}described in table I _mayinduce

_{heterogeneous}

variances. In

_particular,

different classifiers are

_expected

to _generatenot

_only

different MD means, but different MD variances as well.

Thus,

the usual sire model

with

_assumption

of

_homogeneous

variances _maybe

_inadequate.

This

_hypothesis

was

tested on the

Maine-Anjou

breed. After elimination of twins and further

editing

described in table

_I,

the

_Maine-Anjou

file included

_performance

records on 8 575

progeny out of 142 sires

_{(&dquo;Sire&dquo;)}

recorded in 5

_regions

_{(&dquo;Region&dquo;)}

and 7 _years

(&dquo;Year&dquo;).

Other factors taken into account were: sex of calves

_{(&dquo;Sex&dquo;),}

_ageat

scoring

(&dquo;Age&dquo;),

claving

parity

(&dquo;Parity&dquo;),

month of birth

_{(&dquo;Month&dquo;)}

and classifier

( &dquo;Classi&dquo; ).

In most strata defined as combinations of levels of the

_{previous factors,}

only

one observation was _present.

Preliminary analysis

A

_histogram

of the MD variable can be found in

_figure

1. The distribution of MD

seems close to

_normality,

with a fair

_PP-plot

_(although

the use of this

_procedure

is somewhat

_{controversial),}

and skewness and kurtosis coefficients were estimated as

(12)

null

_hypothesis,

while others did not

_reject

_it,

_namely

_Geary’s

u, Pearson’s tests for

skewness and kurtosis

_{(Morice, 1972)}

at the 1% level.

Bartlett’s test for

_homogeneity

of variances was

computed

for each of the first

8 factors described in table I. Results in table IIa indicate strong evidence for heteroskedastic variances _amongsubclasses of each factor considered in this data set.

The usual sire model with all factors from table I in the mean

model,

and variance _components

_estimated

_by

_EM-REML,

was

fitted,

leading

to estimates

6d

=

70.1l,a,2,

=

6.91,

and

h

2

=

46fl /(6d

₊

3!)

=

0.36. Note that this model is

equivalent,

in our

_notation,

to the

_homogeneous

model in _f_g_{and Yu}_.

Search for a model for the variances

The

_following

additive mean model

_M

_B

was considered as true

_throughout

the whole

analysis

(13)

A forward selection of factors _strategywas chosen to find a

good

variance model

My

but in 2 _stages;a backward selection _strategywould have been difficult to

implement

because of the

_large

number of models to _compareand the small amount of information in some strata

_{generated by}

those models.

(i)

since

_a2

_represents

> 90% of the total

variation,

it was decided to model that component

first, assuming

the ru- part

homogenous;

’

(ii)

the &dquo;best&dquo;

_T

_u

_-model

was thereafter chosen while

_{keeping unchanged}

the

&dquo;best&dquo;

_T

_e-model.

The different nested models were fitted

_using

the maximum

marginal

likelihood

ratio test

_{(MLRT) A}

described in

_!17J.

During

the first _stage

_(i),

the

_homogeneous

sire variance was

_estimated,

for

computational

ease, with an EM-REML

_algorithm,

and the _Te_parameterestimates

were calculated as in

_Foulley

et al

_(1992).

This _strategy

_leads,

of course, to the

same results as those obtained with the

_algorithm

described in the Estimation

_of

parameters.

The first _stepconsisted of

_choosing

the best one-factor variance model from results

_presented

in table lib. The next _steps,ie the choice of an

_adequate

2-factor

model,

and then of a 3-factor

model,

_etc,are summarised in table III.

Finally,

the

(14)

The model can also be

simplified

after

_comparing

estimates of factor

levels,

and

then

_collapsing

these levels if there are not

_{significantly}

different. For the

_(ii)

_stage,the &dquo;best&dquo;

_r

_u

_-model

was the model

(see

table

IV):

We were not able to reach convergence of the iterative

procedure

for the models

(Mo,

M.y

e

,

Classi)

and

_(Mo,

_M!(,,

_Region),

_although

some levels of the Classi factor were

_collapsed.

This

_phenomenon

is related to a _strongunbalance of the

_design:

for

instance,

one classifier noted the calves of

_only

4 _sires,

_making

_quite

_impossible

a

coherent estimation of

_{Classi-heterogeneous}

sire variances. The other factors

_(except

Year)

had no

_significant

effect on the variation of the sire variances. Because of

imbalance,

the model

gave

unsatisfactory

results _eg

_heritability

estimates _greaterthan one.

Results

Estimates of the

_dispersion

_parametersfor the selected model

_designated

here

as

_{(MB, M!.!, M!.!)}

are shown in table Va. As

_expected,

the

_T

_,-estimates

of the

(Mo, M&dquo;

f

c’

homogeneity)

model,

ie of the best

_r

_e

_-model

with

_only

one

_genetic

variance,

are

quite

similar to the

_-estimates

_e

_T

of the

_(Mo,

_M

_7e

_,

_My&dquo;)

model

_(table

Va).

In _contrast,

_T

_e

_-estimates

of the

_{(Mo, M&dquo;}

_f

_c’

_homogeneity)

model,

with:

M!c :

Classi

_(random)

+ Condsc + Year

_(random)

+ Month

_(random)

_[35]

are different for the &dquo;random&dquo; factors

(see

table

Vb).

Estimated

(15)

!e,Year

= 0.009 and

!e,Month

= 0.0024

respectively,

or

_{alternatively}

_using

% values of

the coefficient of variation for

_ae,

_(!e

_CV

₂

₎

_Class

_i

_e

_,

_CV

=

14.5%, CV,,

Yar

= 9.5%

and

_CV

_e

_,

_Month

= 4.9%

respectively.

In

fact,

the smaller the cell size

(n

i

),

and the

smaller

CV,

the

greater

the

shrinkage

of the

sample

estimates

(6

f

)

toward the

mean variance

(3 )

since the

_regression

coefficient toward this mean in the

equa-tion

_Q2

_{= õ’2}

+ b(6

i

2 _

_õ’2)

is

_{approximately}

b =

n

d[

n

i

₊

(2/CV!)]

with

!7 =

2/CV

2 :

(1992).

The

_genetic

variation in heifers turns out to be less than one half what it is in

bulls even

_though

the

_phenotypic

variance was

_virtually

the same. This may be due

to the fact that classifiers do not score

_exactly

the same trait in males

_(muscling)

as

in females

_{(size and/or fatness).}

It may also suggest that the

_regime

of male calves

is

_supplemented

with concentrate.

Location _parametersare

_compared

in

_figures

2a-d under different

_dispersion

models,

through

scatter

_plots

of estimates of standardized sire merits

_(u

_*

_).

Indexes

based on &dquo;subclass means&dquo;

_(V

_i

=

y

i

,

i = 1, ...

I,

with

homogeneous

variances)

and

those based on the &dquo;sire model&dquo; under the

_homogeneity

of variance

_assumption

are

far away from each other

(see

fig

2a).

Figure

2a is

_just

a reference of

_discrepancy,

which illustrates the

impact

of the BLUP

_methodology.

When

_{heterogeneity}

is

introduced _amongresidual

_variances,

sires’

_genetic

values do not vary too

much,

as

shown in

_figure

2b.

_Modelling

of the

_genetic

variances has a

_larger

_impact

on the sire

_genetic

values

_(see

_figure

_2c)

than

_modelling

of residual variances.

_Finally,

the

Bayesian

treatment of

_r

_e

_{-parameters by introducing}

random effects in the model

(M

B

,

M&dquo;

YJ

does not have any influence on the sire

_genetic

merits

(fig 3d).

Evaluation of sires can be biased if true

_{heterogeneity}

of variance is not taken into

account. As shown in table

_VI,

sire number 13 went down from the 16th to the 24th

position because

his calves were scored

mostly by

classifier no 1 who uses a

large

scale of notation

_(see

_T

_-estimates

in table

_V).

On the other

hand,

sire 103 _{went up}

from the 25th to the 14th

_place

since the

_{corresponding}

Classi and Condsc levels have low residual variance

_(for

the other factor levels

_represented,

the variances

were at the

_average).

For the same _reason,the sire

_genetic

merits were also affected

by

modelling

In ad.

The difference in

_genetic

merit for sire 56

_{(1.40 vs}

1.74 under the homoskedastic and the residual heteroskedastic models

_{respectively)}

is also

explained by

the fact that the calves of this sire were scored

_{exclusively by}

classifier no 12 and in 1983

(Year

=

1).

Due _to

modelling

Q

u,

this sire _wentdown

_again

(from

1.74 to 1.63 under the full heteroskedastic

_model)

because all its progeny

are females with a lower

Q

u

component

than in males. Other

_{things being equal,}

a reduction in the

oru

variance results in a

larger

_ratio,

or

equivalently

a smaller

heritability

and

_consequently

in a

_{higher shrinkage}

of the estimated

_breeding

value

toward the mean. In other

words,

if a decrease in

_genetic

variance is

ignored,

sires

above the mean are overevaluated and sires below the mean are underevaluated.

Hypothesis checking

(16)

(17)

The

_estimated

sire variance was

_7.08.

Variances of the

_Classi,

Year and Month factors are,

(18)

(19)

After

_modelling

residual

_variances,

the distribution of standardized residuals

became closer to

_normality,

in terms of skewness and

_especially

kurtosis. This

phenomenon

was observed in the whole

_sample

and also in the

_subsamples

defined

by

the levels of the factor considered in _r_e_.On the other

hand,

normality

of the residuals was stable in the

subsamples

defined

_by

the factors absent from the _r_e

-model.

Normality

of the distribution of the standardized sire values in terms of

kurto-sis and

_PP-plot

was

continuously

damaged

at each _stepof the variance

_modelling:

estimated kurtosis was

_0.61,

0.72 and

_0.90,

for the

_{homoskedastic,}

residual

het-eroskedastic and

_fully

heteroskedastic models

_{respectively.}

_Moreover,

skewness for

the 142 sire

_genetic

merits

_improved

_{slightly during}

that process: -0.09, -0.003

and -0.03 for the same models

respectively.

Computational

aspects

Programmes

were written in Fortran 77 on an IBM 3090

by implementing

an

EM

_{algorithm corresponding}

to

[15].

The _convergencewas fast: 15-20

_cycles

for heteroskedastic

_{T e-models}

with

_<7J

estimated

_by

EM-R.EML

_{((i) stage),}

and 15-40

cycles

for

_fully

heteroskedastic

_T

_-models

or heteroskedastic

_T

_e

_-models

with random

effects. CPU time was between 2-5 min per model fit

(estimation

of parameters and

computation

of the

_{log marginal}

likelihood.

DISCUSSION AND CONCLUSION

This paper is an extension to u-components of variances of the

_{approach developed}

by Foulley

et al

₍₁₉₉₂₎

to consider

_{heterogeneity}

in residual variances

using

a

structural model to describe

_dispersion

_parameters,in a similar _wayas

usually

done on subclass means.

In that _respect,our main concern focuses on _waysto render models as

parsi-monious as

_possible

so as to reduce the number of _parametersneeded to assess

heteroskedasticity

of variances. An

_interesting

feature of this

_procedure

is _{to assess,}

through

a kind of

analysis

of variance, the effects of factors

marginally

or

jointly.

For _instance,one can test

_{heterogeneity}

of _{sire variances among}breeds of dams

(20)

be related to a

_segregating

_major

_gene)

can be tested while

_taking

into account the

influence of other nuisance factors

_{(season, nutrition...).}

_However,

the _powerof the

likelihood ratio test for

_detecting

_{heterogeneity}

of variance can be a real issue in

many

practical

instances.

From the

_{genetic point}

of _view,the

_approach

is

_quite

_general

since it can deal with

heterogeneity

among within and between

_family

_componentsof

_variances,

or _among

genetic

and environmental variances. Factors involved for u and e _componentsof variance _maybe different or the same,

making

the method

_especially

flexible. Our

modelling

allows one to assume

_(or

even

test)

whether the ratios of variances or

heritabilities are constant over levels of some

_single

factor or

_combination

of factors

(Visscher

and

_Hill,

_1992).

If a constant

_heritability

or ratio of variances a =

or

2 i/

or

2

among strata is

assumed,

the model involves the parameters y and a

only,

and

reduces to

_{In o, ei 2}

=

we!re

with

_{oru 2i}

_replaced

_by

_a;

_j

_0:

in the likelihood function. The

_shrinkage

estimator for the variances

_{proposed by}

_eg,Gianola et al

_(1992),

follows the same idea of the

_Bayesian

estimator described in the

_{Bayesian approach}

section. When a Gaussian

_prior

_density

is

_employed

for the

_dispersion

_parameters

Y

, the

hyperparameter

6 acts

as a shrinker. But the

_{Bayesian approach}

for a direct

shrinkage

of variance _componentsassumes that

_{heterogeneity}

in such components

(residual

and u

_components)

is due

_only

to one factor. The

_{approach presented}

in this _paperis more

_general

since it can _{cope with}more

complex

structures of stratification which _{may differ}from one _componentto the other.

_Moreover,

its mixed model structure allows _great

_flexibility

to

_adjust

variances in relation to the amount of information for factors in the

_model;

eg when data

provide

little information for some factors

_{(or levels)}

or considerable for

_others,

our

procedure

behaves like BLUP

_{(or James-Stein)}

ie shrink estimates of

_dispersion

parameters toward zero if there is little

_information;

_only

with sufficient information

can the estimate deviate. For

instance,

our

methodology provides

a

_simple

and

rational

_procedure

to shrink herd variances

_(whatever

_they

are,

genetic,

residual

or

_phenotypic)

towards different

_population

values

_(eg

_regions,

as

_{proposed by}

Wiggans

and

_VanRaden,

₁₉₉₁₎

due to _{poor accuracy}of within herd or

herd-year

variances

_{(Brotherstone}

and

_Hill,

_1986).

It then suffices to use a hierarchical

_(linear)

mixed model for herd

_{log-variances}

and take the

_population

factor

_{( eg region)}

as

fixed and herd as random within that factor. An illustration of the

_flexibility

and

feasibility

of our

_procedure

was

_recently

_given

_{by Weigel}

₍₁₉₉₂₎

in

_analyzing

sources

of

_{heterogeneous}

variances for milk and fat

_yield

in US Holsteins.

Coming

back to the case of a

_unique

factor of variation for the sire

_variances,

one can think of a

_{simpler model,}

such as _yZ!,!=

m

i + _u_ij +

_e2!!(i

= 1, ...

I; j

=

1, ... J; k

=

l, ...

n2! ),

where _J_.l_i is the mean effect of environment

_i,

_u_ij is the

(random)

effect of

_{the jth}

sire in the ith

_environment,

such that ui =

{u2!}! !

N(O,a!,A),

and e2!! is the residual effect

pertaining

to the kth calf of the

_jth

sire in the ith environment.

_Usually

in such hierarchical

_models,

it is assumed that Cov

_(u

_i

_,

_Ui’

₎

= 0 if i

=A

i’. On the _contrary,our

_{modelling procedure}

via the

change

in variables

_u!

=

Qu! u2!

(see

!4) )

takes into account covariances among

the same

_(or

_genetically

_related)

sires used in different

herds,

ie

_{Cov (ui, uj, )}

=

a u , a u

&dquo;

A!i! (Aii!

=

relationship

_matrix

pertaining to

ui and

u

i

, )

and so recovers

the inter-block information. The loss in _powerin

_hypothesis

_testing

due to

_ignoring

(21)

Although

this

_presentation

is restricted to a

single

random factor

u

*

,

it can be

generalized

to a

multiple

random factor situation. If such factors are

uncorrelated,

the extension is

_{straightforward.}

When covariances

_exist,

one _may

_simply

_assume, as

proposed

by

Quaas

et al

_(1989),

that

_{heterogeneity}

in covariances is due to

scaling.

This _means,for

instance,

that in a sire

_(s

_i

_)-maternal

_grand

sire

_(t!)

model Y

ijk =

X!. k 13 + Si

+t!

+e2!k,

one will model

as

h’

2 a th 2

as

_previously,

and assume that

the _{covariance is Q9t}_,,=

pa sha t h

for stratum h. If the model is

parameterized

in

terms of direct ao and maternal am effects as follows

_through

the transformation

s

i =

ao

i/

2 and t

j

=

ao!/4

₊

a

m;

/2,

one can set the

_genetic

correlation pa to a

constant, ie

_a

_ao

=

P

a

aoh

O

a

,

n

-

Notice that this condition is not

equivalent

to

the

_previous

one, except if

_a

_aoh

_/!d&dquo;,h

does not

_depend

on h.

Although

the

_methodology

is

_appealing,

attention must be drawn to the

feasi-bility

of the method. The first

_problem

is the inversion of the coefficient matrix in

[16]

required

for the

_computation

of the variance _system

_(15].

In animal

_breeding

applications,

this matrix is

_usually

_very

_large.

This

_limiting

factor is

_already

becom-ing

less

_important

due to _{constant progress}in

_computing

software and hardware.

The

_technique

of

_absorption

is

_usually

used to reduce the size of matrices to invert.

Another

_approach

is to

_approximate

the inverse. One can, for instance, use a

Taylor

series

_expansion

of order N for a _squareinvertible matrix A

where the square matrix

A

o

is a matrix close to A and _is,of course, easy to invert, and

_{where 11}

-

11

denotes some norm on the space of invertible matrices. Methods

viewed in Boichard et al

₍₁₉₉₂₎

can also

help

to

_{approximate A-}

1

in

_particular

cases such as _sparsematrices, &dquo;animal

model&dquo;,

etc.

Statistical power for likelihood ratio tests was

_investigated

for detection of

heterogeneous

variances in the usual

_designs

of

_{quantitative genetics}

and animal

breeding.

Results

_given

_by

Visscher

₍₁₉₉₂₎

and Shaw

₍₁₉₉₁₎

indicate

_generally

low power values for

detecting heterogeneity

in

_genetic

variance.

_According

to

_Shaw,

a nested

_design

of 900 individuals out of 100 sire families

_provides

a _powerof 0.5 for

_genetic

variances

_differing

_by

a factor of 2.5. This

clearly

indicates the

minimum

_requirements

in

_sample

size and

_family

numbers which should be met

before

_carrying

out such an

analysis

and the limits therein. Therefore it seems

unrealistic to model

_genetic

variances in

_practice

_according

to more than 1 or 2

factors,

and it

_might

be wise to consider some of them as random if little information

is

_provided

_by

the data in each level of such factors.

Although

statistical constraints are satisfied for estimates of the _parametersof

the model

_(positive

variance _estimates,intra-class correlations within

[-1, + l]J,

some

_genetic

constraints such as about the

_heritability

estimate

_(h

₂

=

_4a!/(a!+ae)

for a sire

model)

_ranging

within

[0,1]

are not

_{imposed by}

our model. This can be

dealt with

_by

_choosing

_appropriate

_priordistributions on the

dispersion

_parameters

that would take this constraint into _account,but this

_procedure

appears to be

extremely complicated. Fortunately,

the

_{unconstrained}

solutions are the constrained

(22)

maximization

_procedure

under constraints must be

_performed

or the

_posterior

distribution under the constraint can be obtained from the unconstrained

_posterior

distribution

_{multiplied by}

a corrective factor

_(Box

and

_Tiao,

_1973).

This

_problem

does not occur with an &dquo;animal

model&dquo;,

but can arise when a &dquo;sire model&dquo; is

_used,

and is not

_specifically

related to

_{heteroskedasticity.}

From a statistical _pointof _view,the

procedure

uses the _conceptof variance function

_(Davidian

and

_Carroll,

₁₉₈₇₎

as an extension to

_dispersion

_parametersof the link function. Our _presentationfocuses on the

_log

link function which is the

most common choice in this field

_(see

for instance San

Cristobal,

1992,

for a review of variance

_models)

&dquo;for

_physical

and numerical reasons&dquo;

_(Nair

and

_Pregibon,

_1988).

Following

Davidian and Carroll

₍₁₉₈₇₎

or

_Duby

et al

_(1975),

the

_question

can be

asked whether or not variances vary

according

to means or location _parameters.In

the

_Maine-Anjou

_{data, however,}

it does not seem to be the case, thus

validating

our choice in

!10!.

It would be

_interesting

to extend our method to a

fully generalized

linear mixed

model on means and on variances with or without common _parametersbetween

the mean model and the variance model. Numerical

_integration

or Gibbs

sampling

procedures

would then be

_required

_although

_approximatemethods of inference

can also be used for such models

_(Breslow

and

_Clayton,

_1992;

_Firth,

_1992).

Statistical

_{problems arising}

with common _parametersare

_{already highlighted by}

van

_Houwelingen

_(1988).

With a

_fully

fixed effect variance

_model,

_techniques

of estimation and

_hypothesis

testing

for

_dispersion

_parameters

_presented

here are those of the classical

_theory

of

likelihood inference

_(likelihood

and likelihood ratio

_test),

_exceptthat the

_marginal

likelihood function

_L(

_Y

_;y)

was

_preferred

to the usual likelihood

_L(13,

_T_;

_y),

in the

light

of ideas behind REML estimators of variance _components.This test reduces

to Bartlett’s test

_{(Bartlett, 1937)}

for a one classification model in variances and

under a saturated fixed model on the means

_(ie

_i

_ji

=

_y

_i

_,

i = 1, ...

I).

Unfortunately,

Bartlett’s test is known to be sensitive to

_departure

from

_normality

_{(Box, 1953).}

Simulations are needed to

_study

the robustness of this test and other

_competing

tests. From a

Bayesian

_perspective,

the

Bayes

factor is

usually applied

for

hypothesis

testing

(see

Robert, 1992,

for a

_discussion).

The

_posterior

_Bayes

factor

_(Aitkin,

1991)

could also be used to _compare

_dispersion

_models,

but numerical

_integration

would then be

_required

_(see

the

_expression

of the likelihood in

_!18!).

In this paper, focus was on an

_appropriate

way to model

_{heterogeneous}

vari-ances, but the initial motivation was a best

fitting

of location _parameters

(animal

evaluation for animal

_breeders).

This difficult

_problem

of

feedback,

also related to

the Behrens-Fisher

_problem,

has to be solved in our

_{particular approach.}

Moreover,

a _great

research

_perspective

is _openon the

_important

and

complicated

question of the

_{joint modelling}

of means and variances

(Aitkin,

1987;