HAL Id: hal-00894170
https://hal.archives-ouvertes.fr/hal-00894170
Submitted on 1 Jan 1997
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
ECM approaches to heteroskedastic mixed models with
constant variance ratios
Jl Foulley
To cite this version:
Original
article
ECM
approaches
to
heteroskedastic
mixed models with
constant
variance ratios
JL
Foulley
Station de
génétique quantitative
etappliquée,
Institut national de la rechercheagronomique,
78352
Jouy-en-Josas cedex,
France(Received
6February
1997;accepted
28May
1997)
Summary - This paper presents
techniques
of parameter estimation in heteroskedasticmixed models
having
constant variance ratios andheterogeneous log
residual variancesthat are described
by
a linear model. Estimation ofdispersion
parameters isby
standard(ML)
and residual(REML)
maximum likelihood.Estimating equations
are derivedusing
the
expectation-conditional
maximization(ECM)
algorithm
andsimplified
versions of it(gradient ECM).
Direct and indirectapproaches
areproposed
with the latterallowing
hypothesis testing
about the variance ratios. Theanalysis
of a smallexample
is outlined to illustrate thetheory.
heteroskedasticity
/
mixedmodel / maximum
likelihood/
EM algorithmRésumé - Approches ECM des modèles mixtes hétéroscédastiques à rapports de variances constants. Cet article
présente
destechniques
d’estimation desparamètres
intervenant dans des modèles mixtes ayant des rapports de variance constants et des variances résiduelles décrites par un modèle linéaire de leurs
logarithmes.
Lesparamètres
de
dispersion
sont estimés par le maximum de vraisemblanceclassique
(ML)
et restreint(REML).
Leséquations
à résoudre pour obtenir ces estimations sont établies à partir del’algorithme d’espérance-maximisation
conditionnelle(ECM)
et d’une versionsimplifiée
dite du
gradient
ECM. Desapproches
directe et indirecte sontproposées,
cette dernière conduisant à un testd’hypothèse
sur le rapport de variances. La théorie est illustrée parl’analyse numérique
d’un petitexemple.
hétéroscédasticité
/
modèle mixte/
maximum de vraisemblance / algorithme EMINTRODUCTION
Heteroskedasticity
hasrecently
generated
much interest inquantitative genetics
and animal
breeding.
Tobegin with,
there is now alarge
amount ofexperimental
evidence ofheterogeneous
variances for mostimportant
livestockproduction
traitsof
heterogeneous
variancesarising
in univariate mixed models(Foulley
etal, 1990;
Gianola etal, 1992; Weigel
etal,
1993;
DeStefano,
1994;
Foulley
andQuaas,
1995).
For many reasons(accuracy
ofestimation,
ease ofhandling large
datasets),
amajor
objective
in this area lies inmaking
models asparsimonious
aspossible.
This can be
accomplished
in at least two ways:i)
by modelling
variances in thecase of
potentially
numerous sources ofheteroskedasticity,
andii)
by
assuming
thatsome functions of those
parameters
(eg,
intra-class correlation orheritability)
areconstant. The first
aspect
corresponds
to the so-called structuralapproach
in which theheterogeneity
of thelog
components
of variances is described via a linear modelstructure similar to that used for means
(Foulley
etal; 1990,
1992;
SanCristobal,
1993).
Restrictions as inii)
were consideredby
Meuwissen et al(1996)
and Robert et al(1995a,b).
Meuwissen et al(1996)
introduced amultiplicative
mixed model to estimatebreeding
values andheteroskedasticity
factorsassuming
heritability
(h
2
)
constant across
herd-years.
Robert et al(1995a,b)
developed
estimation andtesting
procedures
forhomogeneity
ofheritability
withinand/or
genetic
correlations acrossenvironments. But Meuwissen’s
study
postulates
knownh
2
and
Robert’s researchapplies
toonly
asingle
classification ofheteroskedasticity.
The purpose of this paper is to propose a
complete
inferenceapproach
forparameters
having
both featuresi)
andii),
ie,
for continuous data describedby
mixed models with constant variance ratios and
heteroskedasticity analyzed
viaa structural
approach.
Forsimplicity,
thetheory
will bepresented using
aone-way random mixed model for data and afterwards it will be
generalized
to severalu-components.
Inference is based on likelihoodprocedures
(REML
andML)
andestimating equations
derived from theexpectation-maximization
(EM)
theory,
more
precisely
theexpectation/conditional
maximization(ECM)
algorithm
recently
introduced
by Meng
and Rubin(1993).
THEORY
Statistical model
As
usual,
it is assumed that thepopulation
can be structured into strata(i
=1, 2, ....,1)
corresponding
topotential
factors ofheterogeneity.
Let the one-way random model be written as:where yi is the
(n
2
x1)
data vector for stratumi;
j3
is a(p
x1)
vector of unknown fixed effects with incidence matrixX
i
,
and ei is the(n
i
x1)
vector of residuals. The contribution of random effects isexpressed
as inFoulley
andQuaas
(1995)
as
O&dquo;uiZiU’
where u* is a(q
x1)
vector of standardizeddeviations, Z
i
is thecorresponding
incidence matrix and au, is the square root of theu-component
of variance the value of which
depends
on stratum i. Classicalassumptions
aremade
for
the distributions of u* and ei,ie,
Nu*N(0, A),
eiNN(0, ae.In! ),
and The notation in[1]
is unusual ascompared
to that used in the statistical literaturean
expression
of the randompart
especially
in animalbreeding.
For instance the between sire variance may varyaccording
to the environment in which the progenyof the sires are raised. Note also that (JUi can be viewed as a
regression
coefficient ofany element of yi on the
corresponding
element ofZ
i
u
*
.
Thus,
in animalbreeding,
au
, acts as a
scaling
factor of a vector u* of standardized sire values onwhich,
forinstance,
selection can be based.A structure is
hypothesized
on the residual variance so as to model the influence of factorscausing heteroskedasticity.
This is carried outalong
the linespresented
in
Foulley
et al(1990, 1992)
via a linearregression
onlog-variances:
where 5 is an unknown
(r
x1)
real-valued vector ofparameters
andp’
is thecorresponding
(1
xr)
row incidence vector ofqualitative
or continuous covariates.Furthermore,
theassumption
of a constant intra-class correlation(or heritability)
implies
setting
EM-REML estimation
Use is made here of the EM
algorithm
ofDempster
et al(1977)
tocompute
REML estimates ofparameters
involved in variancecomponents
(Patterson
andThompson, 1971;
Searle etal,
1992).
The basicprocedure
proposed by Foulley
andQuaas
(1995)
isapplied
here after someadjustment
of theM-step
taking
advantage
of the ECM
algorithm
ofMeng
and Rubin(1993).
-the ECM
algorithm
is based on acomplete
data set definedby
x =(0’, u
*
’,
e’)’
and itslog-likelihood
L(y; x).
The iterative process takesplace
as follows.The
E-step
is defined asusual,
ie,
at iteration[t],
calculate the conditionalexpectation
ofL(y; x)
given
the data y and y =y!t!
which,
as shown inFoulley
andQuaas
(1995),
reduces towhere
E!t]
(.)
is a condensed notation for a conditionalexpectation
taken withrespect
to the distribution ofx!y, y
=-yf
t
l.
Since the
parameters
to be estimated areheterogeneous,
theestimating equations
are derived at the maximizationstage
from aslightly
different version of the EMalgorithm,
the so-called ECMalgorithm.
Asexplained
in detail inMeng
and Rubin(1993),
a CMstage
replaces
theM-step
by
a sequence of several conditionalascent maximization
procedure
(Zangwill,
1969).
Wesuggest
here thefollowing
procedure:
-
maximize
Q
over y toget
6
[tH]
with T set at its last valueT
[t]
,
ie-
then,
maximizeQ
over Tto obtainT!’+’l
with 5 in y of1’[
])
t
Q(
I
’
1
set to5!!!,
ie,
Thus,
the maximizationstep
consists of twoCM-steps
within the sameE-step
in order to reduce the need to
compute
the conditionalexpectation
ofeie
i
,
and itscomponents
more than once. Thealgebra
of differentiation isgiven
inAppendix
A.The iterative
system
forcomputing
formulae 5 can be written aswith the elements of the
right-hand
sidebeing
Note that for this
algorithm
to be a trueECM,
one would have to iterate the NRalgorithm
in[7]
within an innercycle
(index £)
until convergence to the conditionalmaximizer
y[
tH]
=yl’,’]
at eachM-step
[t].
Inpractice
it may beadvantageous
toreduce the number of inner
iterations,
even up toonly
one,ie,
by solving
just
onceHowever,
caution should be exercised whenapplying
such ahybrid algorithm
that no
longer
guarantees
the monotonic convergence in likelihood values(Lange,
The formula to
update
Treduces tomimicking
the form of a scaledregression
coefficientpooled
over strata.The elements to
compute
at theE-step
can beexpressed
as functions of the sumsX’yi, Z’yi,
the sums of squaresyiyi
withinstrata,
and GLS-BLUP solutions ofHenderson’s mixed model
equations
and of their accuracy(Henderson, 1984),
ieThus,
deleting
[t]
for the sake ofsimplicity,
one has:where
(3 and
u* are mixed modelequations
for13
andu
*
,
and C - _[Cf
3f
3
Cuf3
C
f
3
u
]
Cuu
J
is the
partitioned
inverse of the coefficient matrix.Expressions
in[12a-c]
caneasily
accommodategrouped
data(see
Appendix
B).
The close connection between the
system
ofequations
[7]
for residualparameters
and formula[12]
given
inFoulley
et al(1990)
can be observed. There is also aremarkable
similarity
between formula[9]
for the ratio and formula[7]
inFoulley
and
Quaas
(1995).
This means that thecomputations
can beimplemented
withvery little
change
in the code usedpreviously.
True orgradient
EM could also haveExtension to several
u-components
Formulae
(7!, [8ab]
and[9]
caneasily
begeneralized
to a mixed modelincluding
several
(k
=1,
2,...,
K)
independent
u-components
with Tk =
a
Uik
/a
ei
constant over strata i.Letting
y =(b’,
T
’)’
aspreviously
but now with T=I
Tk
being
a vector of ratiosof standard
deviations,
theQ
function to be maximized has the same form as in[4]
with eiexpressed
from!13!.
One canperform
theCM-steps using
eitheri)
thesequence
6, ’r
l
, T
2
....
I
Tk
, - - - ,
, KTie,
each Tk oneby
one, theremaining
onesbeing
held
constant,
orii)
the sequence/5,
and T as a whole with all the Tks maximizedjointly.
In both cases, thealgorithm
forcomputing
5 isformally
the same as in[7]
withonly
aslight
change
in the definition of the elements ofW
bb
,
vbbeing
unchanged
If the conditional maximization of
the T
k
s
takesplace
oneby
one(case i),
formula[9]
stillapplies
for each of them. Otherwise(case ii),
one has to solve thefollowing
system:
An indirect
approach
The
original
model with a constant T ratiospecified
in[1-3]
can be viewed as aspecial
case of a moregeneral
modelwith,
aspreviously,
fno, 2 -
p§5,
but also with a linear structure onlog-ratios
Letting
y =(6’, 71’)’
here,
thesequence of the
CM-steps
areThe
algorithm
for S is the same as in[7].
Thealgebra
for A is shown in theAppendix,
and leads to asystem
that can be written under a similar form as thatof 6
1--J For
practical
reasons, one may also wish to limit the number of inner iterations(index
£)
even toonly
one in order to reduce the volume ofcomputation
but theapplication
of this ECMgradient algorithm
should beperformed
carefully.
Furtherempirical
simplifications
for the elements of[22]
can beproposed
along
the samelines as in
Foulley
et al(1990).
Again,
these results can be extended to a model with several randomindependent
factors(k
=1, 2,...,
K)
by
setting
Actually,
if theCM-steps
areperformed
for each vector71
k
separately,
the sameformulae as in
[20], [21]
and[22]
apply: just replace
Ti,,
i
Z
u*by
Ti,k,Zi
k
,
uk and
ML estimation
It may be
interesting
in some instances to use ML rather than REML forestimating
variance
components
(see Discussion).
The ECMprocedure developed
in this papercan be
easily
adapted
to obtain MLparameter
estimates.13
is nowpart
of therespect
to the distribution of u*given
y, y =y!t!,
and13
=13
[
’
I
.
Maximization withrespect
to13
can be based on theequation
<9Q/<9j3
=0,
ieOne can
proceed
aspreviously,
ie,
run twoCM-steps
for thedispersion
parameters
based on the same
E-step
so as to obtain6!t+
and
T
]
t+1
]
(or
!ft+1]),
and thenperform
an additionalCM-step
forcomputing
¡3
[t+l]
based on!23!,
iel &dquo;&dquo; ’ -!J
Alternatively,
it may beadvantageous
toperform
theCM-step
forj3
and thenext
E-step
jointly by solving
Henderson’s mixed modelequations
inI3
[Hl]
andu*[
t
+
i
]
=E!u*!y,
6
1
tH
], rrl
t
H
]
)
based on6[
Hl
]
andT
f
c
+
1
1.
Formulae for the two
CM-steps
do notchange.
Theonly
additional modification results fromtaking
the conditionalexpectation
ofcomponents
ofe!e,
given
y, y =y[
t
],13
=l3
[t]
instead ofy, y =
y
[t]
.
Formulae in[12]
reduce towhere
M
uu
is the uby
u block of the coefficient matrix!11!.
Note that the trace terms inside those formulae have
disappeared
or have beengreatly simplified owing
toconditioning
withrespect
to(3
=l3
[t]
.
Moregenerally,
for models
[13]
involving
severalu-components,
[25c]
becomeswhere
(M§) )
k£
is the blockpertaining
to random factors k and in the inverse of the randompart
of the coefficient matrix.Numerical
example
The
procedures presented
in this paper are illustrated with a small data set obtained from simulation. Data weregenerated
according
to a cross-classified modelhaving
two
(environmental)
fixed factors(A =
2 levels;
B = 3levels)
and one(genetic)
random factor
(S
= 9levels).
Thegenetic
contribution consists of sire and maternalgrand
sireeffects,
the latterbeing
assumed to have half the value of the first one.where p is a
general
mean, ai the effectof
environmental factor A(i
=1, 2), b!
the effect of environmental factor B(j
=1, 2, 3), s* the
standardized contribution of male k as asire,
and1/2se
the standardized contribution of male as a maternalgrand
sire,
and eZ!w&dquo;, the residual term.Values chosen for the fixed effects were
(using
a full-rankparameterization):
¡.
.t
+al
+b1
=100;
az-on =20;
b2
-
b
,
=-10;
b3
-
bl
= -20. The vector s* =fs
*
kl
}
of sire effects is assumed to beN(0, A)
with elements of therelationship
matrix A shown at the bottom of table I.Residual variances were obtained from
with a base line value
(]&dquo;!11
=exp(p
*
+ai +bl)
=400,
andmultiplicative adjustment
factors:exp(a2 - a*)
=2;
exp(b2 - bi)
=1/2
andexp(b3 - b*)
=3/2.
The ratioT =
(
]
&dquo; 8ij
/ (]&dquo;
eij
of the square root of the sire to the residual variance was taken asconstant over A x B cells and set to
8.75-
1/2
(heritability
equal
to0.41).
There were 267 observations distributed among 18 different AB x sire x maternal
grand
sire subclasses. The data structure isdisplayed
in table I as well as cell sizeTests of
hypotheses
about the locationparameters
{3,
the residualdispersion
parameters
5 and the ratiosr
ij
were carried out via the likelihood ratio statistic asdescribed in
previous
studies(Foulley
etal,
19901992;
San Cristobal etal, 1993;
Meyer
etal, 1993; Foulley
andQuaas,
1995).
Formulaeby
Quaas
(1992)
were usedto
compute
maximized likelihood functions(Ln,
aX
).
Results can be
arranged
as ananalysis
of variance(or
deviance)
table: seetable II for
hypothesis testing
about{3,
and table III for residual(b)
and ratio(A)
parameters.
Note also that the test statistic for13
relies on-2L
n
,aX
evaluatedfrom the ML estimates of all
parameters,
whereas a maximized residual likelihood can be betteremployed
for 5 and 7!.Interaction effects on location
parameters
areconstantly rejected
under differentassumptions
for the otherparameters.
Thehypothesis
of residual variancehomo-geneity
isstrongly
rejected
as well assingle
factordescriptions
ofheterogenity.
Theassumption
of a constant ratio Tturns out to be a reasonable one. The test resultseventually
agree with the simulationmodel; they
support
thepractical
conclusion that the p + A + B model is the mostappropriate
to account for variation both inlocation and in
log-residual
variances,
the ratio Tbeing
constant.The estimation
procedure
for l5 and T(or J!)
is illustrated in table IV for this model and an alternative oneusing
both standard and residual maximum likelihoodmethods of estimation. ML and REML estimates of residual variances do not differ
very
much;
on thecontrary,
the ML estimates of the ratio T turns out tobe,
asexpected,
lower than the REML ones, the values of the latterbeing
close to thetrue value.
DISCUSSION AND CONCLUSION
The main purpose of this paper was to extend the
general
structuralapproach
toheteroskedasticity
in mixed modelsproposed by Foulley
et al(1990, 1992)
to thecase of
homogeneous
ratios of u to e variancecomponents.
In a sire
by
environmentinteraction,
this isequivalent
topostulating
homo-geneous intra-class correlations or heritabilities. This seems to be a reasonable
assumption
inpractice,
or at least serves as a suitablecompromise
between the existence ofheteroskedasticity
andparsimony
of models. Less restrictive assump-tionsmight
also beinvestigated
(Quaas,
1995,
perscomm).
This paper alsoprovides
a
generalization
of LR tests of thisassumption
to unbalanced data andcomplex
model structures: see theprevious
work of Visscher(1992)
on a one-way randombalanced
design,
and thatby
Robert et al(1995a,b)
forheterogeneous
variancesdue to a
single
classification.The EM
algorithm
turns out to be a convenient andpowerful
tool forsolving
variance
component
estimationproblems.
The ECMalgorithm
allows us tosimplify
theestimating equations,
inparticular
the ECMgradient
version. Theadvantage
of thisalgorithm
wasespecially
clear here in the case of the indirectapproach.
A few
examples
of this for the mixed model have beenalready
mentioned(Meng
andRubin,
1993example 1; Walker,
1996).
It offersgreat
flexibility
indefining
thesequence of the conditional maximization
steps,
all the alternatives of which havegenerated by
the EMalgorithm
arestrikingly
natural(see
Appendix
B)
thusgiving
a flavour ofsimplicity
to the wholeprocedure.
It also makes it
possible
to switch from REML to ML or vice versa with very littlechange
inimplementing
computations
(Foulley
etal,
1994).
Some authors such asLeonard
(1975),
Denis(1983)
and Anderson(1984)
in statistics and Shaw(1987)
inquantitative genetics
have advocated the use of ML rather than REMLprocedures
to estimate variancecomponents.
Although
the interest of ML versus REML inthat case remains
questionable,
ML estimates remainmandatory
forhypothesis
testing
about13
via the LR statistic(see
the numericalexample).
Bayesian point
estimators can also be envisioned via EM
(Foulley
etal, 1992;
Gianola etal, 1992;
San Cristobal et
al,
1993;
Weigel
andGianola,
1993;
Foulley
andQuaas,
1995).
Inaddition,
asalready pointed
outby
Denis et al(1996),
a LR test about(3
requires
5 and T(or 71)
being
the same for the nullhypothesis
and itsalternative;
the same rule holds in
hypothesis testing
about 5(or 71)
by keeping
the otherand difficult
problem
ofjoint
modelling
of means andvariances,
which is related to such issues as the Behrens-Fisherproblem
andmulti-stage hypothesis testing,
and which needs further consideration.Another area that deserves caution and further
development
is that ofestima-bility.
Difficulties areexpected
when all the cellscontributing
to anelement j
of5 or A
(or
to a linear combination ofthem)
have aweight tending
towards zero.This may arise due to
i)
purely overparameterization problems,
or due toii)
pa-rameter valuesbecoming
extreme(eg,
ratios Titending
to zeroimplying
elementsof 71
being
infinite).
This lastphenomenon
is similar to whathappens
in theanal-ysis
ofbinary
and ordinal data with latent variable models(Misztal
etal, 1989;
Fahrmeir andTutz,
1994).
Such difficulties can be avoidedby reparameterization
(i),
orby
setting
lower bounds to thediagonal
elements of thesystem
ofequations
to solve(ii).
Finally,
asymptotic
accuracy can also be worked outnumerically
within the EM framework via the so-called SEMalgorithm
(Meng
andRubin,
1991).
Although
itonly requires
the code of thecomplete
data variance covariance matrix and of the EM or ECMoutputs,
the burden of calculations is thenheavier,
so that wesuggest
that it is restricted to the
simplest
models. Othercomputing
alternatives should also beconsidered,
eg, the average information-restricted maximum likelihoodalgorithm
(AI-REML)
of Gilmour et al(1995).
ACKNOWLEDGMENTS
The author is
grateful
to ElinorThompson
(INRA, Jouy-en-Josas)
and Dr Max Rothschild(ISU,
Ames)
for theEnglish
revision of themanuscript
and to the two anonymous referees for their valuable commentsespecially
about some technicalitiesof the EM
theory.
REFERENCES .
Anderson TW
(1984)
An Introduction to Multivariate StatisticalAnalysis.
JWiley
andSons,
New YorkDempster AP,
Laird NM, Rubin DB(1977)
Maximum likelihood fromincomplete
data via the EMalgorithm.
J R Statist Soc B 39, 1-38Denis JB
(1983)
Extension du mod6le additifd’analyse
de variance par modélisationmultiplicative
des variances. Biometrics39,
849-856Denis
JB, Piepho HP,
vanEeuwijk
A(1996)
Mixed models for genotypeby
environmenttables with an
emphasis
onheteroskedasticity.
TechnicalReport,
Laboratoire deBiom6trie
DeStefano AL
(1994)
Identifying
andquantifying
sources ofheterogeneous
residual andsire variances in
dairy production
data. PhDthesis,
CornellUniversity, Ithaca,
New YorkFahrmeir L, Tutz G
(1994)
Multivariate StatisticalModelling
Based on Generalized LinearModels.
Springer Verlag,
BerlinFoulley JL,
GianolaD,
San CristobalM,
Im S(1990)
A method forassessing
extent andFoulley JL,
San CristobalM,
GianolaD,
Im S(1992)
Marginal
likelihood andBayesian
approaches
to theanalysis
ofheterogeneous
residual variances in mixed linear Gaussian models.Comput
Stat Data Anal13,
291-305Foulley
JL, H6bertD,
Quaas
RL(1994)
Inferences onhomogeneity
of betweenfamily
components of variance and covariance among environments in balanced cross-classified
designs.
Genet SelEvol 26,
117-136Foulley JL, Quaas
RL(1995)
Heterogeneous
variances in Gaussian linear mixed models. Genet SelEvol 27,
211-228Garrick
DJ,
PollakEJ, Quaas RL,
Van Vleck LD(1989)
Varianceheterogeneity
in direct and maternalweight
traitsby
sex and percentpurebred
for Simmental sired calves.J Anim Sci 67, 2515-2528
Gianola
D, Foulley JL,
FernandoRL,
HendersonCR, Weigel
KA(1992)
Estimation ofheterogeneous
variancesusing empirical Bayes
methods: theoretical considerations. JDairy
Sci75,
2805-2823Gilmour
AR, Thompson
R, Cullis BR(1995)
Average
information REML: an efficientalgorithm
for variance parameter estimation in linear mixed models. Biometrics 51, 1440-1450Henderson CR
(1984)
Applications of
Linear Models in AnimalBreeding. University
ofGuelph, Guelph, Ontario,
Canada.Lange
K(1995)
Agradient algorithm locally equivalent
to the EMalgorithm.
J R Statist Soc B57,
425-437Laird
NM, Lange N,
Stram D(1987)
Maximum likelihoodcomputations
withrepeated
measures:
application
to the EMalgorithm.
J Am Statist Assoc 82, 97-105Leonard T
(1975)
ABayesian approach
to the linear model withunequal
variances. Technometrics 17, 95-102Meuwissen
THE,
DeJong G, Engel
B(1996)
Joint estimation ofbreeding
values andheterogeneous
variances oflarge
data files. JDairy
Sci79,
310-316Misztal I, Gianola
D, Foulley
JL(1989)
Computing
aspects of a non linear method of sireevaluation for
categorical
data. JDairy
Sci 72, 1557-1568Meng XL.,
Rubin DB(1991)
Using
EM to obtainasymptotic
variance-covariance matrices: the SEMalgorithm.
J Am Stat Assoc86,
899-909Meng XL,
Rubin DB(1993)
Maximum likelihood estimation via the ECMalgorithm:
Ageneral
framework. Biometrika 80, 267-278Meyer K,
CarrickJ,
Donnelly
BJP(1993)
Genetic parameters forgrowth
traits of Australian beef cattle from a multibreed selectionexperiment.
J Anim Sci 71,2614-2622
Patterson
HD, Thompson
R(1971)
Recovery
of interblock information when block sizesare
unequal.
Biometrika 58, 545-554Quaas
RL(1992)
RL. REML Notebook, Mimeo,
CornellUniversity, Ithaca,
New York RobertC, Foulley JL, Ducrocq
V(1995a)
Inference onhomogeneity
of intra-classcorre-lations among environments
using
heteroskedastic models. Genet Sel Evol 27, 51-65Robert
C, Foulley JL, Ducrocq
V(1995b)
Estimation andtesting
of constantgenetic
and intra-class correlation coefficients among environments. Genet Sel Evol 27, 125-134San Cristobal
M, Foulley JL,
Manfredi E(1993)
Inference aboutmultiplicative
het-eroskedastic components of variance in a mixed linear Gaussian model with anap-plication
to beef cattlebreeding.
Genet Sel Evol 25, 3-30Searle
SR,
CasellaG,
McCulloch CE(1992)
VarianceComponents.
JWiley
andSons,
New-York
Visscher PM
(1992)
On the power of likelihood ratio tests fordetecting heterogeneity
of intra-class correlations and variances in balanced half-sib
designs.
JDairy
Sci75,
1320-1330Visscher
PM, Thompson R,
Hill WG(1991)
Estimation ofgenetic
and environmentalvariances for fat
yield
in individual herds and an investigation intoheterogeneity
ofvariance between herds. Livest Prod Sci 28, 273-290
Visscher
PM,
Hill WG(1992)
Heterogeneity
of variance anddairy
cattlebreeding.
Anim Prod 55, 321-329Walker S
(1996)
An EMalgorithm
for non linear random effects model. Biometrics 52,934-944
Weigel KA,
Gianola D(1993)
Acomputationally simple Bayesian
method for estimationof
heterogeneous
within-herdphenotypic
variances. JDairy
Sci 76, 1455-1465Weigel KA,
Gianola D, YandelBS,
Keown JF(1993)
Identification of factorscausing
heterogeneous
within-herd variance componentsusing
a structural model for variances.J
Dairy
Sci 76, 1466-1478Zangwill
W(1969)
Non LinearProgramming:
AUnified Approach.
Prentice Hall,Engle-wood Cliffs
APPENDIX A:
Algebra
for theestimating equations
TheQ
function to be maximized is(in
condensednotation)
Letting
!=<9(2Q)/<9 In
(y2 ei
Second derivatives: from
!A4a!,
one hasor,
alternatively
Finally,
the non-linearsystem
to solve reduces toDerivatives with
respect
to T(ratio)
Additional derivatives for the exact EM
From the second
expression
in[A7]
it followsimmediately
One has also to express
Now,
from[A7]
where
W r8
is a(I x 1)
definedby
The
system
for true EM(or
gradient
EM without inneriteractions)
is thenExtension to K
independent
random factorsQ
in[Al]
remainsformally unchanged
with ei in[A3]
being
nowBased on this
expression
of theresidual,
formulae[A4ab]
still hold.or,
alternatively
As far as Tis
concerned,
[A6]
becomesleading
to thefollowing
system
Derivatives with
respect
to 7!(parameters
of thelog-ratio)
Q
and the model forIn (
T
e
,
2
are the same as in[Al]
and[A2]
but the vector of residuals is definedas
o , I I
or, after some
algebra
Furthermore,
This can be
easily generalized
to severalindependent
random factorsif conditional maximization is
performed
factorby
factor. Thesystem
[AI8]
applies
to each factor k with
APPENDIX B: Formulae
[12]
forgrouped
dataIn some instances
(see,
eg, theexample
in tableI)
data can begrouped
so that then
2 observations within a stratum i share the same covariates
Substituting
X
i
by
itsexpression
in[Bl]
gives
where yzj is the