HAL Id: hal-00894076
https://hal.archives-ouvertes.fr/hal-00894076
Submitted on 1 Jan 1995
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Inference for threshold models with variance components
from the generalized linear mixed model perspective
B Engel, W Buist, A Visscher
To cite this version:
Original
article
Inference for threshold models
with
variance
components
from the
generalized
linear mixed model
perspective
B Engel
W
Buist
A Visscher
1
DLO
Agricultural
MathematicsGroup
(GLW-DLO),
PO Box100,
6700 ACWageningen;
2
DLO
Institutefor
Animal Science and Health(ID-DLO),
PO Bo!501,
3700 AMZeist,
The Netherlands(Received
10 December1993; accepted
5 October1994)
Summary - The
analysis
of threshold models with fixed and random effects and associated variance components is discussed from theperspective
ofgeneralized
linear mixed models(GLMMs).
Parameters are estimatedby
an interativeprocedure,
referred to as iteratedre-weighted
REML(IRREML).
Thisprocedure
is an extension of the iterativere-weighted
least squaresalgorithm
forgeneralized
linear models. Anadvantage
of thisapproach
is thatit
immediately
suggests how to extendordinary
mixed-modelmethodology
to GLMMs. This is illustrated forlambing difficulty
data. IRREML can beimplemented
with standardsoftware available for
ordinary
normal data mixed models. The connection with other estimationprocedures,
eg, the maximum apo8teriori
(MAP)
approach,
is discussed. Acomparison by
simulation with a relatedapproach
shows a distinct pattern of the bias of MAP and IRREML forheritability.
When the number of fixed effects isreduced,
while the total number of observations iskept
about the same, bias decreases from alarge positive
to a
large negative value, seemingly independently
of the sizes of the fixed effects.binomial data
/
thresholdmodel / variance
components/
generalized linear model/
restricted maximum likelihood
Résumé - Inférence sur les composantes de variance des modèles à seuil dans une perspective de modèle linéaire mixte généralisé.
L’analyse
des modèles à seuils aveceffets fixes
et aléatoires et des composantes de variancecorrespondantes
est iciplacée
dans la perspective des modèles linéaires mixtes
généralisés
(GLMMs!.
Lesparamètres
sont estimés par une
procédure itérative, appelée
maximum de vraisemblance restreintere-pondéré
obtenu par itération(IRREML).
Cetteprocédure
est une extension del’algorithme
itératif
des moindres carrésrepondérés
pour les modèles linéairesgénéralisés.
Elle al’avantage
desuggérer
immédiatement une manière d’étendre laméthodologie
habituelle du modèle mixte aux GLMMs. Uneapplication
à des données dedifficultés d’agnelage
estprésentée.
IRREML peut être mis en ceuvre avec leslogiciels
standarddisponibles
pour lesmodèles linéaires mixtes normaux habituels. Le lien avec d’autres
procédures d’estimation,
par simulation avec une méthode voisine montre un biais
caractéristique
du MAP et de l’IRREML pour l’héritabilité.Quand
le nombre deseffets fixés
estdiminué,
à nombreto-tal d’observations constant, le biais passe d’une valeur
fortement
positive à une valeurfortement négative,
apparemmentindépendantes
del’importance
deseffets fixés.
distribution binomiale
/
modèle à seuil/
composante de variance/
modèle linéairegénéralisé
/
maximum de vraisemblance restreinteINTRODUCTION
In his paper on sire evaluation
Thompson
(1979)
already pointed
out thepotential
interest for binomial data inmodifying
thegeneralized
linear model(GLM)
esti-mating equations
to allow for random effects. Heconjectured
that if modification isfeasible,
generalization
towards other distributions such as the Poisson or gammadistribution should be easy. The iterated
re-weighted
restricted maximumlikeli-hood
(IRREML)
procedure
(Schall,
1991;
Engel
andKeen,
1994)
forgeneralized
linear mixed models(GLMM)
proves to beexactly
such a modification. IRREMLis motivated
by
the fact that in GLMMs theadjusted dependent
variate in theiterated
re-weighted
least squares(IRLS)
algorithm
(McCullagh
andNelder,
1989,
§ 2.5)
approximately
follows anordinary
mixed-model structure withweights
for the residual errorsand,
in the absence of under- oroverdispersion,
residual error variance fixed at a constant value(typically 1).
IRREML isquite
flexible and notonly
covers avariety
ofunderlying
distributions for the threshold model but alsoeasily
extends to othertypes
of data such as countdata,
forexample,
litter size. This entailssimple
changes
in thealgorithm
withrespect
to link and variance func-tionemployed.
When the residual error variance for theadjusted dependent
variate is notfixed,
itrepresents
an additional under- oroverdispersion
parameter
which is a usefulfeature,
forexample,
under- oroverdispersed
Poisson counts. Calculationsin this paper are
performed
with REML(Patterson
andThompson,
1971)
facilities forordinary
mixed models in Genstat 5(1993).
Software for animal models such as DFREML(Meyer, 1989),
after somemodification,
can be used for IRREML as well.Methods for inference in
ordinary
normal data mixedmodels,
eg, the Wald test(Cox
andHinkley, 1974,
p323)
for fixedeffects,
are alsopotentially
useful forGLMMs,
as will be illustrated for thelambing difficulty
data. Simulation results for the Wald test in a GLMM for(overdispersed)
binomial data werepresented
inEngel
and Buist(1995).
For threshold models with normal
underlying
distributions and knowncompo-nents of
variance,
Gianola andFoulley
(1983)
observe that theirBayesian
maximum aposteriori
(MAP)
approach produces estimating
equations
for fixed and random effects such as thoseanticipated
by Thompson.
Undernormality
assumptions,
forfixed
components
ofvariance,
IRREML will be shown to beequivalent
to MAP. IRREML therefore offers analternative, non-Bayesian,
derivation of MAP. TheMAP
approach
was alsopresented
in Harville and Mee(1984),
including
estima-tion of variancecomponents.
Theirupdates
of thecomponents
of variance are akinto those of the estimation maximization
(EM)
algorithm
(Searle
etal,
1992,
§
8.3)
paper, is related to Fisher
scoring.
Bothalgorithms
solve the same finalestimating
equations,
but the latter isconsiderably
faster than the former.Gilmour et al
(1985)
presented
an iterativeprocedure
for threshold models withnormal
underlying
distributions,
which also uses anadjusted dependent
variate andresidual
weights.
Thisapproach,
which will be referred to asGAR,
is different fromMAP and IRREML. In the
terminology
ofZeger
et al(1988)
MAP and IRREML areclosely
related to thesubject-specific
nature of theGLMM,
while GAR is of apopulation-averaged
nature,
as will beexplained
in more detail in this paper.A number of
authors,
eg, Preisler(1988),
Im and Gianola(1988),
and Jansen(1992),
have discussed maximum likelihood estimation for threshold models.Apart
from the fact thatstraightforward
maximum likelihood estimation does not correctfor loss of
degrees
of freedom due to estimation of fixedeffects,
as REML does in the conventional mixedmodel,
it is alsohandicapped
by
the need forhigh-dimensional numerical
integration.
Maximum likelihood estimation for models with severalcomponents
ofvariance,
especially
with crossed randomeffects,
ispractically
impossible.
IRREML is more akin toquasi-likelihood
estimation(McCullagh
andNelder, 1989, chap 9;
McCullagh,
1991):
conditional upon the random effectsonly;
the
relationship
between the first 2 moments isemployed
while no full distributionalassumptions
are neededbeyond
existence of the first 4 moments.Since
practical
differences between various methodsproposed
pertain
mainly
totheir
subject-specific
orpopulation-averaged
nature,
we willgive
some attentionto a
comparison
between GAR and IRREML. Simulation studies werereported
in Gilmour et al(1985),
Breslow andClayton
(1993),
Hoeschele and Gianola(1989),
andEngel
and Buist(1995).
Conclusions from the Hoeschele and Gianolastudy
differ from conclusions from the other studies with
respect
to bias ofMAP/IRREML
and GAR. Since the Hoeschele and Gianolastudy
was rather modest insize,
it was decided torepeat
it here in moredetail,
ie under avariety
ofparameter
configurations
and forlarger
numbers of simulations.GLMMs and threshold models The GLMM model
Suppose
that random effects are collected in a random vector u, with zero meansand
dispersion
matrixG,
eg, for a sire model G =Ao, 8 2,
where A is the additiverelationship
matrix and0
&dquo;;
the
sirecomponent
of variance. Conditional upon u, eg,for
given
sires,
observations y are assumedindependent,
with variancesproportional
to known functions V of the means p:For
binary
data,
y = 1 may denote a difficult birthand
y = 0 a normal birth. The mean f.1, is theprobability
of a difficult birth foroffspring
of aparticular
sire. The conditional variance isVar(y!u)
=V(p)
=/l
(1-
p)
and 0 equals
1. Forproportions
Parameter 0
may be included to allow for under- oroverdispersion
relativeto binomial variation
(McCullagh
andNelder,
1989,
§
4.5).
Observe that[2]
isinappropriate
when n ispredominantly
small orlarge:
for n = 1 nooverdispersion
ispossible
and 0
shouldequal
1 and for n -! oo,[2]
vanishes to 0 whileextra-binomial variation should remain. More
complicated
variances(Williams,
1982)
may be obtainedby replacing 0
in[1]
by
{1
+(n - 1)!0{
orby
{1
+(n -
1)u5
f
.1,(1- f.1,n.
In both
expressions
ao
is a variancecorresponding
to a source ofoverdispersion
(for
a discussion ofunderdispersion
seeEngel
and TeBrake,
1993).
Limits for n -> oo of the variances areo,
2
t
L
(1 -
0
f
.1,)
ando, 0 2 /-t
2
(1_M)2
respectively.
Both can be accomodated in a GLMM for continuousproportions,
eg,motility
ofspermatozoa,
and are coveredby
IRREML.The mean f.1, is related to a linear
predictor
by
means of a known link function g:=
g(f.1,).
The linearpredictor
is a combination of fixed and random effects: ! =x
l
l3
+z’u,
where x and z aredesign
vectors for fixed and random effectscollected in vectors
13
and u,respectively.
Fordifficulty
ofbirth,
forinstance,
77may include main effects for
parity
of the dam and a covariable forbirthweight
as fixed effects and thegenetic
contribution of the sire as a random effect.Popular
link functions for
binary
or binomial data are thelogit
andprobit
link functions:logit(p)
=log(f.1,/(l-¡.¡,))
=
qand
probit (p) =
!-1(p.)
=17
, where
!-1
is the inverseof the cumulative
density
function(cdf)
of the standard normal distribution.The threshold model
Suppose
that r is the’liability’,
anunderlying
random variable such that y = 1 whenr exceeds a threshold value 0
and
y = 0 otherwise. Without loss ofgenerality
it may be assumed that 0 = 0. Let 77 be the mean of r, conditional upon u.Furthermore,
let the cdf of the residual e =(r — 7!),
say
F,
beindependent
of u. Thenwhere
F-
1
is the inverse of F. It follows that the threshold model is a GLMM with link functiong(
f
.1,) =
-F-
1
(1 -
f.1,),
whichsimplifies
tog(f.1,)
=F-
1
(f.1,)
when e issymmetrically
distributed. Residual e mayrepresent
variation due to Mendeliansampling
and environment. Probabilities p do notchange
when r ismultiplied
by
anarbitrary
positive
constant and the variance of e can be fixed at any convenientconstant
value,
sayor
2
.
When F is the cdf L of the standardlogistic distribution,
ie
F(e)
=L(e)
=1/(1
+exp(-e)), g
is thelogit
link anda
2
=!r2/3.
WhenF is the cdf 4) of the standard normal
distribution,
g will
be theprobit
link andQ
Z
= 1.Although
thelogistic
distribution hasrelatively
longer
tails than the normaldistribution,
to a closeapproximation
(Jonhson
andKotz, 1970,
p6):
where c =
(15/16)!/!.
Results ofanalyses
with aprobit
orlogit
link areusually
virtually equivalent,
apart
from thescaling
factor c for the effects andC
2
for thecomponents
of variance.Heritability
may be defined on theliability
scale,
eg, for a sire model:h
2
=4a;/(a;+a2).
As afunction
of0
,2/or
2
heritability
does notdepend
Conditional and
marginal
effectsIn a
GLMM,
effects are introduced in the link-transformed conditional means, ie inthe linear
predictor q
=g(p).
Consequently,
effects refer tosubjects
or individuals.The GLMM and the threshold model are both
subject-specific models,
using
theterminology
ofZeger
et al(1988).
This is in contrast with apopulation-averaged
model where effects are introduced in the link-transformed
marginal
meansg(E(f.1,))
and refer to the
population
as a whole. In animalbreeding,
where sources of variation have a directphysical
interpretation
and are ofprimary
interest,
asubject-specific
model,
whichexplicitly
introduces these sources of variationthrough
randomeffects,
seems the natural choice. For fixed effectshowever, presentation
in terms of averages over thepopulation
is often moreappropriate.
In the thresholdmodel,
there is no information in the data about thephenotypic
variance of theliability,
allowing
QZ
to be fixed at anarbitrary
value.Intuitively
onewould expect
theexpressions
formarginal
effects to involve some form ofscaling by
theunderlying
phenotypic
standard deviation. Fornormally
distributed randomeffects
andprobit
link this is indeed so. From r -N(x’j3,
z’Gz +1)
themarginal probability,
say p,follows
directly:
Hence,
theprobit
link also holds formarginal probabilities,
but the effects are shrunkenby
a factorAp
=(z’Gz
+1)-°!5.
For a sire modelAp
=(u
£
+
1) -
0
.
5
.
Thatthe same link
applies
for both conditional means f.1, andmarginal
means p is ratherexceptional.
For thelogit link,
the exactintegral
expression
for p cannot be reducedto any
simple
form(Aitchison
andShen,
1980).
However,
from[3]
it follows that thelogit
link holdsapproximately for p,
withshrinkage
factorA
L
=((z’Gz/c
2
) + 1 )-
0
.
5
.
Without full distributional
assumptions,
forrelatively
smallcomponents
ofvariance,
marginal
moments may also be obtainedby
aTaylor
seriesexpansion
(see
Engel
and
Keen,
1994).
Binary
observations y
i
and y
j
corresponding
to,
forinstance,
the same sire will be correlated. For theprobit
link the covariance follows from:Here
V2
(a,
b;
p)
is the cdf of the bivariate normal distribution with zero means, unit variances and correlation coefficient p, p2! is the correlation on theunderlying
scale,
eg, in asimple
sire model Pij =or 8 2/(
0
,2
+
1).
For thelogit
link,
using
!3!,
Ap
should be
replaced by
!L/c,
while the value of thecorrelation,
expressed
in termsof the
components
of variance in thelogit
model,
is about the same. The doubleintegral
in!2
mayeffectively
be reduced to asingle integral
(Sowden
andAshford,
1969),
which can be evaluatedby
Gaussquadrature
(Abramowitz
andStegun,
1965,
and
Stegun,
1965, 26.3.29,
p940)
may be used:where
T
(
t
)
is the tth derivative of theprobability
density
function(pdf)
T of thestandard normal distribution. For a sire
model,
undernormality
assumptions,
the first-orderapproximation
appears to besatisfactory,
except
for extreme incidencerates p
(Gilmour
etal,
1985).
By grouping
of nbinary
observationspertaining
tothe same fixed and random
effect,
moments for binomialproportions
y immediately
follow from!4),
eg:where p is the intra-class correlation on the
liability
scale.Expression
[6]
can besimplified by
using
!5!.
Results for thelogit
link follow from!3!.
Estimation of
parameters
Thealgorithm
for IRREMLThe
algorithm
will be describedbriefly.
For details seeEngel
and Keen(1994)
andEngel
and Buist(1993a).
Suppose
that[3
o
and uo arestarting
values obtained from anordinary
GLM fitwith,
forexample,
random effects treated as ifthey
were fixed or with random effectsignored,
ie uo = 0. After the initial GLM has been fittedby
IRLS,
theadjusted dependent variate
and
iterativeweights
w(McCullagh
andNelder, 1989,
§
2.5)
are saved:where
g’
is the derivative of the link function withrespect
to f.1&dquo; eg, for theprobit
link: w =
/{f.1,
0
(1 -
2
)
o
nr(ry
f
.1,0)},
4
approximately
follows anordinary
mixed-modelstructure with
weights
w for the residual errors and residual variance0.
Now a minimum normquadratic
unbiased estimation(MINQUE) (Rao,
1973,
§
4j)
isapplied
tol&dquo;
employing
the Fisherscoring
algorithm
for REML(1
step
of thisalgorithm corresponds
toMINQUE).
From themixed-model
equations
(MMEs)
(Henderson,
1963;
Searle etal, 1992,
§
7.6)
newvalues [3
and6
for the fixed andrandom effects are solved:
Here,
X and Z are thedesign
matrices for the fixed and random effectsdistributional
assumptions
beyond
the existence of the first 4 moments and may bepresented
as aweighted least-squares
method(Searle
etal, 1992,
Ch12).
Some
properties
of IRREMLWhen the MMEs are
expressed
in terms of theoriginal
observations y, it isreadily
shown that at convergence thefollowing
equations
are solved:where *
denotes a direct elementwise
(Hadamard)
product.
Theseequations
are similar to the GLMequations
for fixed u(with
appropriate
sideconditions)
except
for theterm 0
G-
1
u
on theright-hand
side.Equations
[8]
may also be obtainedby
setting
first derivatives withrespect
to elements ofj3
and u of D +u’G-
l
u
equal
to zero, where D is the(quasi)
deviance(see
McCullagh
andNelder,
1989,
§
2.3 and§
9.2.2.)
conditional upon u. Theassumption
of randomness for uimposes
a’penalty’
on values which are ’too far’ from 0. When thenormally
distributed random effects u is in the GLMexponential
family,
eg, a binomial or Poissondistribution,
maximization ofD
+
u’G-
l
u
iseasily
shown to beequivalent
to maximization of thejoint
Suppose
that we have a sire modelwith
q sires
and sire variancecomponent
or
2
The IRREMLestimating equations
foro,2
s and 0
(see,
forexample, Engel,
1990)
are:
Here
Z
o
=I,
Z
I
is thedesign
matrix for thesires, A
o
=W-
1
,
A
l
=A,
P =
SZ-
1
_
[21 X(X’[2-1 X)-l X’[2-
I
and S2 =ZGZ’
+cpW-
1
.
The difference withordinary
REMLequations
isthat
depends
on theparameter
values as well. TheMINQUE/Fisher
scoring
update
of IRREML can be recovered from!9!,
by
using
P = P[2P8
AZ’P
+cpPW-
1p
on the left-hand side:where
0’5
= ø
andat
=as.
When 0
is fixed at value1,
theequation
fork’
= 0 isdropped
from(10!.
Alternativeupdates
related to the EMalgorithm
may also be obtained from[9]
(see,
forexample,
Engel,
1990),
and will be of interest when other estimationprocedures
are discussed:Here
T / ø
is thepart
of the inverse of the MME coefficient matrixcorresponding
to u.
With
quasi-likelihood
(QL)
forindependent data,
it issuggested
(McCullagh
andNelder, 1989,
§
4.5 andchap
9)
that one can estimate0
from Pearson’s(generalized)
chi-square
statistic. From[9]
it may be shownthat !
=variances and d = N -
rank(X) -
{q -
trace (A -
I
T/â;)}
is an associated ’numberof
degrees
of freedom’.Application
to birth difficulties insheep
The data are
part
of astudy
into the scope for a Texelsheep breeding
programin the Netherlands
employing
artificial insemination.Lambing difficulty
will beanalyzed
as abinary
variable: 0 for a normal birth and 1 for a difficult birth. There are 43herd-year-season
(HYS)
effects. Herds are nested withinregions
andregions
are nested within years. There are 2 years, 3regions
per year, about 4 herds perregion,
and 3 seasons. The 33 sires are nested withinregions.
The 433 dams arenested within herds with about 20 dams per herd. Observations are available from
674
offspring
of the sires and dams.Variability
on theliability
scale maydepend
on litter size.Therefore,
observationscorresponding
to a litter size of 1 and litter sizes of 2 or more areanalyzed separately.
Corresponding
data sets are referred to as the S-set(single;
191observations)
and
M-set
(multiple;
483observations).
The M-set isreproduced
inEngel
and Buist(1993)
and is available from the authors. Some summary statistics are shown intable I.
Table II shows some results for
components
ofvariance,
for models fitted to theS- and M-sets. Dam effects are absorbed. To stabilize convergence, the occurrence of extreme
weights
wasprevented
by
limiting
fitted values on theprobit
scale to therange
[-3.5, 3.5!.
In addition to fixed HYSeffects,
factors for age andparity
of the dam(P),
sex of the lamb(S),
and for the M-set a covariate for litter size(L
= littersize -
2)
and included. Levels for factor P consist of thefollowing
6 combinationsof age and
parity:
(1;1), (2;1), (2;2),
(3; !
2),
(4; ! 3)
and(>
5; !
4).
In models3,
4 and 5 a factor D for
pelvic
dimension of the dam(’wide’,
’normal’ or’narrow’),
and in models 4 and 5 a covariate W =birthweight -
average
birthweight
of thelamb is also
included,
withseparate
averages of 4.27 and 3.63 for the S- and M-setsrespectively.
Fixed effects may be screened
by applying
the Wald test to the values ofStandard errors are in
parentheses. S,
P and D denote maineffects,
W and L arecovariables,
S x L and D x W are interactions. When an estimate isnegative
(-),
thecomponent is assumed to be
negligible
and set to 0.table III. In all cases, test statistics are calculated for the values of the variance
components
obtained for thecorresponding
fullmodel,
ie model 1-5.Variability
due
to estimation of the variance
components
isignored.
For each line in the table thecorresponding
test statistic accounts for effects above thatline,
butignores
effects below the line.Referring
to achi-square
distribution,
in model 1 seasonal effects seem to beunimportant
and are excluded from thesubsequently
fitted models.In model 3 for the
M-set,
thefollowing
contrasts forpelvic
opening
(D)
are found: 0.520(0.231),
2.019(0.547)
and 1.499(0.531),
for ’normal’ versus’wide’,
’narrow’ versus ’wide’ and ’narrow’ versus’normal’,
respectively.
Pairwisecomparison,
with a normalapproximation,
shows that any 2 levels aresignificantly
(P
<0.05)
different.The effects refer to the
probits
of the conditionalprobabilities.
For theprobits
ofthe
marginal
probabilities,
effects have to bemultiplied
by
(1+(J&dquo;!+(J&dquo;5)-0.5 =
0.769(from
tableII).
The difference between ’narrow’ and’wide’,
forexample,
becomes 1.55(0.43).
In model 5 for theM-set,
separate
coefficients forbirthweight
are fitted for the 3 levels ofpelvic
opening.
The estimated coefficient forbirthweight
for a dam with a narrowpelvic
opening
is 0.72(0.61);
this is about 0.47(0.63)
higher
than the estimates for the other 2levels,
which are about the same.Although
alarger
coefficient is to beexpected
for a narrowpelvic opening,
the difference found is far fromsignificant.
Fitting
a commoncoefficient,
iedropping
the interaction D x W betweenpelvic opening
andbirthweight
in model4, gives
an estimatedcoefficient for
birthweight
of 0.28(0.15),
which becomes 0.21(0.12)
aftershrinkage.
By
comparison,
the coefficient for theS-set, after
shrinkage,
is 0.92(0.30).
The reduced effect ofbirthweight
for the M-set agrees with thenegligible
values foundRelation to other methods
We will
mainly
concentrate on differences between GAR and IRREML.GAR is based on
QL
for themarginal
moments with aprobit
link andnor-mally
distributed random effects(see
alsoFoulley
etal,
1990).
QL-estimating
equations
fordependent
data are(McCullagh
andNelder, 1989,
§
9.3):
D’Var(y)-1
(y - p)
=0,
where the matrix of derivatives D =(d
ij
),
dij =
((
3
pt/<9/3,t
j
),
follows from p= <I>(X(3
*
),
and(3
*
=where R =
diag({p(l - p) -
pr
2
(ÀpXj3)}/n),
B =diag(r(ÀpXj3)),
C =A2G
andp
represents
theappropriate
intraclass correlation.Var(y)
will bereplaced by
[12],
ignoring
terms of orderp
2
andhigher.
TheQL
equations
may be solvedby
iteratedgeneralized
least squares on anadjusted dependent variate,
say(
GAR
:
where
B
is B evaluated atj3,.
By
comparison,
theadjusted dependent
variate from[7]
is:The latter variate relates to a first-order
approximation
of the conditional means p, while the former relates to afirst-order
approximation
of
themarginal
means p.Approximately
Var((
GAR
)
=1
Var(y)B-
l
B-
=B-
I
RB-
1
+ZCZ’ =
w
G1
R +ZCZ’,
and Gilmour et al(1985)
solve MMEs in terms of!GAR
andWGAR!
For
example,
in a sire modelwith
q sires,
predictions
from these MMEs forApu,
say
Û
GAR
,
are used toupdate
the intraclass correlation p.Analogous
to!11!
with<!=1:
Both GAR and IRREML are based on an
approximate
mixed-model structure for anadjusted dependent
variate. Note however that inGAR,
in contrast toIRREML,
predictions
of the random effects areonly
used toupdate
thecomponents
of variance and do not enterW
GAR
and(
GAR
directly.
At the end of this section we shall seethat this may be an
advantage
for GAR over IRREML in situations where there islittle information in the data about individual random
effects,
eg, in a sire schemewith small
family
sizes or extreme incidence. Themarginal quasi-likelihood
method(MQL)
presented
in Breslow andClayton
(1993)
can beregarded
as ageneralization
of GAR.
However,
in its moregeneral
setting,
MQL
does not have the benefit of some of the exact results for binomial data andunderlying
normal distributionsused to derive GAR.
In
MAP,
assuming
a vagueprior
for(3
and a normalprior
for u, theposterior
mean for(0’,
u’)’
isapproximated by
theposterior
mode.Hence,
(13 , û’)’
maximizethe
joint pdf
of y and u. Undernormality
this isequivalent
to maximization of apenalized
deviance.Hence,
theestimating equations
for13
and u for MAP andIRREML are the same and
given
by
[8].
As shown inFoulley
et al(1987)
an estimatorfor,
forinstance,
a sirecomponent
QS
,
may be solved from:Eo[
8/8
u
£
slog (f (u
I
or 2))]
= 0.Here,
f(.[.)
denotes a(conditional)
indicated,
andexpectation E
o
is withrespect
tof (u!y,
as).
The latterapproximated
by
a normaldensity
with mean u anddispersion
T. In an iterative scheme this leads to theEM-type update
!11!.
Consequently,
MAP and IRREML also solve the same finalestimating equations
withrespect
to thecomponents
ofvariance,
although
thealgorithms
used are different.The
penalized quasi-likelihood
method(PQL)
presented
in Breslow andClayton
(1993)
is based onLaplace
integration.
Random effects are assumed to benormally
1987)
can beused,
seeEngel
and Buist(1993).
In theresulting
integral expression,
thelogarithm
of theintegrand
in terms of u isapproximated by
aquadratic
function around itsoptimum
inu,
employing
expected
second-order derivatives. Now random effects areeasily integrated
out. Some furtherapproximations,
eg,approximating
conditional deviance residualsby
Pearsonresiduals,
result in anormal
log
likelihood for theadjusted dependent variate
( from
[7].
An REMLtype
adjustment
for loss ofdegrees
of freedom due to estimation of fixed effectsfinally yields
an REMLlog likelihood,
which is maximizedby
Fisherscoring. Hence,
although
motivatedquite
differently,
PQL
isequivalent
to IRREML(for
details seeEngel
andBuist,
1993).
A
comparison
between GAR andMAP/P(!L/IRREML
by
simulation waspre-sented in
Gilmour
et al(1985).
They
consider asimple
one-waymodel,
eg, a sire model with q unrelatedsires,
with noffspring
persire,
and with abinary
observation peroffspring.
An overall mean is theonly
fixed effect. Gilmour etal
(1985)
observe that there is atendency
forMAP/P(!L/IRREML
to under-estimateh
2
for
smallfamily
size n or extreme incidence p. For extremeinci-dence p, GAR tends to overestimate
h’.
As also notedby Thompson
(1990),
forthis
simple
set-up,
closed-formexpressions
can be derived from[8]
and!11!,
viz,
P
GAR
={MS
B
-
y
(1 -
y
)}/{(n -
1)T(!-1(y)z}.
HereMS
B
is the mean sum ofsquares between sires for the
binary
observationsand y
is the overall mean. Theadjusted dependent
variable andweights
are:(
GAR
=!-1(y)
+(y -
y)/T(!’-1(y))
and
W
GAR
=T(!-1(y))z/{!(1 -
y) -
/&dquo;’(!!07))!}-
Aslong
as the first-order
ap-proximation
from[5]
holds, and
q is
not toosmall,
P
GAR
will benearly
unbiased.Actually,
for thissimple
scheme,
GAR may be shown to beequivalent
to Williams’ method of moments forestimating
overdispersion
(Williams,
1982).
Starting
fromu
o = 0 and
?7o =
!-1 (y),
theadjusted dependent
for IRREML is the same as forGAR,
but theweights
w =T(!-1(y))z/{y(1 -
V)l
are smaller!Consequently,
the first estimatea
2
={MS
B
- y(l - y)}/{nr(<p-
l
(y))
2
}
underestimatesU
s
approx-imately by
a factor(n -
1)/n.
Simulation results in Gilmour et al(1985)
suggest
that with further iteration underestimationby
this factorpersists.
For extremeincidence,
say close to1,
IRREMLweights
areapproximately T[r(i¡) ,
as follows from Abramowitz andStegun
(1965,
26.2.12,
p932).
Most of the information is inthe
negative-valued
randomeffects,
andover-shrinkage
of random effects willyield
smallerweights
in the nextiteration,
ie ‘residual variances’ will be toohigh.
There-fore,
underestimationby
IRREML may beexpected
to be more serious for extremeincidence,
even when the first-orderapproximation
from[5]
still holds. Lack ofin-formation about individual random
effects,
because of smallfamily
size or extremeincidence,
seems a seriousproblem
withMAP/P(aL/IRREML.
The smallerweights
in IRREML are a consequence of ’residual variances’being
derived frompredicted
values of conditional variances. Alternativeweights
wo =E(g’(f.1,)
2
V(f.1,))-
1
forIR-REML are
suggested
inEngel
and Keen(1994).
For thelogit
link evaluation of alter-nativeweights
isstraightforward:
wo= {2
+2exp(
QZ/
2)
cosh(x’I3)}-
B
but for theprobit
link it isproblematic.
It issuggested
that we use wo ={g’ (ji)2 E(V (f.1,))) -1
instead. Note that in the first
step
for thesimple
sire scheme thisweight equals
WGAR
effects.
MAP/PQL/IRREML
has smaller bias and smallermean-squared
error thanGAR. For both methods an
upward
bias in the order of20%
(based
on 45simula-tions)
of the trueh
2
= 0.25 is observed. Anupward
bias is alsoapparent
for some of the models simulated in Hoeschele et al(1987).
Simulation results foroverdis-persed
binomial data inEngel
and Buist(1995)
only
indicate a seriousupward
bias for variancecomponents
estimatedby
IRREML when the truecomponent
value is small and thenon-negativity
constraints are active.Simulation results
Data are
generated
as described in Hoeschele and Gianola(1989) (referred
to asHG).
The 135 HYS effects aregenerated
from aN(0,
U2
H y
s
)
distribution. For each HYSclass,
a sire has0,
1 or 2offspring
withprobabilities
1 - pHyS,PHYS/
2
and
p
HY
s/2
respectively.
The HYS effects anddesign
for sires andoffspring
aregenerated only
once, after which the same HYS values and the samedesign
are used in allsubsequent
simulations. HYS effects are included in the model as fixed effects. Other fixed effects are sire group effects: 4 groups with effects-0.40,
-0.15(the
value -0.10 in HG was assumed to be atyping
error),
0.15 and 0.40. The numbers of sires in the groups are12, 14,
13 and 11. The 50independent
sireeffects are
generated
from anN(0,
<7
g)
distribution,
whereor
2
=h
2
/(4 - h
2
).
The residuals for theliability
are from anN(0, 1)
distribution. The overall constant on theprobit
scale is-!-1(po)
(1
+Q
H
YS
+a!)0.5,
where po determines the overallincidence.
Suppose
thataH
YS
corresponds
to aproportion
HYS
/
of 1 +o
h
ys 2
+e
r
g,
then
QHYS
=(4f
HYS/
(1 -
,I
HYS
))/(4 -
).
h
2
InHG, h
2
=0.25,
f
HYS
=
0.30,
po = 0.9 and
PHYS = 0.2. The
expected
total number of records is2 025,
withabout 40
offspring
per sire. This isconfiguration
10,
which ispresented
with some of the otherconfigurations
ofparameter
values studied in table IV.For each
configuration,
either 1 or 2 series of 200 simulations areperformed.
In aseries,
HYSeffects,
sire-offspring configuration
and data are allgenerated
from a sequence of random numbers from the same seed. The first seriescorresponds
tothe same seed for all
configurations.
Therefore,
for the firstseries,
forexample
forconfigurations
3 and8,
thedesign
is the same. Seeds of the second series are all different. In table IV bias(%bias)
and root mean square error(8
l
4Q)
are bothexpressed
as apercentage
of the true value ofh
2
.
In contrast with the resultsin Gilmour et al
(1985)
and inagreement
with Hoeschele and Gianola(1989),
with oneexception,
MAP/PGZL/IRREML
shows apositive
bias. Theexception
isconfiguration
13 where HYS effects are notincluded,
neither in thegeneration
of the data nor in the model fitted. Thenegative
value for thisconfiguration
indicatesthat, although
the bias seemsfairly independent
of the size of the fixedeffects,
itmay
depend
on the(relative)
number of fixed effects in the model.Table V shows that this is indeed so.
Starting
from theoriginal
HGscheme,
using
new seeds forgenerating
thedata,
the number of fixed effects is reducedby
factors1/3, 1/2, 2/3
and3/4.
A reduction of1/3,
forinstance,
is effectedby
combination of HYS effects(2, 3),
(5,
6), (8,
9)
and soforth,
replacing
theoriginal
values forStandard errors in
parentheses;
h
2
=
heritability;
Po = incidence rate; PHYS =probability
for a sire for progeny in a HYS
class; f
HYS
=
proportion
of varianceexplained by
theThe
negative
bias agrees with the calculations in thepreceding
section for thesimple
sire model and with the results in Gilmour et al(1985),
where an overallconstant was the
only
fixed effect in the model. Data forconfigurations
10 and 16 was alsogenerated
with random HYSeffects,
with a new set of HYS effects for eachsimulation,
andanalysed
with HYS as a random factor in the model. Estimated bias was -35.8(2.7)%
forconfiguration
10 and -11.8(1.7)%
forconfiguration
16.Corresponding
root mean square errors were 52.0 and27.3%,
respectively.
Bias and root mean square error are reduced when the progeny size isincreased,
see,for
example, configurations
18,
21 and 22. Estimators are also less biased when incidence is lessextreme,
eg,configurations
3 versus 8 and 1 versus 7. Differencesbetween series within
configurations
are sometimesgreater
than would beexpected
on the basis of the standard errorsinvolved, showing
that theconfiguration
of siresand
offspring
is also ofimportance.
Bias and root mean square error for estimated
heritability
areexpressed
as a percentageof the true value of
h
2
.
Standard errors inparentheses.
GAR was
programmed,
similar toMAP/PQL/IRREML,
in terms of Fisherscoring
for theupdates
of the sire variance. Estimates are the same as for theoriginal
methodproposed,
which uses EMsteps,
butspeed
of convergence is often increasedby
at least a factor 10. Someresults, using
the same seeds asfor
MAP/P(aL/IRREML,
arepresented
in tables IV and V. Forhigh
incidence(po
=0.9)
results ofMAP/PQL/IRREML
and GAR arecomparable.
Formod-erate incidence
(p
o
=0.6),
the bias for GAR isclearly
less than forMAP/P(aL/
mean
square error isconsiderably larger
than thecorresponding
bias. Plots of the es-timatedh
2
values ofMAP/P(aL/IRREML
against
GARsuggest
astrong,
fairly
lin-earrelationship.
Correlations between sirepredictions
fromMAP/P(aL/IRREML
and GAR arehigh,
eg, over 0.99 for the first series ofconfigurations
10 and 16.Table V also includes results obtained for IRREML with the alternative
weights
Wo =
T
(
1]?/
{p(l- p) -
pT(Àp1])2}
from theprevious
section. There is a distinctchange
in the bias due to thechange
ofweights.
However,
although
bias isconsiderably
improved
for thetop
lines in thetable,
underestimation for the bottomlines has become more serious.
In table
VI, approximate
standard errors ofheritability
estimates fromMAP/
/
PQL/IRREML
arecompared
with standard errors estimated from the series of 200 simulations(referred
to asempirical
values).
Theapproximate
standard errors, based on Fisher informationassuming
normality
for theadjusted dependent
variate,
perform quite well,
as was also observed inEngel
and Buist(1995).
These standard errors are standardoutput
in Genstat 5.Mean,
10 and 90% percentagepoints
of theapproximate
standard errors arepresented
as percentages of theempirical
value.The
approximately
linearrelationship
between IRREML and GAR estimates within aconfiguration
ofparameter
values,
and the similar standard errors,suggest
that it should bepossible
to correct for bias inIRREML,
at the least in those caseswhere
GARperforms
well. For incidence 0.90 and the full set of 135 HYSeffects,
shrinkage
towards theorigin.
Possibly,
the results of Cordeiro andMcCullagh
(1991),
involving
an extra iterationstep
foradjusted
responsevariables,
may
beextended to minimization of the
penalized
deviance,
thusimproving
estimates /3
andpredictions
u. This wouldimply
modification of theadjusted dependent
variate(.
Users ofMAP/P(aL/IRREML
should be aware of theproblems
involved whenfamily
sizes aresmall,
or in the context of an animalmodel,
when many animals areonly weakly
related. With extremeincidence,
say over0.90,
and a sizeable number of HYSeffects,
say in the order of6%
of the number ofbinary
observations,
MAP,
PQL,
IRREML and GAR are liable toseriously
overestimateheritability,
and actual selection response may beconsiderably
less thanexpected
on the basis of model calculations.ACKNOWLEDGMENTS
The assistence of A Keen with the
analysis
of thelambing difficulty
data and the useof his Genstat
procedure
IRREML isgreatly appreciated.
We also thank J deBree,
PW Goedhart and an anonymous reviewer forhelpful
comments on earlier versions of themanuscript.
REFERENCES
Aitchison
J,
Shen SM(1980)
Logistic-normal
distributions: someproperties
and uses.Biometrika
67,
261-272Abramowitz
M, Stegun
I(1965)
Handbookof
Mathematical Functions.Dover, NY,
USA BreslowNE,
Clayton
DG(1993)
Approximate
inference ingeneralized
linear mixedmodels. J Am Stat Assoc 88, 9-25
Cordeiro
GM, McCullagh
P(1991)
Bias correction ingeneralized
linear models. J R Stat Soc B53,
629-643Cox
DR, Hinkley
DV(1974)
Theoretical Statistics.Chapman
andHall, London,
UKEngel
B(1990)
Theanalysis
of unbalanced linear models with variance components.Statistica Neerlandica
44,
195-219Engel B,
Te Brake J(1993)
Analysis
ofembryonic development
with a model for under-oroverdispersion
relative to binomial variation. Biometrics49,
269-279Engel B,
Buist W(1993)
IteratedRe-weighted
REMLApplied
to Threshold Models withComponents of
Variancefor Binary
Data.Agricultural
MathematicsGroup.
Researchreport
LWA-93-16,
available from first authorEngel B,
Buist W(1995)
Analysis
of ageneralized
linear mixed model: a casestudy
andsimulation results. Biometr J
(in press)
Engel B,
Keen A(1994)
Asimple approach
for theanalysis
ofgeneralized
linear mixed models. Statistica Neerlandica48,
1-22Foulley JL,
ImS,
GianolaD,
Hoeschele I(1987)
Empirical
estimation of parameters forn
polygenic binary
traits. Genet Sel Evol 19, 197-224Foulley JL,
GianolaD,
Im S(1990)
Genetic evaluation for discretepolygenic
traitsin animal
breeding.
In: Advances in Statistical Methodsfor
GeneticImprovement of
Livestock(D
Gianola,
KHammond,
eds).
Springer Verlag, Berlin, Germany,
361-409 Genstat 5 committee(1993)
Censtat 5 Release 3Reference
Manual.(RW
Payne
Gianola
D, Foulley
JL(1983)
Sire evaluation for orderedcategorical
data with a threshold model. Genet SelEvol 15,
201-223Gilmour
AR,
AndersonRD,
Rae AL(1985)
Theanalysis
of binomial databy
ageneralized
linear mixed model. Biometrika
72,
593-599Harville
D,
Mee RW(1984)
A mixed-modelprocedure
foranalyzing
orderedcategorical
data. Biometrics40,
393-408Henderson CR
(1963)
Selection index andexpected genetic
advance. NAS-NRC 982 HoescheleI,
Gianola D(1989)
Bayesian
versus maximumquasi-likelihood
methods for sireevaluation with
categorical
data. JDairy
Sci72,
1569-1577Hoeschele
I,
GianolaD, Foulley
JL(1987)
Estimation of variance components with quasi-continuous datausing Bayesian
methods. J An Breed Genet104,
334-349Im
S,
Gianola D(1988)
Mixed models for binomial data with anapplication
to lambmortality Appl
Stat2,
196-204Jansen J
(1992)
Statisticalanalysis
of threshold data fromexperiments
with nested errors.Comp
Stat DataA!,al 13,
319-330Johnson
NL,
Kotz S(1970)
Continuous univariate distributions. -2.Houghton Miffiin,
Boston,
USAMcCullagh
P(1991)
Quasi-likelihood
andestimating
functions. In: StatisticalTheory
andModelling
(DV
Hinkley,
NReid,
EJSnell,
eds)
Chapman
andHall, London, UK,
265-286
McCullagh P,
Nelder JA(1989)
Generalized Linear Models.Chapman
andHall, London,
UK,
2nd ednMeyer
K(1989)
Restricted maximum likelihood to estimate variance components for animal models with several random effectsusing
a derivative-freealgorithm.
GenetSel Evol
31,
317-340Nelder
J, Pregibon
D(1987)
An extendedquasi-likelihood
function. Biometrika74,
221-232
Patterson
HD,
Thompson
R(1971)
Recovery
of inter-block information when block sizesare
unequal.
Biometrika58,
545-554Pearson K
(1901)
Mathematical contributions to thetheory
of evolution. VII. On the correlation of characters notquantitatively
measureable. Phil Trans R Soc(Lond)
A195,
1-47Preisler HK
(1988)
Maximum likelihood estimates forbinary
data with random effects. Biometr J3,
339-350Rao CR
(1973)
Linear StatisticalInference
and itsApplications.
JohnWiley
&Sons,
NewYork, USA,
2nd ednSchall R
(1991)
Estimation ingeneralized
linear models with random effects. Bio!n,etrika78,
719-728Searle
SR,
CasellaG,
McCulloch CE(1992)
VarianceComponents.
JohnWiley
& Sons. NewYork,
USASowden
RR,
Ashford JR(1969)
Computation
of the bi-variate normalintegral. Appl
Stat18, 169-180
Thompson
R(1979)
Sire evaluation. Biometrics35,
339-353Thompson
R(1990)
Generalized linear models andapplications
to animalbreeding.
In: Advarcces in Statistical Methodsfor
GeneticImprovement of
Livestock(D
Gianola,
K
Hammond,
eds)
Springer Verlag, Berlin, Germany,
312-328Williams DA