HAL Id: hal-00894151
https://hal.archives-ouvertes.fr/hal-00894151
Submitted on 1 Jan 1996
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
A Bayesian analysis of mixed survival models
V Ducrocq, G Casella
To cite this version:
Original
article
A
Bayesian analysis
of mixed survival modelst
V
Ducrocq
1
G Casella
2
1
Department of
AnimalScience,
CornellUniversity;
2
Biometrics
Unit,
CornellUniversity, Ithaca,
NY14852,
USA(Received
13August 1996; accepted
1 October1996)
Summary - In
proportional
hazardsmodels,
the hazard of an animalA(t),
ie,
itsprobability
ofdying
orbeing
culled at time tgiven
it is aliveprior
to t, is describedas
A(t) =
’>"
o
(t)e
W
’
e
where(t)
o
A
is a ’baseline’ hazard function ande
w
’B
represents theeffect of covariates w on
culling
rate. A distribution can be attached to elements sq in0,
identifying,
forexample, genetic
effects andleading
to mixed survivalmodels,
also called’frailty’
models. To estimate the parameters Tof the distribution offrailty
terms, aBayesian analysis
isproposed.
Inferences are drawn from themarginal posterior density
x(T)
which can be derived from thejoint posterior
density
viaLaplacian integration,
apowerful technique
related tosaddlepoint approximations.
Thevalidity
of thistechnique
is shown here on simulatedexamples by comparing
theresulting approximate
x(
T
)
to the oneobtained
by algebraic integration.
This exact calculation is feasible in veryspecific
casesonly,
whereas thesaddlepoint approximation
can beapplied
to situations whereAo(t)
isarbitrary
(Cox models)
orparametric
(eg, Weibull),
where thefrailty
terms are correlatedthrough
a knownrelationship
matrix, or in moregeneral
models with stratificationand/or
time-dependent
covariates. The influence of thecensoring
rate and the data structure isalso illustrated.
survival analysis
/
mixedmodel / variance
component estimation/
Bayesiananaly-sis
/
proportional hazards modelRésumé - Une analyse bayésienne des modèles de survie mixtes. Dans le cas des modèles à
risques proportionnels,
la fonction
derisque
d’un animala(t),
c’est-à-dire saprobabilité
de mourir ou d’êtreréformé
au temps t sachantqu’il
est vivantjuste
avant t, ala forme
A(t)
=>’
o
(t)e
W
’o
(t)
o
où A
estune fonction
derisque
« de basé»et eW’o
représente
l’e,f,fet
des covariables w sur le taux deréforme.
Une distribution peut être associée avx termes Sq de9, identifiant,
parexemple,
deseffets génétiques
et conduisant à des modèles*
Correspondence
andreprints:
Station deg6n6tique quantitative
etappliqu6e,
Institut national de la rechercheagronomique,
78352Jouy-en-Josas cedex,
France.1
On leave from the Station de
g6n6tique quantitative
etappliqu6e,
Institut national dela recherche
agronomique.
de survie mixtes, aussi
appelés
modèlesde fragilité.
Pour l’estimation desparamètres
T dela distribution des termes
aléatoires,
uneanalyse bayésienné
estproposée.
Lesinférences
statistiques sont
faites
à partir de la densitémarginale
a posteriorix(
T
)
qui peut êtreobtenue à
partir
de la distributionconjointe
a posteriori parintégration laplacienne,
unetechnique
liée aux approximationspoint-selles.
La validité de cettetechnique
est démontréeici
à partir d’exemples simulés,
en comparant les résultats del’approximation
de7
r(
T
)
avec ceux obtenusaprès intégration algébrique.
Cette dernièrecorrespond
à un calcul exactréalisable uniquement dans des cas très
particuliers,
alors quel’approximation point-selle
peut être
appliquée
dans des situations oùÀ
o
(t)
estcomplètement
arbitraire(modèles
deCox)
ouparamétrique
(par
exemple,
de typeWeibull),
où les termes aléatoires sontcorrélés à travers une matrice de
parenté
connue, ou avec des modèlesplus généraux
avecstratification
et/ou
covariablesdépendantes
du temps.L’influence
du taux de censure etde la structure des données est aussi illustrée.
analyse de données de survie
/ modèles
mixtes/
estimation des composantes devariance
/
analyse bayésienne/ modèle
à risques proportionnelsINTRODUCTION
Traits associated with
longer productive
life of livestock arereceiving increasing
attention in the animal
breeding
field: it isrecognized
thatdecreasing culling
dueto the
involuntary
causes(eg,
related todisease,
infertility,
lameness,
etc)
by
genetic
or
non-genetic
means has apositive
effect on economicperformance, mainly through
decreased
replacement
costs(van
Arendonk,
1986;
Strandberg,
1991,
Strandberg,
1995,
Strandberg
andS61kner,
1996).
Huge
field data sets areusually
availablefor
comprehensive
analyses
ofproductive
life,
forexample,
as aby-product
ofthe
dairy recording
schemes indairy
cattle. The obviousmethodology
of choicefor such studies is survival
analysis,
in which propertechniques
to deal with theunavoidable presence of censored data have been
developed. However,
statisticalcomplexity
andcomputational
difficulties related to these methods havedelayed
theadoption
of state-of-the-artmethodology
and different indirectapproaches
have been
proposed
(see
Strandberg
and S61kner(1996)
for areview).
Somelarge-scale
applications
(Smith,
1983;
Smith andQuaas, 1984; Ducrocq, 1987; Ducrocq
et
al, 1988a,
b;
Ruiz, 1991; Fournet, 1992; Egger-Danner, 1993; Ducrocq,
1994)
as well as the
availability
of a softwarespecifically
written with animalbreeding
applications
in mind(Ducrocq
andS61kner,
1994)
have demonstrated that the useof less
appropriate approaches
can be avoided.The most
popular
class of survival models is the class ofproportional
hazardsmodels
(Cox,
1972;
Kalbfleisch andPrentice, 1980; Lawless, 1982;
Cox andOakes,
1984).
The hazard of an animal(or
in the animalbreeding
context,
its risk ofbeing
culled)
at time t is described as theproduct
of a baseline hazard function.!o(t),
which is either leftcompletely
arbitrary
(Cox model)
or has aparametric
form
(eg,
exponential,
Weibull orgamma)
and of apositive
term which is anexponential
function of a vector of covariates w’multiplied by
a vector ofregression
parameters
0.Proportional
hazard models can be extended to include random(eg, genetic)
effects,
as in theregular
mixed linear models that are used forgenetic
evaluations
worldwide. Mixed survival models are
classically
referred to as’frailty’
modelsby
affects
multiplicatively
the hazard of individuals or groups of animals. When a termv
m is defined for each animal ’I!!I
(!,,L(t,w)
=v
m
.
À
(t, w)),
thefrailty
component
extracts
part
of the unobserved variation between individuals(Vaupel
etal, 1979;
Hougaard,
1986a,b;
Follmann andGoldberg,
1988;
Aalen,
1994)
and thereforeallows for a correction of the
possible
discrepancy
between the true variance ofthe observations and the one
specified
by
the model. Such an extra variation isreferred to as
’overdispersion’
(Louis,
1991; Tempelman
andGianola,
1994).
Whenvq is defined for a group of
individuals,
eg, alldaughters
of a sire q, it describes theshared unobservable
(genetic,
in thiscase)
characteristics which act on the hazardof each member of the group
(Clayton
andCuzick,
1985;
Anderson etal,
1992;
Klein,
1992;
Klein etal,
1992).
In all cases, thesimple
transformation s =log
vallows the inclusion of the
frailty
term in the linear term w’O.Traditionally,
a gamma(Clayton
andCuzick,
1985;
Ducrocq,
1987;
Klein,
1992)
distribution has been attached to the
frailty
term v because of itsflexibility
andmathematical convenience. Other distributions have also been
proposed,
eg, apositive
stable distribution or an inverse Gaussian distribution(Hougaard,
1986a,b;
Klein et
al,
1992).
Unfortunately,
in all cases,they
do not have the theoreticalappeal
of the(multivariate)
normal distributioncommonly
used in animalbreeding
when a infinitesimal
polygenic
model is assumed.However,
it has been shown thatthe estimates obtained for the
parameters
of the gamma distribution of v wererelatively large,
at least indairy cattle,
which means that v had anapproximate
log-normal
distribution, ie,
s wasapproximately
normally
distributed(Ducrocq,
1987;
Ducrocq
etal, 1988b; Ducrocq,
1994).
Therefore,
it has beensuggested
to accountfor the
genetic
relationship
between animalsby assuming
a multivariate normaldistribution for s, the
logarithm
of thefrailty
term v(Ducrocq,
1987; Korsgaard,
1996).
Several
approaches
have been used to estimate theparameters
of thefrailty
distributions. Klein
(1992)
and Klein et al(1992)
suggested
the use of an EMalgorithm
(Dempster
etal,
1977),
with iterative estimation of v, 0 and the baselinecumulative hazard distribution for a Cox
model,
followedby
the estimation of thefrailty
distributiongiven
0. When a Weibull model is combined with a gammafrailty
term,
Follmann andGoldberg
(1988)
showed that thefrailty
term can bealgebraically
integrated
out from the likelihood function. The sameproperty
hasbeen used in a
Bayesian
context(Ducrocq,
1987; Ducrocq
etal, 1988b; Fournet,
1992; Ducrocq,
1994).
Monte-Carlotechniques
have also beensuggested
in order toobtain the
marginal
posterior
distributions of thehyperparameters
(Clayton,
1991;
Dellaportas
andSmith, 1993; Korsgaard,
1996)
but their use onlarge
data sets withcomplex
models(eg,
withtime-dependent
covariates)
may be very tedious. Theobjective
of this paper is topresent
ageneral Bayesian
approach
tothe
analysis
of mixed survivalmodels,
with(but
withoutbeing
restrictedto)
typical
animalbreeding
situations in mind. The framework will bepresented
fora
simple
Weibull model with twotypes
ofpriors
for thefrailty
term(gamma
orlog-normal).
Straightforward generalization
to other models(with
stratification andtime-dependent covariates,
Coxmodels)
will follow. Aparticular
strategy
for estimation of the
hyperparameters
suitable forlarge applications,
complex
models and situations where a
relationship
matrix is used will bepresented
andMETHODS
In the Weibull
regression
case, the baseline hazard function has the Weibull formA
o
(t) =
A/9(A!!.
For the timebeing,
we will assume that all covariates aretime-independent
and thatonly
one baseline is defined(no stratification).
The vector 0includes fixed and random effects. For
clarity,
and unlessspecified otherwise, only
one random effect in the
model,
eg, a sire effect s is considered here.Using
the classical linear mixed-model notation:where
13
is the vector of fixed effects.The hazard function
A(t)
for animal m is:and p
log A
can beincorporated
in agrand
mean(or
anyfactor)
inw!
0.For
simplicity,
we will write from now on:using
the same notation butkeeping
in mind that acomponent
ofw£
0(represent-in
g
anintercept)
now includesp
lo
g
A
.
m
If the record comes from a
daughter
m of sire q, with observed failure atT
m
:
Here, vq =
esq is thefrailty
term. The usualrelationship
f (t)
=A(t)S(t)
whereS(t)
=J
0
A(u)
du can be used to show that[3]
is aparticular
case of alog-linear
model of the form
(Kalbfleisch
andPrentice,
1980):
where um follows an extreme value distribution
(Kalbfleisch
andPrentice, 1980;
Lawless,
1982)
whose variance isequal
to!r2/6.
Note that here umimplicitly
includes
three-quarters
of the additivegenetic
variance. With thispresentation,
a natural definition of the
heritability
of the survival trait on thelogarithmic
scaleFormula
[6]
solves theproblem
of a proper definition ofheritability
for survivaltraits indicated in
Ducrocq
(1987)
andDucrocq
et al(1988b).
Prior distributions
Gamma
frailty
modelAssume:
vq
N
gamma(
T
,
7)
ie,
sq -
(generalized) log-gamma(-
/
,-y)
The
log-gamma
distribution(Bartlett
and Kendall, 1946,
according
to Lawless(1982),
p21)
corresponds
to the distribution oflogx
when x follows a gammadistribution. Note however that the suffix
’log-’
(eg,
in’log-normal’)
is oftengiven
to the distribution of x when
log x
has a known form(eg,
normal).
Again,
thechoice of this
prior
distribution ismainly
related to itsflexibility
and mathematicalconvenience
(see
alsoKlein,
1992,
and Klein etal,
1992).
Then:Log-normal
frailty
modelIn
quantitative genetics,
due to the infinitesimalpolygenic
modelusually assumed,
it is more natural to consider the
following
prior
distribution for thefrailty
term:and if sires are related:
where A is the
relationship
matrix betweensires,
we haveHyperparameters
In order to
simultaneously
consider the twoprevious
cases, we will denote thedispersion
parameter
of the random effect distributionby
T(with
T=y or T =
os
2
)
Likelihood construction
Conditionally
on 0 and p, the contribution to the likelihood of animal m which fails(8
m
=1)
or is censored(6&dquo;,
=0)
at time ym , is:
where
S(t)
is the survivor function at time t. For the Weibullmodel,
these twocomponents
are:Combining
all these contributions(for
m =1, ... ,
N)
which areconditionally
independent,
we obtain:where
{unc}
and{cens} represent
the sets of indices mcorresponding
to uncensoredand censored
records, respectively.
Joint
posterior
density
Applying Bayes’
theorem,
we obtain:Inference on 0 and p
If we assume that T is
known,
thelogarithm
of thejoint posterior density
ofUsing
the same notation as inTempelman
and Gianola(1993),
let8
r
be themode of this
joint posterior
density:
At the
mode,
thegradient
vector is null:For latter use, we also need to define the
negative
Hessian matrix:Joint inference on
(3,
p and TConsider here the
particular
case of the gammafrailty
model,
where theran-dom effect s has a
log-gamma
distribution(
T
=&dquo;y; this
implies
that thegenetic
relationship
between sires isignored).
Then themarginal
posterior
density
of0,
p and T is obtainedby
integrating
out s from thejoint posterior
density
p(e,p,T
I Y) = P((), P, 7 1
y
):
Grouping
the contributions to the likelihood of alldaughters
of each sire q:where now
func, q}
and{cens, q}
are the sets of indices m of the nq uncensoredand the censored
daughters
of sire q,respectively.
Writing
e!’1°
=e
x
;&dquo;{3
e
Sq
for alldaughters
of sireq, one can factor out the terms
which do not
depend
on sq, which leads to:with:
and:
Each of these
products,
for q =1, ...
N
9
,
is of the form:The term under the
integral
can berecognized
as the kernel of alog-gamma
distribution with
parameters
(n
9
+&dquo;
Y
)
and-!).
Therefore,
Hence,
theintegration
of the random effects sq out of thejoint posterior density
can be done
algebraically:
or:
Expressions
[28]
and[29]
areessentially
those used inDucrocq
(1987),
Ducrocq
et al
(1988b)
andDucrocq
(1994)
for the estimation of the sire variance of thelength
ofproductive
life ofdairy
cows. Follmann andGoldberg
(1988)
referred tobe estimated as the mode of this
posterior
distribution:with associated
negative
Hessian matrixH.
Inference on T
Inferences on the
dispersion
parameter
Tshould be based on itsmarginal
posterior
distribution,
afterintegrating
out the nuisanceparameters
0 and p(Berger,
1985;
Robert,
1992):
or:
J J
Except
in trivial cases, thisintegration
cannot beperformed algebraically.
Toobtain the
marginal
posterior
distribution of thedispersion
parameter
T, one caneither simulate random
samples
from it(Clayton,
1991;
Dellaportas
andSmith,
1993;
Korsgaard,
1996),
compute
theintegral numerically
(Smith
etal,
1985)
or find anapproximation.
We will choose the thirdalternative,
using
atechnique
known asLaplacian
integration
(Tierney
andKardane,
1986;
Achcar andBolfarine,
1986;
Tierney
andKardane,
1986;
Tierney
etal, 1989;
Tempelman
andGianola, 1993;
Goutis and
Casella,
1996).
For anygiven
value T* of T, we want toapproximate:
Intuitively,
ifp(0,p ! y,r)
=p(6
T
-
*
)
isunimodal,
the value of theintegral
willheavily depend
on the value of thedensity
at its mode6r* .
Then,
using
the firstterms of a
Taylor
seriesexpansion
oflogp(6!*)
around this mode andnoticing
that, 7 p
,.
(%
r
* )
=0,
we have:The determinant
part
in the lastequation
is obtainedby recognizing
the kernel ofThis results in an
approximation
of themarginal posterior
density
which is similarto what is described in the statistical literature as a
saddlepoint approximation
ofthis
density
(Daniels,
1954; Reid, 1988; Kolassa, 1994;
Goutis andCasella,
1996).
Taking
thelogarithm
on bothsides,
weget
thefollowing
approximation:
An obvious
point
estimate of T is T at the mode of thisapproximate
marginal
posterior
density:
However,
the use of[34]
is not limited to thecomputation
of its mode. Otherpoint
estimates or othertypes
of inferences(credible
sets orhypothesis
testing,
etc(Berger,
1985; Robert,
1992))
can be derived from theknowledge
of the fullmarginal
posterior
density. Repeated
computations
of(34!,
and inparticular
of thenegative
Hessian matrix
H,
for many different values of T mayquickly
become tooheavy,
though.
We propose to summarize thegeneral
characteristics of the distribution[34]
through
thecomputation
of its first three momentsby
unidimensional numericalintegration
based on Gauss-Hermitequadrature.
To obtain a moreprecise
estimateof these moments after
quadrature,
the iterativestrategy
proposed by
Smith etal
(1985)
isimplemented. Using
initial values of the mean and the variance ofthe distribution of
log
T
(to
force theintegration
domain to be(—
00
,+
00
)),
theintegration
variable is standardized. New estimates are obtainedby quadrature
and the standardization is
repeated.
After a fewiterations,
thisstrategy
ensuresthat the
quadrature
rules areapplied
in anappropriate region
of the function tointegrate.
Details aregiven
in theAppendix.
The results can be used to obtaina second
approximation
of themarginal
posterior
density
based on its first threemoments.
Using
anexpression
known as the Gram-Charlier seriesexpansion
of afunction
f (!)
of a variable x with moments p,2
0
&dquo;
and !c, we have(McCullagh,
1987):
where
§(z)
is thedensity
of a normal distribution with mean !, and varianceQ
2
2and z =
(x - p)lo,.
Other situations
Cox model
The
application
of thesaddlepoint approximation
to obtain themarginal posterior
density
of thedispersion
parameter
of the random effect is not restricted to theWeibull
regression
model. It can beapplied,
at least intheory,
to anyjoint posterior
density.
Forexample,
in the case of a Cox mixedmodel,
for which the baselinehazard function
Ao (t)
is assumed to becompletely arbitrary,
p(0,-*
!, y, T
*
)
and thelikelihood function in
[16]
by
thepartial
likelihood functioninitially proposed by
Cox(1972):
where the
T!2!’s
are the distinct observed failure times andRisk(T!Z! )
is the set ofindividuals
at risk at timeT
[i]
,
ie,
alivejust prior
to7!.
Then, assuming
that Tis
known,
the estimate of 0 to be used in[34]
is obtained from thejoint
posterior
density
as:Stratification.
Time-dependent
covariatesStratification and the use of
time-dependent
covariates are commonapproaches
to accommodate situations for which the
proportional
hazards is not valid for alleffects or
throughout
the whole time range. As for the Coxmodel,
the mainchanges
with
respect
to the situation described so far occur in thecomputation
of thelikelihood and its derivatives and do not interfere with the
validity
of thesaddlepoint
approximation.
Forexample,
if the covariates inb f mw.&dquo;,,
arestep-functions
of time withchanges
at timescp,&dquo;,,,i,
i =0, ...I
with W,,,o = 0 and<!m,7 =
Ym
, then wm is
piecewise
constant on intervals(cp,,&dquo;,,
i
,
and
i+1
cp&dquo;,,,
theexpressions
to use in[12]
are:In the case of
stratification,
the hazard functionA(y,,,)
and the survivor functionS(
Ym
)
includeparameters p
and plog A
(the
’intercept’
inw£0
in!1!)
specific
tothe relevant stratum.
ILLUSTRATION
In order to illustrate the
approach
described above for the estimation ofdispersion
parameters
of the random effects infrailty models,
simulated data weregenerated
based on a Weibull model with a random effect
(that
will be referred to as asire
effect)
andmimicking
the data structure that is often encountered in animalbreeding
situations. Theobjective
was to assess thequality
of thesaddlepoint
approximation
by
comparing
the exactmarginal
posterior
distribution of thevariance
parameter
of the sire effect( !28!
obtained viaalgebraic
integration)
with itsapproximation
(!34!
afterLaplacian
integration).
Thiscomparison
was done underthe
following
conditions: alog-gamma
distribution[8]
was considered as aprior
forfixed effect
(13
=ftthe
grand
mean )
wasincluded;
and it was assumed that in!28!,
we have:
Preliminary
examination of[43]
showed that in all casesstudied,
thedensity
[43]
was
virtually
identical to theapproximate density
p(-y y)
afterintegrating
out /tand p
by
Laplacian integration.
In otherwords,
what wasactually
compared
here are twoapproximate
densities obtained afterLaplacian
integration
of /-t, p andNq
sire
effects s9
in one case, of p and p(with
algebraic
integration
of thesq’s)
in theother case.
The
general
behavior of thesaddlepoint approximation
of themarginal
posterior
density
of the sire variance was also examined under avariety
of situations(different
types
ofcensoring,
of unbalancedstructure,
with a multivariate normalprior,
withrelationships
betweensires, using
a Coxmodel,
etc).
Simulation
strategy
In all situations
(unless
specified
otherwise),
5 000 records weregenerated
using
thefollowing
Weibull hazard function:where
A
jk
q (t)
represents
the hazard at time t of thejth
animal(j
=1, ...
5 000/Nq)
under the influence of the kth level of a fixed
effect,
hereafter referred to as the ’herd’effect
(k
=1,...
K)
anddaughter
of theqth
sire(q
=1,...
N
Q
).
Valuesp, _ -11 1
and p = 1.5 were used in all cases described
here, corresponding
to anaverage
failure time of about 1800. For the
comparison
betweenLaplacian
andalgebraic
integrations,
it was assumed that K =0, ie,
!3!
= 0 and the sire effectssq were
generated
from alog-gamma
distribution withparameter
’Y= 50. Thiscorresponds
to a variance of sq
equal
to1}i(1) C’Y) !
0.02,
whereis the
trigamma
function evaluated at y.Using expression
!6!,
weget:
which is in the
typical
range ofheritability
values encountered for this kind of trait.When a normal distribution was
assumed,
a sire variance of 0&dquo; = 0.02 was retainedto
generate
the sire effects. When herd effects were used in model[44]
(K
>0),
these were
arbitrarily generated
from a uniform!-2, 2!
distribution.Two different
censoring
schemes were simulated. Incensoring
type
A,
allgen-erated records
greater
than agiven
valueC
A
were considered as censored atC
A
.
The value of
C
A
was chosenby
trial and error in order to obtain agiven proportion
of censored records.
Censoring
type
B tried to mimic anoverlapping
generations
to
C
B
when their simulated failure time wasgreater
thanC
B
.
Thedaughters
of thefollowing
batch(also 10%)
of sires were considered as censored when their failuretime was
greater
that2C
B
,
and so on. Thecensoring
time for the last10%
waslOC
B
.
Therefore,
thedaughters
of the first group of sires wereheavily
censored(’young
daughters
of youngsires’)
while theproportion
of censored records for the last group was small(’daughters
of oldsires’).
Again,
C
B
was determinedby
trialand error.
Different unbalanced situations were also simulated. In scheme
U1,
thedaughters
of 100 sires
(with
50daughters
each)
were distributed over 505herds,
five with 500animals and 500 with five
daughters.
In schemeU2,
half of the animals(2 500)
wereassumed to be
daughters
of five sires with 500daughters
each while the other halfwere
daughters
of 500 sires with fivedaughters
each. These animals wererandomly
distributed over 100 herds.
Finally,
in schemeU3,
thedaughters
of the 50 ’best’sires
(with
50daughters
each)
were raised in the ’best’ 50 herds(where
’best’ meanslowest relative
culling
rate)
while thedaughters
of the ’worst’ 50 sires were raisedin the ’worst’ herds.
To
study
theimpact
of the existence ofgenetic
relationships
betweenindividuals,
data were
generated according
to a modelslightly
different from[44].
First,
the effects sg, of ten
grandsires
(’sires
ofsires’)
weregenerated
from a normaldistribution with mean 0 and variance
a 2/4
(with
0
&dquo;;
=0.02).
For each ofthem,
ten sire effects sq were obtained
by adding
to sg, anormally
distributed randomeffect with variance
3o!/4.
Finally,
50 records ofdaughters
of each of these sireswere simulated
according
to the model:where rj
represents
theremaining
additivegenetic
effect for thejth
animal andwas
generated
from a normal distribution with mean 0 and variance3
Q9
,
leading
to records with a
global
additivegenetic
varianceequal
toQ
a
=4cr!.
These
datawere
analyzed
and themarginal
posterior
density
of the sire variancecomponent
was obtained under three different
genetic
models: two sire models identical to[44]
assuming
norelationships
between sires(case Sl)
orincluding
therelationship
matrix between sires
(case S2),
and an ’animal’ model(case An),
describing
theindividual additive
genetic
effect aj of eachanimal j
and including
thecomplete
relationship
matrix between the 5 110 animals(5 000
with records + 100 sires + 10grand-sires):
All
computations
were doneusing
the ’SurvivalKit’,
a set of Fortran programsdeveloped by
Ducrocq
and S61kner(1994).
The ’Survival Kit’ wasspecifically
written to
efficiently analyze
the verylarge
field data sets encounteredby
animalbreeders and
implements
all the features described in this paper with Weibull and Coxmodels,
possibly
withstrata,
time-dependent
covariates and random effects.In
particular,
the maximization of theexpressions
[18]
or[29]
is based on a limitedmemory
quasi-Newton
method(Liu
andNocedal,
1989)
whichonly
requires
thecomputation
of the vector of first derivatives of[18]
or!29!.
Ifrequired
(for
example,
in
[36]
or whencomputing asymptotic
standarderrors),
thenegative
Hessian is1994)
are used tocompute
the determinant or the inverse of thisnegative
Hessianin the Weibull case.
Results
Laplacian
integration
vsAlgebraic
integration
Figure
1represents
themarginal
posterior
distribution obtained afterintegrating
out the sire effects s9 from the
joint posterior
distribution,
eitheralgebraically
or
using
theLaplacian
approximation.
All records were uncensored. In the threesamples presented here,
the truevalue q
= 50 isobviously
included in anyreasonable HPD credible set. When there were few sires with many
daughters
each,
the twocomputed
forms of themarginal posterior
distribution werevirtually
indistinguishable.
When little information was available for each sire effect(ten
daughters
each in the 500 sirescase),
themarginal
posterior
distributions wererather
flat,
with along
tail towardslarge
valuesof q
(ie,
small sirevariances).
Theagreement
betweenLaplacian
andalgebraic
integration
was not asgood, although
the modes of the two distributions were close. With even less information per
sire
(five
daughters
or less persire),
neither of the twomarginalization techniques
worked in most of the cases: the mode of the distribution or its first moments could
Effect of
censoring
Figure
2presents
again
the result of the same twomarginalization approaches,
for100 sires with 50
daughters
each but undercensoring
schemes A andB,
with in bothcases a
proportion
of50%
censored records(C
A
= 1200 andC
B
=270).
Clearly,
censoring
had little effect on thequality
of theapproximation
when theLaplacian
integration
was used.However,
because the amount of information available toestimate a rather small sire variance was
drastically
reduced,
it was notalways
possible
to obtain a well-definedposterior
density
(see
Breslow andClayton
(1993)
for similar results in the context of
generalized
linear mixedmodels).
Forexample,
infigure 2,
theposterior
density
in the case ofcensoring
scheme A does notintegrate
to1. The same
phenomenon
also occurred for somesamples
withcensoring
scheme B.Interestingly,
when sire effects with alarger
variance 7 = 10 weresimulated,
whichcorresponds
to anheritability
of0.24,
even extreme situations with more than80%
censored records(with
C
A
=520)
led towell-defined,
very
peaked
posterior
densities.
Normally
distributed randomeffects
Having
shown thevalidity
of thesaddlepoint
approximation
of themarginal
effects and with 100
(fixed)
’herd’ effects.Figure
3displays
themarginal
posterior
density
for ten suchsamples,
with 100 sires and nocensoring.
The obtaineddistributions were not as skewed as in the case of a
log-gamma
distribution. Atleast in the
examples
studied,
the true value 0.02 wasalways
in any HPD credibleset. Note however that the variance of these densities were
quite
large
(standard
deviations between 0.0049 and 0.0079 for a true
parameter
value of0.02).
Effect of unbalancedness
When unbalancedness was induced
by simulating
both verylarge
and small herds(case Ul),
the effect on themarginal
posterior
density appeared
to be minimal(fig 4).
When alarge heterogeneity
was created in the number ofdaughters
per sire(case
U2),
the main consequence was a lessprecise
estimation of the sire variance.The most
negative impact
was observed when the animals were notrandomly
distributed across herds
(case U3).
It seems that apart
of the favorable influence ofthe best sires on the survival of their
daughters
was attributed to the herdeffects,
resulting
in a sire variancestrongly
biased downwards.Including
arelationship
matrixThe two
marginal
posterior
densities obtained under a sire model with or withoutinclusion of the true
relationship
matrix between sires were very similar(fig 5).
As may have been
expected,
the inclusion of therelationship
matrixslightly
the records of related animals are more
similar,
henceglobally
less variable. In allthe
samples simulated,
the animal modelconsistently
led to aslight
overestimationof the sire variance: the
marginal
posterior
density
in the case of the animal modelwas
systematically
to theright
of those for the two sire models. This may beattributed,
at least inpart,
to the fact that a muchlarger
number ofparameters
have to be
integrated
out with an animal model than with a sire model. Such aproblem
has beenpointed
out forexample by Mayer
(1995)
in the context of athreshold model. The
Laplacian integration probably
does notperform
as well insuch a case.
Note, however,
that this may be worsenedby
the fact thatonly
avery
simple pedigree
structure was simulated here. Inparticular,
no information atall was assumed to be available on the female side. The sire model used does not
account for the
overdispersion
implicitly
createdby
the effect rj, whichrepresents
three-quarters
of the total additivegenetic
variance. Anattempt
to fit a modelsimilar to
[46]
assuming
alog-gamma
prior
distribution for rj andperforming
thealgebraic
integration
of rj led to amarginal posterior density
of the sire variancesimilar to that obtained with the two sire models and a very
large
estimate(q
> 400at the
mode)
for the gammaparameter,
synonymous of a very small variance for ther
j
’s.
This islikely
the result of the lack of information available for the estimationfor q
that wasalready
illustrated infigure
1.Cox model vs Weibull model
When a
parametric
(Weibull)
orsemi-parametric
(Cox)
model was used in theconstruction of the likelihood
function,
it wasrepeatedly
observed that theresulting
marginal posterior
densities ofa
were
very similar(fig 6),
with often aslightly
larger
variance in the case of the Cox model. It is not known if similar resultswould have been obtained had the data been
generated
assuming
a baseline hazardfunction different from the Weibull hazard.
Approximation
of themarginal posterior density
of T based on its firstthree
moments
The first three moments of the
marginal
posterior
density
of theparameter
Twere
computed
by
numericalintegration
of[34]
using
afive-point
Gauss-Hermitequadrature
formula and after standardization of the function tointegrate.
Newstandardization factors were obtained and the
procedure
wasrepeated
until thecomputed
momentsstabilized,
whichusually
occurred afteronly
three iterations.Figure
7 illustrates the fact that theknowledge
of these moments leads to areasonable
approximation
of themarginal
posterior
density
of TDISCUSSION AND CONCLUSION
Bayesian
analysis
offers a coherent framework for the otherwise unclearproblem
ofvariance
components
estimation in mixed nonlinear models(Ducrocq, 1990):
all theelements for inferences on
dispersion
parameters
are contained in themarginal
pos-terior distribution of these
parameters
and the construction of the latter is basedproposed
forcategorical
data(Foulley
etal, 1987;
H6
5
chele
etal, 1987;
Foulley
et
al,
1989)
and for Poisson mixed models(Tempelman
andGianola,
1993).
Inthis paper, a
general
approach
forgenetic
evaluation and estimation ofdispersion
parameters
for Weibull and Cox mixed models was described. Its main attractivefeatures are its
generality
and itscomputational feasability,
even for verylarge
applications.
As anexample
of thelatter,
thelargest analysis
that we have carriedout involved the estimation of the mode and the first three moments of the
marginal
posterior
distribution of the sire variancecomponent
for thelength
ofproductive
life of 633 516 Holstein cows,
daughters
of 3 613 related sires. The Weibull mixedmodel used was
quite complex
and includedtime-dependent
effects such as aherd-year-season effect
(with
82 713levels,
assumed to berandomly
distributed with alog-gamma
distribution),
a lactation number xstage
of lactationeffect,
a herd sizeeffect and a
year-to-year
variation in herd size effect as well as continuous linearand
quadratic
effects of covariates such age at firstcalving,
milk,
fat andprotein
yield.
Popular
extensions ofproportional
hazards models such as stratification or the use oftime-dependent
covariatescomplicate
the actual likelihoodcomputations
butdo not interfere with the
marginalization
procedures
described here. The inclusionof
genetic relationships
between individuals isstraightforward through
the use of anappropriate prior
distribution. Otherprior
distributions(including
informativepriors)
or otherparametric
baseline hazard functions could have beenincorporated.
More
complex
genetic
structures(eg,
with maternaleffects)
can be fitted. Whenmore than one random effect is considered in the
model,
theapproximation
described here leads to the
joint marginal posterior
of all thedispersion
parameters
for all random effects. Further
marginalization
can beperformed numerically along
the lines described in the
Appendix
for the calculation of the moments of themarginal
posterior
distribution but this may be considered toocostly.
In the case of a Weibull mixed model with two randomeffects,
one of themhaving
alog-gamma
distribution,
thepossibility
ofintegrating
out the latteralgebraically
avoids thisdifficulty.
Laplacian
integration
can beapplied
to other situations too. Forexample,
Tier-ney and Kadane
(1986)
andTierney
et al(1989)
suggested
the directcomputation
of the mean of the
marginal posterior
density
using
second-orderapproximation
formulae. These formulae were derived
applying
Laplacian integration
to both thenumerator and the denominator of a ratio of
integrals.
However,
thisrequires
themaximization of the
joint posterior
density
for thedispersion
parameters,
the fixedeffects and the random effects. This
approach
failed when weattempted
it as themaximization
procedure
led todispersion
parameters estimatescorresponding
torandom effects with null variance. The same
phenomenon
had been describedpre-viously
in similar situations(Tempelman
andGianola,
1993).
At least in
theory,
Laplacian integration
could have been used to obtain themarginal posterior
distribution ofparameters
other than thedispersion
parameters.
However,
this may be considered far toodemanding,
because eachapplication
of theLaplace
expansion requires
the maximization of oneparticular
functioninvolving
all
parameters except
the one of interest. This is in contrast with someMonte-Carlo
methods,
such as Gibbssampling,
where themarginal
distributions for allsituations,
theseparate
consideration of allmarginal
densities is often notrequired,
because estimated
breeding
values arepoint
estimatesmainly
used to rank animals:when little information is available for the
genetic evaluation,
an accurateranking
ofthe candidates to selection is unrealistic. In the
opposite
case(precise
estimation),
the
rankings
based on, say, the mode or the mean of either themarginal
or thejoint
posterior
distribution arelikely
to be very similar.Marginal
posterior
densities ofnonlinear functions of
parameters
can also be calculated(Wong
andLi,
1992).
Marginalization
based onLaplacian integration
has been shown togive
excellentresults in standard situations. For many nonlinear
applications,
thequality
of thesaddlepoint
approximation
would have torely
on thecomparison
of theapproximate
marginal
distribution of thedispersion
parameters
with the actual distributionobtained via Monte-Carlo simulations. The
exceptional
situation studied here wherean exact
algebraic integration
of alog-gamma
random effect ispossible permits
a morestraightforward
comparison.
It was found that thedesigns
for which thetwo
marginal
posterior
distributions(exact
andapproximate)
depart
from eachother
correspond
to situations where thequantity
of information available for theestimation of
genetic
parameters
isquite
limited. This means, inparticular,
thatthe
saddlepoint approximation
islikely
to be unsuccessful for the estimation of theparameters
of afrailty
term used to describe an extra variation(overdispersion).
However,
one can still usealgebraic integration
of the random effects in the case of a gammafrailty
component
in a Weibull model.ACKNOWLEDGMENTS
The first author thanks the INRA for
making possible
his stay at CornellUniversity
andexpresses his
appreciation
to GenexCooperative, Inc, Ithaca,
New York for its support.REFERENCES
Aalen 00
(1994)
Effects offrailty
in survivalanalysis.
Statist Meth in Med Res3, 227-243
AbramowitzM, Stegun
IA(1964)
Handbookof
Mathematical Furcctiorcs. USDepartment
of
Commerce,
National Bureau ofStandards,
1046 pAchcar
JA,
Bolfarine H(1986)
Use of accurateapproximations
forposterior
densities inregression
models with censored data. Rev Soc Chil Estad3,
84-104Anderson JE, Louis
TA,
Holm NV, Harvald B(1992)
Time-dependent
associationmea-sures for bivariate survival
analysis.
J Am Stat Ass87,
641-650Berger
JO(1985)
Statistical DecisionTheory
andBayesian Analysis. Springler-Verlag,
New
York,
NYBreslow
NE,
Clayton
DG(1993)
Approximate
inference ingeneralized
linear mixed models. J Am Stat Ass 88, 9-25Clayton
DG(1991)
A Monte-Carlo method forBayesian
inference infrailty
models. Biometrics47,
467-485Clayton DG,
Cuzick J(1985)
Multivariategeneralizations
of theproportional
hazards model. J R StatSoc,
Series A148,
82-117Cox D
(1972)
Regression
models and life table(with discussion).
J R StatSoc,
Series B34, 187-220