A Bayesian analysis of mixed survival models

(1)

HAL Id: hal-00894151

https://hal.archives-ouvertes.fr/hal-00894151

Submitted on 1 Jan 1996

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

A Bayesian analysis of mixed survival models

V Ducrocq, G Casella

To cite this version:

(2)

Original

article

A

_{Bayesian analysis}

of mixed survival modelst

V

_Ducrocq

₁

G Casella

2

1

Department of

Animal

_Science,

Cornell

_University;

2

Biometrics

_Unit,

Cornell

_{University, Ithaca,}

NY

_14852,

USA

(Received

13

_{August 1996; accepted}

1 October

₁₉₉₆₎

Summary - In

proportional

hazards

models,

the hazard of an animal

_A(t),

_ie,

its

probability

of

_dying

or

_being

culled at time t

_given

it is alive

prior

to _t,is described

as

_{A(t) =}

’>"

o

(t)e

W

’

e

where

_(t)

_o

_A

is a ’baseline’ hazard function and

e

w

’B

_representsthe

effect of covariates w on

culling

rate. A distribution can be attached to _{elements sq in}

0,

identifying,

for

_{example, genetic}

effects and

_leading

to mixed survival

models,

also called

_{’frailty’}

models. To estimate the parameters Tof the distribution of

frailty

_terms,a

Bayesian analysis

is

_proposed.

Inferences are drawn from the

_{marginal posterior density}

x(T)

which can be derived from the

_{joint posterior}

_density

via

Laplacian integration,

a

powerful technique

related to

_{saddlepoint approximations.}

The

_validity

of this

technique

is shown here on simulated

_{examples by comparing}

the

_{resulting approximate}

_x(

_T

₎

to the one

obtained

_{by algebraic integration.}

This exact calculation is feasible _{in very}

_specific

cases

only,

whereas the

_{saddlepoint approximation}

can be

_applied

to situations where

_Ao(t)

is

arbitrary

(Cox models)

or

_parametric

(eg, Weibull),

where the

frailty

terms are correlated

through

a known

_relationship

_matrix,or in more

_general

models with stratification

and/or

time-dependent

covariates. The influence of the

_censoring

rate and the data structure is

also illustrated.

survival _analysis

_/

mixed

_{model / variance}

component estimation

_/

Bayesian

analy-sis

_/

_proportionalhazards model

Résumé - Une analyse bayésienne des modèles de survie mixtes. Dans le cas des modèles à

risques proportionnels,

la fonction

de

_risque

d’un animal

_a(t),

c’est-à-dire sa

probabilité

de mourir ou d’être

_réformé

au temps t sachant

_qu’il

est vivant

_juste

avant t, a

la forme

A(t)

=

>’

o

(t)e

W

’o

(t)

o

où A

_est

_{une fonction}

de

_risque

« de basé»

et eW’o

_représente

l’e,f,fet

des covariables w sur le taux de

_réforme.

Une distribution peut être associée avx termes _S_{q de}

_{9, identifiant,}

_par

_exemple,

des

_{effets génétiques}

et conduisant à des modèles

*

Correspondence

and

_reprints:

Station de

_{g6n6tique quantitative}

et

_appliqu6e,

Institut national de la recherche

_agronomique,

78352

Jouy-en-Josas cedex,

France.

1

On leave from the Station de

_{g6n6tique quantitative}

et

_appliqu6e,

Institut national de

la recherche

_agronomique.

(3)

de survie mixtes, aussi

_appelés

modèles

_{de fragilité.}

Pour l’estimation des

_paramètres

T de

la distribution des termes

_aléatoires,

une

_{analyse bayésienné}

est

_proposée.

Les

_inférences

statistiques sont

_faites

à partir de la densité

_marginale

a _posteriori

x(

T

)

_{qui peut}être

obtenue à

_partir

de la distribution

_conjointe

a _posteriori_par

_{intégration laplacienne,}

une

technique

liée aux approximations

point-selles.

La validité de cette

_technique

est démontrée

ici

_{à partir d’exemples simulés,}

en _comparantles résultats de

l’approximation

de

₇

_r(

_T

₎

avec ceux obtenus

après intégration algébrique.

Cette dernière

correspond

à un calcul exact

réalisable uniquement dans des cas très

particuliers,

alors _que

_{l’approximation point-selle}

peut être

_appliquée

dans des situations où

_À

_o

_(t)

est

_{complètement}

arbitraire

_(modèles

de

_Cox)

ou

_{paramétrique}

_(par

_exemple,

de _type

_Weibull),

où les termes aléatoires sont

corrélés à travers une matrice de

_parenté

connue, ou avec des modèles

_{plus généraux}

avec

stratification

et/ou

covariables

dépendantes

du _temps.

_{L’influence}

du taux de censure et

de la structure des données est aussi illustrée.

analyse de données de survie

_{/ modèles}

mixtes

_/

estimation des _composantesde

variance

_/

_{analyse bayésienne}

_{/ modèle}

à risques proportionnels

INTRODUCTION

Traits associated with

_{longer productive}

life of livestock are

_{receiving increasing}

attention in the animal

_breeding

field: it is

_recognized

that

_{decreasing culling}

due

to the

_involuntary

causes

_(eg,

related to

_disease,

_infertility,

_lameness,

_etc)

_by

_genetic

or

_non-genetic

means has a

_positive

effect on economic

_{performance, mainly through}

decreased

replacement

costs

_(van

_Arendonk,

1986;

Strandberg,

1991,

Strandberg,

1995,

Strandberg

and

S61kner,

1996).

Huge

field data sets are

usually

available

for

_{comprehensive}

_analyses

of

_productive

_life,

for

_example,

as a

_by-product

of

the

dairy recording

schemes in

_dairy

cattle. The obvious

_methodology

of choice

for such studies is survival

_analysis,

in which proper

techniques

to deal with the

unavoidable _presenceof censored data have been

_{developed. However,}

statistical

complexity

and

computational

difficulties related to these methods have

_delayed

the

_adoption

of state-of-the-art

_methodology

and different indirect

_approaches

have been

proposed

(see

Strandberg

and S61kner

₍₁₉₉₆₎

for a

_review).

Some

large-scale

applications

(Smith,

1983;

Smith and

Quaas, 1984; Ducrocq, 1987; Ducrocq

et

_{al, 1988a,}

_b;

_{Ruiz, 1991; Fournet, 1992; Egger-Danner, 1993; Ducrocq,}

₁₉₉₄₎

as well as the

availability

of a software

specifically

written with animal

breeding

applications

in mind

_(Ducrocq

and

_S61kner,

₁₉₉₄₎

have demonstrated that the use

of less

_{appropriate approaches}

can be avoided.

The most

_popular

class of survival models is the class of

_proportional

hazards

models

_(Cox,

_1972;

Kalbfleisch and

_{Prentice, 1980; Lawless, 1982;}

Cox and

_Oakes,

1984).

The hazard of an animal

_(or

in the animal

_breeding

_context,

its risk of

being

culled)

at time t is described as the

product

of a baseline hazard function

.!o(t),

which is either left

_completely

_arbitrary

_{(Cox model)}

or has a

_parametric

form

_(eg,

_exponential,

Weibull or

_gamma)

and of a

_positive

term which is an

exponential

function of a vector of covariates w’

_{multiplied by}

a vector of

_regression

parameters

0.

Proportional

hazard models can be extended to include random

_{(eg, genetic)}

effects,

as in the

_regular

mixed linear models that are used for

_genetic

evaluations

worldwide. Mixed survival models are

_classically

referred to as

_{’frailty’}

models

_by

(4)

affects

_{multiplicatively}

the hazard of individuals or _groupsof animals. When a term

v

m is defined for each animal ’I!!I

_(!,,L(t,w)

=

v

m

.

À

(t, w)),

the

frailty

_component

extracts

_part

of the unobserved variation between individuals

_(Vaupel

et

_{al, 1979;}

Hougaard,

1986a,b;

Follmann and

_Goldberg,

_1988;

_Aalen,

₁₉₉₄₎

and therefore

allows for a correction of the

_possible

_discrepancy

between the true variance of

the observations and the one

specified

_by

the model. Such an extra variation is

referred to as

_{’overdispersion’}

_(Louis,

_{1991; Tempelman}

and

_Gianola,

_1994).

When

vq is defined for a _groupof

_individuals,

_eg,all

_daughters

of a sire _q,it describes the

shared unobservable

_(genetic,

in this

_case)

characteristics which act on the hazard

of each member of the _group

_(Clayton

and

_Cuzick,

_1985;

Anderson et

_al,

_1992;

Klein,

1992;

Klein et

_al,

_1992).

In all cases, the

simple

transformation s =

log

v

allows the inclusion of the

_frailty

term in the linear term w’O.

Traditionally,

a _gamma

(Clayton

and

_Cuzick,

_1985;

_Ducrocq,

_1987;

_Klein,

₁₉₉₂₎

distribution has been attached to the

_frailty

term v because of its

_flexibility

and

mathematical convenience. Other distributions have also been

_proposed,

_eg,a

positive

stable distribution or an inverse Gaussian distribution

_(Hougaard,

1986a,b;

Klein et

_al,

_1992).

_{Unfortunately,}

in all cases,

they

do not have the theoretical

appeal

of the

_{(multivariate)}

normal distribution

_commonly

used in animal

_breeding

when a infinitesimal

_polygenic

model is assumed.

_However,

it has been shown that

the estimates obtained for the

parameters

of the gamma distribution of v were

relatively large,

at least in

_{dairy cattle,}

which means that v had an

_approximate

log-normal

_{distribution, ie,}

s was

_{approximately}

_normally

distributed

_(Ducrocq,

_1987;

Ducrocq

et

_{al, 1988b; Ducrocq,}

_1994).

_Therefore,

it has been

_suggested

to account

for the

_genetic

_relationship

between animals

_{by assuming}

a multivariate normal

distribution for _s,the

_logarithm

of the

_frailty

term v

(Ducrocq,

_{1987; Korsgaard,}

1996).

Several

approaches

have been used to estimate the

_parameters

of the

_frailty

distributions. Klein

₍₁₉₉₂₎

and Klein et al

₍₁₉₉₂₎

suggested

the use of an EM

algorithm

(Dempster

et

_al,

_1977),

with iterative estimation of _{v, 0}and the baseline

cumulative hazard distribution for a Cox

_model,

followed

_by

the estimation of the

frailty

distribution

_given

0. When a Weibull model is combined with a _gamma

frailty

term,

Follmann and

_Goldberg

₍₁₉₈₈₎

showed that the

_frailty

term can be

algebraically

integrated

out from the likelihood function. The same

_property

has

been used in a

_Bayesian

context

_(Ducrocq,

_{1987; Ducrocq}

et

_{al, 1988b; Fournet,}

1992; Ducrocq,

1994).

Monte-Carlo

_techniques

have also been

_suggested

in order to

obtain the

_marginal

_posterior

distributions of the

_{hyperparameters}

_(Clayton,

_1991;

Dellaportas

and

_{Smith, 1993; Korsgaard,}

₁₉₉₆₎

but their use on

_large

data sets with

complex

models

_(eg,

with

_{time-dependent}

_covariates)

_maybe _verytedious. The

_objective

_{of this paper is} to

_present

a

_{general Bayesian}

_approach

to

the

_analysis

of mixed survival

models,

with

_(but

without

_being

restricted

_to)

typical

animal

_breeding

situations in mind. The framework will be

_presented

for

a

_simple

Weibull model with two

_types

of

_priors

for the

_frailty

term

_(gamma

or

log-normal).

Straightforward generalization

to other models

_(with

stratification and

_{time-dependent covariates,}

Cox

_models)

will follow. A

_particular

_strategy

for estimation of the

_{hyperparameters}

suitable for

_{large applications,}

complex

models and situations where a

_relationship

matrix is used will be

_presented

and

(5)

METHODS

In the Weibull

_regression

_case,the baseline hazard function has the Weibull form

A

o

(t) =

A/9(A!!.

For the time

_being,

we will assume that all covariates are

time-independent

and that

_only

one baseline is defined

_{(no stratification).}

The vector 0

includes fixed and random effects. For

_clarity,

and unless

_{specified otherwise, only}

one random effect in the

_model,

eg, a sire effect s is considered here.

_Using

the classical linear mixed-model notation:

where

₁₃

is the vector of fixed effects.

The hazard function

_A(t)

for animal m is:

and _p

_{log A}

can be

_incorporated

in a

grand

mean

(or

_any

factor)

in

w!

0.

For

_simplicity,

we will write from now on:

using

the same notation but

_keeping

in mind that a

_component

of

_w£

0

(represent-in

g

an

_intercept)

now includes

_p

_lo

_g

_A

_.

m

If the record comes from a

_daughter

m of sire q, with observed failure at

T

m

:

Here, vq =

esq is the

_frailty

term. The usual

_relationship

_{f (t)}

=

A(t)S(t)

where

S(t)

=

J

0

A(u)

du can be used to show that

_[3]

is a

_particular

case of a

_log-linear

model of the form

_(Kalbfleisch

and

_Prentice,

_1980):

where um follows an extreme value distribution

_(Kalbfleisch

and

_{Prentice, 1980;}

Lawless,

1982)

whose variance is

_equal

to

_!r2/6.

Note that here um

implicitly

includes

_{three-quarters}

of the additive

_genetic

variance. With this

_{presentation,}

a natural definition of the

_heritability

of the survival trait on the

_logarithmic

scale

(6)

Formula

_[6]

solves the

problem

of a _properdefinition of

_heritability

for survival

traits indicated in

_Ducrocq

₍₁₉₈₇₎

and

_Ducrocq

et al

_(1988b).

Prior distributions

Gamma

_frailty

model

Assume:

vq

N

gamma(

T

,

7)

ie,

sq -

(generalized) log-gamma(-

/

,-y)

The

_log-gamma

distribution

_(Bartlett

_{and Kendall, 1946,}

_according

to Lawless

(1982),

p

21)

corresponds

to the distribution of

_logx

when x follows a _gamma

distribution. Note however that the suffix

_’log-’

_(eg,

in

_{’log-normal’)}

is often

_given

to the distribution of x when

_{log x}

has a known form

_(eg,

normal).

_Again,

the

choice of this

_prior

distribution is

_mainly

related to its

_flexibility

and mathematical

convenience

_(see

also

_Klein,

_1992,

and Klein et

_al,

_1992).

Then:

Log-normal

frailty

model

In

_{quantitative genetics,}

due to the infinitesimal

_polygenic

model

_{usually assumed,}

it is more natural to consider the

_following

prior

distribution for the

_frailty

term:

and if sires are related:

where A is the

_relationship

matrix between

sires,

we have

Hyperparameters

In order to

_{simultaneously}

consider the two

_previous

_cases,we will denote the

dispersion

parameter

of the random effect distribution

_by

T

(with

T=

y or T =

os

2 )

(7)

Likelihood construction

Conditionally

on 0 and _p,the contribution to the likelihood of animal m which fails

(8

m

=

1)

or is censored

_(6&dquo;,

=

0)

_attime y

m , is:

where

_S(t)

is the survivor function at time t. For the Weibull

_model,

these two

components

are:

Combining

all these contributions

_(for

m =

_{1, ... ,}

N)

which _are

conditionally

independent,

we obtain:

where

_{unc}

and

_{{cens} represent}

the sets of indices m

_{corresponding}

to uncensored

and censored

_{records, respectively.}

Joint

posterior

density

Applying Bayes’

theorem,

we obtain:

(8)

Inference on 0 and _p

If we assume that T is

_known,

the

_logarithm

of the

_{joint posterior density}

of

Using

the same notation as in

_Tempelman

and Gianola

_(1993),

let

₈

_r

be the

mode of this

_{joint posterior}

_density:

At the

_mode,

the

_gradient

vector is null:

For latter _use,we also need to define the

_negative

Hessian matrix:

Joint inference on

_(3,

_pand T

Consider here the

_particular

case of the gamma

frailty

model,

where the

ran-dom effect s has a

_log-gamma

distribution

(

T

=

&dquo;y; this

implies

that the

genetic

relationship

between sires is

_ignored).

Then the

_marginal

_posterior

_density

of

0,

p and T is obtained

by

integrating

out s from the

_{joint posterior}

_density

p(e,p,T

I Y) = P((), P, 7 1

y

):

Grouping

the contributions to the likelihood of all

_daughters

of each sire _q:

(9)

where now

_{func, q}}

and

_{{cens, q}}

are the sets of indices m of the _nquncensored

and the censored

_daughters

of sire q,

respectively.

Writing

e!’1°

=

e

x

;&dquo;{3

e

Sq

for all

daughters

of sire

q, one can factor out the terms

which do not

_depend

on _{sq, which}leads to:

with:

and:

Each of these

_products,

for q =

1, ...

N

9 ,

is of the form:

The term under the

_integral

can be

_recognized

as the kernel of a

_log-gamma

distribution with

_parameters

_(n

₉

+

_&dquo;

_Y

₎

and

_(Qq

+

_-!).

_Therefore,

Hence,

the

_integration

of the random effects _sqout of the

_{joint posterior density}

can be done

_{algebraically:}

or:

Expressions

[28]

and

_[29]

are

_essentially

those used in

Ducrocq

(1987),

Ducrocq

et al

_(1988b)

and

_Ducrocq

₍₁₉₉₄₎

for the estimation of the sire variance of the

length

of

_productive

life of

_dairy

cows. Follmann and

Goldberg

(1988)

referred to

(10)

be estimated as the mode of this

_posterior

distribution:

with associated

_negative

Hessian matrix

H.

Inference on T

Inferences on the

_dispersion

_parameter

Tshould be based on its

_marginal

_posterior

distribution,

after

_integrating

out the nuisance

parameters

0 and p

(Berger,

1985;

Robert,

1992):

or:

J J

Except

in trivial cases, this

integration

cannot be

_{performed algebraically.}

To

obtain the

_marginal

_posterior

distribution of the

_dispersion

_parameter

T, one can

either simulate random

samples

from it

_(Clayton,

_1991;

_Dellaportas

and

_Smith,

1993;

Korsgaard,

1996),

compute

the

_{integral numerically}

_(Smith

et

_al,

₁₉₈₅₎

or find an

_{approximation.}

We will choose the third

_alternative,

using

a

technique

known as

_Laplacian

_integration

(Tierney

and

Kardane,

_1986;

Achcar and

Bolfarine,

_1986;

Tierney

and

_Kardane,

_1986;

_Tierney

et

_{al, 1989;}

_Tempelman

and

_{Gianola, 1993;}

Goutis and

_Casella,

_1996).

For any

given

value T* of T, we want to

approximate:

Intuitively,

ifp(0,p ! y,r)

=

p(6

T

-

*

)

is

unimodal,

the value of the

integral

will

heavily depend

on the value of the

density

at its mode

6r* .

_Then,

_using

the first

terms of a

_Taylor

series

_expansion

of

_logp(6!*)

around this mode and

_noticing

that

, 7 p

,.

(%

r

* )

=

_0,

_wehave:

The determinant

part

in the last

equation

is obtained

_{by recognizing}

the kernel of

(11)

This results in an

_{approximation}

of the

_{marginal posterior}

_density

which is similar

to what is described in the statistical literature as a

_{saddlepoint approximation}

of

this

_density

_(Daniels,

_{1954; Reid, 1988; Kolassa, 1994;}

Goutis and

_Casella,

_1996).

Taking

the

_logarithm

on both

_sides,

we

_get

the

_following

_{approximation:}

An obvious

_point

estimate of T is T at the mode of this

_approximate

_marginal

posterior

density:

However,

the use of

[34]

is not limited to the

_computation

of its mode. Other

point

estimates or other

_types

of inferences

(credible

sets or

_hypothesis

_testing,

etc

(Berger,

1985; Robert,

1992))

can be derived from the

_knowledge

of the full

_marginal

posterior

density. Repeated

computations

of

_(34!,

and in

_particular

of the

_negative

Hessian matrix

H,

for many different values of T _may

_quickly

become too

_heavy,

though.

We propose to summarize the

_general

characteristics of the distribution

_[34]

through

the

_computation

of its first three moments

_by

unidimensional numerical

integration

based on Gauss-Hermite

_quadrature.

To obtain a more

_precise

estimate

of these moments after

_quadrature,

the iterative

_strategy

_{proposed by}

Smith et

al

₍₁₉₈₅₎

is

_{implemented. Using}

initial values of the mean and the variance of

the distribution of

_log

_T

_(to

force the

_integration

domain to be

_(—

₀₀

_,+

₀₀

_)),

the

integration

variable is standardized. New estimates are obtained

by quadrature

and the standardization is

_repeated.

After a few

_iterations,

this

_strategy

ensures

that the

_quadrature

rules are

_applied

in an

_{appropriate region}

of the function to

integrate.

Details are

_given

in the

_Appendix.

The results can be used to obtain

a second

_{approximation}

of the

_marginal

_posterior

_density

based on its first three

moments.

_Using

an

_expression

known as the Gram-Charlier series

_expansion

of a

function

_{f (!)}

of a variable x with moments _p,

2

0 &dquo;

and !c, we have

(McCullagh,

1987):

where

_§(z)

is the

_density

of a normal distribution with mean _!,and variance

Q

2

and z =

(x - p)lo,.

Other situations

Cox model

The

_application

of the

_{saddlepoint approximation}

to obtain the

_{marginal posterior}

density

of the

_dispersion

parameter

of the random effect is not restricted to the

Weibull

_regression

model. It can be

_applied,

at least in

_theory,

_{to any}

_{joint posterior}

density.

For

_example,

in the case of a Cox mixed

_model,

for which the baseline

hazard function

_{Ao (t)}

is assumed to be

_{completely arbitrary,}

_p(0,-*

_{!, y, T}

_*

₎

and the

(12)

likelihood function in

_[16]

_by

the

_partial

likelihood function

_{initially proposed by}

Cox

_(1972):

where the

_T!2!’s

are the distinct observed failure times and

_{Risk(T!Z! )}

is the set of

individuals

at risk at time

_T

_[i]

_,

_ie,

alive

_{just prior}

to

_7!.

_{Then, assuming}

that T

is

_known,

the estimate of 0 to be used in

_[34]

is obtained from the

_joint

_posterior

density

as:

Stratification.

_{Time-dependent}

covariates

Stratification and the use of

_{time-dependent}

covariates are common

_approaches

to accommodate situations for which the

_proportional

hazards is not valid for all

effects or

_throughout

the whole time range. As for the Cox

_model,

the main

_changes

with

_respect

to the situation described so far occur in the

computation

of the

likelihood and its derivatives and do not interfere with the

_validity

of the

saddlepoint

approximation.

For

_example,

if the covariates in

_{b f mw.&dquo;,,}

are

_{step-functions}

of time with

_changes

at times

_{cp,&dquo;,,,i,}

i =

_{0, ...I}

with _W_,,,_o = 0 and

<!m,7 =

Ym

, then wm is

piecewise

constant on intervals

(cp,,&dquo;,,

i

,

and

i+1

cp&dquo;,,,

the

_expressions

to use in

[12]

are:

In the case of

_{stratification,}

the hazard function

_A(y,,,)

and the survivor function

S(

Ym

)

include

_{parameters p}

and p

log A

(the

’intercept’

in

_w£0

in

_!1!)

_specific

to

the relevant stratum.

ILLUSTRATION

In order to illustrate the

approach

described above for the estimation of

_dispersion

parameters

of the random effects in

_{frailty models,}

simulated data were

generated

based on a Weibull model with a random effect

(that

will be referred to as a

sire

_effect)

and

_mimicking

the data structure that is often encountered in animal

breeding

situations. The

_objective

was to assess the

_quality

of the

_saddlepoint

approximation

by

comparing

the exact

_marginal

_posterior

distribution of the

variance

_parameter

of the sire effect

_{( !28!}

obtained via

_algebraic

_integration)

with its

approximation

(!34!

after

_Laplacian

_{integration).}

This

_comparison

was done under

the

_following

conditions: a

_log-gamma

distribution

[8]

was considered as a

_prior

for

(13)

fixed effect

₍₁₃

=

ftthe

grand

mean )

was

_included;

and it was assumed that in

_!28!,

we have:

Preliminary

examination of

_[43]

showed that in all cases

_studied,

the

_density

_[43]

was

_virtually

identical to the

_{approximate density}

_{p(-y y)}

after

_integrating

out _/t

and _p

_by

_{Laplacian integration.}

In other

_words,

what was

_actually

_compared

here are two

_approximate

densities obtained after

_Laplacian

_integration

of _/_{-t, p}and

_Nq

sire

_{effects s9}

in one _case,of p and p

(with

_algebraic

_integration

of the

sq’s)

in the

other case.

The

_general

behavior of the

_{saddlepoint approximation}

of the

_marginal

_posterior

density

of the sire variance was also examined under a

_variety

of situations

(different

types

of

_censoring,

of unbalanced

_structure,

with a multivariate normal

_prior,

with

relationships

between

sires, using

a Cox

_model,

_etc).

Simulation

_strategy

In all situations

_(unless

_specified

_otherwise),

5 000 records were

_generated

_using

the

following

Weibull hazard function:

where

_A

_jk

_{q (t)}

represents

the hazard at time t of the

_jth

animal

_(j

=

_{1, ...}

5 000/Nq)

under the influence of the kth level of a fixed

_effect,

hereafter referred to as the ’herd’

effect

_(k

=

1,...

K)

and

daughter

of the

qth

sire

(q

=

1,...

N

Q

).

Values

p, _ -11 1

and _p= 1.5 _wereused in all _casesdescribed

_{here, corresponding}

_to_an

average

failure time of about 1800. For the

_comparison

between

_Laplacian

and

_algebraic

integrations,

it was assumed that K =

0, ie,

!3!

= 0 and the sire effects

sq were

generated

from a

_log-gamma

distribution with

_parameter

_’Y= 50. This

corresponds

to a variance of _sq

_equal

to

_{1}i(1) C’Y) !}

_0.02,

where

is the

_trigamma

function evaluated at _y.

_{Using expression}

_!6!,

we

_get:

which is in the

_typical

_rangeof

_heritability

values encountered for this kind of trait.

When a normal distribution was

_assumed,

a sire variance of 0&dquo; = 0.02 was retained

to

_generate

the sire effects. When herd effects were used in model

[44]

(K

>

_0),

these were

_{arbitrarily generated}

from a uniform

!-2, 2!

distribution.

Two different

_censoring

schemes were simulated. In

_censoring

_type

A,

all

gen-erated records

_greater

than a

_given

value

_C

_A

were considered as censored at

_C

_A

_.

The value of

_C

_A

was chosen

_by

trial and error in order to obtain a

_{given proportion}

of censored records.

_Censoring

_type

B tried to mimic an

_overlapping

_generations

(14)

to

C

B

when their simulated failure time was

_greater

than

C

B

.

The

_daughters

of the

following

batch

_{(also 10%)}

of sires were considered as censored when their failure

time was

_greater

that

_2C

_B

_,

and so on. The

_censoring

time for the last

10%

was

lOC

B

.

Therefore,

the

_daughters

of the first group of sires were

_heavily

censored

(’young

daughters

of young

sires’)

while the

_proportion

of censored records for the last _groupwas small

(’daughters

of old

sires’).

Again,

C

B

was determined

by

trial

and error.

Different unbalanced situations were also simulated. In scheme

_U1,

the

_daughters

of 100 sires

_(with

50

_daughters

_each)

were distributed over 505

_herds,

five with 500

animals and 500 with five

_daughters.

In scheme

U2,

half of the animals

_{(2 500)}

were

assumed to be

_daughters

of five sires with 500

_daughters

each while the other half

were

daughters

of 500 sires with five

daughters

each. These animals were

randomly

distributed over 100 herds.

_Finally,

in scheme

_U3,

the

_daughters

of the 50 ’best’

sires

_(with

50

_daughters

_each)

were raised in the ’best’ 50 herds

(where

’best’ means

lowest relative

_culling

_rate)

while the

_daughters

of the ’worst’ 50 sires were raised

in the ’worst’ herds.

To

_study

the

_impact

of the existence of

_genetic

_{relationships}

between

_individuals,

data were

generated according

to a model

slightly

different from

[44].

First,

the effects sg, of ten

_grandsires

_(’sires

of

_sires’)

were

_generated

from a normal

distribution with mean 0 and variance

a 2/4

(with

0 &dquo;;

=

0.02).

For each of

them,

ten sire _{effects sq}were obtained

_{by adding}

to _sg,a

_normally

distributed random

effect with variance

_3o!/4.

_Finally,

50 records of

_daughters

of each of these sires

were simulated

according

to the model:

where _r_j

_represents

the

_remaining

additive

_genetic

effect for the

_jth

animal and

was

_generated

from a normal distribution with mean 0 and variance

3 Q9

,

leading

to records with a

_global

additive

_genetic

variance

_equal

to

Q

a

=

4cr!.

These

data

were

_analyzed

and the

_marginal

posterior

density

of the sire variance

component

was obtained under three different

_genetic

models: two sire models identical to

[44]

assuming

no

_{relationships}

between sires

_{(case Sl)}

or

_including

the

_relationship

matrix between sires

_{(case S2),}

and an ’animal’ model

(case An),

describing

the

individual additive

_genetic

effect _a_j of each

_{animal j}

_{and including}

the

_complete

relationship

matrix between the 5 110 animals

_{(5 000}

with records + 100 sires + 10

grand-sires):

All

_computations

were done

_using

the ’Survival

_Kit’,

a set of Fortran programs

developed by

Ducrocq

and S61kner

_(1994).

The ’Survival Kit’ was

_specifically

written to

_{efficiently analyze}

the very

large

field data sets encountered

_by

animal

breeders and

_implements

all the features described in this _paperwith Weibull and Cox

models,

possibly

with

strata,

time-dependent

covariates and random effects.

In

_particular,

the maximization of the

_expressions

_[18]

or

[29]

is based on a limited

memory

quasi-Newton

method

(Liu

and

Nocedal,

1989)

which

only

requires

the

computation

of the vector of first derivatives of

_[18]

or

!29!.

If

_required

(for

_example,

in

_[36]

or when

_{computing asymptotic}

standard

_errors),

the

_negative

Hessian is

(15)

1994)

are used to

_compute

the determinant or the inverse of this

_negative

Hessian

in the Weibull case.

Results

Laplacian

integration

vs

_Algebraic

_integration

Figure

1

_represents

the

_marginal

_posterior

distribution obtained after

_integrating

out the sire _{effects s}₉_fromthe

_{joint posterior}

_{distribution,}

either

_{algebraically}

or

_using

the

_Laplacian

_{approximation.}

All records were uncensored. In the three

samples presented here,

the true

_{value q}

= 50 is

obviously

included in any

reasonable HPD credible set. When there were few sires with many

daughters

each,

the two

_computed

forms of the

_{marginal posterior}

distribution were

_virtually

indistinguishable.

When little information was available for each sire effect

_(ten

daughters

each in the 500 sires

_case),

the

_marginal

posterior

distributions were

rather

_flat,

with a

long

tail towards

large

values

of q

(ie,

small sire

variances).

The

agreement

between

_Laplacian

and

_algebraic

_integration

was not as

_{good, although}

the modes of the two distributions were close. With even less information _per

sire

_(five

_daughters

or less _per

sire),

neither of the two

_{marginalization techniques}

worked in most of the cases: the mode of the distribution or its first moments could

(16)

Effect of

_censoring

Figure

2

_presents

_again

the result of the same two

_{marginalization approaches,}

for

100 sires with 50

_daughters

each but under

censoring

schemes A and

_B,

with in both

cases a

_proportion

of

50%

censored records

(C

A

= 1200 and

C

B

=

270).

Clearly,

censoring

had little effect on the

_quality

of the

_{approximation}

when the

Laplacian

integration

was used.

However,

because the amount of information available to

estimate a rather small sire variance was

drastically

reduced,

it was not

_always

possible

to obtain a well-defined

posterior

density

(see

Breslow and

Clayton

(1993)

for similar results in the context of

_generalized

linear mixed

_models).

For

example,

in

figure 2,

the

_posterior

_density

in the case of

_censoring

scheme A does not

_integrate

to

1. The same

phenomenon

also occurred for some

_samples

with

_censoring

scheme B.

Interestingly,

when sire effects with a

_larger

variance ₇ = 10 _were

simulated,

which

corresponds

to an

heritability

of

0.24,

even extreme situations with more than

80%

censored records

_(with

C

A

=

520)

led _to

well-defined,

very

peaked

posterior

densities.

Normally

distributed random

effects

Having

shown the

_validity

of the

_saddlepoint

_{approximation}

of the

_marginal

(17)

effects and with 100

_(fixed)

’herd’ effects.

_Figure

3

_displays

the

_marginal

posterior

density

for ten such

_samples,

with 100 sires and no

_censoring.

The obtained

distributions were not as skewed as in the case of a

log-gamma

distribution. At

least in the

_examples

studied,

the true value 0.02 was

_always

in _anyHPD credible

set. Note however that the variance of these densities were

_quite

_large

(standard

deviations between 0.0049 and 0.0079 for a true

_parameter

value of

_0.02).

Effect of unbalancedness

When unbalancedness was induced

_{by simulating}

both _very

large

and small herds

(case Ul),

the effect on the

_marginal

posterior

density appeared

to be minimal

(fig 4).

When a

_{large heterogeneity}

was created in the number of

daughters

per sire

(case

U2),

the main consequence was a less

precise

estimation of the sire variance.

The most

_{negative impact}

was observed when the animals were not

_randomly

distributed across herds

_{(case U3).}

It seems that a

_part

of the favorable influence of

the best sires on the survival of their

daughters

was attributed to the herd

_effects,

resulting

in a sire variance

_strongly

biased downwards.

Including

a

_relationship

matrix

The two

_marginal

_posterior

densities obtained under a sire model with or without

inclusion of the true

relationship

matrix between sires were _verysimilar

(fig 5).

As may have been

expected,

the inclusion of the

relationship

matrix

slightly

(18)

(19)

the records of related animals are more

_similar,

hence

globally

less variable. In all

the

_{samples simulated,}

the animal model

_consistently

led to a

_slight

overestimation

of the sire variance: the

_marginal

posterior

density

in the case of the animal model

was

systematically

to the

_right

of those for the two sire models. This may be

attributed,

at least in

_part,

to the fact that a much

_larger

number of

_parameters

have to be

_integrated

out with an animal model than with a sire model. Such a

problem

has been

_pointed

out for

_{example by Mayer}

₍₁₉₉₅₎

in the context of a

threshold model. The

Laplacian integration probably

does not

_perform

as well in

such a case.

Note, however,

that this _maybe worsened

by

the fact that

only

a

very

simple pedigree

structure was simulated here. In

particular,

no information at

all was assumed to be available on the female side. The sire model used does not

account for the

overdispersion

implicitly

created

_by

the effect _r_j_,which

_represents

three-quarters

of the total additive

_genetic

variance. An

_attempt

to fit a model

similar to

_[46]

_assuming

a

_log-gamma

_prior

distribution for _r_j and

performing

the

algebraic

integration

of rj led to a

marginal posterior density

of the sire variance

similar to that obtained with the two sire models and a _very

large

estimate

(q

> 400

at the

_mode)

for the gamma

parameter,

synonymous of a _verysmall variance for the

r

j

’s.

This is

_likely

the result of the lack of information available for the estimation

for q

that was

already

illustrated in

figure

1.

Cox model vs Weibull model

When a

_parametric

(Weibull)

or

semi-parametric

(Cox)

model was used in the

construction of the likelihood

_function,

it was

_repeatedly

observed that the

resulting

marginal posterior

densities of

_a

_were

very similar

(fig 6),

with often a

slightly

larger

variance in the case of the Cox model. It is not known if similar results

would have been obtained had the data been

_generated

_assuming

a baseline hazard

function different from the Weibull hazard.

Approximation

of the

_{marginal posterior density}

of T based on its first

three

moments

The first three moments of the

_marginal

posterior

density

of the

_parameter

T

were

_computed

_by

numerical

_integration

of

[34]

using

a

_five-point

Gauss-Hermite

quadrature

formula and after standardization of the function to

_integrate.

New

standardization factors were obtained and the

procedure

was

repeated

until the

computed

moments

stabilized,

which

_usually

occurred after

_only

three iterations.

Figure

7 illustrates the fact that the

knowledge

of these moments leads to a

reasonable

_{approximation}

of the

_marginal

_posterior

_density

of T

DISCUSSION AND CONCLUSION

Bayesian

analysis

offers a coherent framework for the otherwise unclear

problem

of

variance

_components

estimation in mixed nonlinear models

_{(Ducrocq, 1990):}

all the

elements for inferences on

dispersion

_parameters

are contained in the

marginal

pos-terior distribution of these

_parameters

and the construction of the latter is based

(20)

(21)

proposed

for

_categorical

data

_(Foulley

et

_{al, 1987;}

H6

5 chele

et

_{al, 1987;}

_Foulley

et

_al,

₁₉₈₉₎

and for Poisson mixed models

_(Tempelman

and

_Gianola,

_1993).

In

this _paper,a

general

approach

for

_genetic

evaluation and estimation of

dispersion

parameters

for Weibull and Cox mixed models was described. Its main attractive

features are its

_generality

and its

computational feasability,

even for _very

_large

applications.

As an

example

of the

latter,

the

largest analysis

that we have carried

out involved the estimation of the mode and the first three moments of the

_marginal

posterior

distribution of the sire variance

_component

for the

_length

of

productive

life of 633 516 Holstein cows,

daughters

of 3 613 related sires. The Weibull mixed

model used was

_{quite complex}

and included

_{time-dependent}

effects such as a

herd-year-season effect

(with

82 713

levels,

assumed to be

_randomly

distributed with a

log-gamma

distribution),

a lactation number x

_stage

of lactation

_effect,

a herd size

effect and a

_year-to-year

variation in herd size effect as well as continuous linear

and

_quadratic

effects of covariates such _ageat first

_calving,

_milk,

fat and

_protein

yield.

_proportional

hazards models such as stratification or the use of

_{time-dependent}

covariates

complicate

the actual likelihood

_computations

but

do not interfere with the

_{marginalization}

_procedures

described here. The inclusion

of

_{genetic relationships}

between individuals is

_{straightforward through}

the use of an

_{appropriate prior}

distribution. Other

_prior

distributions

(including

informative

priors)

or other

parametric

baseline hazard functions could have been

_{incorporated.}

More

_complex

_genetic

structures

_(eg,

with maternal

_effects)

can be fitted. When

more than one random effect is considered in the

model,

the

_{approximation}

described here leads to the

_{joint marginal posterior}

of all the

_dispersion

_parameters

for all random effects. Further

_{marginalization}

can be

performed numerically along

the lines described in the

_Appendix

for the calculation of the moments of the

marginal

posterior

distribution but this _maybe considered too

_costly.

In the case of a Weibull mixed model with two random

_effects,

one of them

_having

a

_log-gamma

distribution,

the

_possibility

of

_integrating

out the latter

_{algebraically}

avoids this

difficulty.

Laplacian

integration

can be

_applied

to other situations too. For

_example,

Tier-ney and Kadane

(1986)

and

Tierney

et al

(1989)

suggested

the direct

computation

of the mean of the

_{marginal posterior}

density

_using

second-order

_{approximation}

formulae. These formulae were derived

_applying

_{Laplacian integration}

to both the

numerator and the denominator of a ratio of

_integrals.

_However,

this

_requires

the

maximization of the

_{joint posterior}

_density

for the

_dispersion

parameters,

the fixed

effects and the random effects. This

_approach

failed when we

_attempted

it as the

maximization

_procedure

led to

_dispersion

parameters estimates

_{corresponding}

to

random effects with null variance. The same

_phenomenon

had been described

pre-viously

in similar situations

_(Tempelman

and

_Gianola,

_1993).

At least in

_theory,

_{Laplacian integration}

could have been used to obtain the

marginal posterior

distribution of

parameters

other than the

_dispersion

_parameters.

However,

this _maybe considered far too

_demanding,

because each

_application

of the

Laplace

expansion requires

the maximization of one

_particular

function

_involving

all

_{parameters except}

the one of interest. This is in contrast with some

Monte-Carlo

_methods,

such as Gibbs

_sampling,

where the

_marginal

distributions for all

(22)

situations,

the

_separate

consideration of all

_marginal

densities is often not

_required,

because estimated

_breeding

values are

_point

estimates

_mainly

used to rank animals:

when little information is available for the

_{genetic evaluation,}

an accurate

_ranking

of

the candidates to selection is unrealistic. In the

_opposite

case

_(precise

estimation),

the

_rankings

based on, say, the mode or the mean of either the

marginal

or the

_joint

posterior

distribution are

likely

to be very similar.

Marginal

posterior

densities of

nonlinear functions of

_parameters

can also be calculated

_(Wong

and

_Li,

_1992).

Marginalization

based on

_{Laplacian integration}

has been shown to

_give

excellent

results in standard situations. For many nonlinear

applications,

the

quality

of the

saddlepoint

approximation

would have to

_rely

on the

_comparison

of the

_approximate

marginal

distribution of the

_dispersion

_parameters

with the actual distribution

obtained via Monte-Carlo simulations. The

exceptional

situation studied here where

an exact

_{algebraic integration}

of a

_log-gamma

random effect is

_{possible permits}

a more

_{straightforward}

_comparison.

It was found that the

_designs

for which the

two

_marginal

_posterior

distributions

_(exact

and

_approximate)

_depart

from each

other

_correspond

to situations where the

_quantity

of information available for the

estimation of

_genetic

parameters

is

_quite

limited. This means, in

particular,

that

the

_{saddlepoint approximation}

is

_likely

to be unsuccessful for the estimation of the

parameters

of a

_frailty

term used to describe an extra variation

_{(overdispersion).}

However,

one can still use

_{algebraic integration}

of the random effects in the case of a _gamma

_frailty

_component

in a Weibull model.

ACKNOWLEDGMENTS

The first author thanks the INRA for

_{making possible}

his stay at Cornell

_University

and

expresses his

appreciation

to Genex

_{Cooperative, Inc, Ithaca,}

New York for its _support.

REFERENCES

Aalen 00

₍₁₉₉₄₎

Effects of

_frailty

in survival

_analysis.

Statist Meth in Med Res

_{3, 227-243}

Abramowitz

M, Stegun

IA

₍₁₉₆₄₎

Handbook

_of

Mathematical Furcctiorcs. US

_Department

of

Commerce,

National Bureau of

Standards,

1046 p

Achcar

_JA,

Bolfarine H

₍₁₉₈₆₎

Use of accurate

_{approximations}

for

_posterior

densities in

regression

models with censored data. Rev Soc Chil Estad

_3,

84-104

Anderson JE, Louis

_TA,

Holm NV, Harvald B

₍₁₉₉₂₎

_{Time-dependent}

association

mea-sures for bivariate survival

_analysis.

J Am Stat Ass

87,

641-650

Berger

JO

₍₁₉₈₅₎

Statistical Decision

_Theory

and

_{Bayesian Analysis. Springler-Verlag,}

New

_York,

NY

Breslow

_NE,

_Clayton

DG

₍₁₉₉₃₎

_Approximate

inference in

_generalized

linear mixed models. J Am Stat Ass 88, 9-25

Clayton

DG

₍₁₉₉₁₎

A Monte-Carlo method for

_Bayesian

inference in

_frailty

models. Biometrics

_47,

467-485

Clayton DG,

Cuzick J

₍₁₉₈₅₎

Multivariate

generalizations

of the

proportional

hazards model. J R Stat

_Soc,

Series A

148,

82-117

Cox D

₍₁₉₇₂₎

_Regression

models and life table

_{(with discussion).}

J R Stat

_Soc,

Series B

34, 187-220