Bayesian inference in threshold models using Gibbs sampling

(1)

HAL Id: hal-00894092

https://hal.archives-ouvertes.fr/hal-00894092

Submitted on 1 Jan 1995

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Bayesian inference in threshold models using Gibbs

sampling

Da Sorensen, S Andersen, D Gianola, I Korsgaard

To cite this version:

(2)

Original

article

Bayesian

inference

in

threshold models

using

Gibbs

_sampling

DA Sorensen

S Andersen

2 D

Gianola

I

_Korsgaard

1

National Institute

_of

Animal

Science,

Research Centre

Foulum,

PO Box

_39,

DK-8830

_Tjele;

2

National Committee

_{for Pig Breeding,}

Health and

Prodv,ction,

Axeltorv

_3,

_{Copenhagen V, Denmark;}

3

University of Wisconsin-Madison, Department of

Meat and Animal

_Sciences,

Madison, WI53706-1284,

USA

(Received

17 June

1994; accepted

21 December

₁₉₉₄₎

Summary - A

Bayesian analysis

of a threshold model with

multiple

ordered

_{presented. Marginalizations}

are achieved

by

means of the Gibbs

sampler.

It is shown

that use of data

_augmentation

leads to conditional

_posterior

distributions which are _easy to

sample

from. The conditional

_posterior

distributions of thresholds and liabilities are

independent

uniforms and

independent

truncated

normals, respectively.

The

_remaining

parameters of the model have conditional

posterior

distributions which are identical to

those in the Gaussian linear model. The

methodology

is illustrated

_using

a sire

model,

with an

_analysis

of

hip dysplasia

in

dogs,

and the results are

_compared

with those obtained

in a

_{previous study,}

based on

_approximate

maximum likelihood. Two

_independent

Gibbs chains of

length

620 000 each were run, and the Monte-Carlo

sampling

error of moments

of

_posterior

densities were assessed

using

time series methods. Differences between results

obtained from both chains were within the range of the Monte-Carlo

sampling

error.

With the

_exception

of the sire variance and

_{heritability, marginal posterior}

distributions seemed normal. Hence inferences

using

the _presentmethod were in

_good

_agreementwith those based on

_approximate

maximum likelihood. Threshold estimates were

_strongly

autocorrelated in the Gibbs sequence, but this can be alleviated

using

an alternative

parameterization.

threshold

_{model / Bayesian}

analysis

/

Gibbs _sampling

_/

_dog

Résumé - Inférence bayésienne dans les modèles à seuil avec _{échantillonnage}de

Gibbs. Une

_{analyse bayésienne}

du modèle à seuil avec des

catégories multiples

ordonnées

est

présentée

ici. Les

_{marginalisations}

nécessaires sont obtenues par

échantillonnage

de

Gibbs. On montre que l’utilisation de données

_{augmentées -}

la variable continue sous-jacente non observée étant alors considérée comme une inconnue dans le modèle -

con-duit à des distributions conditionnelles a

_posteriori

faciles

à échantillonner. Celles-ci sont

(3)

tronquées indépendantes

pour les sensibilités

(les

variables

sous-jacentes).

Les

paramètres

restants du modèle ont des distributions conditionneLles a

_{posteriori identiques}

à celles

qu’on

trouve en modèle linéaire

_gaussien.

La

méthodologie

est illustrée sur un modèle

paternel appliquée

à une

_dysplasie

de la hanche chez le

_chien,

et les résultats sont

com-parés

à ceux d’une étude

_précédente

basée sur un maximum de vraisemblance

approché.

Deux

séquences

de Gibbs

_{indépendantes, longues}

chacune de 620 000

_{échantillons,}

ont été réalisées. Les erreurs

d’échantillonnage

de _typeMonte Carlo des moments des densités a

posteriori ont été obtenues par des méthodes de séries

temporelles.

Les résultats obtenus

avec les 2

_{séquences indépendantes}

sont dans la limite des erreurs

_{d’échantillonnage}

de Monte-Carlo.

À

l’exception

de la variance

_paternelle

et de

l’héritabilité,

les

distribu-tions

_marginales

a _posteriorisemblent normales. De ce

fait,

les

_inférences

basées sur la

présente

méthode sont en bon accord avec celles du maximum de vraisemblance

_approché.

Pour l’estimation des

_seuils,

les

_séquences

de Gibbs révèlent de

_{fortes autocorrélations,}

auxquelles

il est

_{cependant possible}

de remédier en utilisant un autre

_{paramétrage.}

modèle à seuil

_/

_analyse_bayésienne

_/

_{échantillonnage}de Gibbs

_/

chien

INTRODUCTION

Many

traits in animal and

_plant

_breeding

that are

_postulated

to be

_continuously

inherited are

_{categorically}

scored,

such as survival and conformation _scores,

_degree

of

_{calving difficulty,}

number of

_piglets

born dead and resistance to disease. An

appealing

model for

_genetic

_analysis

of

_categorical

data is based on the threshold

liability

concept,

first used

_by

_Wright

₍₁₉₃₄₎

in studies of the number of

_digits

in

guinea pigs,

and

_by

Bliss

₍₁₉₃₅₎

in

_toxicology

_experiments.

In the threshold

_model,

it is

_postulated

that there exists a latent or

_underlying

variable

(liability)

which has a continuous distribution. A _responsein a

_given

_category

is

_observed,

if the actual value of

_liability

falls between the thresholds

_defining

the

_appropriate

_category.

The

probability

distribution of _{responses in}a

_{given population depends}

on the

position

of its mean

_liability

with

_respect

to the fixed thresholds.

_Applications

of this model in animal

_breeding

can be found in Robertson and Lerner

(1949),

_Dempster

and Lerner

₍₁₉₅₀₎

and Gianola

_(1982),

and in Falconer

_(1965),

Morton and McLean

(1974)

and Curnow and Smith

_(1975),

in human

_genetics

and

_{susceptibility}

to

disease.

_Important

issues in

_{quantitative genetics}

and animal

_breeding

include

drawing

inferences about

_(i)

_genetic

and environmental variances and covariances in

populations;

(ii)

liability

values of groups of individuals and candidates for

genetic

selection;

and

_(iii)

_prediction

and evaluation of _responseto selection. Gianola and

Foulley

(1983)

used

_Bayesian

methods to derive

_{estimating equations}

for

_(ii)

_above,

assuming

known variances. Harville and Mee

₍₁₉₈₄₎

_proposed

an

_approximate

method for variance

_component

_estimation,

and

_{generalizations}

to several

_polygenic

binary

traits

_having

a

_joint

distribution were

_{presented by Foulley}

et al

_(1987).

In these methods inferences about

_dispersion

_parameters

were based on the mode of their

_{joint posterior distribution,}

after

_integration

of location

_parameters.

This involved the use of a normal

_{approximation which,}

_seemingly,

does not behave well _{in sparse}

_contingency

tables

_(H6schele

et

_al,

_1987).

These authors found that

(4)

combination of fixed and random levels in the model was smaller than

_2,

and

suggested

that this _maybe caused

by inadequacy

of the normal

_{approximation.}

This

_problem

can render the method less useful for situations where the number of rows in a

_contingency

table is

_equal

to the number of individuals. A data

structure such as this often arises in animal

_breeding,

and is referred to as the ’animal model’

_(Quaas

and

Pollak,

1980).

Anderson and Aitkin

₍₁₉₈₅₎

_proposed

a

maximum likelihood estimator of variance

_component

for a

_binary

threshold model. In order to construct the

likelihood, integration

of the random effects was achieved

using

univariate Gaussian

_quadrature.

This

_procedure

cannot be used when the random effects are

_correlated,

such as in

_{genetics. Here, multiple}

integrals

of

high

dimension would need to be

_calculated,

which is unfeasible even in data sets with

only

50

_genetically

related individuals. In animal

_breeding,

a data set _maycontain thousands of individuals that are correlated to different

_degrees,

and some of these

may be inbred.

Recent reviews of statistical issues

_arising

in the

_analysis

of discrete data in

animal

_breeding

can be found in

Foulley

et al

₍₁₉₉₀₎

and

_Foulley

and Manfredi

(1991).

Foulley

(1993)

gave

approximate

formulae for

one-generation predictions

of response to selection

_by

truncation for

_binary

traits based on a

simple

threshold model.

_However,

there are no methods described in the literature for

_drawing

inferences about

_genetic

_change

due to selection for

_categorical

traits in the context

of threshold models.

_Phenotypic

trends due to selection can be

reported

in terms of

changes

in the

_frequency

of affected individuals.

_{Unfortunately,}

due to the nonlinear

relationship

between

_phenotype

and

_genotype,

_{phenotypic changes}

do not translate

directly

into additive

_{genetic changes,}

_{or, in}other

_words,

to _{response to}selection. Here we

_point

out that inferences about realized selection _responsefor

_categorical

traits can be drawn

_{by extending}

results for the linear model described in Sorensen

et al

_(1994).

With the advent of Monte-Carlo methods for numerical

integration

such as Gibbs

sampling

(Geman

and

_{Geman, 1984;}

Gelfand et

_al,

_1990),

_analytical

_{approximations}

to

_posterior

distributions can be

_avoided,

and a simulation-based

_approach

to

Bayesian

inference about

quantitative genetic

parameters

is now

possible.

In animal

breeding, Bayesian

methods

_using

the Gibbs

_sampler

were

_applied

in Gaussian models

_{by Wang}

et al

_{(1993, 1994a)}

and Jensen et al

₍₁₉₉₄₎

for

_(co)variance

component

estimation and

_by

Sorensen et al

₍₁₉₉₄₎

and

_Wang

et al

_(1994b)

for

assessing

response to selection.

_Recently,

a Gibbs

_sampler

was

_implemented

for

binary

data

_(Zeger

and

Karim,

1991)

and an

analysis

of

multiple

threshold models was described

_by

Albert and Chib

(1993).

_Zeger

and Karim

(1991)

constructed the Gibbs

_{sampler using rejection}

_{sampling techniques}

_{(Ripley, 1987),}

while Albert and Chib

₍₁₉₉₃₎

used it in

_conjunction

with data

_{augmentation,}

which leads to

a

_{computationally simpler}

_strategy.

The _purposeof this _paperis to describe a Gibbs

_sample

for inferences in threshold models in a

_quantitative

_genetic

context.

First,

the

_Bayesian

threshold model is

_presented,

and all conditional

_posterior

distributions needed for

_running

the Gibbs

sampler

are

_given

in closed form.

Secondly,

a

_{quantitative genetic}

_analysis

of

_hip

_dysplasia

in German

_shepherds

is

presented

as an

illustration,

and 2 different

_{parameterizations}

of the model

leading

(5)

MODEL FOR BINARY RESPONSES

At the

_phenotypic

_level,

a Bernoulli random variable

_Y

_i

is observed for each individual

_{i (i}

=

1, 2, ... , n)

_taking

values y

i

=

_{1 or y}

₂

= 0

(eg,

alive _or

dead).

The variable

_Y

is the

_expression

of an

_underlying

continuous random variable

_U

_i

_,

the

_liability

of individual i. When

_U

_Z

exceeds an unknown fixed threshold

_t,

then

Y

=

_{1, and Y}

= 0 otherwise. We _assumethat

liability

is

_normally

_distributed,

with the mean value indexed

_by

a

_parameter

_0,

_and,

without loss of

generality,

that it has unit variance

_(Curnow

and

_Smith,

_1975).

Hence:

where

0’ =

_{(b’, a’)}

is a vector of

parameters

with _pfixed effects

_(b)

_and

_{q random}

additive

_genetic

values

_(a),

and

_w’

is a row incidence vector

_linking

e to the ith observation.

It is

_important

to note that

_{conditionally}

on

_0,

the

_U

_i

are

independent,

so for the vector U =

{U

i

}

given 0,

we have as

_joint

_density:

where

_!U(.)

is a normal

_density

with

_parameters

as indicated in the

_argument.

In

!2!,

put

WO = Xb ₊

Za,

where X and Z are known incidence matrices of order n

by

p and n

by

q,

respectively,

and,

without loss of

generality,

X is assumed to have full column rank. Given the

_model,

we have:

where

_<p(.)

is the cumulative distribution function of a standardized normal variate. Without loss of

_generality,

and

_provided

that there is a constant term in the

_model,

t can be set to

_0,

and

_[3]

reduces to

Conditionally

on both 0 and

_{on Y}

_i

=

y 2

,

U

i

follows a truncated normal distribution. That

is,

for _Yi = 1:

where

_I(X

E

_A)

is the indicator function that takes the value 1 if the random variable X is contained in the set

_A,

and 0 otherwise. For _Y_i=

0,

the

density

is

§

u

; (w§ e, I) /V(-w§ e) I (U

i

x 0) .

Invoking

an infinitesimal model

_{(Bulmer, 1971),}

the conditional distribution of additive

_genetic

values

_given

the additive

_genetic

variance in the

_conceptual

base

(6)

where A is a _q

_by

_qmatrix of additive

_{genetic relationships.}

Note that a can include animals without

_phenotypic

scores.

We discuss next the

_{Bayesian inputs}

of the model. The vector of fixed effects b will be assumed to follow a

_priori

the

_improper

uniform distribution:

For a

_description

of

_uncertainty

about the additive

_genetic

_variance,

or a 2,

an inverted gamma distribution can be

_invoked,

with

_density:

where v and

S’

are

_parameters.

When v = -2 and

S

Z

=

0,

[8]

reduces _tothe

improper

uniform

_prior

distribution. A proper uniform

prior

distribution for

_Q

_a

_is:

where k

a

is a constant and

_{a a 2m!’. is}

the maximum value which

_J£

_cantake a

_priori.

To facilitate the

_development

of the Gibbs

_sampler,

the unobserved

_liability

U

is included as an unknown

_parameter

in the model. This

approach,

known as data

augmentation

(Tanner

and

_Wong,

_1987;

Gelfand et

al, 1992;

Albert and

Chib, 1993;

Smith and

_Roberts,

₁₉₉₃₎

leads to identifiable conditional

_{posterior distributions,}

as shown in the next section.

Bayes

theorem

_gives

as

_{joint posterior}

distribution of the

_parameters:

The last term is the conditional distribution of the data

_given

the

parameters.

We notice

_that,

for

_Y

_i

=

_1,

say, we have

For

_Y

_i

₌

_0,

we have:

(7)

The

_{joint posterior}

distribution

_[10]

can then be written as:

where the

_conditioning

on

hyperparameters

v and

S’

is

_replaced

by

0 a

ma

.

when the uniform

prior

[9]

for the additive

_genetic

variance is

_employed.

Conditional

_posterior

distributions

In order to

_implement

the Gibbs

sampler,

all conditional

_posterior

distributions of the

parameters

of the model are needed. The

_{starting point}

is the full

posterior

distribution

_!13!.

_Among

the 4 terms in

_(13!,

the third is the

only

one that is a function of b and we therefore have for the fixed effects:

which is

_proportional

to

_{!U(Xb+Za, I).}

As shown in

_Wang

et al

_(1994a),

the scalar form of the Gibbs

_sampler

for the ith fixed effect consists of

sampling

from:

where _x_i is the ith column of the matrix

X, and b

i

satisfies:

In

_!16!,

_X-

_i

is the matrix X with the column associated with

i deleted,

and

b_

i

is b with the ith element deleted. The conditional

posterior

distribution of the

vector of

breeding

values is

_proportional

to the

_product

of the second and third

terms in

_!13!:

which has the form

_{!(0,Acr!)!(u!b,a).}

Wang

et al

_(1994a)

showed that the scalar Gibbs

sampler

draws

samples

from:

where zi is the ith column of

Z,

cii is the element in the ith row and column of

A-1 , /B B =

(Qa)-1,

and a

i

satisfies:

In

_[19],

_c_i_,-_i is the row of

A-

1 corresponding

to the ith individual with the ith element excluded. We notice from

_[14]

and

_[17],

that

_augmenting

with the

(8)

For the variance

_component,

we have from

!13!:

Assuming

that the

_prior

for

_o,2is

the inverted _gamma

_given

in

_!8!,

this becomes:

and

_assuming

the uniform

_prior

_!9!,

it becomes:

Expression

!21a!

is in the form of a scaled inverted gamma

density,

and

[21b]

in

the form of a truncated scaled inverted _gamma

density.

The conditional

_posterior

distribution of the

_underlying

variable

U

i

is

propor-tional to the last 2 terms in

_!13!.

This can be seen to be a truncated normal

dis-tribution,

on the left if

Y

i

= 1 and _onthe

right

otherwise. The

density

function of

this truncated normal distribution is

_given

in

_!5!.

_Thus,

_depending

on the observed

Yi,

we have:

or

Sampling

from the truncated distribution can be done

by generating

from the untruncated distribution and

_retaining

those values which fall in the constraint

region.

Alternatively

and more

efficiently,

_supposethat U is truncated and defined in the interval

_{!i, j]}

_only,

where

_{i and j}

are the lower and _upper

bounds,

respectively.

Let the distribution function of U be

F,

and let v be a uniform

[0, 1]

variate. Then U =

F-1 !F(i)

₊

v(F(j) —

F(i))!

is _a

_drawing

from the truncated random variable

(Devroye,

1986).

Albert and Chib

₍₁₉₉₃₎

also constructed the mixed model in terms of a hier-archical

_model,

but

_proposed

a block

_sampling

_strategy

for the

_parameters

in the

underlying

scale,

instead.

_{Essentially, they}

_suggest

_sampling

from the distributions

(

Q

a,

blU)

as

(a!IU)(a,

blU,

J£

),

instead of from the full conditional

posterior

distributions

!15!,

!18!

and

_!21!,

and

_they

assumed a uniform

_prior

for

log(a2)

in a finite interval. To facilitate

_sampling

from

p(or2lU),

_they

use an

_{approximation}

which consists of

_placing

all

_{prior probabilities}

on a

grid

of

or2

values,

thus

making

the

_prior

and the

_posterior

discrete. The need for this

_{approximation}

is

question-able,

since the full conditional

_posterior

distribution of

_{(T has}

a

_simple

form as noted in

_[21]

above. In

_addition,

in animal

_breeding,

the distribution

_{(a, b[U,}

_{a a 2)}

is a

_high

dimensional multivariate normal and it would not be

_{simple computationally}

(9)

MULTIPLE ORDERED CATEGORIES

Suppose

now that the observed random variable Y can take values in one of C

mutually

exclusive ordered

_categories

delimited

_by

C + 1 thresholds. Let

to

=

- oo,

t

c

= _+oo,with the

remaining

thresholds

satisfying

t1 ! t

2 ... <

_t

_C

_-

₁

-Generalizing

[3]:

Conditionally

on

_A,

_Y

_i

=

j,

t

j

-1

and

_t

_j

_,

the

_underlying

variable associated with the ith observation follows a truncated normal distribution with

density:

Assuming

that

_{o, a, 2}

b and t are

_{independently}

distributed a

_priori,

the

_joint

posterior

density

is written as:

where

_p(Ulb,

_a,

_t)

=

p(Ulb, a).

Generalizing

[12],

the last term in

_[25]

can be

expressed

as

(Albert

and

_Chib,

_1993):

All the conditional

_posterior

distributions needed to

_implement

the Gibbs

sam-pler

can be derived from

!25!.

It is clear that the conditional

_posterior

distributions of

b

i

,

ai and

u2

_{are the}same as for the

_binary

_responsemodel and

_given

in

_(15!,

[18]

and

_!21!.

For the

_underlying

variable associated with the ith observation we have from

_!25!:

This is a truncated

normal,

with

_density

function as in

!24!.

The thresholds t =

(t

l

,

_t

₂

_{, ... ,}

tC-1)

are

_clearly

_dependent

a

_priori,

since the model

postulates

that these are distributed as order statistics from a uniform distribution in the interval

_[t

_mini

_t

_max]

_.

_However,

the full conditional

_posterior

distributions of the thresholds are

independent.

That

_is,

p(t

j

It-

j

, b, a, U, o

l

a

2 ,

Y

) =

p(t

(10)

where T =

{(h, t2,&dquo;’, tc-dltmin !

_{/ x t2 ! ... ! t}

₁

_v

_-

_C

t

max

)

(Mood

et

a

l

,

1974).

Note that the thresholds enter

_only

in

_defining

the

support

of

_p(t).

The conditional

_posterior

distribution

_{of t}

_j

is

_{given by:}

which has the same form as

!26!.

_Regarded

as a function

_{of t,}

[26]

shows

_{that, given}

U and _y,the _upperbound of

_{threshold t}

_j

is

_{min (U I Y}

_{= j}

₊₁₎

and the lower bound

is

_max(UIY

=

j).

The _a

_priori

condition _{t E} T is

automatically

fulfilled,

and the bounds are unaffected

_by

_knowledge

of the

_remaining

thresholds. Thus

_t

_j

has a uniform distribution in this interval

_given

_by:

This

_argument

assumes that there are no

_categories

with

_missing

observations. To accommodate for the

possibility

of

_missing

observations in 1 or more

_categories,

Albert and Chib

₍₁₉₉₃₎

define the _upperand lower bounds of threshold

_j,

as

minfmin(UIY

= j

+

_{1), t}

_j+d

and as

max{max(U!Y

=

j),t

j

-’

11 ,

respectively.

In this case, the thresholds are not

_{conditionally independent.}

The Gibbs

_sampler

is

implemented

by sampling repeatedly

from

_{!15!, !18!, !21!, [24]}

and

_(28!.

Alternative

_{parameterization}

of the

_multiple

threshold model

The

multiple

threshold model can also be

parameterized

such that the conditional distribution of the

_underlying

variable

_{U, given 0,}

has unknown variance

_0’

_instead

of unit variance. The

_equivalence

of the 2

_{parameterizations}

is shown in the

Appendix.

This

_{parameterization requires}

that records fall in at least 3

_mutually

exclusive ordered

_categories;

for C

_categories,

_only

C-3 thresholds are identifiable.

In this new

_{parameterization,}

one must

_sample

from the conditional

_posterior

distribution

_{of o, e 2.}

Under the

priors

[8]

or

_(9!,

the conditional

posterior

distribution of

_{Jfl can}

be shown to be in the form of a scaled inverted _gamma.The

_parameters

of this distribution

_depend

on the

_prior

used for

ae

2

If this is in the form

(8!,

then

where,

SSE =

(U - Xb - Za)’ (U - Xb - Za),

and _v,and

S

e

are

_parameters

of the

prior

distribution. If a uniform

_prior

of the form

[9]

is assumed to describe the

_prior

uncertainty

about

_{u . 2,}

the conditional

_posterior

distribution is a truncated version of

_[29]

_(ie

_[21b]),

_{with v}

_e

= -2

and S,2

= 0. With

exactly

3

categories,

the Gibbs

(11)

EXAMPLE

We illustrate the

_methodology

with an

_analysis

of data on

hip

dysplasia

in German

shepherd dogs.

Results of an

_early

_analysis

and a full

_description

of the data can be found in Andersen et al

_(1988).

_Briefly,

the records consisted of

_radiographs

of 2 674

_offspring

from 82 sires. These

_radiographs

had been classified

_according

to

guidelines approved by

FCI

_(Federation

_Cynologique

Internationale,

1983),

each

offspring

record was allocated to 1 of 7

_mutually

exclusive ordered

_categories.

The model for the

_underlying

variable was:

.

where ai is the effect of sire

i (i

=

_{1, 2, ... , 82; j}

=

1, 2,... , n

i

).

The

prior

distribution of /! was as in

_[7]

and sire effects were assumed to follow the normal distribution:

The

_prior

distribution of the sire variance

₍

_Q

_a)

was in the form

_given

in

!8!,

with v = 1 and

S

2

= 0.05. The

prior

for

_t

₂

_{, ... , t6}

was chosen to be uniform on the ordered subset of

_[f,

=

-1.365,

f

7

=

+00!5

for which

ti

_<

t

2

_< ... <

t

7 ,

where

f

l

was the value at which

t

1

was

_set,

and

f

7

is the value of the 7th threshold. The value for

_f

_l

was obtained from Andersen et al

_(1988),

in order to facilitate

comparisons

with the

_present

_analysis.

The

analysis

was also carried out under the

parameterization

where the conditional distribution of U

_given

0 has variance

_{a 2}

Here,

Q

e

was assumed to follow a

_prior

of the form of

!8!,

with v = 1 and

S

Z

= 0.05

and

_t

₆

was set to 0.429. Results of the 2

_analyses

were

similar,

so

only

those from the second

parameterization

are

_presented

here.

Gibbs

_sampler

and

_post

Gibbs

_analysis

The Gibbs

sampler

was run as a

_single

chain. Two

_independent

chains of

length

620 000 each were _run,and in both _cases,the first 20 000

samples

were discarded.

Thereafter, samples

were saved _every20

_iterations,

so that the total number of

samples kept

was 30 000 from each chain. Start values for the

_parameters

_were,for the case of chain

_1,

o,2=

_2.0,

Q

a

=

_0.5,

_t

₂

=

_{-0.8, t}

₃

₌

_-0.5,

_t

₄

=

-0.2,

_t

₅

= 0.1. For

chain

2,

estimates from Andersen et al

₍₁₉₈₈₎

were

_used,

and these were

J

fl

=

_1.0,

or =

0.1, t

2

=

_-1.05,

_t

₃

=

₄

_{-0.92, t}

=

_{-0.62, t}

₅

= -0.34. In both _runs,

starting

values for sire effects were set to zero.

Two

_important

issues are the assessment of convergence of the Gibbs

sampler,

and the Monte-Carlo error of estimates of features of

_posterior

distributions. Both issues are related to the

_question

of whether the

chain,

or

chains,

have been run

long enough.

This is an area of active research in which some

_guidelines

based on theoretical work

(Roberts,

1992; Besag

and

Green, 1993;

Smith and

Roberts,

1993;

Roberts and

_Polson,

₁₉₉₄₎

and on

_practical

considerations

(Gelfand

et

al,

1990;

Gelman and

Rubin, 1992; Geweke,

1992;

Raftery

and

_Lewis,

₁₉₉₂₎

have been

(12)

and Lewis

₍₁₉₉₂₎

_proposed

a method based on 2-state Markov chains to calculate the number of iterations needed to estimate

_{posterior quantiles.}

Let

_X

_l

_{, ... , X&dquo;,}

be elements of a

_single

Gibbs chain of

_length

m. The m

sampled

values,

which are

_generally

_correlated,

can be used to

_compute

features of the

posterior

distribution. In our

model,

the Xs could

_represent

_samples

from the

posterior

distribution of a sire

_value,

or of the sire

_variance,

or of functions of these.

Invoking

the

_{ergodic theorem,}

for

_large

_m,the

_posterior

mean can be estimated

_by

and the variance of the

_posterior

distribution can be estimated

_by

(Cox

and

_{Miller, 1965; Geyer,}

_1992).

_Similarly,

a

_{marginal posterior}

distribution

function,

F(Z)

=

P(X

_<

Z),

_canbe estimated

by

which means that

_F(Z)

is estimated

_by

the

_empirical

distribution function. All these estimators are

subject

to Monte-Carlo

_sampling

_error,which is reduced

_by

prolongation

of the chain.

Consider the _sequence

_g(X

₁

_),...,

_g(Xm),

where

_g(.)

is a suitable

_function,

_eg,

g(X)

= X _or

g(X)

=

l!X!Z!,

and let

_{E(g(X)) =}

_p.The

_lag-time

auto-covariance of the sequence is estimated as:

where

is the

_sample

mean for a chain of

_length

_m,and t is the

_lag.

The auto-correlation is then estimated as

Tm(!)/!m(0).

The variance of the

_sample

mean of the chain is

given by

which will exceed

_y(0)/m

if

_q(t)

> 0 for all

t,

as is usual. Several estimators of the variance of the

_sample

mean have been

_proposed

(Priestley, 1981),

but we chose one

(13)

Let

_F

_m

_(t)

=

$

m(2t)

₊

(2t + 1),

im

f

t =

0, 1, ....

The estimator can then be written

as

where t is chosen such that it is the

_{largest integer}

_satisfying

fm (I)

>

_0,

i =

_1,

1 ...

t.

The

_{justification}

for this choice is that

_r(i)

is a

_{strictly positive,}

_{strictly’decreasing}

function of i. If

X

l

,

X

2 , ...

X&dquo;,

are

independent,

then

Var(í

1m

)

=

q(0) /m.

To obtain

an indication of the effect of the correlation on

Var(í

1m

),

an ’effective number’ of

independent

observations can be assessed

as j

m

=

5 m

(0)/ Vâr(í

1m

)

’

When the

elements of the Gibbs chain are

_{independent, !m}

= m.

RESULTS

Estimates of the

_empirical

distribution function of various

_parameters

of the model for each of the 2 chains of the Gibbs

sampler

are shown in

figures

1-5. For

(14)

(15)

under 0.10.

_{Similarly, figure}

3 indicates that there is

90%

posterior

probability

that

heritability

in the

_underlying

scale

_(h

₂

=

4J£ / (J£ + Jfl ) )

lies between 0.24 and

_0.49,

and the median of the

_posterior

distribution is 0.35.

_Although

this distribution is

slightly skewed,

the estimate of the median agrees well with the ML

type

estimate of

_heritability

of

0.35, reported

in Andersen et al

_(1988).

_Figure

4

_depicts

estimates

of distribution functions for the mean

₍

_J1

₎

and for each of 3 sire effects

_(a

_l

_,

_a₂_,

_a

₃

_).

Figure

5

_gives

_{corresponding}

distributions for 2 threshold

parameters

(t

2 ,

and

_t

₃

_).

The

_figures

fall in 3

_categories.

The distribution functions obtained from chains 1

and 2 coincide for each of the variables

_Q

_a,

_h2,

_a_l_,a2 and a3, where the sire effects a

l

, a2 and a3

pertain

to 3 males with

_31,

5 and 158

_{offspring, respectively.}

A small deviation between chains 1 and 2 is observed for

_Jfl

and

J1, and a

_larger

deviation is observed for the threshold

_parameters

_{(fig 5).}

The Gibbs sequence for the threshold

parameters

showed very slow

mixing properties.

For

_example,

for threshold

_2,

the autocorrelations between

_sampled

values were

_0.785,

0.663 and

0.315,

for

_lags

between

_5,

10 and 50

_{samples, respectively.}

The reason is that the

_sampled

value for a

_given

threshold is bounded

_by

the values of the

_{neighbouring underlying}

variables U. If these are _very

close,

the value of the threshold in

_subsequent

samples

is

_likely

to

_change

_very

_slowly.

Under the

_{parameterization}

where 1 of the thresholds is substituted

_by

the residual

variance,

the autocorrelations associated with the

_{lags above,}

between

_samples

from the

_marginal

_posterior

distribution of

e 2,

were

_0.078,

0.064 and

_0.032,

_{respectively.}

Another scheme that _mayaccelerate

mixing

is to

sample

jointly

from the threshold and the

_liability.

For sire

_{effects, lag}

autocorrelations were close to zero.

A

_comparison

between

_marginal

posterior

means

(average

of the 30 000

samples)

estimated from the 2 chains is shown in table I. The difference between chains 1

and 2 is in all cases within Monte-Carlo

_sampling

_error,

_{which was}

estimated within

chains, using

[34].

The ’effective number’ of observations

_u

_Jm

for the means of

marginal

posterior

distributions is close to 30 000 for

_{or a 2,}

, la a2 and a3, but is

about 3 000 for

_Q

_e

_{and p}

and between 200 and 400 for

_t

₂

_{, ... , ts.}

The

_{marginal posterior}

distributions for _J1_,

_t

₂

_{, ... , t}

₅

_,

_,_a_l _a₂_,a3are well

approxi-mated

_by

normal distributions. The

_posterior

means and standard deviations of these

_marginal

distributions can be

_compared

to estimates

_reported

in Andersen et

al

_(1988),

who used a

_2-stage

_procedure.

The authors first estimated variances

_using

an

_REML-type

estimator

_(Harville

and

Mee,

1984).

Secondly,

assuming

that the estimated variances were the true _ones,fixed effects and sire effects were estimated as

_{suggested by}

Gianola and

_Foulley

(1983).

For

_example,

for the 3 sires with

_158,

31 and 5

_{offspring respectively,}

the

_2-stage

_procedure

_yielded

estimates of sire effects

(approximate

posterior

standard

_deviations)

of 0.30

_(0.09),

0.17

_(0.17)

and -0.093

(0.26),

respectively.

The

_present

_Bayesian

_approach

with the Gibbs

_{sampler, yielded}

estimates of

_marginal

_posterior

means and standard deviations for these sires of 0.304

_(0.088),

0.177

_(0.166),

and -0.092

_(0.263),

_respectively

_(table

_I).

DISCUSSION

(16)

(17)

(18)

here likelihood inference with numerical

_integration

is a

_competing

alternative. For this model and

_{data, marginal}

_posterior

distributions of sire effects are well

ap-proximated by

normal distributions. On the other

hand,

with little information per

random

_effect,

_eg,animal

_models,

the

_normality

of

_marginal

_posterior

distributions when variances are not known is

_unlikely

to hold. A

_strength

of the

_Bayesian

ap-proach

via Gibbs

_sampling

is that inferences can be made from

marginal

_posterior

distributions in small

_{sample problems,}

without

_resorting

to

_asymptotic

approxi-mations.

_Further,

the Gibbs

_sampler

can accommodate

_{multivariately}

distributed random

_effects,

such as is the case with animal

_models,

and this cannot be

imple-mented with numerical

_integration

_techniques.

It seems

_important

to

_investigate

threshold models

_further,

_especially

with _sparse data structures

consisting

of many fixed effects with few observations per subclass.

This case was studied

_by

Moreno et al

_(manuscript

submitted for

_publication)

in the

_{binary model,}

where

_they

_{investigated frequency}

_properties

of

_Bayesian

point

estimators in a simulation

study. They

showed that

_improper

uniform

_prior

distributions for the fixed effects lead to an _averageof the

_{marginal posterior}

mean of

_heritability

which was

_larger

than the simulated value.

They

obtained better

agreement

when fixed effects were

assigned

_propernormal

_prior

distributions. The use of ’non-informative’

_{improper prior}

distributions is

_discouraged

on several

grounds by,

among

others, Berger

and Bernardo

(1992),

as this can lead to

_improper

posterior

distributions. It seems that the

_disagreement

between simulated and estimated values in Moreno et al is due to lack of information and not due to

impropriety

of

_posterior

distributions.

_Thus,

the bias

_persists,

_though

_smaller,

when all

parameters

of the model are

assigned

_proper

_prior

distributions

(Moreno,

(19)

contain data

_falling

into

_only

one of the 2 dichotomies. There is no information in the data

_(in

the Fisherian

_sense)

to estimate these fixed effects and the likelihood

is ill-conditioned. In the case of these _sparsedata

_structures,

the choice of the

_prior

distribution for the fixed effects _maywell be the most critical

_part

of the

_problem.

This needs to be studied further.

Data

_augmentation

in the Gibbs

_sampler

led to conditional

_posterior

distribu-tions which are _easyto

_sample

from. This facilitates

_programming.

We have noted

though,

that threshold

_parameters

have very slow

mixing properties,

and this is

probably

related to the data

augmentation approach

used in this

study

(Liu

et

al,

1994).

With our

_data,

the

_{parameterization}

in terms of the residual variance

resulted in smaller autocorrelations between

_samples

of

_Q

_e

than between

_samples

of the thresholds. A scheme that is

likely

to accelerate

_mixing

is to

_sample

_jointly

from the threshold and

_liability.

This

_step

_maynecessitate other Monte-Carlo

sam-pling techniques

such as a

_Metropolis

_algorithm

_{(Tanner, 1993),}

since

_sampling

is from a non-standard distribution. Alternative

_{computational strategies}

and param-eterizations of the model _maybe more critical with animal models.

_Here,

there

is

_typically

little information on additive

_genetic

effects,

and these are correlated. These

_properties

slow down _convergenceof the Gibbs chain.

The methods described in this paper can be

adapted easily

to draw inferences about

_genetic

_change

when selection is for

_categorical

data. Sorensen et al

₍₁₉₉₄₎

described how to make inferences about _responseto selection in the context of the Gaussian model. In the threshold

_model,

the

_only

difference is that observed data are

_{replaced by}

the unobserved

_underlying

variable U

(liability).

In order to make inferences about response to

selection,

the

parameterization

must be in terms of an animal model.

REFERENCES

Albert

_JH,

Chib S

₍₁₉₉₃₎

_{Bayesian analysis}

of

_binary

and

_{polychotomous}

response data.

J Am Stat Assoc 88, 669-679

Andersen

_S,

Andresen

_E,

Christensen K

₍₁₉₈₈₎

_Hip

_dysplasia

selection index

_exemplified

by

data from German

_{shepherd dogs.}

J Arcim Breed Genet 105, 112-119

Anderson

_DA,

Aitkin M

₍₁₉₈₅₎

Variance _componentmodels with

_binary

response: inter-viewer

_variability.

J R Stat Soc B

_47,

203-210

Berger JO,

Bernardo JM

₍₁₉₉₂₎

On the

_development

of reference priors. In:

_Bayesian

Statistics

_{IV (JM}

_Bernardo,

JO

_Berger,

AP

_Dawid,

AFM

_Smith,

_eds),

Oxford

Univer-sity Press,

UK, 35-60

Besag J,

Green PJ

₍₁₉₉₃₎

_Spatial

statistics and

_{Bayesian computation.}

J R Stat Soc B

55, 25-37

Bliss CI

₍₁₉₃₅₎

The calculation of the

_{dosage-mortality}

curve. Ann

_Appl

Biol

_22,

134-167 Bulmer MG

₍₁₉₇₁₎

The effect of selection on

_{genetic variability.}

Am Nat 105, 201-211

Cox DR, Miller HD

₍₁₉₆₅₎

The

_{Theory of}

Stochastic Processes.

_Chapman

and

_Hall,

London,

UK

Curnow

_R,

Smith C

₍₁₉₇₅₎

Multifactorial models for familial diseases in man. J R Statist Soc A

_138,

131-169

Dempster ER, ,Lerner

IM

₍₁₉₅₀₎

_Heritability

of threshold characters. Genetics 35, 212-236

(20)

Falconer DS

₍₁₉₆₅₎

The inheritance of

_liability

to certain

diseases,

estimated from the incidence among relatives. Ann Hum Genet

29, 51-76

Federation

_Cynologique

Internationale

₍₁₉₈₃₎

Hip dysplasia-international

certificate and evaluation of

_radiographs

_(W

Brass, S Paatsama,

eds),

Helsinki,

Finland

Foulley

JL

₍₁₉₉₃₎

Prediction of selection response for threshold dichotomous traits.

Genetics 132, 1187-1194

Foulley JL,

Manfredi E

₍₁₉₉₁₎

_{Approches statistiques}

de 1’evaluation

_g6n6tique

des

reproducteurs

pour des caract6res binaires 4 seuils. Genet Sel EvoL

23,

309-338

Foulley JL,

Im

_S,

Gianola

_D,

H6schele I

₍₁₉₈₇₎

_Empirical

_Bayes

estimation of parameters

for n

_{polygenic binary}

traits. Genet Sel

_{Evol 19,}

197-224

Foulley

JL, Gianola

D,

Im S

₍₁₉₉₀₎

Genetic evaluation for discrete

polygenic

traits in animal

_breeding.

In: Advances in Statistical Methods

_for

Genetic

_Improvement

_of

Livestock

_(D

_Gianola,

Hammond

_K,

_eds)

_{Springer-Verlag, NY, USA,}

361-409

Gelfand

AE,

Hills

SE,

Racine-Poon

A,

Smith AFM

₍₁₉₉₀₎

Illustration of

_Bayesian

inference in normal data models

_using

Gibbs

_sampling.

J Am Stat Assoc

85,

972-985

Gelfand

_A,

Smith

_AFM,

Lee TM

₍₁₉₉₂₎

_{Bayesian analysis}

of constrained _parameterand truncated data

_{problems using}

Gibbs

_sampling.

J Am Stat Assoc

_87,

523-432

German

_S,

Geman D

₍₁₉₈₄₎

Stochastic relaxation, Gibbs distributions and

Bayesian

restoration of

_images.

IEEE Trans Pattern Anal Machine

_Intelligence

6, 721-741 Gelman

_A,

Rubin DB

₍₁₉₉₂₎

A

_single

series from the Gibbs

_{sampler provides}

a false sense

of

security.

In:

Bayesian

Statistics IV

_(JM

Bernardo,

JO

Berger,

AP

_Dawid,

AFM

Smith,

eds),

Oxford

_University

Press,

UK,

625-631

Geweke J

₍₁₉₉₂₎

_Evaluating

the accurracy of

sampling-based approaches

to the calculation

of

_posterior

moments. In:

_Bayesian

Statistics

_{IV (JM}

_Bernardo,

JO

_Berger,

AP

_Dawid,

AFM

Smith,

eds),

Oxford

_{University Press, UK,}

169-193

Geyer

CJ

₍₁₉₉₂₎

Practical Markov chain Monte Carlo

_{(with discussion).}

Stat Sci

_{7, 467-511}

l

Gianola D

₍₁₉₈₂₎

_Theory

and

_analysis

of threshold characters. J Anim Sci

54,

1079-1096

Gianola

_D,

_Foulley

JL

₍₁₉₈₃₎

Sire evaluation for ordered

_categorical

data with a threshold

model. Genet Sel

Evol 15,

201-224

Harville

_DA,

Mee RW

₍₁₉₈₄₎

A mixed-model

_procedure

for

_analyzing

ordered

_categorical

data. Biometrics

_40,

393-408

H6schele I, Gianola

_{D, Foulley}

JL

₍₁₉₈₇₎

Estimation of variance _componentswith

quasi-continuous data

using Bayesian

methods. J Anim Breed Genet 104, 334-349

Jensen

_{J, Wang CS,}

Sorensen

_DA,

Gianola D

₍₁₉₉₄₎

_Marginal

inferences of variance and

covariance _componentsfor traits influenced

_by

maternal and direct

_genetic

effects

_using

the Gibbs

sampler.

Acta

_Agric

Scand

44,

193-201

Liu

_{JS, Wong WH, Kong}

A

₍₁₉₉₄₎

Covariance structure of the Gibbs

sampler

with

applications

to the _comparisonsof estimators and

_augmentation

schemes. Biometrika

81,

27-40

Mood

_{AM, Graybill FA,}

Boes DC

₍₁₉₇₄₎

Introduction to the

_{Theory of}

Statistics.

McGraw-Hill, NY,

USA

Morton

NE,

McLean CJ

₍₁₉₇₄₎

Analysis

of

family

resemblance. III.

_{Complex segregation}

of

quantitative

traits. Am J Hum Genet

26,

489-503

Priestley

MB

₍₁₉₈₁₎

_{Spectral Analysis}

and Time Series. Academic

_{Press, London,}

UK

Quaas RL,

Pollak EJ

₍₁₉₈₀₎

Mixed-model

_methodology

for farm and ranch beef cattle

testing

programs. J Anim Sci

51,

1277-1287

Raftery AE,

Lewis SM

₍₁₉₉₂₎

How _{many iterations}in the Gibbs

_sampler?

In:

_Bayesian

Statistics

_{IV (JM}

_Bernardo,

JO

_Berger,

AP

_Dawid,

AFM

_Smith,

_eds),

Oxford

Univer-sity Press, UK,

763-773

(21)

Roberts GO

₍₁₉₉₂₎

Convergence diagnotics

of the Gibbs

sampler.

In:

_Bayesian

Statistics IV

_(JM

Bernardo,

JO

Berger,

AP

Dawid,

AFM

Smith,

eds),

Oxford

University

Press,

775-782 <

Roberts

GO,

Polson NG

₍₁₉₉₄₎

On the

_geometric

convergence of the Gibbs

sampler.

J R

Statist Soc B _56,377-384

Robertson

A,

Lerner IM

₍₁₉₄₉₎

The

_heritability

of all-or-none traits:

_viability

of

_poultry.

Genetics

_34,

395-411 1

Smith

_AFM,

Roberts GO

₍₁₉₉₃₎

_{Bayesian computation}

via the Gibbs

_sampler

and related Markov chain Monte-Carlo methods. J R Stat Soc B _55,3-23

Sorensen

_{DA, Wang CS,}

Jensen

J,

Gianola D

₍₁₉₉₄₎

_{Bayesian analysis}

of

genetic change

due to selection

_using

Gibbs

_sampling.

Genet Sel Evol

_26,

333-360

Tanner MA

₍₁₉₉₃₎

Tools

for

Statistical

_{Inference. Springer-Verlag,}

NY, USA

Tanner

_{MA, Wong}

WH

₍₁₉₈₇₎

The calculation of

_posterior

distributions

by

data

augmen-tation. J Am Stat Assoc 82, 528-540

! Wang

CS, Rutledge JJ,

Gianola D

₍₁₉₉₃₎

_Marginal

inferences about variance _components

in a mixed linear model

_using

Gibbs

_sampling.

Genet Sel Evol _21,41-62

Wang CS, Rutledge

JJ, Gianola D

_(1994a)

_{Bayesian analysis}

of mixed linear models via

Gibbs

_sampling

with an

_application

to litter size in Iberian

_pigs.

Genet Sel Evol 26,

91-115

Wang CS,

Gianola

_D,

Sorensen

_DA,

Jensen

_J,

Christensen

_{A, Rutledge}

JJ

_(1994b)

Response

to selection for litter size in Danish Landrace

_pigs:

a

_{Bayesian analysis.}

Theor

Appl

Genet

_88,

220-230

Wright

S

₍₁₉₃₄₎

An

_analysis

of

_variability

in number of

_digits

in an inbred strain of

_guinea

pigs.

Genetics 19, 506-536

Zeger SL,

Karim MR

₍₁₉₉₁₎

Generalized linear models with random effects: a Gibbs

sampling approach.

J Am Stat Assoc 86, 79-86

APPENDIX

Here it is shown that there is a one-to-one

_relationship

between 2 chosen param-eterizations of the threshold

model,

and that

_they

lead to the same

probability

distribution.

As a

_{starting point}

assume that the conditional distribution of

liability

is:

where e =

(b

l

, ... , bp,

a l

, ...

aq),

b

l

is an

_intercept

common to all

observations,

and associated with C

_categories

there are thresholds

_ti,

i =

_{0, 1, ... , C,}

with

to

= -oo and

_t

_c

= oo. We assume that the _pfixed effects are estimable. In order to make the

parameters

identifiable,

in what we call the standard

_{parameterization,}

we define:

Note that

_by

_setting

_f

_l

=

_0,

then

_t

_l

= 0. In _termsof this

_{parameterization,}

the

(22)

There are _p_{+ q +}C — 1 identifiable unknown

_parameters

which are:

In the alternative

_{parameterization,}

one can define

Ui

=

a

i

Ji,

such that:

where 8

=

a,6,a2

=

a2U2,g,

=

l

,

l

i

and

t

l

=

f

l

.

The number of identifiable

parameters

is of course the same as before. In this _paperwe chose to set

_{tc_}

₁

=

f

2 ,

where

_f

₂

>

_f

_l

is an

arbitrary

known constant.

The

_probability

distributions are

_{given by:}

P(Y

=

!0, t)

=

P(0-l

< U

i

s

t! !0, t)

=

P(T

j

-

l

< U

i

s

t! !0, t)

(since

the

relationship

between

(0, t)

and

(0, t)

is

_one-to-one)

=