Gibbs sampling, adaptive rejection sampling and robustness to prior specification for a mixed linear model

(1)

HAL Id: hal-00894156

https://hal.archives-ouvertes.fr/hal-00894156

Submitted on 1 Jan 1997

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Gibbs sampling, adaptive rejection sampling and

robustness to prior specification for a mixed linear model

Cm Theobald, Mz Firat, R Thompson

To cite this version:

(2)

Original

article

Gibbs

_{sampling, adaptive rejection}

sampling

and robustness

to

_prior

specification

for

a

mixed

linear

model

CM

_Theobald,

MZ Firat

R

_Thompson

Department of

Mathematics and

_{Statistics, University of Edinburgh,}

King’s Buildings, Ma

y

f

ield

Road, Edinburgh

EH9

_3JZ,

UK

(Received

4 October

_{1995; accepted}

13

_January

₁₉₉₇₎

Summary - Markov chain Monte-Carlo methods are

_{increasingly being applied}

to make inferences about the

_{marginal posterior}

distributions of parameters in

_{quantitative genetic}

models. This paper considers the

application

of one such

method,

Gibbs

_sampling,

to

Bayesian

inferences about _parametersin a normal mixed linear model when a restriction is

_imposed

on the relative values of the variance _components.Two

prior

distributions

are

_proposed

which

_incorporate

this restriction. For one of

_them,

the

_{adaptive rejection}

sampling technique

is used to

_sample

from the conditional

_posterior

distribution of

a variance ratio. Simulated data from a balanced sire model are used _{to compare}

different

_{implementations}

of the Gibbs

_sampler

and also inferences based on the two

_prior

distributions.

_{Marginal posterior}

distributions of the parameters are illustrated. Numerical

results _suggestthat it is not _{necessary to}discard any

iterates,

and that similar inferences

are made

_using

the two

_{prior specifications.}

adaptive rejection sampling

/

Bayesian

inference /

Gibbs _sampling

_/

mixed linear

model / sire

model

Résumé -

_{Échantillonnage}

de _{Gibbs, échantillonnage}de rejet adapté et robustesse relative à l’a _{priori spécifié}pour un modèle linéaire mixte. On utilise de

_plus

en

_plus

des méthodes de Monte-Carlo basées sur des chaînes de Markov

pour faire

des

_inférences

sur

des distributions

marginales

a _posteriorides

paramètres

de modèles

_{génétiques quantitatifs.}

Cet article concerne

_{l’application}

d’une telle

_{méthode, l’échantillonnage}

de

Gibbs,

à des

inférences bayésiennes

sur des

_paramètres

d’un modèle linéaire mixte

_{normal, quand}

on

_impose

une contrainte sur les valeurs relatives des composantes de variance. Deux distributions a _priori

_incorporant

cette contrainte sont

_proposées.

Pour l’une

d’elles,

la

technique

de

_{l’échantillonnage}

de

_{rejet adapté}

est

_employée

pour échantillonner dans une

distribution conditionnéllé a

_posteriori

d’un _rapportde variance. Des données simulées

*

Present address:

_Department

of Animal

_{Science, University}

of

_{pukurova, Faculty}

of

Agriculture,

01330

_{Balcah, Adana, Turkey.}

tp

(3)

dans un modèle

_{père équilibré}

sont utilisées pour comparer

différentes

mises en oeuvre de

_{l’échantillonnage}

de Gibbs et aussi les

_inférences

basées sur les deux distributions a _priori.Les distributions

marginales

a

_posteriori

des

_paramètres

sont données comme

illustrations. Les

_{exemples numériques suggèrent qu’il}

n’est nécessaire d’éliminer aucun

résultat d’itération et que des

_inférences

similaires sont

_faites

en utilisant l’un ou l’autre des deux a _priori.

échantillonnage de rejet adapté

/

inférence bayésienne

/

échantillonnage de Gibbs

_/

modèle linéaire mixte

_/

modèle père

INTRODUCTION

Parameters such as

_{heritability,}

and

predictors

of individual

breeding values,

de-pend

on the unknown variance

_components

in

quantitative

genetic

models. The wide

array of methods available for

estimating

variance

components

includes

ANOVA,

likelihood-based methods and

Bayesian

methods. A

_difficulty

with ANOVA meth-ods which has concerned many authors is that ANOVA estimates may lie outside

the

_parameter

_space,for

_example

estimates of heritabilities which are

_negative

or

greater

than one. The use of such estimates for

_genetic

_parameters

can lead to _very

inefficient selection decisions.

Owing

to

_rapid

advances in

computing

technology,

methods based on likelihood have become common in animal

breeding;

in

_particular

REML

_(Patterson

and

Thompson,

1971)

has been

_widely

used. REML estimators are

_by

definition

always

within the

_parameter

space, and

they

are

_consistent,

asymptotically

normal and

efficient

_(Harville,

_1977).

However,

interval estimates for variances based on the

asymptotic

distribution of REML estimators can include

_negative

values

_(Gianola

and

_Foulley,

_1990).

In recent years,

Bayesian

methods have been

applied

to variance

component

estimation in animal

_breeding

_(Harville,

1977;

Gianola and

_Fernando,

_1986;

Gianola

et

al, 1986;

Foulley

et

_{al, 1987;}

Cantet et

_{al, 1992; Wang}

et

_{al, 1993,}

_1994).

For normal

models,

the

_{joint posterior}

distribution of the

dispersion

parameters

can be

derived,

but numerical

_{integration techniques}

are

_required

to obtain the

_marginal

distributions of functions of interest.

Gibbs

sampling

is a Markov chain Monte-Carlo method which

_operates

by

generating samples

from a _sequenceof conditional

distributions;

under suitable

conditions,

its

_equilibrium

distribution is the

joint

posterior

distribution. Gilks

et al

₍₁₉₉₆₎

provide

a

thorough

introduction to the Gibbs

sampler.

The method

is

_applied

to variance

_component

estimation in a

_general

univariate mixed linear

model

_{by Wang}

et al

_(1993).

_Also,

_Wang

et al

₍₁₉₉₄₎

obtain

_marginal

inferences about fixed and random

_effects,

variance

_components

and functions of them

_through

a scalar version of the Gibbs

sampling

algorithm.

The

objective

of this paper is to broaden the

application

of Gibbs

sampling

in

an animal

_breeding

context

_{by considering}

prior

distributions which

_incorporate

a constraint on the relative values of variance

_components:

such a constraint

_might

arise from

_restricting

_heritability

to values less than one. The next section defines

(4)

sampling

can be

_applied

to inferences about model

parameters

under the two

specifications.

Simulated data from a balanced univariate one-way sire model are used to demonstrate these

_methods,

to _comparethree

_{implementations}

of Gibbs

sampling

and to examine the robustness of the inferences to the choice of

_prior.

THE MODEL AND ALTERNATIVE PRIOR DISTRIBUTIONS Model

We consider inferences about model

_parameters

for a univariate mixed linear model

of the form

where y and e are n-vectors

_representing

observations and residual

_{errors, 0}

is a

p-vector

of ’fixed

_effects’,

u is a

_q-vector

of ’random

effects’,

X is a known n x _p

matrix of full column rank and Z is a known n x

_{q matrix.}

Here e is assumed to

have the distribution

_N

_n

_(0,

_Q

_e

_I),

_{independently of fl}

and u. Also u is taken to be

N

9 (0, Q

u

G)

with G a

known positive

definite matrix.

Restrictions on variance ratios

In animal

_breeding

_{applications,}

interest _maybe in

_making

inferences about ratios of variances or functions

_thereof,

rather than about individual variance

_components

themselves. For

_example,

if u and _y

_represent

vectors of sire effects and of

phenotypic

values recorded on the sires’

_offspring

in unrelated

_paternal

half-sib

families,

and _’Y denotes the variance ratio

_{or 2/}

₀

_,2,

then the

_{heritability,}

_h

₂

_,

of the trait is an

_increasing

function

_{of -y}

_given

_by

h!

=

4/(1

₊

-y-

1 ).

The constraint

that the

_heritability

is between 0 and _{1 restricts ’Y}_tolie in the interval

(0, ! ) .

A

method of inference which

_ignores

such restrictions _maylead to ridiculous estimates of

_{heritability.}

More

_generally,

we

_might

consider

_{restricting -}

_Y

to an interval

(0, g),

corresponding

to the constraint that 0 <

_Q!

_<

_g

_Q

_e,

where _gis a known constant.

Bayesian

formulation with variance

_parameters

_Q

_e

_and

_Q!

_u2

In the above

_model,

the

_assumption

made about the distribution of u must be

supplemented by

prior

distributions for

_fl,

_Q

_e

_{and or u 2.}

In

_specifying

their

_form,

we

_might

aim to reflect the animal breeder’s

_prior

_knowledge

while

_permitting

convenient calculations on the

_resulting

_posterior

distributions.

The

_prior

distribution

_{of fl}

is taken here to be

_{uniform, corresponding}

to little

being

known about its value

_{initially. Wang}

et al

_(1993,

₁₉₉₄₎

make the same

_prior

assumption

about P

(in

a model with further random

effects),

and consider

_taking

ae and a

to

be

_independent

with scaled

_inverse-x

₂

distributions

_,

_e

₂

_(v

_-

_X

_(v

_e

_k

_e

_)-

₁

₎

and

_X

_-

₂

_(v

_u

_,

_);

₁

_)-

_u

_k

_u

_(v

_ke

and

_{k§ !}

can

then be

interpreted

as

_{prior expectations}

of

_or-2

and

_a

_:

_;¡

₂

_,

_{respectively,}

and v. and vu as

_precision

_parameters

_analogous

to

(5)

This

_{prior assumption}

may be modified to take account of an upper bound on

the variance

_ratio,

so that

_{o,2and Q}

_u

are

not

_independent,

but have a

_{joint prior}

density

of the form

The conditional

_density

of _y

_{given !,}

u and

afl

for the model

given

in

[1]

is

proportional

to

(!)’s&dquo;exp[--<7!(y -

Xfl -

Zu)’(y -

Xfl -

Zu)],

so the

joint

posterior density

is

_{given by}

It is shown in the

_Appendix

that if vu is

positive

and n + ve exceeds p then the

marginal

posterior

distribution

_{of q}

is proper and hence this

_joint

distribution is

also.

To

_implement

the Gibbs

sampling

algorithm,

we

_require

the conditional

_posterior

distributions of each of

_(3,

u,

u

2 and

a

given

the

_remaining

_parameters,

the

so-called

_full

conditional

distributions,

which are as follows

_(using

the notation

_{[. 1.]}

for a conditional

distribution)

(truncated

at

_ga!).

_Apart

from the use here of proper

prior

distributions for the

variances,

these are similar to the conditional distributions

_given

in

_Wang

et al

(1993).

Methods for

_sampling

from these distributions are well known.

An alternative

_Bayesian

formulation with

prior

independence

of

0 ’

e 2 and

As

_Wang

et al

₍₁₉₉₃₎

show for a model with a

_{slightly simpler}

_prior

for

_Q

_e

_and

o,,2,,

the

_prior

_{specification}

in terms of these two

_parameters

is a convenient one for

applying

the method of Gibbs

_sampling.

It is also

_{easily generalized}

when several traits are recorded. It _may

_not,

however,

be the most useful

_{specification}

for

_eliciting

(6)

may

prefer

to think in terms of the

heritability

of a

_trait,

equivalent

for a

paternal

half-sib model to the variance

_ratio

₇

=

a£ lafl

defined above. A second function

of the two variances must also be considered in order to

_specify

their

_joint

prior

distribution,

and

_{or may}

be an

_appropriate

choice because inferences about

_Q

_e

2 from

_previous

data

are

likely

to be much more

_precise

than those about

_0’

_{u 2.}

To

e

specify

a

_{joint prior}

distribution for the sire and residual variance

_components,

it may therefore be convenient to

assign priors

to

Q

e

and

the variance ratio y. We

thus consider the alternative

parameter

vector

_(p,

_u,

_ae,

_q),

but make the same

prior assumptions for

and u as

before, noting

that

ou

becomes

-

YU2

.

We can

incorporate

an _upper

_{bound g}

on the _range

_{of q}

more

_naturally

than with the

_{by using}

the

_family

of Beta distributions on the interval

_{(0, g):}

this is a

_fairly

flexible

_family

with

_{probability density}

of the form

If y is taken to be

_{independent of P}

and ea a

_priori,

and

a

§

assumed to be

x-2

(v

e

,

(

Le

k

e

)-

1 )

then the

joint posterior

density

is

_given

_by

The full conditional

_posterior

distributions

_{for fl}

and u are the same as

[4]

and

!5!,

while that for

_<7!

is

The full conditional

_{density for y}

is

_proportional

to

This

_family

is not a well-known one, so it is not

_immediately

clear how to

_sample

from it.

_However,

the

_adaptive

_rejection

_sampling

_technique

of Gilks and Wild

₍₁₉₉₂₎

provides

a

reasonably

efficient method for

sampling

from the distribution of

ln

7 .

Given these two methods of

_including

the restriction on

&OElig;!/ &OElig;;,

the

_question

arises of how

posterior

inferences, especially

about q, are affected when the second

prior

specification

is used instead of the first. To answer

_this,

we consider alternative

specifications

in which the

_priors

for the variance

_parameters

are similar: to match these distributions we choose a and b in

[8]

to

_{give -y}

the same upper and lower

(7)

CALCULATIONS ON THE POSTERIOR DISTRIBUTIONS

Gibbs

_sampling

Markov chain Monte-Carlo

_{methods, particularly}

Gibbs

_sampling,

are now

_widely

used for

_making

inferences from

_posterior

distributions of

_parameters.

Gelfand

and Smith

₍₁₉₉₀₎

review the Gibbs

_sampler

and other Monte-Carlo

_methods,

and Gelfand et al

₍₁₉₉₀₎

illustrate the use of the Gibbs

_sampler

on a _rangeof statistical

models, including

variance

_components

models.

_Wang

et al

_{(1993, 1994)}

_give

two

versions of the Gibbs

_sampler

for a univariate normal mixed linear model

_assuming

prior

independence

of the variance

_components.

The Gibbs

_{sampler algorithm}

_generates

_samples

_iteratively

from the full condi-tional distributions of

_disjoint

subsets of the

_parameters

_given

the current value of

the

_{complementary}

subset.

The

_following

version of the Gibbs

sampler

can be

applied

to

_sample

from the

posterior

distribution defined in

_[3]

_using

the full conditional distributions

_given

in

!4!-!7!.

It

incorporates

the restriction that

_{(}&dquo;!/ ()&dquo;!}

cannot exceed an _upper

_{bound g}

by discarding

pairs

of variances which do not

_satisfy

this condition.

(i)

Choose an

_arbitrary

_starting

value

0(°)

=

(p

CO

),

e u

for the vector

of

_{parameters 0}

=

(fl, u,

Q

e, or2 U);

(ii)

draw a value

p(

1 )

from the distribution

[p

u(°),

e

o

a2(

),

0 l

u

2 (

0 ),

y];

(iii)

draw

u(

1 )

from

_[

_u

_!(1),

o

re

,au ,

_y!;

(i

v

)

draw

Q

u(

1 )

from

_[

_o

_{r2 u}

P(1),U(1)10,e 2(0),

_y]

_ignoring

the restriction on

_or

_u/

_or

_e

_;

₂

(

v

)

draw

0’e

from

_[

₀

_{,2 e}

_!(1),

_u(1),

),

u(

1 Q

_y]

_ignoring

the restriction on

_or

_u/

_u,;

₂

(vi)

if (}&dquo;!(1) /(}&dquo;!C1) ;;::

g th

e

n

repeat

)

v

(i

and

₍

_v

₎

until

(}&dquo;!Cl) /(}&dquo;!C1)

< _g.

Thus,

subsets of the

parameters

are

_sampled

in an

_arbitrary

order and this

cycle

defines the transition from

0(°)

to

B(

1 ).

_{Subsequent cycles replace}

0(

l

)

to

_give

further

vectors

B(

2 ), 0(!) ,

and so on.

Opinions

differ on how the _sequenceof vectors used in

_{subsequent analyses}

should

be defined. The

_following

three methods of

implementation

have been

_proposed,

and are

compared

on simulated data in the next section.

(a)

A

_{single long}

chain: one

_generates

a

_single

run of the chain as

_practised

_by

Geman and Geman

_(1984),

_ie,

(vii)

repeat

(ii)-(vi)

m times

_using

_updated

values and store all the values.

A common modification is to discard vectors from the start of the sequence to allow it to

_{approach equilibrium.}

The

_remaining

vectors

0(

l

),

... ,

,

0Cm

)

ap-proximate

simulated values from the

posterior

distribution of

0,

but successive

vectors _maybe correlated and the _{sequence may}be

_strongly

influenced

_by

the

starting

values.

(b)

Equally spaced

samples:

to reduce serial

_{correlations,}

one can choose suitable

integers

k and m,

perform

a

single

run of

length

km,

and then form a

sample

from _everykth

vector,

ie,

use

o(

k

),

0(

2k

),

_{... ,}

0(!!) .

(c)

Multiple

short chains:

_by

contrast,

Gelfand and Smith

₍₁₉₉₀₎

and Gelfand et al

(8)

In

_support

of

_{implementation}

_(c),

Gelman and Rubin

₍₁₉₉₂₎

_argue

_that,

even if

using

one

_long

run _maybe more

efficient,

_{it may}still be

_important

to use several different

_starting

_points.

It has been common

_practice

to discard a substantial number of initial

_iterations,

and to use

_{implementations}

(b)

and

(c)

_(especially

_(b),

_storing

_only

_every

_kth,

usually

10th or

_20th,

_iterate).

The results

_{of Raftery}

and Lewis

₍₁₉₉₂₎

_suggest

that in _manycases this is rather

_profligate.

For _anyindividual

_parameter,

the collection of m values can be viewed as a simulated

sample

from the

_appropriate

_marginal

distribution. This

_sample

can be

used to calculate

_summaries,

such as the

_marginal

_posterior

_mean,or to estimate

marginal posterior

distributions of

_parameters

or of functions of them

_using

kernel

density

estimation. We use kernel

_density

estimation below with normal kernels and window width

_{suggested by}

Silverman

_(1986).

Adaptive rejection sampling

The

_adaptive

_rejection

_sampling

_technique

introduced

_by

Gilks and Wild

₍₁₉₉₂₎

is a method of

_rejection

_sampling

intended for situations in which evaluation of

a univariate

_density

is

_{computationally}

_expensive.

It uses a

_squeezing

function

(Devroye,

1986),

and is

adaptive

in the sense that the

_rejection

_envelope

and

squeezing

function

_(which

are

_piecewise

exponential)

are revised as

_sampling

proceeds,

and

_approach

the

_density

function. In Gibbs

_sampling,

the full conditional

distribution for each

_parameter

_changes

as the iteration

_proceeds,

so that

_usually

only

one value is

_required

from a

_{particular density.}

_Adaptive

_rejection

_sampling

has the

_advantage

that it does not

_require

an

_optimization

_step

to locate the mode of the

_density

before a

_sample

can be

_generated.

The method

_requires

that the

_density

is

_log-concave,

that is that the derivative of the

_logarithm

of the

_density

is monotone

_decreasing.

It is shown in the

_Appendix

that the

_{density of y}

in

_expression

_!11!

is

_guaranteed

to be

_{log-concave only}

if a is

greater

than !q 2

+

1,

but that if we transform to 6 =

_In&dquo;(

then the full conditional

density

of 6 is

_log-concave

if b is at least one. The condition on a is not a reasonable

one, since a

prior

with a maximum at zero must have a no

_greater

than _one,for

example.

However,

a value of b less than one

_implies

that the

_prior

_density

in

_[8]

has a maximum at its _upper

_bound,

_g

_(which

_might

_correspond

to a

_heritability

of

_one),

and it is not unreasonable to exclude such values. Hence we use

_adaptive

rejection

sampling

with

_ln

₇

_.

APPLICATION TO A ONE-WAY CLASSIFICATION

The methods described above were

_applied

to a _{one-way sire}model of the form

(9)

in which _pand _q

_equal

1 and s, and

X,

Z and G become

I

sd

(a

vector of

Is with

_length

_{sd), diag(l}

_d

_{, ... , l}

_d

₎

and

Is.

The full conditional distributions

for

_{3,

_u,

_o,2

_and

_Q

_u

_defined

in

_expressions

_[4]-[7]

become

_{N !}

_I

where

_{y,, Y..}

and u. denote the vector of

family

means and

the

overall means of the _ygand the ui.

Four sets of data were

_{generated using}

this model with

parameter

values 0

=

0,

a2

=

_0.975,

er!

= 0.025 and

hence h

2

= 0.1 with 25 families and 20

offspring

per

sire. ANOVA estimates are summarized in table I: the data sets are ordered

_by

the estimates of

_{heritability,}

which range from -0.073 to 0.306. The first ANOVA

estimate of

_a

₂

_is

_negative,

_making

conventional inference about this

_parameter

difficult.

Analysis

using

the

_prior

for

_Q

_e

_{and Q}

_u

u

2

Bayesian analyses

of the four data sets are

_{given using}

the

_prior

distribution in

[2]

with v! = _vu=

I,

k!

=

0.975, ku

= 0.025 and

g

1

The

prior

distribution for with v, = vu =

1,

k

e

=

0.975, k

u

= 0.025 and

9

= 3.

The

prior

distribution for

Q

e

and au

is

proper and leads to a _proper

_{joint posterior}

distribution

(as

shown

in the

_Appendix),

but the small values chosen for v, and Vu

correspond

to a _very

dispersed

distribution;

this is intended to reveal _any

_problems

with convergence of

the Gibbs

sampler

algorithm.

The

_sample

data

provide

little information about

_Q

_u

u

as there are

only

25 sires. The

algorithm

is carried out for the four sets of data with the

_following

_{implementations,}

which each

_provide

1 000 Gibbs

_samples:

(a)

a

_single

run of 1000

_iterations;

(b)

taking

every

(1)

10th,

(2)

20th and

(3)

30th value in

single

runs of

_length

_{10 000,}

20 000 and

_{30 000,}

_{respectively, using only}

one

_starting

value for the

_parameter

vector;

(c)

short runs of

_length

₍₁₎

_10,

₍₂₎

20 and

₍₃₎

_30,

_storing

the last iterate and

(10)

The

_parameter

values used to

_generate

the data were also

employed

as

_starting

values for all three

_{implementations.}

No iterates were discarded since the true

parameter

values were known.

The

_{resulting marginal}

_posterior

means and standard deviations for

{3,

0 &dquo;;, a2 e

q and

h

2

for the four data sets are shown in table II. This table reveals little difference between the

_{implementations}

in the

_marginal

_posterior

means,

except

for

a

_{slight tendency}

for standard deviations to be lower

_using

_(a)

for

_{3,

_{a u, 2 -y}

and

h

2 Figures

1-4 show the estimated

_marginal

posterior

densities for

_{3,

₀

_{&dquo;;, or2}

_and

h

2

using

density

estimators with normal kernels under the

_prior

_{specification}

of this subsection

_(with

_{implementation}

_(a))

and that considered in the next subsection.

Analysis assuming prior independence

of

₀

_{&dquo;; e 2and 7}

We now _comparethe above

_analyses

with those based on the second

prior

specification,

which assumes

_prior

_independence

of

Q

e

and q

and

_prior

_density

[8]

for

!y; we use

adaptive

_rejection

sampling

on the full conditional

posterior

distribution of 6 = In _q,as defined in the

_Appendix.

The

_Appendix

also describes how the values of a and b in the Beta

prior

for -y

are determined from those of _{v,, v}_u_,

k,

and

k!,.

At each Gibbs

_sampling

_iteration,

_{adaptive rejection sampling}

requires

at least

two

_points

for the construction of the

rejection

envelope

and

_squeezing

function

(Gilks

and

_Wild,

_1992).

The

_parameter

_spacefor 6 is unbounded on the

_left,

so Gilks and Wild

₍₁₉₉₂₎

advise that the smallest of these initial

_points

be chosen where the derivative of the

_log

_posterior

_density

is

_positive.

The 15th and 85th

percentiles

of the

_{sampling density}

from the

_previous

iteration were tried as initial

points,

additional

_points

_{being supplied}

if the derivatives at these values had the same

_sign.

Three evaluations of the

_log

_posterior

_density

of 6 were

required

on

average at each iteration of

adaptive

rejection

sampling.

Convergence

of the Gibbs

_sampler,

assessed

_by

_monitoring

summary statistics

for each

parameter

based on _everyten

_iterations,

had

_clearly

occurred before 1 000 iterations. The

_marginal

_posterior

summaries shown in table II and the

_density

estimates for

_{3,

_{o, 2, afl}

and

h

2

in

_figures

1-4 are based on this number of iterations. The results obtained

_using

the two

_prior

_{specifications}

_{agree very}

_{closely, especially}

those for

_/3

and

_ce

₂

DISCUSSION

Our results from

using

Gibbs

_sampling

on a balanced _one-waysire model with

the

_prior

_density

of

_expression

_[2]

for

₀

_{&dquo;; and}

₀

_{&dquo;; suggest}

that it is unnecessary to

discard all but the

_10th,

20th or 30th

_iterate;

_doing

so is

_probably

_quite

wasteful. A much

_larger

_proportion

of the iterates should be used unless

_storage

is an issue.

_Using

1000 successive iterates seems

_adequate

to make inferences about the

parameters

of

_interest,

_although

the

_slightly

lower standard deviations obtained with

_{implementation}

_(a)

may reflect the need to discard ’burn-in’ iterations. We encountered no difficulties with the _convergenceof the Gibbs

_{sampling algorithms}

used. With real

data,

we would use REML estimates as

_starting

values for the

Gibbs iterations _{and monitor convergence}

_diagnostics

to decide how many

(11)

(12)

(13)

(14)

The

_prior

distribution which assumes

_u2

_and

_{-y to}be

_{independent provides}

a

potentially

useful alternative to the usual

prior

in which

_U2

_and

_Q

_u

_are

taken to be

independent.

The

_{adaptive rejection}

_sampling

_algorithm

of Gilks and Wild

₍₁₉₉₂₎

is found to work

_efficiently

with the

_resulting

full conditional distribution for

_{In q.}

The

_sensitivity

of a

_Bayesian

_analysis

to the choice of

_prior

_{hyperparameters}

and

to different functional forms of

_prior

can be

_{investigated.}

For our simulated

_data,

the use of an alternative form of

_prior

distribution led to _verysimilar

_marginal

posterior

inferences on each

_parameter.

The use of the alternative

_prior

distribution for

Q

e

e 2 and -y

can be extended to

models with further variance

_components,

for

_example

if

_[1]

is

_generalized

to

where v is an r-vector of random

_effects,

T is a known n x r matrix and v is taken to

be

_N,(0,

_Q

_v

_H)

_{independently}

of

_fl,

u and e with H a known matrix. If

_!,

!e

_and

U2

_v are taken to be

_{mutually independent}

a

_priori,

with

_uniform,

x-

2 (v

e

,

(veke)-1)

and

x-2

(

L

v,

(v

v

k

v

)-

1 )

distributions,

respectively,

and q

is

_independent

of them with the

Beta distribution defined in

_[8],

then the full conditional distributions for

_!,

_{u, v,}

_!e

_e

and

_Q

_v

_are,

_{respectively,}

where e denotes

_{a 2/}

₀

_,2,

while that

_{for q}

is the same as

_!11!.

ACKNOWLEDGEMENTS

The authors are

_grateful

to W Gilks for

_providing

a FORTRAN _programto

_perform

adaptive rejection sampling.

MZ Firat was

_{supported by}

the Turkish Government.

REFERENCES

Cantet

RJC,

Fernando

_RL,

Gianola D

₍₁₉₉₂₎

_Bayesian

inference about

_dispersion

parame-ters of univariate mixed models with maternal effects: theoretical considerations. Genet Sel Evol

_24,

107-135

Devroye

L

₍₁₉₈₆₎

_Non-Uniform

Random Variate Generation.

_{Springer-Verlag,}

New York

Foulley JL,

Im

_S,

Gianola

_D,

Hoschele I

₍₁₉₈₇₎

_{Empirical Bayes}

estimation of parameters

(15)

Gelfand

AE,

Smith AFM

₍₁₉₉₀₎

Sampling-based approaches

to

_{calculating marginal}

densities. J Am Stat Assoc 85, 398-409

Gelfand

AE,

Hills

SE,

Racine-Poon

A,

Smith AFM

₍₁₉₉₀₎

Illustration of

_Bayesian

inference in normal data models

_using

Gibbs

_sampling.

J Am Stat Assoc 85, 972-985

Gelman, A,

Rubin DB

₍₁₉₉₂₎

A

_single

series from the Gibbs

_{sampler provides}

a false sense

of

_security.

In:

_{Bayesian Statistics}

_{4 (JM}

_Bernardo,

JO

_Berger,

AP

David,

AFM

Smith,

eds),

Clarendon

_{Press, Oxford,}

625-631

Geman

_S,

Geman D

₍₁₉₈₄₎

Stochastic

_relaxation,

Gibbs distributions and the

_Bayesian

restoration of

_images.

IEEE Transactions on Pattern

Analysis

and Machine

_Intelligence

6, 721-741

Gianola

_D,

Fernando RL

₍₁₉₈₆₎

_Bayesian

methods in animal

_{breeding theory.}

J Anim Sci

63, 217-244

Gianola

_{D, Foulley}

JL

₍₁₉₉₀₎

Variance estimation from

_integrated

likelihoods

_(VEIL).

Genet Sel Evol

_22,

403-417

Gianola

D, Foulley JL,

Fernando RL

₍₁₉₈₆₎

Prediction of

breeding

values when variances

are not known. Genet Sel Evol

_18,

485-498

Gilks

WR,

Richardson

_{S, Spiegelhalter}

DJ

₍₁₉₉₆₎

Markov Chain Monte Carlo in Practice.

Chapman

and

Hall,

London

Gilks

WR,

Wild P

₍₁₉₉₂₎

_{Adaptive rejection sampling}

for Gibbs

_{sampling. A}

PP

I

Statist

41, 337-348

Harville DA

₍₁₉₇₇₎

Maximum likelihood

_approaches

to variance _componentestimation and to related

_problems.

J Am Stat Assoc

_72,

320-338

Patterson

_{HD, Thompson}

R

₍₁₉₇₁₎

_Recovery

of inter-block information when block sizes

are

unequal.

Biometrika _58,545-554

Raftery AE,

Lewis SM

₍₁₉₉₂₎

How _manyiterations in the Gibbs

_sampler?

In:

_Bayesian

Statistics

4 (JM

Bernardo,

JO

_Berger,

AP

_David,

AFM

_Smith,

_eds),

Clarendon

_Press,

Oxford,

763-773

Silverman BW

₍₁₉₈₆₎

_Density

Estimation

_for

Statistics and Data

_{Analysis. Chapman}

and

Hall,

London

Wang CS, Rutledge JJ, Gianola,

D

₍₁₉₉₃₎

_Marginal

inferences about variance _components in a mixed linear model

_using

Gibbs

_sampling.

Genet Sel

_{Evol 25,}

41-62

Wang CS, Rutledge JJ,

Gianola D

₍₁₉₉₄₎

_{Bayesian analysis}

of mixed linear models via Gibbs

_sampling

with an

_application

to litter size in Iberian

_pigs.

Genet Sel Evol

_26,

91-115

APPENDIX

Marginal posterior probability density

function

_{of -y using}

the

_prior

for

92 and

u

2

u

Integrating

the

_posterior

_density

for

_fl,

_u,

_Q

_e

_and

₀

_-;

_given

in

_[3]

with

_respect

_{to fl}

and _u,the

posterior density

of

_Q

_e

_and

₀

_-;

_is

found to be

_proportional

to

(16)

The

_joint

_density

of

_Q

_e

_{and y}

is therefore

_proportional

to

and the

_marginal

_posterior

_{density of y}

_is

_proportional

to

provided

that n + LIe + v,, - P is

positive.

Whether or not this is a _proper

_density

function

_depends

on its behaviour near zero. For small _!y,

I G-

1 +

7 Z’QZI

is

approximately

constant,

while the

_&dquo;(-

_l

_Llu

_k

_u

term is dominant in

_Db)

so

long

as

v

u is

positive.

So the

density of

7

behaves like

&dquo;(!(n+ve-p-2)

near

zero, and has a

finite

_integral

if the

_exponent

_{exceeds -1,}

that is if n + ve exceeds p.

Matching

quartiles of q

in the two

_prior

distributions

To _compareinferences under the two

_prior

distributions considered in this

article,

we determine values of the

_{hyperparameters}

a and b

_defining

the Beta distribution

for y

in the second

_prior

so that the two

_{specifications}

are matched in some sense. The first

_{specification}

for the variance

_parameters

_(in

terms of

_Q

_e

_{and u u 2)}

_implies

that&dquo;(

(which

equals

o, 2 /C2 )

has

_{probability density}

function

_proportional

to

Hence 0 =

-yk,lk,,

has an

_F(v

_e

_,

_vu)

distribution truncated at

_gk

_e/

_k

_u

_.

We choose

a and b to match the Beta distribution used for _-yin the second

_prior

to this

_by

equating

their

quartiles.

If G denotes the distribution function of

_F(v

_e

_,

_vu)

then 0

has lower and upper

quartiles O

L

,

O

U

which are solutions of

and -y

has

quartiles

(/J

L

k

e/

k

u

and

_ouk,lk,,.

Thus if xL and xU denote the lower and upper

quartiles

of the Beta distribution

B(a, b),

we

_equate

XL and xU to

cP

L

k

u/

(gk

e

)

and

_{Ouk.l(gk, ,),}

_{respectively, by}

_solving

for a and b the two

_equations

Example:

We use the

prior

specification

for

_or2and

_or

with

ve = _l/u =

_1,

k

e

=

0.975,

k

u

= 0.025 and

g

= 3,

so that

’Y

ke

/

k

u

has

an

F(l, 1)

distribution

truncated at

-ke/ku 1

=

_13;

G(13)

=

0.8278,

_so

(17)

The

_{corresponding}

values of a and b are 0.4038 and 3.0678.

Log-concavity

of the full conditional distributions

_of

₇

and

_ln

₇

A

_density

is defined to be

_log-concave

if the derivative of its

_logarithm

is monotone

decreasing.

For the

_{density of q}

in

_!11!,

the

_required

derivative is

The second and third terms are

_decreasing

_{in q}

if b exceeds

_1,

but the first is

not unless a is

_greater

than

2

_{q +}

_1;

_{log-concavity}

cannot be

_guaranteed

for all

_afl

2 without this condition.

If we transform to 6 =

_{In q}

_(and

include the Jacobian of the

_{transformation),}

the

logarithm

of the

_density

is,

apart

from an additive

constant,

and this has derivative

The first term is constant and the second and third are

_decreasing

with

respect

to

Gibbs sampling, adaptive rejection sampling and robustness to prior specification for a mixed linear model

HAL Id: hal-00894156

https://hal.archives-ouvertes.fr/hal-00894156

Submitted on 1 Jan 1997

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Gibbs sampling, adaptive rejection sampling and

robustness to prior specification for a mixed linear model

Cm Theobald, Mz Firat, R Thompson

To cite this version:

Original

article

Gibbs

sampling, adaptive rejection

sampling

and robustness

to

prior

specification

for

a

mixed

linear

model

CM

Theobald,

MZ Firat

R

Thompson

Department of

Statistics, University of Edinburgh,

King’s Buildings, Ma

y

f

ield

Road, Edinburgh

3JZ,

(Received

1995; accepted

January

1997)

increasingly being applied

marginal posterior

quantitative genetic

application

method,

sampling,

Bayesian

imposed

prior

proposed

incorporate

them,

adaptive rejection

sampling technique

sample

posterior

implementations

sampler

prior

Marginal posterior

iterates,

using

prior specifications.

/

inference /

/

model / sire

Échantillonnage

plus

_{sampling, adaptive rejection}

_prior

_Theobald,

_Thompson

_{Statistics, University of Edinburgh,}

_3JZ,

_{1995; accepted}

_January

₁₉₉₇₎

_{increasingly being applied}

_{marginal posterior}

_{quantitative genetic}

_sampling,

_imposed

_proposed

_incorporate

_them,

_{adaptive rejection}

_sample

_posterior

_{implementations}

_sampler

_prior

_{Marginal posterior}

_using

_{prior specifications.}

_/

_{Échantillonnage}

_plus

_plus

_inférences

_{génétiques quantitatifs.}

_{l’application}

_{méthode, l’échantillonnage}

_paramètres

_{normal, quand}

_impose

_incorporant

_proposées.

_{l’échantillonnage}

_{rejet adapté}

_employée

_posteriori

_Department

_{Science, University}

_{pukurova, Faculty}

_{Balcah, Adana, Turkey.}

_{père équilibré}

_{l’échantillonnage}

_inférences

_posteriori

_paramètres

_{exemples numériques suggèrent qu’il}

_inférences

_faites

_/

_/

_{heritability,}

_components

_difficulty

_parameter

_example

_negative

_genetic

_parameters

_rapid

_particular

_(Patterson

_widely

_by