HAL Id: hal-00893980
https://hal.archives-ouvertes.fr/hal-00893980
Submitted on 1 Jan 1993
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Marginal inferences about variance components in a mixed linear model using Gibbs sampling
Cs Wang, Jj Rutledge, D Gianola
To cite this version:
Cs Wang, Jj Rutledge, D Gianola. Marginal inferences about variance components in a mixed linear model using Gibbs sampling. Genetics Selection Evolution, BioMed Central, 1993, 25 (1), pp.41-62.
�hal-00893980�
Original article
Marginal inferences about variance
components in
amixed linear model
using Gibbs sampling
CS Wang* JJ Rutledge
DGianola
University of Wisconsin-Madison,
Department of Meat and Animal Science, Madison, WI 53706-1284, USA
(Received
9 March 1992; accepted 7 October1992)
Summary - Arguing from a Bayesian viewpoint, Gianola and Foulley
(1990)
derived a newmethod for estimation of variance components in a mixed linear model: variance estimation from integrated likelihoods
(VEIL).
Inference is based on the marginal posterior distri- bution of each of the variance components. Exact analysis requires numerical integration.In this paper, the Gibbs sampler, a numerical procedure for generating marginal distri- butions from conditional distributions, is employed to obtain marginal inferences about variance components in a
general
univariate mixed linear model. All needed conditionalposterior distributions are derived. Examples based on simulated data sets containing varying amounts of information are presented for a one-way sire model. Estimates of the
marginal densities of the variance components and of functions thereof are obtained, and the corresponding distributions are plotted. Numerical results with a balanced sire model suggest that convergence to the marginal posterior distributions is achieved with a Gibbs sequence length of 20, and that Gibbs sample sizes ranging from 300 - 3 000 may be needed to appropriately characterize the marginal distributions.
variance components / linear models / Bayesian methods / marginalization / Gibbs sampler
R.ésumé - Inférences
marginales
sur des composantes de variance dans un modèle linéaire mixte à l’aide de l’échantillonnage de Gibbs. Partant d’un point de vue bayésien,Gianola et Foulley
(1990)
ont établi une nouvelle méthode d’estimation des composantes de variance dans un modèle linéaire mixte: estimation de variance par les vraisemblancesintégrées
(VEIL).
L’inférence est basée sur la distribution marginale a posteriori de chacune des composantes de variance, ce quioblige
à des intégrations numériques pour arriver aux solutions exactes. Dans cet article, l’échantillonnage de Gibbs, qui est une procédure numérique pour générer des distributions marginales à partir de distributions* Correspondence and reprints. Present address: Department of Animal Science, Cornell University, Ithaca, NY 14853, USA
conditionnelles, est
employé
pour obtenir des inférences marginales sur des composantesde variance dans un modèle linéaire mixte univarié général. Toutes les distributions conditionnelles a posteriori nécessaires sont établies. Des exempdes basés sur des données simulées contenant plus ou moins d’information sont présentés pour un modèle paternel
à un facteur. Des estimées des densités marginales des composantes de variance et de fonctions de celles-ci sont obtenues, et les distributions correspondantes sont tracées. Les résultats numériques avec un modèle paternel équilibré suggèrent que la convergence vers les distributions marginales a posteriori est atteinte avec une séquence de Gibbs longue de
20 unités, et que des tailles de l’échantillon de Gibbs allant de 300 à 3 000 peuvent être nécessaires pour caractériser convenablement les distributions marginales.
composante de variance / modèle linéaire / méthode bayésienne / marginalisation / échantillonnage de Gibbs
INTRODUCTION
Variance components and functions thereof are
important
inquantitative genetics
and other areas of statistical
inquiry.
Henderson’s method 3(Henderson, 1953)
for
estimating
variance components waswidely
used until the late 1970’s. Withrapid
advances incomputing technology,
likelihood based methodsgained
favor inanimal
breeding. Especially
favored has been restricted maximum likelihood undernormality,
known as REML(Thompson, 1962;
Patterson andThompson, 1971).
This method accounts for the
degrees
of freedom used inestimating
fixedeffects,
which full maximum likelihood(ML)
does not do.ML estimates are obtained
by maximizing
the fulllikelihood, including
its loca-tion variant part, while REML estimation is based on
maximizing
the &dquo;restricted&dquo;likelihood,
ie, that part of the likelihood functionindependent
of fixed effects. Froma
Bayesian viewpoint,
REML estimates are the elements of the mode of thejoint posterior density
of all variance components when flatpriors
areemployed
for all parameters in the model(Harville, 1974).
InREML,
fixed effects are viewed as nui-sance parameters and are
integrated
out from theposterior density
of fixed effects and variance components, which isproportional
to the full likelihood in this case.There are at least 2
potential shortcomings
of REML(Gianola
andFoulley,1990).
First,
REML estimates are the elements of the modal vector of thejoint posterior
distribution of the variance components. From a decision theoreticpoint
of view, theoptimum Bayes
decision rule underquadratic
loss in theposterior
mean rather than theposterior
mode. The mode of themarginal
distribution of each variance component shouldprovide
a betterapproximation
to the mean than a component of thejoint
mode.Second,
if inferences about asingle
variance component aredesired,
themarginal
distribution of this component should be used instead of thejoint
distribution of all components.Gianola and
Foulley (1990) proposed
a new method that attempts tosatisfy
theseconsiderations from a
Bayesian perspective.
Given theprior
distributions and the likelihood which generates thedata,
thejoint posterior
distribution of all parameters is constructed. Themarginal
distribution of an individual variance component is obtainedby integrating
out all other parameters contained in the model.Summary
statistics,
such as the mean,mode,
median and variance can then be obtained from themarginal posterior
distribution.Probability
statements about a parameter can bemade,
andBayesian
confidence intervals can beconstructed,
thusproviding
afull
Bayesian
solution to the variance component estimationproblem.
Inpractice, however,
thisintegration
cannot be doneanalytically,
and one must resort to numerical methods.Approximations
to themarginal
distributions wereproposed (Gianola
andFoulley, 1990),
but the conditionsrequired
are often not met in datasets of small to moderate size.
Hence,
exact inferenceby
numerical means ishighly
desirable.
Gibbs
sampling (Geman
andGeman, 1984)
is a numericalintegration
method.It is based on all
possible
conditionalposterior distributions, ie,
theposterior
distribution of each parametergiven
the data and all other parameters in the model.The method generates random
drawings
from themarginal posterior
distributionsthrough iteratively sampling
from the conditionalposterior
distributions. Gelfand and Smith(1990)
studiedproperties
of the Gibbssampler,
and revealed itspotential
in statistics as a
general
numericalintegration
tool. In asubsequent
paper(Gelfand
et
al, 1990),
a numberof applications
of the Gibbssampler
weredescribed, including
a variance component
problem
for a one-way random effects model.The
objective
of this paper is to extend the Gibbssampling
scheme to variancecomponent estimation in a more
general
univariate mixed linear model. We firstspecify
the Gibbssampler
in thissetting
and then use a sire model to illustrate the method indetail, employing
7 simulated data sets that encompass a range of parameter values. We alsoprovide
estimates of theposterior
densities of variance components and of functionsthereof,
such as intraclass correlations and variance ratios.SETTING Model
Details of the model and definitions are found in Macedo and Gianola
(1987),
Gianola et al
(1990a, b)
and Gianola andFoulley (1990); only
a summary isgiven
here. Consider the univariate mixed linear model:
where: y: data vector of order n x 1; X: known incidence matrix of order n x p;
Z i
:
known matrix of order n x ; iqp:
p x 1 vector ofuniquely
defined &dquo;fixed effects&dquo;(so
that X has full columnrank);
ui: qi x 1 &dquo;random&dquo; vector; and ei: n x 1 vector of random residuals. The conditional distribution which generates the data is.where R is an n x n known
matrix,
assumed to be anidentity
matrixhere,
andQ e
2is the variance of the random residuals.
Prior distributions
Prior distributions are needed to
complete
theBayesian specification
of the model.Usually,
a &dquo;flat&dquo;prior
distribution isassigned
toJ3,
so as to represent lack ofprior knowledge
about this vector, so:where
G i
is a known matrix anda’ .
is the variance of theprior
distribution of ui. Allu i ’s
are assumed to bemutually independent,
apriori,
as well asindependent of p.
Independent
scaled invertedx 2
distributions are used aspriors
for variance components, so that:Above
v e (v u; )
is a&dquo;degree
of belief&dquo; parameter, andse (su a )
can beinterpreted
as aprior
value of theappropriate
variance. In this paper, as in Gelfand et al(1990)
weassume the
degree
of belief parameters, ve and v!;, to be zero to obtain the &dquo;naive&dquo;ignorance improper priors:
The
joint prior density offi,u i (i
=1, 2, ... , c), U2i (i
=1, 2, ... , c)
andQ e
is theproduct
of densities associated with(3-7J, realizing that
there are c randomeffects,
each with their
respective
variances.The
joint
posterior distributionresulting
from priors[7]
is improper mathemati-cally,
in the sense that it does notintegrate
to 1. Theimproperty
is due to(6J,
andit occurs at the tails. Numerical difficulties can arise when a variance component has a
posterior
distribution withappreciable density
near O. In thisstudy, priors [7]
were
employed following
Gelfand et al(1990),
and difficulties were not encountered.However,
informative or non informativepriors
other than[7]
should be used T inapplications
where it ispostulated
that at least one of the variance components is close to O.JOINT AND FULL CONDITIONAL POSTERIOR DISTRIBUTIONS Denote u’ =
ui, ... , u!)
and v’ =(Q!1, ... , Qu!).
Letf = (u!...,u!_i,u!i,...,u!) and v’ _ (a2 ’ !2 !z . !2 )
beu’ ’U-i = -
U1&dquo;&dquo;,Ui-1,Ui+1&dquo;&dquo;’Uc
c an y-i =aU1&dquo;..,aU’-1,aUH1&dquo;..,auc
v.! e Uand yf with the ith element deleted from the set. The
joint posterior
distribution of the unknowns(fi, u, y
andud)
isproportional
to theproduct
of the likelihood function and thejoint prior
distribution. As shownby
Macedo and Gianola(1987)
and Gianola et al
(1990a, b),
thejoint posterior density
is in thenormal-gamma
form:
The full conditional
density
of each of the unknowns is obtainedby regarding
allother parameters in
[8]
as known. We then have:Manipulating [9]
leads toI 1
where
p
=(X’X)-’X’(y - L Z i u i ).
Note that this distribution does notdepend
i=l i on
u2 - -
on
Q u..
The
full conditional distribution of each ui(i
= 1, 2, ... ,c)
is multivariate normal:The full conditional
density
ofQ e is
in the scaled invertedX 2
form:c c
with parameters ve = n
and sd
=(y-Xfi- L Ziui)’(y-X!-! Ziui)/n.
Each:
=i :=i
full conditional
density
ofa!,
also is in the scaled invertedX 2
form:with parameters vu; = qi and
s 2
=u!G71 t . ui/qi.
The full conditional distributions
[9-12]
are essential forimplementing
the Gibbssampling
scheme.OBTAINING THE MARGINAL DISTRIBUTIONS USING GIBBS
SAMPLING Gibbs
sampling
In many
Bayesian problems, marginal
distributions are often needed to make ap-propriate
inferences.However,
due to thecomplexity
ofjoint posterior
distributionsobtaining
ahigh degree
ofmarginalization
of thejoint posterior density
is difficultor
impossible by analytical
means. This is so for manypractical problems,
eg infer-ences
about
variance components. Numericalintegration techniques
must be usedto obtain the exact
marginal distributions,
from which functions of interest can becomputed
and inferences made.A numerical
integration
scheme known as Gibbssampling (Geman
andGeman,
1984;
Gelfand andSmith,
1990; Gelfand etal, 1990;
Casella andGeorge, 1990)
circumvents the
analytical problem.
The Gibbssampler
generates a randomsample
from a
marginal
distributionby successively sampling
from the full conditional distributions of the random variables involved in the model. The full conditional distributionpresented
in theprevious
sections are summarized below:Although
we are interested in themarginal
distributions ofQ e and a Ui 2 only,
allfull conditional distributions are needed to
implement
thesampler.
Theordering placed
above isarbitrary.
Gibbssampling proceeds
as follows:(i)
setarbitrary
initial values forp,
u, v,U2 (ii)
generateud
from(13J,
andupdate U2 ;
e
(iii)
generatea2i u
from(14J,
andupdate O - u 2i
(iv)
generate ui from(15J,
andupdate
ui ;(v)
generatef3
from(16J,
andupdate 13;
and(vi)
repeat(ii-v)
ktimes, using
theupdated
values.We call k the
length
of the Gibbs sequence, Ask - oo, thepoints
from the kth iteration aresample points
from theappropriate marginal
distributions. The convergence of thesamples
from the above iteration scheme todrawings
from themarginal
distributions was establishedby
Geman and Geman(1984)
and restatedby
Gelfand and Smith(1990)
andTierney (1991).
It should be noted that there are noapproximations
involved. Let thesample points
be:( 2) (k) ( 2 ) (k) (i = 1 , 2 , ... , c ), ) k )( ui ( (i
- 1, 2, ... , c)
and (f3) lk > respectively,
where
superscript (k)
denotes the kth iteration. Then:(vii) Repeat (i-vi)
mtimes,
to generate m Gibbssamples.
At thispoint
we have:Because our interest is in
making
inferences abouto, and Q u.,
no attention will bepaid
hereafter to ui andP. However,
it is clear that themarginal
distributions of uiand P
are also obtained as abyproduct
of Gibbssampling.
Density
estimationAfter
samples
from themarginal
distributions aregenerated,
one can estimate thedensities
using
thesesamples
and the full conditional densities. As notedby
Casellaand
George (1990)
and Gelfand and Smith(1990),
themarginal density
of a randomvariable x can be written as:
An estimator of
p(!)
is:Thus,
the estimator of themarginal density
ofQ e is:
The estimated values of the
density are
thus obtainedby fixing Q e (at
a numberof
points
over itsspace),
and thenevaluating [21]
at eachpoint. Similarly,
theestimator of the
marginal density
ofQ u i
is:Additional discussion about mixture
density
estimators is found in Gelfand and Smith(1990).
Estimation of the
density
of a function of the variance components is accom-plished by applying theory
of transformations of random variables to the estimateddensities,
with minimal additional calculations.Examples
ofestimating
the densi-ties of variance ratios and of an intraclass correlation are
given
later.APPLICATION OF GIBBS SAMPLING TO THE ONE-WAY
CLASSIFICATION
Model
We consider the one-way linear model:
where
(3
is a &dquo;fixed&dquo; effect common to allobservations,
ui couldbe,
forexample,
sire
effects,
and eij is a residual associated with the record on thejth
progeny of sire i. It is assumed that:where NiD and NiiD stand for
&dquo;normal, independently
distributed&dquo; and&dquo;normal,
independently
andidentically distributed&dquo; , respectively.
Conditional distributions
For this
model,
the parameters of the normal conditional distributionof ,6Iu, Qu, <7!, y
in
[9]
are:Likewise,
the parameters of the normal conditional distribution oful,8, au 2 ,ae 2
yin
[10]
are:with a =
a;/a!,
and the covariance matrix is:with cii =
1/(n;.
+a).
Because the covariance matrix isdiagonal,
each ui can begenerated independently
as:The conditional
density
ofa; in !11!
can be written as:Because
e e XN ,
it follows thatud - Ns!x&dquo;i/,
so[31]
is the kernel of amultiple
of an invertedX Z
random variable.Finally,
the conditionaldensity
ofufl in [12]
isexpressible
as:Since q
s2/ 0 ,2 _ X2 ,
then,2 - 0 q S!X;2,
also amultiple
of an invertedX 2
variable.Data sets and
designs
Seven data sets
(experiments)
weresimulated,
so as to represent situations that differ in the amount of statistical information. Essentials of theexperimental designs
are in table 1. Number of sire families
(q)
varied from 10 to 10000,
while number of progeny perfamily (n) ranged
from 5 to 20. The smallestexperiment
wasI,
with10 sires and 5 progeny per sire; the
largest
one wasVII,
with a total of 100 000 records.Only
balanceddesigns (n i
= n, for i =1, 2, ... , q)
werereported here,
as similar results were found with unbalancedlayouts.
Data wererandomly generated using
parameter values of 0 and 1 for( 3
anda!, respectively.
Parametric values fora; were
from 1 to 99, thusyielding
intraclass correlations(p) ranging
from 0.01 to0.5. From a
genetic point
ofview,
an intraclass correlation of 0.5(Data
setI)
is notpossible
in a siremodel,
but it isreported
here forcompleteness.
The Gibbs
sampler [13-16]
was run atvarying lengths
of the Gibbs sequence(k
= 10 to100)
and Gibbssample
sizes(m
= 300 to3 000),
to assess the effectsof k and m on the estimated
marginal
distributions. FORTRAN subroutines of the IMSL(IMSL Inc, 1989)
were used to generate normal and invertedX 2 random
deviates. At the end of each run, the
following quantities
were retained:Sample points
in[37]
and(38),
are needed fordensity
estimation, as noted below.Marginal density
estimationThe estimators of the
marginal
densities ofor2and <7!
were:Using
thetheory
of transformation of randomvariables,
we obtained themarginal density
of the varianceratio y
=a!/a; by fixing a;, using [40]
as:If inferences are to be made about the intraclass correlation p =
ufl /(ufl
+a;),
the Jacobian of the transformation
from ,
to p is J =!e/(1 - p) 2 ,
so from!40!,
considering a; as fixed,
themarginal density
of the intraclass correlation is:Densities of any functions of the variance components can be estimated in a
similar manner.
Density [39]
can also be used to make the transformations. Note that the Gibbssampler [13-16]
does not need to be runagain
to obtain the densities of the functions of variance components.Plots were
generated using
densities[39-42] by selecting
50 to 100equally spaced points
in the &dquo;effective&dquo; range of thevariables;
the &dquo;effective&dquo; range of the variablecovers at least 99.5% of the
density
mass.Summary
statistics of theposterior
distributions such as mean,mode,
median and variance were calculatedusing
thecomposite Simpson’s
rules of numericalintegration by dividing
the effective range of the variables into 100 to 200equally spaced
intervals. Whereappropriate,
summary statistics were calculatedusing
the densities estimated athigher
values of m or k.RESULTS
The estimated
marginal
densities of variance components and of their functions(q
andp)
aredepicted
infigures
1 to7,
in which solid curvescorrespond
to densitiesestimated at
higher
values of m or k and vice versa for the dotted ones. Thesefigures correspond
to the 7designs given
in table I.Figure
1 represents a situation where 50% of the total variance is &dquo;due tosires&dquo;,
ie, ahigh
intraclasscorrelation;
as noted
earlier,
this is notpossible genetically.
Allposterior
distributions wereunimodal,
and convergence of the Gibbssampling
scheme to theappropriate marginals
was achieved with k = 20 and k =300,
as it can be ascertained fromdirect inspection
of the curves. Because of the limited information contained in data setI(q
= 10, n =5), posterior
densities were notsymmetric,
so the mean, mode and median differ. The median was closer to the true values of the parameters than the mean and themode;
this was true for all 4 distributions considered.For data sets
II-IV,
with heritabilitiesranging
from20-80%,
and number of sires from 50 to1000,
theposterior
densities(figs 2-4)
werenearly symmetric,
so the3 location statistics were very similar to each other. In
figure 4, corresponding
toa! = 1, Q e
= 10 and to a data set with 1000 sires and a total of 20 000records,
the
posterior
coefficients of variation wereapproximately 1%
foror and
9% for theremaining
parameters. This illustrates the well known result thatQ e
is less difficult to estimate thanor or
functions thereof. Theplots
suggest convergence of the Gibbssampler
at values of k as low as 10-20.Some difficulties were encountered with the Gibbs
sampling
schemes indesigns
V and VI
(fig
5 and 6,respectively).
Thesedesigns correspond
to situations of lowheritability
and of mild information about parameters contained in the data.While there was no
problem
ingeneral
with theposterior
distribution ofQ e,
thiswas not so for the
remaining
3 parameters.Typical problems
werebi-modality
orlack of smoothness in the left tail of the estimated densities. These were found to be related to insufficient Gibbs
sample
size, and were correctedby increasing
theGibbs
sampling
size(m). Compare,
forexample,
the dotted(m
=300)
with thesolid
(m
=3 000)
curves infigure
6. A more awkward distributionrequires
moresamples
to be characterizedaccurately.
Recall that each of thedensity
estimators[40]-[42]
was obtainedby averaging
m conditional densities. In the case ofbadly
behaved
marginal distributions,
which are associated with lowheritability
and smalldata sets, it is
possible
to obtain &dquo;outlier conditional densities&dquo;. The outliers can be influential and distort the values of the estimateddensity
unless m islarge enough.
It is clear from the
figures
that smoothposterior
estimated densities for awkwarddistributions,
eg, forheritability
as low as4%,
can be obtainedby increasing
Gibbssample
size, albeit atcomputational
expense. At this level ofheritability,
when thenumber of sire families was increased to
10 000, posterior
distributions werenearly
symmetric (fig 7),
and there was littlevariability
about theparametric
values.DISCUSSION
Gianola and
Foulley (1990)
gaveapproximations
to themarginal
distribution of variance components, which should beadequate provided
that thejoint posterior density
of the ratios between variance components issharp
orsymmetric
about its mode. As in anyapproximation,
itsgoodness depends
on the amount of information contained in thedata,
and on the number of parameters in the model. Animportant practical question
is to what extent theapproximations
hold. From ourexperiments,
the information in data sets
I, II,
V and VI is not sufficient tojustify
use of theapproximation,
even in a model with 2 variance components; this is so because themarginal
distribution of the variance ratio was neithersharp
norsymmetric.
However,
theapproximation
would hold in theremaining
data sets. A more detailedcomparative study
between the exact and theapproximate
method is needed toascertain when the latter can be used
confidently.
In themeantime,
the Gibbssampling provides
a way to estimate themarginal
distributions withoutusing complicated
numericalintegration procedures.
The present work is an extension of that of Gelfand et al
(1990),
in whichthey
illustrated the Gibbssampler
with several normal datamodels, including
variance component estimation in a balanced one-way random effects
layout.
Inour paper, formulae were
presented
toimplement
Gibbssampling
in a moregeneral
univariate mixed linear model. With these
extensions,
a fullBayesian
solution to theproblem
of inference about variance components, and functions thereof in sucha mixed linear model is
possible.
Gibbssampling
turns ananalytically
intractable multidimensionalintegration problem
into a feasible numerical one.Gibbs
sampling
isrelatively
easy toimplement.
Given the likelihood function and thepriors,
one canalways
obtain thejoint posterior density
of the unknowns under consideration. From thisdensity,
at least in the normal linearmodel,
one candirectly
get the full conditional distribution of aparticular
variableby fixing
therest of the variables in the
joint posterior
distribution. The set of all full conditional densitiesgives
theexpressions
needed forimplementing
the Gibbssampler.
The fullconditional densities in this case are in
families,
such as normal and invertedx 2 ,
where
generating
random numbers is notexceedingly complicated.
Thelimiting
factor is the
efficiency
with which random numbers can begenerated,
because the methodrequires generating
alarge quantity
of randomnumbers;
m x k x r, wherer is the number of parameters in the
model;
r = q +3,
in our case. It is difficultto
specify
the values of m andk,
apriori.
Gelfand et al(1990)
described a methodfor
assessing
convergence under different values of m andk,
andsuggested using
k = 10 — 20 and m = 100 for a variance componentproblem
in a balanced one-way model.
However, they
increased m to 1000 when the variance ratio was under consideration. Our numerical results for the same model support theirsuggestion
forthe value of
k,
but indicate that Gibbssample
sizes of 2 000 to 3 000 may be needed forbadly
behavedmarginal
distributions(figs 6, 7).
The most difficult situationencountered in our
experiments
was when intraclass correlation andsample
sizewere both small. In
general
theappropriate
values of m and kdepend
on thenumber of variables in the
model,
theshapes
of themarginal
distributions and the accuracyrequired
to estimate densities.An
appealing
aspect of Gibbssampling
is itsflexibility.
Forexample,
densitiesof functions of the
original
variables included in theposterior
distribution can be estimatedusing
standardtheory
of random variabletransformation,
with minimal additional calculations.Gibbs
sampling
is iterative. In this respect, there are 2 issues of concern: con-vergence and
uniqueness. However,
Geman and Geman(1984)
showed that under mildregularity conditions,
the Gibbssampler
convergesuniquely
to theappropri-
ate
marginal
distributions. Casella andGeorge (1990)
discuss numerical means toaccelerate convergence. Another way to
speed
up convergence is tointegrate
outanalytically
some nuisanceparameters
from thejoint posterior
distribution beforerunning
the Gibbssampler.
In our case, inference issought
on variance components and on their functions.Here,
u andP
would beintegrated
out beforerunning
the Gibbssampler.
The necessaryconditional
distributions aregiven by
Gianola andFoulley (1990).
The finite mixture
density
estimators ofae and Q u in [39]
and[40]
can bethought
of as average of a finite number of invertedX 2 .
This ensures that&dquo;point
estimates&dquo; of variance components, eg, mean, mode and median will
always
bewithin the allowable parameter
space. Likewise,
interval estimates are also within the parameter space, in contrast to theasymptotic
confidence intervals obtained from full or restricted maximumlikelihood,
which may includenegative
values.Once the
marginal
densities areobtained,
it is easy to calculate summary statistics from theposterior
distributions. From a decision theoreticalviewpoint,
theoptimum Bayes
estimator underquadratic
loss is theposterior
mean; theposterior
median isoptimal
if the loss functions isproportional
to the absolute value of theerror of estimation and the
posterior
mode isoptimal
if a step loss function isadopted.
Some caution is needed in variance component
problems
ingenetics.
Forexample,
some
genetic
models dictate bounds for aparticular
variable. If oneemploys
a &dquo;sire&dquo;model,
the intraclass correlation(p)
must lie inside the[0, 1/4J interval,
becauseheritability
is between 0 and 1. Thisimplies
that the variance ratioa!/a;
is between0 and
1/3,
and that 0 !a! :0;:; a; /3. Hence,
use of truncatedinverted X 2
densitiesin the Gibbs
sampler
would be more sensible in such a model. On the otherhand,
if one considers an &dquo;animal&dquo;model,
the bounds for p anda u 2!U2
are from 0 to1,
and from 0 to oo
respectively,
and the variance components are unbounded. Since any &dquo;sire&dquo; model isexpressible
as an &dquo;animal&dquo;model,
this would solve theproblem
mentioned
above, though
at somecomputational
expense.Gibbs
sampling
is computer intensive, but in somesimple models,
such as thesire model
employed here, large
data sets can be handled(eg,
data set VII andfig 7).
Thefeasibility
of Gibbssampling
inlarge
&dquo;animal&dquo; model is asubject
forfurther research.
ACKNOWLEDGMENTS
We thank the college of Agriculture and Life Sciences, University of Wisconsin-Madison,
for supporting this work. C Ritter of the Department of Statistics is thanked for useful suggestions. Computational resources were generously provided by the San Diego
Supercomputer Center, San Diego, CA. We thank two anonymous reviewers for comments on the paper.
REFERENCES
Casella
G, George
EI(1990)
Gibbsfor
Kids(An
Introduction to GibbsSampling).
Paper BU-1098-M,
BiometricsUnit,
Cornel UnivGelfand
AE,
Smith AFM(1990) Sampling-based approaches
tocalculating marginal
densities. J Am Stat Assoc
85, 398-409
Gelfand
AE,
HillsSE,
Racine-PoonA,
Smith AFM(1990)
Illustration ofBayesian
inference in normal data models
using
Gibbssampling.
J Am Stat Assoc 85, 972-985 GemanS,
Geman D(1984)
Stochasticrelaxation,
Gibbs distributions and theBayesian
restoration ofimages.
IEEE Trans Pattern Anal MachineIntelligence 6, 721-741
Gianola
D, Foulley
JL(1990)
Variance estimation fromintegrated
likelihood(VEIL).
Genet SelEvol 22,
403-417Gianola
D,
ImS,
FernandoRL, Foulley
JL(1990a)
Mixed modelmethodology
and the Box-Cox
theory
of transformations: aBayesian approach.
In: Advances inStatistical Methods
for
GeneticImprovement of
Livestock(Gianola D,
HammondK, eds) Springer-Verlag,
NewYork,
15-40Gianola
D,
ImS,
Macedo FWM(1990b)
A framework forprediction
ofbreeding
value. In: Advances in Statistical Methods
for
GeneticImprovement of
Livestock(Gianola D,
HammondK, eds) Springer-Verlag,
NewYork,
210-238Harville DA
(1974) Bayesian
inference for variance componentsusing only
errorcontrasts. Biometrika
61,
383-385Henderson CR
(1953)
Estimation of variance and covariancecomponents. Biomet-
rics
9, 226-252
IMSL Inc
(1989) IMSL © Stat/Library:
Fortran Subroutinesfor
StatisticalAnalysis.
Houston,
TXMacedo
FWM,
Gianola D(1987) Bayesian analysis
of univariate mixed models with informativepriors.
In: Eur Assoc AnimProd,
38th Annu Meet.Lisbon, Portugal,
35 p
Patterson
HD, Thompson
R(1971) Recovery
of interblock information when block sizes areunequal.
Biometrika 58, 545-554Thompson
WA(1962)
Theproblem
ofnegative
estimates of variance components.Ann Math Stat