HAL Id: hal-00893947
https://hal.archives-ouvertes.fr/hal-00893947
Submitted on 1 Jan 1992
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Bayesian inference about dispersion parameters of
univariate mixed models with maternal effects:
theoretical considerations
Rjc Cantet, Rl Fernando, D Gianola
To cite this version:
Original
article
Bayesian
inference about
dispersion
parameters
of
univariate mixed models
with maternal effects:
theoretical considerations
RJC Cantet
RL Fernando, D
Gianola
University of Illanoas, Department of Animal Sciences, Urbana,
IL 61801, USA(Received
14 January 1991; accepted 5 January1992)
Summary - Mixed linear models for maternal effects include fixed and random elements, and dispersion parameters
(variances
andcovariances).
In this paper a Bayesian modelfor inferences about such parameters is presented. The model includes a normal likelihood
for the data, a "flat" prior for the fixed effects and a multivariate normal prior for the direct and maternal
breeding
values. The prior distribution for the genetic variance-covariance components is in the inverted Wishart form and the environmental components followinverted
X
2
prior distributions. The kernel of the joint posterior density of the dispersion parameters is derived in closed form. Additional numerical and analytical methods of interest that aresuggested
to complete a Bayesian analysis include Monte-Carlo Integration, maximum entropy fit, asymptotic approximations, and the Tierney-Kadane approach to marginalization.maternal effect / Bayesian method / dispersion parameter
Résumé - Inférence bayésienne des paramètres de dispersion de modèles mixtes uni-variates avec effets maternels : considérations théoriques. Les modèles linéaires mixtes
avec effets maternels comprennent des éléments fixés et aléatoires, et des paramètres de dis-persion
(variances
etcovariances).
Dans cet article estprésenté
un modèle6ayésien
pour l’estimation de ces paramètres. Le modèle comprend une vraisemblance normale pour les données, un a priori uniforme pour leseffets fixés
et un a priori multivariate normal pour les valeurs génétiques directes et maternelles. Là distribution a priori des composantes de variance-covariance est une distribution de Wishart inverse et les composantes de milieu*
Correspondence and reprints; present address: Facultad de
Agronomia,
Universidad de Buenos Aires, Departamento de Zootecnia, 1417 Buenos Aires,Argentina.
**
suivent des distributions a priori de
x
2
inverse.
Le noyau de la densité conjointe a posteri-ori des paramètres de dispersion est explicité. En outre, des méthodes numériques etana-lytiques
sontproposées
pour compléterl’analyse
bayésienne:intégration
par des méthodesde
Monte-Carlo,
ajustement par le maximumd’entropie,
approximations asymptotiques et la méthode de marginalisation deTiemey-Kadane.
effet maternel / méthode bayésienne / paramètre de dispersion
INTRODUCTION
Mixed linear models for the
study
ofquantitative
traitsinclude,
in addition to fixed and randomeffects,
the necessarydispersion
parameters.Suppose
one is interestedin
making
inferences about variance and covariance components.Except
in trivialcases, it is
impossible
to derive the exactsampling
distribution of estimators of these parameters(Searle, 1979)
so, atbest,
one has to resort toasymptotic
results.Theory
(Cramer,
1986)
indicates that thejoint
distribution of maximum likelihoodestimators of several parameters is
asymptotically normal,
and therefore so are theirmarginal
distributions.However,
this may notprovide
anadequate
description
ofthe distribution of estimators with finite
sample
sizes. On the otherhand,
theBayesian approach
iscapable
ofproducing
exactjoint
andmarginal
posterior
distributions for anysample
size(Zellner,
1971;
Box andTiao,
1973),
whichgive
afull
description
of the state ofuncertainty posterior
to data.In recent years,
Bayesian
methods have beendeveloped
for variancecomponent
estimation in animalbreeding (Gianola
andFernando, 1986;
Foulley
etal,
1987;
Macedo and
Gianola,
1987;
Carriquiry,
1989;
Gianola etal 1990a,
b).
All these studies foundanalytically
intractablejoint posterior
distributions of(co)variance
components, as
Broemeling
(1985)
has also observed. Furthermarginalization
withrespect to
dispersion
parameters seems difficult orimpossible by analytical
means.However,
there are at least 3 otheroptions
for thestudy
ofmarginal
posterior
distributions:1),
approximations;
2),
integration by
numerical means; and3),
numericalintegration
forcomputing
moments followedby
a fit of thedensity
using
thesenumerically
obtainedexpectations.
Recent advances incomputing
haveencouraged
the use of numerical methods inBayesian
inference. Forexample,
after thepioneering
work of Kloek and VanDijk
(1978),
Monte Carlointegration
(Hammersley
andHandscomb, 1964; Rubinstein,
1981)
has beenemployed
in econometric models(Bauwens,
1984;
Zellner etal,
1988),
seemingly
unrelatedregressions
(Richard
andSteel,
1988)
andbinary
responses(Zellner
andRossi,
1984).
Maternal effects are an
important
source ofgenetic
and environmental variation in mammalianspecies
(Falconer, 1981).
Biometrical aspects of the associatedtheory
were firstdeveloped by
Dickerson(1947),
andquantitative
genetic
modelswere
proposed
by Kempthorne
(1955),
Willham(1963, 1972)
and Falconer(1965).
Evolutionary biologists
have also become interested in maternal effects(Cheverud,
Henderson,
1984,
1988).
Although
there are maternal sources of variation within and amongbreeds,
we are concerned hereonly
with the former sources.The purpose of this
expository
paper is to present aBayesian
model for inference about variance and covariance components in a mixed linear modeldescribing
a trait affectedby
maternal effects. The formulation isgeneral
in the sensethat it can be
applied
to the case where maternal effects are absent. Thejoint
posterior
distribution of thedispersion
parameters is derived. Numerical methods forintegration
ofdispersion
parametersregarded
as &dquo;nuisances&dquo; inspecific
settings
are reviewed.
Among
these,
Monte Carlointegration by
&dquo;importance
sampling&dquo;
(Hammersley
andHandscomb, 1964; Rubinstein,
1981)
is discussed.Also,
fitting
a &dquo;maximumentropy&dquo;
posterior
distribution(Jaynes,
1957,
1979)
using
momentsobtained
by
numerical means(Mead
andPapanicolaou,
1984;
Zellner andHighfield,
1988)
is considered.Suggestions
on someapproximations
tomarginal
posterior
distributions of the(co)variance
components aregiven.
Asymptotic
approximations
using
theLaplace
method forintegrals
(Tierney
andKadane,
1986)
are alsodescribed as a means for
obtaining
approximate posterior
moments andmarginal
densities. Extension of the methods studied here to deal withmultiple
traits ispossible
but thealgebra
is more involved.THE BAYESIAN MODEL
Model and
prior assumptions
about locationparameters
The maternal animal model
(Henderson, 1988)
considered is:where y is an n x 1 vector of records and
X,
,
o
Z
Z
m
andE
m
areknown,
fixed,
n x p, n x a, n x a and n x dmatrices,
respectively;
without loss ofgenerality,
the matrix X is assumed to have full-column rank. The vectors!,
ao, am and Cm are unknown fixedeffects,
additive directbreeding
values,
additive maternalbreeding
values and maternal environmentaldeviations,
respectively.
The n x 1vector eo contains environmental deviations as well as any
discrepancy
between the &dquo;structure&dquo; of the model(XR+
Z
o
a
o
+a
m
m
Z
+E
m
e
m
)
and the data y. As inGianola et al
(1990b),
the vectorsp,a
o
,
am and em areformally
viewed as locationparameters of the conditional distribution
yl P,
ao, , maem, but a distinction is madebetween 13
and the other 3 vectorsdepending
on the state ofuncertainty prior
toobserving
data. It is assumed apiiori
that P
follows,a
uniformdistribution,
so as toreflect vague
prior
knowledge
on this vector.Polygenic
inheritance is often assumed for a =[a!, a!]’
(Falconer,
1981; Bulmer,
1985)
so it is reasonable topostulate
aprio
7
i
that a follows the multivariate normal distribution:where G is a 2 x 2 matrix with
diagonal
elementso, Ao 2
andaA 2&dquo;&dquo;
the variancecomponents for additive direct and maternal
genetic effects,
respectively,
and offeffects. The a x a
positive-definite
matrix A has elementsequal
toWright’s
coefficients of additiverelationship
or twice Melecot’s coefficients of co-ancestry(Willham, 1963).
Maternal environmentaldeviations, presumably
causedby
thejoint
action of many factorshaving
relatively
small effects are also assumed to benormally, independently
distributed(Quaas
andPollak,
1980;
Henderson,
1988)
as:where
u5!
is the maternal environmental variance. It is assumed that apriori
p,
a and Cm aremutually independent.
For the vector y, it will be assumed that:where
(1’!o
is
the variance of the direct environmental effects. It should be noted that[1-4J
complete
thespecification
of the classical mixed linear model(Henderson,
1984),
but in thelatter,
distributions[2]
and[3]
have afrequentist
interpretation.
Asimplifying assumption
made in thismodel,
foranalytical
reasons, is that the direct and maternal environmental effects are uncorrelated.Prior
assumptions
about varianceparameters
Variance and covariance components, the main focus of this
study,
appear in thedistributions of a, emand eo. Often these components are unknown. In the
Bayesian
approach,
ajoint prior
distribution must bespecified
forthese,
so as to reflectuncertainty prior
toobserving
y. &dquo;Flat&dquo;prior
distributions,
although
leading
toinferences that are
equivalent
to those obtained from likelihood in certainsettings
(Harville,
1974,
1977)
can causeproblems
in others(Lindley
andSmith,
1972;
Thompson,
1980;
Gianola etal,
1990b).
In thisstudy,
informativepriors
of thetype of proper
conjugate
distributions(Raiffa
andSchlaiffer,
1961)
are used. Aprior
distribution is said to be
conjugate
if theposterior
distribution is also in the samefamily
Forexample,
a normalprior
combined with a normal likelihoodproduces
anormal
posterior
(Zellner,
1971;
Box andTiao,
1973).
However,
as shown later forthe variance-covariance structure under
consideration,
theposterior
distribution of thedispersion
parameters is not of the same type as theirjoint prior
distribution. This was also foundby
Macedo and Gianola(1987)
andby
Gianola et al(1990b)
who studied a mixed linear model with several variance components
employing
normal-gamma conjugate
prior
distributions.An inverted-Wishart distribution
(Zellner,
1971;
Anderson,
1984;
Foulley
etal,
1987)
will be used forG,
withdensity:
where G* =
!c9Gh.
The 2 x 2 matrix Gh of&dquo;hyperparameters&dquo;, interpretable
asprior
values of thedispersion
parameters, hasdiagonal
elementss2
0
ands
2
M
,
andvalues may be difficult in many
applications.
Gianola et al(1990b)
suggested fitting
the distribution to past estimates of the(co)variance
componentsby
eg a method of moments fit. For traits such as birth andweaning
weight
in cattle there isa considerable number of estimates of the necessary
(co)variance
components inthe literature
(Cantet
etal,
1988).
Clearly,
the value ofG
h
influencesposterior
inferences unless theprior
distribution is overwhelmedby
the likelihood function(Box
andTiao,
1973).
Similarly,
as in Hoeschele et al(1987)
the invertedx
2
distribution(a
particular
case of the inverted Wishartdistribution)
issuggested
for the environmentalvariance components, and the densities are:
The
prior
variancess2
m
ands 20
are the scalar counterparts ofG
n
,
and no and nm are thecorresponding degrees
of belief. Themarginal
distribution of anydiagonal
element of a Wishart random matrix isX
2
(Anderson,
1984).
Likewise,
themarginal
distribution of thediagonal
of an inverted-Wishart random matrixis inverted
X
Z
(Zellner, 1971).
Note that the 2 variances in[6]
and[7]
cannot bearranged
in matrix form similar to the additive(co)variance
components in G to obtain an inverted Wishartdensity,
unless no = n,n.Setting
ng I n o
and nm to zero makes the
prior
distributions for all(co)variance
components&dquo;uniformative&dquo;,
inthe sense of Zellner
(1971).
POSTERIOR DENSITIES
Joint
posterior density
of allparameters
To facilitate
marginalization
of[8],
andas
in Gianola et al(1990a),
letW =
[XIZ
oI
Z
mI
E
m]
,
0’ =
[j
f
[a’[e£
]
and
definei
such thatwhere the p + 2a + d square matrix E is
given by:
Using
this,
one can write:Gianola et al
(1990a)
noted thatin
(9J
can beinterpreted
as a &dquo;mixed model residual sum ofsquares&dquo;. Using
[9]
inPosterior
density
of the(co)variance
components
To obtain the
marginal
posterior
distribution ofG, u5!
andO’!o,
0 must beintegrated
out of(10).
This can beaccomplished
noting
that the secondexponential
term in
[10]
is the kernel of the(p
+ 2a +d)-variate
normal distributionand the variance-covariance matrix is
non-singular
because X has full-column rank. Theremaining
terms in[10]
do notdepend
on 0.Therefore,
withR
o
being
the range of0, using
properties
of the normal distribution we have:The
marginal
posterior
distribution of all(co)variance
components then is:The structure of
[11]
makes it difficult orimpossible
to obtainby
analytical
means themarginal posterior
distribution ofG,
o,2 . E
oror2 E,,,.
Therefore,
in order tomake
marginal posterior
inferences about the elements of G or the environmentalvariances, approximations
or numericalintegration
must be used. The latter maygive
accurate estimates ofposterior
moments, but inmultiparameter
situationscomputations
can beprohibitive.
There are 2 basic
approaches
to numericalintegration
inBayesian analysis.
The first one is based on classical methods such asquadrature
(Naylor
andSmith,
1982,
1988;
Wright, 1986).
Increased power of computers has made Monte Carlo numericalintegration
(MCI),
the secondapproach,
feasible inposterior
inferencesin econometric models
(Kloek
and VanDijk,
1978;
Bauwens,
1984;
Bauwens andRichard, 1985;
Zellner etal,
1988)
and in other models(Zellner
andRossi,
1984;
Geweke,
1988;
Richard andSteel,
1988).
In MCI the error isinversely
proportional
toN
l/2
,
where N is the number ofpoints
where theintegrand
isevaluated
(Hammersley
andHandscomb, 1964; Rubinstein,
1981).
Eventhough
this&dquo;convergence&dquo;
of the error to zero is notrapid,
neither thedimensionality
of theintegration region
nor thedegree
of smoothness of the function evaluated enterPOSTERIOR MOMENTS VIA MONTE CARLO INTEGRATION
Consider
finding
moments of parametershaving
thejoint posterior
distribution withdensity
[11].
Letr’ =
[or2 A ,, or2 A
M
0’ AoAm,,
m
2
E
0’
or E
2
.]
,
and letg(r)
be eithera
scalar,
vector or matrix function of r of which we would like to compute itsposterior
expectation. Also,
let(11!
berepresented
asp(T ! y, H),
where H stands forhyperparameters.
Then:assuming
theintegrals
in[12]
exist.Different
techniques
can be used with MCI to achieve reasonable accuracy. Anap-pealing
one forcomputing
posterior
moments(Kloek
and VanDijk, 1978; Bauwens,
1984,
Zellner andRossi,
1984;
Richard andSteel,
1988)
is called&dquo;importance
sam-pling&dquo;
(Hammersley
andHandscomb,
1964;
Rubinstein,
1981).
LetI(r)
be a knownprobability density
function defined on the space ofT;
I(r)
is called theimportance
sampling
function.Following
Kloek and VanDijk
(1978)
letM(r)
be:with
[13]
defined in theregion
where7(F)
> 0. Then[12]
isexpressible
as:where the
expectation
is taken with respect to theimportance density
I(r).
Using
astandard
Monte Carloprocedure
(Hammersley
andHandscomb, 1964;
Rubinstein,
1981),
values of r are drawn at random from the distribution withdensity
I(r).
Then the functionM(r)
is evaluated for each drawn value ofr,
r
j
(i
=1, ... ,
m)
say. Forsufficiently
large
m:The critical
point
is the choice of thedensity
function7(F).
The closerI(r)
is top(r ! y, H),
the smaller is the variance ofM(r),
and the number ofdrawings
neededto obtain a certain accuracy
(Hammerley
andHandscomb,
1964;
Rubinstein,
1981).
Another
important
requirement
is that randomdrawings
of r should berelatively
simple
to obtain from7(F) (Kloek
and VanDijk,
1978; Bauwens,
1984).
For locationparameters, the multivariate
normal,
multivariate and matric-variate t andpoly-t
distributions have been used asimportance
functions(Kloek
and VanDijk,
1978;
Bauwens,
1984;
Bauwens andRichard,
1985;
Richard andSteel, 1988;
Zellner etfrom the inverted Wishart distribution. There are several
problems
yet to be solved and theprocedure
is stillexperimental
(Richard
andSteel,
1988).
However,
resultsobtained so far make MCI
by importance
sampling
promising
(Bauwens,
1984;
Zellner andRossi, 1984;
Richard andSteel,
1988;
Zellner etal,
1988).
Consider
calculating
the mean ofG,
o, 20
anda
2
m
withjoint posterior density
asgiven
in[11].
From[13]
and[14]:
Let now:
where:
7
i
(r)
=prior
density
of G((5!
timesk
l
,
theintegration
constant),
I
2
(r)
=prior
density
ofU2
E
((6!
timesk
2
,
theintegration
constant),
1
3
(r)
=prior
density
of u5!
[7]
timesk
3
,
theintegration
constant).
Then:where
k
o
is the constant ofintegration
of(11!.
Evaluating
E(r ! y,
H)
then entails thefollowing
steps:a)
draw at random the elements of r from distributions with densitiesh (r)
(inverted Wishart),
2
(r)
1
(inverted
x
2
)
and(r) (inverted
I
3
X
Z
).
This can be doneusing,
forexample,
thealgorithm
of Bauwens(1984).
b)
Evaluatek
o
=(J
[11] dT!-1.
Now,
Note that
M
o
is[18]
without r. Thenk
o
can be evaluatedby
MCIby computing
the average of
M
o
,
andtaking
itsreciprocal.
c)
OnceM
o
isevaluated,
then computeM(r)
= rM
o
.
In order toperform
steps(b)
and(c),
the mixed modelequations
and the determinant of W’W + E need tobe solved and
evaluated, repeatedly,
for eachdrawing.
The mixed modelequations
can be solved
iteratively
anddiagonalization
or sparse matrix factorization(Misztal,
1990)
can beemployed
toadvantage
in the calculation of the determinant.This
procedure
can be used to calculate any function of r. Forexample,
theposterior
variance-covariance matrix is:MAXIMUM ENTROPY FIT OF MARGINAL POSTERIOR DENSITIES
A full
Bayesian analysis
requires
finding
themarginal posterior
distribution of each of the(co)variance
components.Probability
statements andhighest
posterior
density
intervals are obtained from these distributions(Zellner,
1971;
Box andTiao,
1973).
Marginal posterior
densities can be obtainedusing
the Monte Carlo method(Kloek
and VanDijk,
1978)
but it iscomputationally expensive.
An alternative isto compute
by
MCI some moments(for
instance,
the first4)
of each parameter,and then fit a function that
approximates
the necessarymarginal
distribution. Amethod that
gives
a reasonablefit,
&dquo;Maximumentropy&dquo;
(ME),
has been usedby
Mead andPapanicolaou
(1984)
and Zellner andHighfield
(1988). Choosing
theME distribution means
assuming
the &dquo;least&dquo;possible
(Jaynes, 1979),
ie,
using
information one has but notusing
what one does not have. An ME fit based on thefirst 4 moments
implies constructing
a distribution that does not use informationbeyond
thatconveyed by
these moments.Jaynes
(1957)
set the basis for what isknown as the &dquo;ME formalism&dquo; and found a role for this to
play
inBayesian
statistics.The entropy
(W)
of a continuous distribution withdensity
p(x)
is defined(Shannon,
1948; Jaynes,
1957,
1979)
to be:The ME distribution is obtained from the
density
that maximizes[20]
subject
to t’he conditions:
r
where po = 1
(by
definition of a properdensity
function)
andJLi
(i
=1, ... , 4)
are the first 4 moments of the distribution of x. Zellner andHighfield (1988)
expressed
the function to be maximized as theLagrangian:
where the
l
i
(i
=0,...,4)
areLagrange multipliers
and I =!lo, ll, d2, d3, l4!’.
Noteformula
[23]
isexpressible
as:Using
Euler’sequation
(Hildebrand, 1972)
the condition for astationary
point
is:
Because H does not
depend
onp’(x), [25]
holdsonly
ifaH/ap(x)
=0, ie,
if:Hence,
the condition for astationary point
is:plus
the 5 constraintsgiven
in(21!.
From(26!,
thedensity
of the ME distribution of x has the form:To
specify
the ME distributioncompletely
1 must be found. Zellner andHighfield
(1988)
suggested
a numerical solution based on Newton’s method.Using
[27]
the side conditions[21]
can be written as:These derivatives are
simply
moments(with
negative
sign)
of the maximumentropy distribution.
Putting
in
[29]
andsetting
thisequal
to[28]
one obtains the linear system in 1:This system can be solved for
h ( j
=0,1, ... , 4)
to obtain a new set of trial valuesand,
thus,
an iteration is established.Defining
and
observing
that0 <_
i+ j <_
8,
the above system can be written in matrix notation as:This system is solved for
8
1
’
l
to obtainIN
=Ilt-11
+1
ft]
,
the vector of newtrial values. Iteration continues until 5 becomes
appropriately
small. Zellner andHighfield
(1988)
showed that coefficient matrix in[30]
ispositive definite,
so solutions areunique.
In summary, the method includes 3 types ofcomputations.
First,
the moments pi - p4must becomputed
by
some method such asMCI;
this isdone
only
once.Second,
theG
i
values(i
=0,1, ... , 8)
arecomputed
at every round of iterationcarrying
out unidimensionalintegrations,
as indicated in[28].
Third,
the 5 x
5 system
[30]
is solved. At convergence, the MEdensity
[27]
isemployed
toSOME ANALYTICAL APPROXIMATIONS TO MARGINAL POSTERIOR DENSITIES
Because numerical
integration
can becomputationally expensive
and the accuracy of MCI in this type ofproblem
is stillunknown,
we consider severalapproximations
to
marginal
posterior
distributions.The mode of the
posterior density
[11]
can be foundby maximizing
thisjointly
with respect toG,
o- E
2m
andu5!.
Foulley
et al(1987),
Gianola et al(1990b)
andMacedo and Gianola
(1987)
showed how this could be done witha
’
simpte algorithm
based on first derivatives. Additional
algorithms
can beconstructed using
secondderivatives,
and the necessaryexpression
aregiven
in theAppen.dix.
The solutions can be viewed asweighted
averages of REML &dquo;estimators&dquo; ofdispersion
parametersand
of thehyperparameters
G
h
,
sE
o
andsi
m’
Let the modal values so obtained be1i,
!2
and!2
or i’,
in compact.’
.°
&2 E.
and&2 E ’n,
or r,
in compact.-Consider
approximations
to themarginal
density
of G because this matrixcontains the parameters of
primary
interests. One can write: ,where
p(u5!, u5!
y, H)
is theposterior
density
ofu5! , u5!
obtained afterinte-grating
G out of[11].
It seemsimpossible
to carry out thisintegration analytically.
Following
ideas in Gianola and Fernando(1986),
we propose as firstapproximation:
It would be better to use the modal values
of p(u5! , u 5!
y, H)
rather than(TJ.:m
and3!,
butfinding
this distribution does not seem feasible.Using
[32]
in[11]
one obtains:It should be noted that now
6
=f (G, &’ E &dquo;&dquo; a2 E !)
andt
=h(G, a2 E &dquo;&dquo; 1?2 E ).
Then,
the MCI method can be used to compute moments of(33J.
The additionaldegree
ofmarginalization
with respect to[11]
achieved in thisapproximation
maybe
small,
butsavings
incomputing
accrue becausedrawing
values ofuk
m
ando-E
o
from
I
2
(r)
andI
3
(r),
respectively,
is nolonger
necessary.In the
preceding, replace
and
using
thepreceding developments
in[33]
we write, afterneglecting
IW
I
W+EI-1/
2
This
density
is in the inverted Wishartform,
with parametersn.
=ng + a and
G*,
provided
G*
ispositive
definite. If not, one can &dquo;bend&dquo; this matrixfollowing
the ideasof Hayes
and Hill(1981).
Thecomputational
advantage
of[34]
over[33]
is that
y’y -
9 W’y
would be evaluatedonly
once atG,Ô’k
m
,Ô’k
o
.
Further,
the inverted Wishart form of[34]
yields
ananalytical
solution for the(approximate)
marginal
posterior
densities ofQAo
and
o,2 A M,
soapproximate
probability
statementsabout elements of G can be made with relative ease. A third
approximation
would bewriting
[34]
asso we would have an inverted Wishart distribution with
hyperparameters
n9
=n
9 + a and G. If G is obtained with an
algorithm
that guaranteespositive
semi-definiteness such as EM(Dempster
etal,
1977),
this would circumvent thepotential
problem posed by G*
in(34!.
The fourth
approximation
involves the matrix of second derivatives(C, say)
of the
logarithm
of[11]
with
respect to theunique
elements ofG, u5!
ando,
2
E
o
and thenevaluating
C at1i, %5!
and
%5!.
The second derivatives are in theAppendi!. Invoking
theasymptotic
normality
property ofposterior
distributions(Zellner, 1971),
one wouldapproximately
have :where it is assumed that the matrix -C =
f(l#, %5! , %5!)
has full rank. TheTHE TIERNEY-KADANE APPROXIMATIONS
The
approximation
in[36]
produces
reasonable results when theposterior
distri-bution isunimodal,
which holds forlarge
enough samples. Tierney
and Kadane(1986)
described anotherapproximation
(based
onLaplace’s
method forintegrals),
and this is reviewed in thefollowing
section.Single
parameter situationLet
g(r)
= g be apositive
function of the scalar parameter r. Thenwhere l is the likelihood
function,
7
r
is theprior
density
and c is theintegration
constant
-With n
being sample
size,
letUsing
[44]
in[37]:
The method of
Tierney
and Kadane(1986)
continues as follows. Letr&dquo;!
be theposterior
mode(which
is also the maximum ofL), L’(r)
andL&dquo;(r)
be the first and second derivatives of L with respect toT and let (1&dquo;2= -1/ LI/ (r
m
)
’
Using
aTaylor
seriesexpansion
fornL(r)
aboutr&dquo;
L
we have:Noting
thatand
retaining
terms up tosecond-order,
theexpansion
becomes:Using
this,
the denominator in[45]
can beapproximated
as:Taking
the ratio between[47]
and[46]
asrequired
in[45]
then,
approximately,
we have:An
interesting
aspect of thisapproximation
is thatonly
first and second orderderivatives are
needed,
and this is less tedious than otherapproximations
suggested
by
eg, Mosteller and Wallace(1964)
andLindley
(1980),
requiring
evaluation ofthird derivatives. The
posterior
variance can also beapproximated
by
finding
theposterior
mean ofg
2
.
Theonly
modification needed is to define L* asThe
multiparameter
caseWhen r is a vector, as in this paper,
[48]
generalizes
to:where
r!
andI’m
maximize L* andL,
respectively,
and H* and H are minus theinverse matrices of second derivatives of L* and L with respect to
r,
evaluated atr!
andI’
m
,
respectively.
Marginal posterior
densitiesThe method can also be used to
approximate marginal posterior
densities of individual parameters of T. Partition r’ as!T1, r
,2
j.
If the order of r is p, say,then
T
2
is of order p -1 (4
in ourcase).
Themarginal posterior density
ofT
l
is:where
1
r(r l, r
2
)
is thejoint posterior density
of r. Frompreceding
developments,
where
I
’m
is
the mode of theposterior
distribution ofT,
andL&dquo;(r .. )
is the matrixof second derivatives of L with respect to 7r. Then:
where
l(r
m
)
is thelog-likelihood
evaluated atr
m
.
Hence,
[53]
becomes:Consider now the numerator of
(51!,
and write it as:is a function where
T
1
is fixed. Define)
l
(r
2m
r
to be the(p - 1)
x 1 vector thatmaximizes this function. This maximizer can be found
employing
the derivatives in theappendix. Then,
similar to(53!,
we can write.where
B&dquo;(r
l
,
r
2m
(r¡))
is the(p —
1)
x(p -
1)
matrix of second derivatives of Bwith respect to r2.
Taking
the ratio between[56]
and[54]
theposterior
density
ofT
1
in[51]
isapproximately:
The moments of the
posterior
distribution ofI’i,
must be foundnumerically
.
Remarks
It has been shown that the method of
Tierney
and Kadane(1986)
has less error than the usual normalapproximation
centered at theposterior
mode with the order ofapproximation
being
O(n-
2
).
However,
it alsorequires
that the functions to beexpanded
be either unimodal or dominatedby
asingle
mode,
sosample
size mustbe
sufficiently large
for this to hold.The
requirement
thatg(T)
be apositive
function is restrictive.Tierney
and Kadane(1986)
pointed
out that for theapproximation
to be accurate for a functiong
taking
bothpositive
andnegative values,
theposterior
distribution of g mustbe concentrated almost
entirely
on one side of theorigin.
However,
Tierney
et al(1988)
extended the method toapply
toexpectations
and variances ofnon-positive
functions. To obtain a second-orderapproximation
toE[g(r)],
they
used the method ofTierney
and Kadane(1986)
toapproximate
the momentgenerating
functionE{exp [!(F)]},
whoseintegrand
ispositive,
and then the result was differentiated. Anotherdifficulty
arises in theapproximation
[49]
to theposterior
variance ofg(r).
Unlesscomputations
are made with sufficientprecision,
[49]
can have alarge
error or turn upnegative.
Similarproblems
can arise in thecomputations
ofposterior covariance,
ieas a covariance matrix
computed
from[58]
may not bepositive
semi-definite.CONCLUSION
This paper presents
theory
andtechniques
forcarrying
out aBayesian analysis
ofdispersion
parameters in a univariate model for maternal effects.Hower,
implemen-tation of the methods
suggested
here poses difficulties toquantitative geneticists
interested in
analysis
oflarge
data sets. Thedevelopment
of feasiblecomputing
techniques
is achallenge
to researchers in the area ofapplication
of numericalmethods to animal
breeding.
Research is
underway
toidentify
morepromising algorithms
toapproximate
marginal
moments ofposterior distributions,
a non-trivialproblem
as newtech-niques
aredeveloped
and there is little indication on the choice to make fores-timating
(co)variance
components under&dquo;non-exchangeability&dquo;
of model[1].
Re-cently
Gelfand and Smith(1990)
and Gelfand et al(1990)
described the Gibbssampler,
apotential
competitor
of the methodspresented
here.REFERENCES
Anderson TW
(1984)
An Introduction to Multivariate StatisticalAnalysis.
JWiley
andSons,
NewYork,
NYBauwens
L,
Richard JF(1985)
A 1-1poly-t
random variable generator withapplication
to Monte Carlointegration.
J Econometrics29,
19-46Box
GEP,
Tiao GC(1973)
Bayesian
Inference
in StatisticalAnalysis.
Addison-Wesley Publishing Co, Reading,
MABroemeling
LD(1985)
Bayesian Analysis of Linear
Models. MarcelDekker,
NewYork,
NYBulmer MG
(1985)
The MathematicalTheory of Quantitative
Genetics. ClarendonPress,
Oxford, UK,
2nd ednCantet
RJC,
KressDD,
AndersonDC,
DoornbosDE,
Burfening
PJ,
Blackwell
RL(1988)
Direct and maternal variances and covariances and maternalphenotypic
effects onpreweaning
growth
of beef cattle. J Anim Sci66,
648-660Carriquiry
AL(1989)
Bayesian prediction
and itsapplication
to thegenetic
evalu-ation of livestock. Ph D
thesis,
Iowa StateUniversity, Ames,
IAChen CF
(1979)
Bayesian
inference for a normaldispersion
matrix and itsapplica-tion to stochastic
multiple
regression analysis.
J R Stat Soc Ser B41,
235-248Cheverud JM
(1984)
Evolutionby
kin selection: aquantitative genetic
model illustratedby
maternalperformance
in mice. Evolution38,
766-777Cramer JS
(1986)
EconometricApplications
of
Maximum Likelihood Methods.Cambridge
University Press,
Cambridge,
UKDempster
AP,
LairdNM,
Rubin DB(1977)
Maximum likelihood fromincomplete
data via the EMalgorithm.
J R Stat Soc Ser B39,
1-38Dickerson GE
(1947)
Composition
ofhog
carcasses as influencedby
heritable differences in rate and economy ofgain.
lowaAgric Exp Stat, Ames, IA,
Res Bull354,
489-524Falconer DS
(1965)
Maternal effects and selection response. In: GeneticsToday.
Proc Xlth IntCongr
Genetics. TheHague,
TheNetherlands,
763-774Falconer DS
(1981)
Introduction toQuantitative
Genetics.Longman, London,
UKFoulley
JL,
ImS,
GianolaD,
Hoschele I(1987)
Empirical
Bayes
estimation ofparameters for n
polygenic binary
traits. Genet SelEvol 19,
197-224Foulley
JL,
Lefort G(1978)
Methodes d’estimation des effets directs et maternelsen selection animale. Ann Genet Sel Anim
10,
475-496Gelfand
AE,
Smith AFM(1990) Sampling-based
approaches
tocalculating
marginal
densities. J Am Stat Assoc85, 398-409
Gelfand
AE,
HillsSE,
Racine-PoonA,
Smith AFM(1990)
Illustration ofBayesian
inference in normal data modelsusing
Gibbssampling.
J Am Stat Assoc85,
972-985Geweke J
(1988)
Antithetic acceleration of Monte Carlointegration
inBayesian
inference. J Econometrics
38,
73-89Gianola
D,
Fernando RL(1986)
Bayesian
methods in animalbreeding.
J Anim Sci63, 217-244
Gianola
D,
ImS,
FernandoRL,
Foulley
JL(1990a)
Mixed modelmethodology
and the Box-Coxtheory
of transformations: aBayesian approach.
In: Advancesin Statistical Methods
for
the GeneticImprovement
of
Livestock(Gianola
D,
HammondK,
eds) Springer-Verlag,
Berlin,
15-40Gianola
D,
ImS,
Macedo FW(1990b)
A framework forprediction
ofbreeding
value. In: Advances in Statistical
Methods for
the GeneticImprovement
of
LivestockHaber S
(1970)
Numerical evaluation ofmultiple
integrals. SIAM
Rev12,
481-526Hammersley
JM,
Handscomb DC(1964)
Monte Carlo Methods.Methuen,
London,
UK
Harville DA
(1974)
Bayesian
inference for variance componentsusing
only
errorcontrasts. Biometrika
61,
383-384Harville DA
(1977)
Maximum likelihoodapproaches
to variance component esti-mation and to relatedproblems.
J Am Stat Assoc72,
320-340Hayes
JF,
Hill WG(1981)
Modification of estimates of parameters in theconstruc-tion of
genetic
selection indices(&dquo;bending&dquo;).
Biometrscs37,
483-493Henderson CR
(1984)
Applications of Linear
Models in AnimalBreeding.
Univ ofGuelph, Guelph,
CanadaHenderson CR
(1988)
Theoretical basis andcomputational
methods for a number of different animal models. JDairy
Sci71,
1-16Hildebrand FB
(1972)
Advanced Calculusfor Applications. Addison-Wesley
Pub-lishing Co, Reading, MA,
2nd ednHoeschele
I,
GianolaD,
Foulley
JL(1987)
Estimation of variance components withquasi-continuous
datausing Bayesian
methods. J Anim Breed Genet104,
334-349Jaynes
ET(1957)
Informationtheory
and statistical mechanics.Physiol
Rev106,
620-630Jaynes
ET(1979)
Where do we stand on maximum entropy. In: The MaximumEntropy
Formalism(Levine
RD,
TribusM,
eds)
The MITPress,
Cambridge,
MAKempthorne
0(1955)
The correlation between relatives in randommating
popu-lations. Cold
Spring
HarborSymp Quant
Biol22,
60-78Kirkpatrick M,
Lande R(1969)
The evolution of maternal characters. Evolution43,
585-503Kloek
T,
VanDijk
HK(1978)
Bayesian
estimates ofequation
system parameters: anapplication
ofintegration by
Monte Carlo. Econometrica46,
1-19Lande
R,
Price T(1989)
Genetic correlations and maternal effect coeflicientsobtained from
offspring-parent
regression.
Genetics122,
915-922Lindley
DV(1980) Approximate
Bayesian
methods. In:Bayesian
Statistics(Bernardo
JM, Degroot MH, Lindley DV,
SmithAFM,
eds)
Valencia
University
Press,
Valencia,
Spain
Lindley
DV,
Smith AFM(1972)
Bayes
estimates for the linear model. J R Stat Soc Ser B34,
1-18Macedo
FW,
Gianola D(1987)
Bayesian analysis
of univariate mixed models with informativepriors.
Eur Assoc Animal Prod XVIII AnnuMeet, Lisbon,
Portugal
Mead
LR,
Papanicolaou
N(1984)
Maximum entropy in theproblem
of moments. J MathPhys 25, 2404-2417
Misztal I
(1990)
Restricted maximum likelihood estimation of variance components in animal modelusing
sparse matrix inversion and asupercomputer.
JDairy
Sci73, 163-172
Mosteller
F,
Wallace D(1964)
Inference
andDisputed
Authorship:
The Federalist.Addison-Wesley Publishing Co,
Reading,
MANaylor JC,
Smith AFM(1982)
Applications
of a method for the efficientcomputa-tion of
posterior
distributions.Appl
Stat31,
214-225(auaas RL,
Pollak EJ(1980)
Mixed modelmethodology
for farm and ranch beef cattletesting
program. J Anim Sci 51, 1277-1287Raiffa
H,
Schlaiffer R(1961)
Applied
Statistical DecisionTheory.
HarvardUniver-sity Press, Boston,
MARichard
JF,
Steel MFJ(1988)
Bayesian analysis
of systems ofseemingly
unrelatedregression equations
under a recursive extended naturalconjugate prior density.
J Econometrics
38,
7-37Riska
B,
Rutledge
JJ,
Atchley
WR(1985)
Covariance between direct and maternalgenetic
effects inmice,
with a model ofpersistent
environmental influences. Genet Res45,
287-297Rubinstein RY
(1981)
Simulation and the Monte Cardo Method. JWiley
&Sons,
New
York,
NYSearle SR
(1979)
Notes on variance component estimation. A detailed account ofmaximum likelihood and kindred
methodology.
Paper
BU-673-M,
BiometricsUnit,
Cornell
University, Ithaca,
NYShannon CE
(1948)
A mathematicaltheory
of communications. BellSyst
Tech J27, 379-423;
reprinted
In: The MathematicalTheory of
Communications(Shannon
CE,
WeaverW,
eds)
TheUniversity
of IllinoisPress, Urbana,
ILThompson
R(1980)
Maximum likelihood estimation of variance components. MathOperationforsch Stat,
11, 95-103
Tierney L,
Kadane JB(1986)
Accurateapproximations
forposterior
moments andmarginal
densities. J Am Stdt Assoc81, 82-86
Tierney
L,
KassRE,
Kadane JB(1986)
Fully exponential Laplace approximations
of
expectations
and variances ofnon-positive
functions. TechRep
418,
Dept
Stat,
Carnegie
1!-ZellonUniversity,
Pittsburgh,
PA,
USAWillham RL
(1963)
The covariance between relatives for characterscomposed
ofcomponents contributed
by
related individuals. Biometrics19,
18-27Willham RL
(1972)
The role of maternal effects in animalbreeding.
III. Biometricalaspects of maternal effects. J Anim Sci
35,
1288-1293Wright
DE(1986)
A note on the construction ofhighest posterior density
intervals.Appd
Stat35,
49-53Zellner A
(1971)
An Introduction toBayesian Inference
in Econometrics. JWiley
&Sons,
NewYork,
NYZellner
A,
BauwensL,
VanDijk
HK(1988)
Bayesian specification analysis
andestimation of simultaneous
equations
modelsusing
Monte Carlo methods. JEcono-metrics
38,
39-72Zellner
A,
Highfield
RA(1988)
Calculation of maximum entropy distributions andapproximation
ofmarginal
posterior
distributions. J Econometrics37,
195-209 ZellnerA,
Rossi P(1984)
Bayesian analysis
of dichotomousquantal
responseAPPENDIX
First and second derivatives of the
log-posterior
of all(co)variance
components
The
log
of(11!
is:Let M’ =
(0 ( I
2a
0]
be a 2a x(p
+ 2a +d)
matrix such thatM’i
=
a. In the sameway, N’ =
[0
[ 0 [ Id]
be a d x(p
+ 2a +d)
matrix such thatN’i
=The
0represents a matrix of
appropriate
order with all elementsequal
to zero.To
simplify
thederivation,
we willdecompose
(A.1!
into components, take deriva-tives with respect to an element ofG(g
ij
say), (Tkm
or!Eo,
and collect results toobtain the desired
expressions.
!,
Derivatives of
(y’y -
0 W’y)
The term
y’y
does notdepend
on r. The other term is8’W’t/
=y’WCW’y,
so that:where
E
ij
is a 2 x 2 matrix with all elementsequal
to zero, with theexception
of a one in
position i,j.
Note that if ei)
j
(e
is a 2 x 1 vector with a 1 in the i-th(j-th)
position
E2!
=eie!.
The notationD(M
l
, ... , M,l
stands for a blockdiagonal
matrix with the s blocks
being equal
toM
i
,
(i
=1,...,
s).
Sincei
=CW’y
andIn a similar way
Second derivatives are obtained from
[A.3]
to[A.5].
Additional second derivatives are:
Derivatives of
log
W’W
+£
I
We use the result in Searle
(1979):
Using
[A.12],
the derivative oflog
W’W
+ E ! with
respect to gij isOther derivatives
We now consider the
remaining
derivatives and these are:with
[A.12]
used to obtain the second term on theright
of[A.22].
LikewiseSecond derivatives obtained from