HAL Id: hal-00894159
https://hal.archives-ouvertes.fr/hal-00894159
Submitted on 1 Jan 1997
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Bayesian analysis of calving ease scores and birth
weights
Cs Wang, Rl Quaas, Ej Pollak
To cite this version:
Bayesian analysis
of
calving
ease scores
and birth
weights
CS Wang*
RL
Quaas
EJ
Pollak
Morrison
Hall, Department of
AnimalScience,
CornellUniversity,
Ithaca,
NY14853,
USA(Received
3January 1996; accepted
13February
1997)
Summary - In a
typical
two stageprocedure, breeding
valueprediction
forcalving
easein a threshold model is conditioned on estimated
genetic
and residual covariance matrices.These covariance matrices are
traditionally
estimatedusing analytical approximations.
AGibbs
sampler
formaking
fullBayesian
inferences about fixedeffects, breeding values,
thresholds andgenetic
and residual covariance matrices toanalyze jointly
a discretetrait with
multiple
orderedcategories
(calving
easescores)
and acontinuously
Gaussiandistributed trait
(birth
weights)
is described. The Gibbssampler
isimplemented by
drawing
from a set of densities -(truncated)
normal,
uniform and inverted Wishart-making implementation
of Gibbssampling straightforward.
The method should be useful forestimating genetic
parameters based on features of theirmarginal posterior
densitiestaking
into full account uncertainties inestimating
other parameters. Forroutine,
large-scale estimation of location parameters
(breeding
values),
Gibbssampling
isimpractical.
Thejoint posterior
modegiven
theposterior
mean estimates of thresholds anddispersion
parameters is
suggested.
Ananalysis
of simulatedcalving
ease scores and birthweights
is described.dystocia
/
beef cattle/
thresholdmodel / Bayesian
method/
Gibbs samplingRésumé - Analyse bayésienne des notes de difficultés de vêlage et des poids de naissance. Dans une
procédure
typique à deuxétapes,
l’évaluationgénétique
pour ladiff’-cculté
devêlage
dans un modèle à seuil est conditionnée par les matrices de covariancegénétiques
et résiduelles. Ces matrices de covariance sont habituellement estimées autravers
d’approximations analytiques.
On décritl’échantillonnage
de Gibbs permettantd’effectuer
desinférences bayésiennes complètes
à propos deseffets fixes,
des valeursgénétiques,
desseuils,
et des matrices de covariancegénétiques
etrésiduelles,
pouranalyser
conjointement un caractère discret à
catégories multiples
ordonnées(note
dedifficulté
devêlage)
et un caractère continu gaussien(poids
denaissance).
L’échantillonnage
de Gibbsest assez
simple
à partir de densités de divers types : normale(tronquée),
uniforme
etWishart inverse. La méthode est utile pour estimer les
paramètres génétiques
à partirde leurs distributions
marginales
a posteriori,après
prise en compte des incertitudes*
Correspondence
and reprints: Pfizer CentralResearch, T201,
Eastern PointRd, Groton,
concernant les autres
paramètres. L’échantillonnage
de Gibbs n’est pasfaisable
en routine pour estimer les valeursgénétiques.
Onsuggère
le mode de la distributionconjointe
aposteriori,
pour des valeurs des seuils et desparamètres
dedispersion correspondant
à leurs moyennes aposteriori.
On décrit uneanalyse
de notes dedifficulté
devêlage
et depoids
de naissance simulés.dystocie
/
bovins à viande/
modèle à seuil/
méthode bayésienne/
échantillonnagede Gibbs
INTRODUCTION
Calving
ease is considered a calf trait and recordedsubjectively
as one of severalexclusive ordered
categories.
Forexample,
for American Simmentalcattle,
calving
ease is scored as 1
(natural
calving,
noassistance),
2(easy
pull),
3(hard pull)
or 4
(mechanical
force orCesarean).
Calf size(birth weight)
affects ease of birth: thebigger
the calfis,
the morelikely
the birth will be difficult(Koger
etal,
1967;
Pollak,
1975).
In this paper, we consider
joint modeling
ofcalving
ease scores and birthweights
using
the threshold modelconcept
ofWright
(1934).
In a thresholdmodel,
anunderlying
continuous variable ispostulated
forcalving
ease. A set of thresholdsdivides this continuous variable into the discrete
calving
ease scoresactually
recorded. Gianola
(1982)
and Gianola andFoulley
(1983)
consideredBayesian
analysis
ofsingle
trait threshold modelsassuming
knowngenetic
variance. Harville and Mee(1984)
andFoulley
et al(1987) gave
approximate
methods for variancecomponent
estimation.Foulley
et al(1983)
developed
a method to deal with abinary
trait and two continuous traits withoutallowing
formissing
data,
while Janss andFoulley
(1993)
extended the method to handle data withmissing
patterns.
In 1990 at CornellUniversity,
asystem
for routine sire evaluation ofcalving
ease scores and birthweights jointly allowing
for allpossible missing
data framework wasimplemented.
Thissystem
assumed asire-mgs
(maternal grandsire)
linearmodel for the
underlying scale;
itpredicts
thefrequency
of unassisted births for American Simmental cattle(Pollak
etal,
1995 perscomm).
This evaluationsystem
also assumed that
genetic
and residual covariance matrices and thresholds wereknown. Variance
components
were estimated(Dong
etal,
1991)
by
extension ofFoulley
et al(1987).
Hoeschele et al(1995)
described further extensions ofFoulley
et al
(1983)
and Janss andFoulley
(1993)
to a situation of onemultiple
orderedcategorical
trait and several continuous traits.A
difficulty
inestimating
parameters
under threshold models is that the like-lihood ormarginal posterior
distributions do not have closed forms and approx-imations are used. With thehelp
of Monte Carlomethods,
inparticular
Gibbssampling
(Geman
andGeman, 1984;
Gelfand etal,
1990),
theseapproximations
are no
longer
needed.Wang
et al(1993, 1994a,b)
describedmaking Bayesian
in-ferences in a univariate linear model in an animalbreeding
contextusing
Gibbssampling.
Sorensen et al(1994)
demonstrated how inference about response tose-lection in a linear model can be made.
Berger
et al(1995)
applied
the methods of Sorensen et al(1994)
andWang
et al(1994b)
toanalyze
a selectionexperiment of
the scope to multitrait linear models.
Bayesian analysis
of univariate thresholds via Gibbssampling
in an animalbreeding
context wasrecently
describedby
Sorensenet al
(1995)
by extending
Albert and Chib(1993).
For abinary
trait,
Hoeschele andTier
(1995)
compared frequency
properties
of three variancecomponent
estimators: mode ofapproximate
marginal
likelihood(Foulley
etal,
1987),
marginal
posterior
mode and mean via Gibbssampling.
Jensen(1994)
analyzed
simulated data ofone
binary
trait and one continuous trait via Gibbssampling
under aBayesian
framework.
Wang
et al(1995)
gave aBayesian
method toanalyze
onemultiple
or-dered
categorical
trait and one continuous trait with Gibbssampling.
Van Tassellet al
(1996)
presented
Bayesian
analysis
oftwinning
and ovulation rateusing
Gibbssampling.
The purpose of this paper is to extend the work of Sorensen et al
(1995)
andWang
et al(1995)
to onemultiple
orderedcategorical
trait(calving ease)
and onecontinuous trait
(birth weight)
with allpossible missing
patterns
of data under aBayesian setting
via Gibbssampling.
A set of full conditionalposterior
densities will be derived in closed formfacilitating straightforward
implementation
of Gibbssampling.
Simulated data areanalyzed
to illustrate themethodology.
MODEL
Let
Y
lo
be a vector of birthweights
(BW),
with odenoting
observedrecord,
andY
20
becalving
ease scores(CE,
recorded as one of four scores(1
= noassistance,
2 =
easy
pull,
3 = hardpull
and 4 = mechanical force orCesarean),
U
lo
=Y
lo
and
U
2o
be thecorresponding
underlying
variable for observedcalving
ease scores,which is also known as
augmented
’data’(Tanner, 1993).
Asir!mgs
model usedfor the American Simmental
calving
ease sire evaluation(Quaas, 1994)
is assumed on theunderlying
scale(an
animal model can be assumedby appropriately defining
terms):
where ai is sire effect
(direct),
and mmgs is maternalgrandsire
effect(1/4
directBV
plus
1/2
maternalBV)
and elo and e20 are residual effects.AgeDam
is age ofdam effect and CG is
contemporary
group effect. Note that maternal effects for BWare not modeled for the Simmental
population
because the maternal contribution to the totalgenetic
variance was found to benegligible
(Garrick
etal,
1989).
In matrix notation:For reason of easy identification of conditional
posterior
distribution of the residualcovariance matrix
later, augmented
data are furtherexpanded
to include residuals associated withmissing
data. DenoteU’
=[U!o e!m]’ U’
=!U2o e!m]’ el
=le
l
missing
BW andCE,
with mdenoting
missing records, respectively.
Then[1]
canbe written as
where U contains
U
l
andU
z
,
W iscomposed
of thedesign
matrices -X
i
s,
Z
i
s
and rows of zeros associated withmissing data,
e contains locationparameters
13
1
,
13
2
,
aI, a2and , zm and e, the residuals el and ez, with all the elementsproperly
ordered and matched. ForU,
we assumewhere R is a block
diagonal
matrixcontaining n
submatrices of residual covariances(R
o
)
for therecord(s)
of aparticular
animal with dimension2, ie,
R =R
o
®1,,
if the data are sorted
by
animal andtrait,
and n is the number of animals with at least one trait recorded.A uniform
prior
distribution isassigned
to[3
i
s,
such that(eg,
Gianola etal,
1990;
Wang
et al1994a)
which is similar to
treating 0
as fixed in a traditional sense. We assume for the bulleffects
(genetic):
where a contains al, a2and m,
q is
the number of bulls(sires
andmgs),
G =G
o
l8iA
with
Go
=Igij 1,
i, j =
1, 2, 3,
the covariance matrix among threegenetic
effects fora
particular
animal and A is the numeratorrelationship
matrix among sires and mgs.To describe
prior
uncertainty
aboutGo,
an inverted Wishart distribution(John-son and
Kotz, 1972;
Jensen etal, 1994;
Van Tassell and VanVleck,
1996)
isassigned
where
Sg
is the locationparameter matrix;
vg is the scalarshape
parameter
(degrees
ofbelief); Sg
=E(G
o
iSg, Vg).
Alarge
value of vg indicates relativecertainty
thatGo
is similar toSg;
a smallvalue,
uncertainty, ie,
arelatively
flat distribution.(The
subscript
3 of IW indicates the order of the covariancematrix.)
Similarly
forR
o
,
The final
parameters
are the thresholds: t =(t
l
,
,
2
t
t
3
),
withto
= -00 andt
4
= oo. These are assumed to be distributed as order statistics from a uniform distribution in the interval!tm;n, t
max]
(Sorensen
etal,
1995):
where
I(.)
is an indicator function andc = 4 in our case, the number of
categories.
Applying
Bayesian theorem,
thejoint posterior
density
of all theparameters
including
theaugmented
data(8,
t,
Go,
,
o
R
elm,U
2
)
given
the observed data(Y’ = !Yio!’io!)
andprior
parameters,
assuming prior independence
oft,
Go
andR
o
is:if both BW and CE observed
if
only
BW observedif
only
CE observedwith wli
and/or
W2i the incidence vectors associated with8
1
and/or
0
2
for the ith animal’srecord(s),
respectively.
The first term in
[8],
p(Y
2oI
U
2o
, t),
has adegenerate density
(Albert
andChib,
1993;
Sorensen etal,
1995):
where
I(.)
is an indicator function. Forexample,
for aparticular
CE score(=
k),
we haveOne way to ensure
identifiability
ofparameters
is to set constraints.Assuming
full column rank for the incidence matrixW,
two constraints need bespecified.
Usually,
one threshold and the residual variance for CE on theunderlying
scale are set to 0 and1,
respectively
(Harville
andMee,
1984).
Anequivalent
parame-terization
(Sorensen
etal,
1995)
is to fix two thresholds and to estimate the CE residual variance. We followed the latter because it allows easyspecification
of theconditional
density
ofR
o
.
The twoparameterizations, though equivalent,
may notyield
the samejoint posterior density owing
to different sets ofpriors specified.
Inference about location anddispersion
parameters
will be based on thejoint
posterior
density
[9],
or on theirrespective
marginal
posterior
densities. ForSimilarly,
inference aboutGo
is based on:These densities cannot be derived
analytically.
Monte Carlomethods,
such as Gibbssampling,
drawsamples
from!9!.
Suchsamples,
if consideredjointly,
are from thejoint posterior
distribution or, viewedmarginally,
from anappropriate
marginal
posterior
distribution. Inferences can be based on these drawnsamples.
Inferencesabout functions of
parameters,
such as heritabilities andgenetic correlations,
canbe made based on transformed
samples.
Fully
conditionalposterior
densities(Gibbs
sampler)
The Gibbs
sampler
consists of a set offully
conditionalposterior
densities of unknownparameters
in themodel,
ie,
the conditionaldensity
of aparameter
given
all other
parameters
and the data. These can be derived from thejoint posterior
density
[8]
or!9! .
For location
parameters
(0),
wekeep
terms in[8]
that are functions of 0 suchthat:
where S2 =
L ! G_1 A
-1
’
with blocks of Oscorresponding
toj3
(Gianola
etal,
lo
Gol(&A-11,
l
1990).
This is a normaldensity,
sowhere
6
satisfies Henderson’s mixed modelequations
(MME)
(Henderson,
1973,
1984):
where C =
{CZ! },
i, j = 1, 2, ... , N,
is the coefficient matrix of theMME,
withC
jj
s
as blocks of
C,
and b ={b
i
},
i = 1, 2,..., N is
thecorresponding right-hand.
Theconditional
posterior
distribution for the locationparameters
is:v
i
-
Ciil,
e
i
andO
j
aresubvectors,
possibly
scalars,
of 0 and0-
i
is e withO
i
deleted. If
O
i
is ascalar,
[15]
is the scalar version of thesampler
for the locationparameters
(Wang
etal,
1994a).
Note thesimilarity
of[15.1]
to anupdate
in(block)
Gauss-Seidel iteration. It may be
advantageous
tosample
a subvectorjointly
tospeed
up convergence of Gibbs chain. Forexample, sampling
allgenetic
effects foran animal may reduce serial correlations among Gibbs
samples
(Van
Tassell,
1994;
Garcia-Cortes and
Sorensen,
1996).
From
!9!,
the full conditionalposterior
density
of thegenetic
covariancematrix,
asin Gaussian linear models
(Jensen
etal,
1994;
Van Tassell and VanVleck,
1996),
isSimilarly,
thefully
conditionalposterior
density
of the residual covariance matrix isfor
SS
e
in!17!.
From[8],
ingeneral,
These distributions
depend
on which combination of records is observed for acalf: BW
only,
CEonly
or both BW and CE. For aparticular
calf,
if a BW isobserved
(U
lo
,i
=Y
l
.,i)
but CE isnot,
we needonly
tosample
e2m,i forCE,
thedistribution is
only
involved withp(UI8, Ro),
which follows a univariate normal distribution withdensity:
where
0(.)
is a normaldensity function;
/
-t
=,i
lo
be
=b(u
lo
,
i
-
w!i8¡);
w12 is theincidence vector associated with
A
1
;
Q
2
= - 22rrî2/rll;
b =r
12/
r
l
{r2!}
=Ro.
If both BW
(U
li
=Y
ii
)
and CE(Y
2i
=k))
areobserved,
thenonly
U
2o
,
i
needsto be
sampled,
This is in a form of univariate truncated normal
(TN)
distribution such thatwith tk-1 <
U20,i !
tk
,
where
p
=w!i82
+-
b(U1o
i
,
w!0i), !
as in[18]
and w21 is the incidence vector associated with8
2
.
If
only
a CE(Y
2i
=k)
isobserved,
then bothe lm ,
i and
U
20
,
1
need to besampled
Finally,
the conditionalposterior
distribution of a threshold is uniform(Albert
and
Chib, 1993;
Sorensen etal,
1995),
if CE is notmissing:
if CE is
missing.
As mentioned
previously,
for t =(t
l
,
,
2
t
t
3
),
there isonly
one estimablethreshold,
which to estimate is
arbitrary.
We took:Note that
t
l
<t
2
.
Ifonly
threecategories
of CE scores areavailable,
there is no need to estimate thresholds under thisparameterization.
If the fourthcategory
was rare, it would betempting
to combine scores into threecategories
to avoidestimating
thresholds.GIBBS SAMPLING AND POST-GIBBS ANALYSIS
Densities
[15]-[18]
(or
[19]
or[20]),
and[21]
(or
[21.1])
constitute the Gibbssampler
for our model. Gibbssampling repeatedly
drawssamples
from this set of full conditionalposterior
distributions. Afterburning-in,
such drawn numbersare random
samples, though dependent,
from thejoint posterior
density
!9!.
Letthe Gibbs
samples
oflength
m for aparticular
parameter,
say for the directgenetic
variance
component
gll forBW,
be x ={xi},
i =1, 2 ... , m.
An estimate of themean of the
marginal
posterior
density,
p(g
nI
Y),
is:and the
posterior
variance can be estimatedby:
Modes and medians can also be used to estimate location
parameter
of aposterior
density
(Wang
etal,
1993),
though
usually
requiring
more Gibbssamples
becauseMonte Carlo errors. Because the Gibbs
samples
arecorrelated,
one way to estimate Monte Carlo errors is toadopt
standard time seriesanalysis techniques
assuggested
by Geyer
(1992)
and usedby
Sorensen et al(1995).
ROUTINE GENETIC EVALUATIONThe
preceding
sections describe aBayesian
analysis
via Gibbssampling
forinfer-ences about all the
parameters
in the modelincluding
fixedeffects,
(functions of)
breeding values,
genetic
and residual covariancematrices,
and thresholds basedon
marginal posterior
densities. This is sensible because all uncertainties inesti-mating
otherparameters
are taken into account when inference is made about aparticular
parameter
ofinterest,
say for thebreeding
value of a sire.However,
it iscomputationally
expensive
to carry outlarge
scaleanalyses
routinely.
Apractical
compromise
is to estimate covariance matrices and thresholdsusing
a fullBayesian
analysis
via Gibbssampling
once andsubsequently
to estimate locationparameters
based on conditional densities of location
parameters
given
the estimateddispersion
and thresholdparameters.
As dataaccumulate,
covariance matrices and thresholdsare reestimated.
Explicitly,
wesuggest
atwo-stage
procedure
thatmight
be useful for alarge
scale routinegenetic
evaluation program:1)
EstimateGo,
R
o
and tusing
mean(mode
ormedian),
based on theirrespective
marginal
posterior
densities,
p(Go!Y), p(R
oI
Y)
andp(t!Y),
via Gibbssampling, dropping
prior
parameters
Sg,
S
e
,
V9 and Ve in notation forconvenience;
and
2)
Estimate locationparameters
based on thefollowing
conditionaldensity:
which is an
approximation
to thecorresponding
marginal
density:
if the
marginal density,
p(t,
Go,
Ro !Y),
issymmetric
orpeaked
(Box
andTiao, 1973;
Gianola and
Fernando,
1986).
In otherwords,
if there is sufficient information in the data to estimateG
o
, R
o
and twell,
then[25]
is agood approximation
to!26!.
This would be the case if the data set is one of the national databases,
forexample,
the American Simmental data set. Note that the
underlying
variable for CE(U
20
)
and the residual vector associated with themissing
data(e
im
ande
2m
)
have beenintegrated
out of thejoint
posterior
density
!9!;
that W matrix nolonger
containsblocks of Os
corresponding
to themissing data, however,
the same notation iskept
below. Note also that o and mdenoting
observed andmissing
data aredropped
from the notation because
missing
data nolonger
play
a role.The
joint posterior
mode of[25]
can be considered aspoint
estimates fore,
Foulley,
1983).
There are at least two ways tocompute
the MAP estimates:expectation-maximization
(EM)
(Zhao,
1987;
Quaas, 1994,
1996)
andNewton-Raphson
(Gianola
andFoulley, 1983; Foulley et al,
1983).
The EM iteration
equation
(Quaas, 1996)
is:where the coefficient matrix is
exactly
the same as that in the usual MME with S2containing
Go
1
®
A-’
and blocks of Oscorresponding
to the fixedeffects,
thesuperscript
in[27]
indicates iterationnumber,
andU’ _
1 ut 1 fit 21 -
For aparticular
record,
withY
2i
=k,
if both BW and CE observed if
only
CE observedif both BW and CE observed
if
only
CE observed with b as in[18],
The
Newton-Raphson
iterationequation
(Janss
andFoulley,
1993;
Quaas,
1994,
1996;
Hoeschele etal,
1995),
following closely
the notation ofQuaas
(1996),
is:where,
for aparticular record,
with p
as in[28.1],
Q
as in
!28.2!,
P
k
as in[28.3!,
andStructurally,
R
in[29]
is the same as R in[2]
butconsisting
ofR
o
matrices aswith b as in
[18]
and rn the residual variance of BW. It is clear thatR
o
depends
on0, t, R
o
andY;
thus it is recordspecific.
If an animal ismissing
BW orCE,
f
l
o
1is a scalar: -y or
rill,
respectively.
A
Bayesian analog
of the BeefImprovement
Federation measure of’accuracy’
ofa bull’s
genetic
prediction
(BIF, 1990)
is:If information contained in the data conflicts with
prior
belief,
thenposterior
variance could be
larger
thanprior
varianceresulting
in anegative
accuracy. Thisis
peculiar
to afrequentist, particularly
to aproducer, why
aftercollecting
data onhis
animal,
theuncertainty
about his animal’s BV has increased! Posterior variances ofbreeding
values areusually
approximated,
based onlarge
sample
theory, by
theinverse of Fisher’s
(expected)
information matrix orby
the inverse ofnegative
Hessian matrix. The latter
approximation
is:where l =
log{[25]}.
This is the inverse of the coefficient matrix of[29].
Note that[27],
the inverse of the coefficient matrix used forEM,
is not alarge
sample
approximation
to theposterior
variance matrix of(25!,
or, atleast,
not a verygood
one. We shall return to thispoint
in the numericalexample
section below.NUMERICAL EXAMPLE
A data set
representing
continuous BW and discrete CE scores with orderedcategories, 1-4,
of ’Simmental calves’ was simulated andanalyzed
to illustrate themethodology.
ModelThe records were simulated under the sire-maternal
grandsire
model[1].
Modelequations
for the kth calf with ith bull as sire andjth
bull as maternalgrandsire
werefor BW with
only
direct effects and CEunderlying
variable with both direct and maternaleffects, respectively.
The random vectors[a
lk
,
a2k ,m
2k]
were drawn froma
N(0, Go)
distribution,
ie,
bulls were unrelated.Similarly
!2lijk
e
2zjk]
-
N(0,
Ro).
Parameter values were similar toprevious
estimates from the Simmental data(Dong
were scaled to obtain
R
o
andGo.
Assuming
thephenotypic
SD for BW was nineunits and
hb
w
= 0.4 !afl!
=1/4
x 0.4 x 81. The CEcomponents,
aa2 and am2’were assumed
equal
andcomputed by
solving
h!e
=x a e2 2
2a x 2 4
+
a2
fora; =
!a2 2
=
a;’ . 2
2Q! -I- Uez
The residual SD which establishes the scale of
U
2
was set to ten to becomparable
withBW;
hfl!
= 0.2. Thedispersion
matrices wereAs found in
previous
analyses
of Simmental data(Zhao,
1986, 1987; Quaas
etal,
1988),
thresholds wereequally spaced
at 1 SD intervals: ti =0,
t
2
= 10 and t3 = 20.Corresponding
to!65%
unassisted births of Simmental calves out of2-year-old
dams(unpublished data):
p
2 =
-!’’(0.65) =
-3.8532.Thus,
P
k
=«){tk-/-l2)/ae2}-«){tk-l -/-l2)/ae2} =
0.6500, 0.2670,
0.0744 and0.0085,
for k =1, 2, 3, 4,
respectively.
ForBW,
!,1 = 85.Calving
ease scores wereassigned
such thatY
2zj£
= k if1
k
_
t
<U
2
ijk <
t
k
.
Design
There were 75 bulls in three batches of 25. Bulls in the first batch were MGS of the calves sired
by
bulls in the secondbatch;
bulls 26-50 had two progeny fromdaughters
of each of the first 25 bulls.Likewise,
bulls 51-75(third batch)
had two progeny fromdaughters
of each of the second 25 bulls(second batch).
Thus bulls 1-50 were each MGS of 50calves;
bulls 26-75 were each sires of 50calves;
only
bulls 26-50
(second
batch)
were both sires and MGS. There were a total of 2500calves,
each with a BW and a CE score.Gibbs
sampling
Priors for 1-1, t and
R
o
were uniform while that forGo
was the inverted Wishartin
[5]
withSg
adiagonal
matrix of thegenetic
variances used for simulation and vg = 5. The latter’slightly
informative’prior
wasadopted
becauseduring
initialtesting
of the programs, anearly singular
SSg
would cause thesampler
to crash. The lastGo
would look reasonable but would have aneffectively
zeroeigenvalue.
The informative
prior
wasincorporated
to ensure(SSg
+Sg)
waspositive
definite.Subsequently,
theproblem
was found in the simulation program where bothh
2
were almost zero;
Go
wasnearly
singular!
Thesafeguard,
however,
was left in theprogram.
The initial
dispersion
matrices werediagonal
matrices of the variances used forsimulation,
bull effects were null and the average BW wasassigned
to !1. Thresholdsand p2 were
computed
from the observedfrequencies:
p2 = 10 x!-!-1(Cl)!
andt
From these
starting
values theparameters
wererepeatedly sampled
in thefollowing
order:U
2ijk
from[19],
t
3
from(21!, (
1
p
-L
J
2
)
jointly
from[15],
(a
12
a2im
2
)
jointly
from(15],
Go
fromIW
3
(SSg+Sg,
75+5), (16),
andR
o
fromIW
2
(SS
e
,
2500-3),
(17!.
Each bull’s three effects weresampled jointly
as were the two ps. Most ofthe results to be
presented
came from asingle
chain of 10 500samples;
all summarystatistics were
computed
from the last 10 000. We feel asingle
chain of thislength
is sufficient to demonstrate the
procedures
but wouldprobably
not suffice for a realapplication.
A few references will be made to results fromreplicated
Gibbs chains andreplicated
data. Our purposehere, however,
was not tostudy
Monte Carloerror,
single long
chain versusmultiple
short chains nor the smallsample properties
of
posterior
means; nosystematic study
was carried out on thesereplications.
Comparisons
offrequentist properties
of estimators of certainparameters
are foundin Jensen
(1994)
and Hoeschele and Tier(1995).
The most time
consuming
step
wassampling
the CEunderlying
variables. This was duepartly
to the number of these but also because of the way the truncatednormals in
[19]
weregenerated,
ie,
normals weresampled
until one was obtainedwithin the range determined
by
the CE score. The time for acomplete cycle
couldalmost
triple;
thisvariability
wasentirely
due togenerating
the truncated normals. A more efficient truncated normalgenerator
might
be necessary for alarge
data set in which theprobability
of somecategories
is verysmall,
eg, a score of four for aheifer calf out of a mature dam. In several million
records,
there will be a handfulof these and
they
will causeproblems.
Joint conditional
posterior
modesJoint
posterior
modesof p
and the bull effectsgiven
t, R
o
andGo
[25]
werecomputed by
both the EM[27]
andNewton-Raphson
[29]
algorithms.
The values oft,
R
o
,
andGo
were fixed at the estimates of theirposterior
means from theGibbs
sampler.
Starting
values were the same as in the Gibbssampler.
Iteration was continued untillog
io
of the maximum absolutechange
of any bull effect was< -10 at which
point
the modal estimatesof p
and bull effectscomputed by
the alternativealgorithms
differedby
<10-
l0
.
NUMERICAL RESULTS
Visual
inspection
ofplots
ofparameters
(or
functions ofparameters)
against sample
number
suggests
that a burn-in of 500cycles
wasprobably
unnecessary(eg,
fig
1).
The
parameters
infigure
1 were chosen to illustrate themarkedly
differentpatterns
observed as a result of correlations amongsequential
Gibbssamples.
There weremarked differences among
parameters
in these serial correlations(table II)
whichwere
particularly high
for thethreshold,
t
3
,
and also for the CE residualvariance,
r 22
. In
contrast,
mixing
for the other rij, all 9ij and the bull effects was much morerapid. However,
overall convergence of a Gibbs chain is tied toparameters
with slowmixing
properties.
In anotherwords,
one cannot declare that a Gibbs chain isconverged
for someparameters
but not others.Estimates of
posterior
means and SDs arepresented
in table I. Theparameters
simulate the data. This was not the case,
however,
for theparameters
affecting
CE. Thedispersion
parameters,
especially,
weremarkedly
smaller than the ’true’ values.Too much cannot be made of this from a
single
sample
and we cannot conclude that such apattern
is a consequence ofusing
themarginal
posterior
mean as anneither t
k
nor aú is identifiable but(t
k
-
tk_1)/Q!e
is identifiable. The ’true’ thresholds wereequally spaced
1 residual SDapart;
analogous
functions oft
l
, t
2
(fixed)
and theposterior
means oft
3
and r22 are 0.996 and0.954, quite
close to the ’true’ values.Likewise,
estimated heritabilities andgenetic correlations,
tableIII,
differed
considerably
less from the ’true’ values than did the(co)variances.
For this small
example
precise
estimates of themarginal
posterior
densities werenot
attempted.
Coarsehistograms
wereexamined;
most wereslightly
skewed awayfrom zero; in
figure
2 aretypical
examples.
None were’peaked’
in the eyes of theauthors.
Though
the EM andNewton-Raphson
algorithms
both gave the same values forthe
joint
conditional modes of[25],
there was a marked difference in thepattern
of convergence: linear versusquadratic
(fig 3).
The EMalgorithm
took six times asmany
rounds,
butelapsed
times were close: EM < two timeslonger.
For alarge,
field data
application
it isquestionable
as to whichalgorithm
will befaster;
we donot
expect
alarge
difference. While EM is easier tocode, approximations
of SDs are not aby-product
asthey
are if the Hessian iscomputed.
to the
marginal
posterior
means of[26]
despite
theposterior
densities fort
3
and
dispersion
parameters
being
neithersymmetric
norparticularly
peaked.
Thesimilarity
ofmarginal
mean andjoint
conditional mode isgraphically
illustrated forthe CE-D effect in
figure
4. Theregression
of mode on mean isslightly
greater
thanunity, 1.08,
reflecting
thegreater
spread
of modes when there is nouncertainty
in thresholds nordispersion
parameters.
This was seen for all three bull effects.Correlations,
within bullbatch,
were alsocomputed
between true and estimated(mode
ormean)
bull effects. These correlations caused debate among the authors.The
Bayesian
amongst
us was not sure how tointerpret
them whereas thelapsed
frequentists
had nodifficulty
whatsoever: abig
number isgood;
a small numberis not so
good.
Except
for CE-D for batch 1 bulls and CE-MGS for batch 3 bullswhere the estimation came
entirely
from correlatedinformation,
these correlations seemsatisfyingly
large.
Of some interest was the observation that the correlationsbetween true and estimated direct
effects,
both BW andCE-D,
were a bithigher
forthe batch 3 bulls than for the batch 2 bulls. Both had 50 progeny but the latter also had 50
grandprogeny.
What seems to beadding
information fromgrandprogeny,
apparently,
is noise. This could be due tosampling, though
the samepattern
was seen in threeindependent
samples
of simulated data.The
posterior
SD were also examined. These wereapproximated
in three ways:from the coefficient matrix of the MME used in EM
[27!,
from the Hessian used inNewton-Raphson
[29]
andby
Gibbssampling.
Inapplications
these are sometimesused
interchangeably but
they
arenot,
of course, the same. The first is theposterior
SD of
p(elU 1,
Û2,
G_o, R_o)
or
thefrequentist’s
standard error ofprediction
(SEP)
for BLUP
p(elU 1,
-
f J
2
,
Go,
R
o
)
note that this assumes the CEunderlying
variableCarlo error. These are
plotted
for the bulls’ BW effects infigure
5. It isimmediately
obvious that the first behavesquite
differently
to either of the latter two. The formerdepends only
onGo, R
o
and the information that isavailable,
hence a differentvalue for each of the three batches of bulls. The second also
depends
onGo
andR
o
(and
3
,)
t
but also on, â
2
l
j!, â
and m2minimally,
hence the values vary a bit butnoticeably
soonly
for the bulls with progeny.(Considerably
more variation was seenfor the CE effects and the SDs were
larger,
ingeneral,
reflecting
the fact thatU
2
wasnot
actually
observed.)
In markedcontrast,
themarginal
posterior
SD are all overthe
place.
This is notjust
due to Monte Carlo error, seefigure
6 where results fromtwo
independent
Gibbssamples
areplotted.
It is due to thedependence
of the SD onthe
posterior
means. This is showngraphically
infigure
7.Presumably
thispattern
is due
primarily
to theuncertainty
of thedispersion
parameters.
Heuristically,
abull with average progeny BW will have zero
â
1k
regardless
of the value of(or
uncertainty
about)
hb
w
whereas theuncertainty
about(SD)
a
lk
for a bull whoseprogeny BW
depart
markedly
from average willdepend
onuncertainty
abouthb
w
.
DISCUSSION AND CONCLUSIONS
Following
closely
the work of Sorensen et al(1995),
we have extended the Gibbssampling
scheme to make fullBayesian
inferences aboutlocation, dispersion,
and thresholds frommodeling
onemultiple
orderedcategorical
trait(CE)
and oneabout
genetic
and residual covariance matrices and thresholds are based on theirrespective
marginal
posterior
densities in a unified fashion withoutanalytical
ap-proximations,
in contrast to traditional methods based onapproximations
(Foulley
et
al, 1987;
Harville andMee,
1984).
Our model can be
expanded
to includeheterogeneous
variances in a similar way as for linear models(Gianola
etal,
1992).
Foulley
and Gianola(1996)
expanded
a
log-linear
structural model to describeheterogeneous
variances(Foulley
etal,
1990;
San Cristobal etal,
1993)
to asingle-trait
threshold model. This idea can beextended to model
heterogeneous
variances in our situation as well.It is
advantageous
to estimate locationparameters
including
fixed effects andbreeding
valuesjointly
withdispersion
parameters
because uncertainties in esti-mation ofdispersion
parameters
can be taken intoaccount,
particularly
in smallpopulations. However,
in realgenetic
evaluationsystems
with data sets of millions ofrecords,
joint modeling
may be neitherpossible
nor necessary. Withgood
esti-mates fordispersion
parameters
and thresholds from thejoint posterior
density
[9]
fromlarge
datasets,
we argue that the conditionalposterior
density
of locationparameters
[25]
with such estimateddispersion
parameters
as fixed values is agood
approximation
to themarginal posterior
density
of locationparameters
!26!.
In ournumerical
example
thejoint
mode of[25]
approximated
well themarginal
posterior
The
posterior
SD of[26],
however,
was not wellapproximated by
the inverse of the coefficient matrix of[27]
or[29].
The latter arelarge
sample approximations;
ourdata were not numerous.
A main purpose of the data
augmentation
used in the paper was to result infully
conditionalposterior
distributions ofparameters
in standardrecognizable
forms such that
samples
wereeasily
drawn. We chose toaugment
theunderlying
variables of observed CE(U
2o
)
and the residuals associated withmissing
BW andCE
(e
lm
ande
2m
)
l
the wholeaugmented
data vector was(e!m U!oe!m)’
Anotherway is to
augment
theunderlying
variables of observed CE(U
2o
)
and that ofmissing
CE(U
2m
),
and continuous datacorresponding
tomissing
BW(Y
l
,
r
,);
the wholeaugmented
data vector is(Y!m!2o!2m)-
Sorensen(1996)
gave aparallel
treatment of the
problem
under the latter dataaugmentation.
Ingeneral,
the wholedesign
matrix W is considered to be fixed. In otherwords,
thedesign
vector wi associated with an observed or a
missing
(either
BW orCE)
calf record isaugmentation
was,implicitly, equivalent
totreating
wis associated withmissing
BW and CE as random and toassigning
uniformpriors
tothem,
andintegrating
them out of thejoint posterior density,
asopposed
to Sorensen(1996)
in which thejoint posterior
density
was conditioned on those wis ofmissing
BW and CE. Weconjecture
that ourapproach
hascomputational
advantages
while notsacrificing
theoreticalsimplicity,
particularly
when themissing
data rate ishigh.
Forexample,
for American
Simmental,
about1/3
of the records have either BW or CEmissing.
It is clear also that other combinations of data
augmentation
arepossible,
such as(
e
lm
U
2
o
U
2m)
(Y!m U!oe!m)
or(ei
m
e2
o
ez
m
).
If no dataaugmentation
isused,
fully
conditional distributions of certain
parameters
may not be inrecognizable
forms and alternativesampling procedures
such asrejection
sampling
need to be used.Zeger
and Karim(1991)
presented algorithms
for asingle
trait threshold modelwithout data
augmentation.
There is a
danger
inusing improper priors
in aBayesian
analysis.
In a linearmodel
setting,
someimproper priors
induceimproper joint posterior
densities eventhough
thefully
conditionalposterior
densities are well defined(Hobert
andCasella,
1996).
We do not know whether or not the uniformpriors
we used in the numericalexample
for p, t andR
o
will induce a properjoint posterior
density
!9!.
The fact that no difficulties were encountered inanalysis
does notnecessarily
mean that[9]
was suitable. The safe way may be to use a noninformative but properprior
in anACKNOWLEDGMENTS
We thank the American Simmental
Association,
BozemanMT,
forpartially supporting
this work. We thank D Sorensen forsending
us a copy of his lecture notes(Sorensen,
1996)
in which an alternative dataaugmentation
isdetailed,
and are indebted to himalong
with an anonymous reviewer for numerous constructive criticisms whichimproved
thepresentation
of the paperconsiderably.
REFERENCES
Albert
JH,
Chib S(1993)
Bayesian analysis
ofbinary
andpolychotomous
response data. J Am Stat Assoc88,
669-679Berger PJ,
LinEC,
Van ArendonkJ,
Janss L(1995)
Properties
ofgenetic
parameter estimates from selection experimentsby
Gibbssampling.
JDairy
Sci 78(Suppl 1),
246 BIF(1990)
BIFGuidelines for Uniform Beef Improvement Programs.
BeefImprovement
Federation, Athens,
6th edBox
GEP,
Tiao GC(1973)
Bayesian Inference
in StatisticalAnalysis. Wiley,
New YorkDong MC, Quaas RL,
Pollak EJ(1991)
Estimation ofgenetic
parameters ofcalving
easeand birth
weight by
a threshold model. J Anim Sci 69(Suppl 1),
204Foulley JL,
GianolaD, Thompson
R(1983)
Prediction ofgenetic
merit from data onbinary
andquantitative
variates with anapplication
tocalving difficulty,
birthweight
and
pelvic opening.
Genet Sel Evol 15, 407-424Foulley JL,
ImS,
GianolaD,
Hoeschele I(1987)
Empirical Bayes
estimation of parameters for npolygenic binary
traits. Genet Sel Evol 19, 197-224Foulley JL,
GianolaD,
San CristobalM,
Im S(1990)
A method forassessing
extent andsources of
heterogeneity
of residual variances in mixed linear models. JDairy
Sci73,
1612-1624Foulley JL,
Gianola D(1996)
Statisticalanalysis
of orderedcategorical
data via astructural heteroskedastic threshold model. Genet Sel Evol
28,
249-273Garcia-Cortes
LA,
Sorensen DA(1996)
On a multivariateimplementation
of the Gibbssampler.
Genet Sel Evol 28, 121-126Garrick
DJ,
PollakEJ, Quaas RL,
Van Vleck LD(1989)
Varianceheterogeneity
in direct and maternalweight
traitsby
sex and percentpurebred
for Simmental-sired calves. JAnim Sci
67,
2515-2528Gelfand
AE,
HillsSE,
Racine-PoonA,
Smith AFM(1990)
Illustration ofBayesian
inference in normal data modelsusing
Gibbssampling.
J Am Stat Assoc 85, 972-985 GemanS,
Geman D(1984)
Stochasticrelaxation,
Gibbs distributions and theBayesian
restoration of
images.
IEEE Trans PatternAnalysis
and MachineIntelligence
6, 721-741Geyer
CJ(1992)
Practical Markov chain Monte Carlo(with
discussion).
Stat Sci 7, 467-511Gianola D
(1982)
Theory
andanalysis
of threshold characters. J Anim Sci 54, 1079-1096 GianolaD, Foulley
JL(1983)
Sire evaluation for orderedcategorical
data with a thresholdmodel. Genet Sel
Evol 15,
201-224Gianola
G,
Fernando RL(1986)
Bayesian
methods in animalbreeding theory.
J Anim Sci63, 217-244