HAL Id: hal-00893805
https://hal.archives-ouvertes.fr/hal-00893805
Submitted on 1 Jan 1989
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Estimation of covariance components between one
continuous and one binary trait
H. Simianer, L.R. Schaeffer
To cite this version:
Original
article
Estimation of
covariance
components
between
one
continuous
and
one
binary
trait
H.
Simianer
L.R. Schaeffer
2
Justus
Liebig University, Department of
AnimalBreeding
andGenetics,
Bismarckstr.16 D-6300
Giessen,
FRG2
University of Guelph, Department of Animal
andPoultry Science, Guelph,
Ontario N1G 2W1 Canada(received
17 December1987, accepted
14 October1988)
Summary - A method is described to estimate variance and covariance components in a
multiple
trait situation with one continuous and onebinary
trait. Anunderlying
bivariate normal distribution is assumed with one variable dichotomized on the observable scalethrough
a fixed threshold. A mixed linear model isapplied
to theunderlying scale,
andBayesian
arguments areemployed
to derive estimationprocedures
for both location anddispersion
parameters. This leads to a nonlinear system ofequations
similar to the mixed modelequations
for observations that have been transformedby
aCholesky
decomposition
of the residual variance-covariance matrix so that the residual covariance between the two
transformed traits is zero,
thereby simplifying
construction of themultiple
trait mixed modelequations.
Theprocedures
forestimating
genetic
variances and covariances and the residual variance for the continuous trait areequivalent
to restricted maximum likelihoodin the multivariate normal case. The residual correlation is estimated
using
a maximum likelihoodapproach.
Suitablecomputing strategies
are indicated and a simulationstudy
isgiven
to illustrate the use of the method. Theimpact
of small subclass size on the estimatesis seen to be a serious drawback to the
proposed
method. Possiblegeneralizations
of the method andpotential problems
in itspractical application
are discussed.variance - covariance - estimation - binary trait methods
Résumé - Estimation des composantes de la covariance entre un caractère continu et
un caractère binaire. On décrit une méthode pour estimer les composantes de variance et de
covariance,
dans le cas de deu!caractères,
l’un continu et l’autre binaire. On supposel’existence d’une distribution
sous-jacente binormale,
où l’une des variables neprésente
que deux états observables en
fonction
de sa valeur par rapport à un seuilfixe.
On écritun modèle linéaire mixte pour les variables continues
sous-jacentes,
et l’ons’appuie
sur uneapproche bayésienne
pour construire des estimateurs desparamètres
deposition
et dedispersion.
On obtient unsystème d’équations
non linéaires semblable auxéquations
d’unmodèle
mixte;
unesimplification
deséquations
est obtenue si la covariance résiduelle entreles deux caractères est
nulle,
cequi
s’obtient en réalisant aupréalable
unedécomposition
decorrélation résiduelle est estimée selon le
principe
du maximum de vraisemblance. Onindique
desstratégies adaptées
aucalcul,
et une simulation illustre l’utilisation de laméthode. Cette méthode s’avère sensible à l’existence de
petits
nombres d’observations dans certaines cellules. On discute lesgénéralisations possibles
de laméthode,
ainsi queles
problèmes potentiels
de sonapplication
pratique.composantes de la variance - composantes de la covariance - estimation - caractère binaire
INTRODUCTION
Discrete traits are
predominant
in certain areas of animalproduction,
e.g.,fertility,
prolificacy, viability
and animal health. These traits are ofincreasing
importance
since many
producers
are underquota systems
and a means ofincreasing
incomeis to
improve
efficiency through
thesenon-production
traits.However,
traditionaltraits like milk
yield,
milkcomposition,
growth
and feedefficiency
are stillimpor-tant, and so are the
relationships
between the discrete and the traditionaltraits,
most of which are continuous. Non-linear
procedures
based on the thresholdcon-cept
(Dempster
&Lerner,
1950)
andBayesian
arguments
have become available toanalyze categorical
observations(Gianola
&Foulley,
1983;
Harville &Mee,
1984).
Bayesian
methodology
has been established as ageneral
framework for theanalysis
of any
type
of observationarising
in animalproduction
(Foulley
&Gianola,
1984;
Gianola etal., 1986;
H6schele etal., 1986, 1987;
Foulley
etal., 1987a,
1987b).
Observation structures that include both continuous and
binary
traits havebeen discussed as far as the estimation of
genetic
and environmental effects isconcerned
(Foulley
etad.,
1983).
Theobjective
of this paper is topresent
a methodfor the estimation of
dispersion
parameters
of thejoint
distribution of continuousand
binary
traits. A simulationstudy
was conducted tostudy
the effects of smallsubclass size on
parameter
estimates.MODEL AND DATA STRUCTURE
Each animal is recorded for two different traits. Let y.il be the continuous trait 1
(e.g.,
milkyield)
for the i-thanimal,
and let ci be the response forbinary
trait 2(e.g.,
mastitis),
which is either 1 or 0.Categories
of response for thebinary
traitare assumed to be
mutually
exclusive and exhaustive. Observations for trait 1 formthe s x 1 vector yi, and the observations for the
binary
trait can berepresented
as an s x 1 vector, c’=(c
l
,
c2, ...,c,)’,
where ci = 0 if no disease isobserved,
otherwise ci = 1.
A non-observable
underlying
continuousvariate,
Yi2, isassumed,
and Yil and y22 follow a multivariate normal distribution.
According
to the thresholdconcept
(Dempster
&Lerner,
1950),
y22 can bethought
of asliability
to contract the disease.Only
if the value of y22exceeds a fixed threshold on theunderlying
scale,
does animali show the disease.
where :
p
j
= p x 1 vector of fixed effects for traitj,
u j
= q x 1 vector of additive
genetic
effects for traitj,
ej= s x
1 vector of residuals for traitj,
X = s x p incidence matrixcorresponding
toP
j
,
Z = s x
q incidence
matrixcorresponding
to uj,p = the number of levels of all fixed
effects,
andq = the number of additive
genetic effects,
which could begreater
than orequal
to s.
The
expectations
and variance-covariance matrices of the random variables are:where:
G = 2 x 2 additive
genetic
variance-covariancematrix,
A =
q x q additivegenetic relationship
matrix,
R = 2 x 2 residual variance-covariance
matrix,
I = s xs identity
matrix,
*
= direct
(Kronecker-)
product,
gjj= additive
genetic
variance of traitj,
r!! = residual variance of traitj,
912
= additive
genetic
covariance between traits 1 and2,
andp e
= residual correlation between traits 1 and 2. The unknown
parameters
are the locationparameters,
and the
dispersion
parameters
The residual variance structure is a function of
only
twoparameters,
because Yi2 isa
conceptual
variableand, therefore,
anarbitrary
value for r22 can be chosen.Note that this model assumes that every animal is observed for both traits and
that the same model
applies
to each trait.METHODS OF INFERENCE
Assume
that P
has a flatprior
distribution(i.e.,
nothing
is known apriori
about theelements
of !),
and that u has a multivariate normal distribution withexpectation
null and variance-covariance matrix
equal
to G * A. Thepriori
assumed to beindependent.
Thejoint posterior
distribution of all unknownscan be written as:
where
f (
Y
)
is thejoint prior
density
of thedispersion
parameters
(Foulley
etal.,
1987a).
Estimation of location
parameters
Integrating
T
out
of(2)
wouldhardly
bepossible,
as waspointed
outby
Gianolaet al.
(1986).
An alternative is toreplace
!yby
somevalue,
-y!t!,
so thatas elaborated
by Foulley
et al.(1983).
Ignoring
thesuperscript
[t]
andassuming
that there is
only
one continuous and onebinary
trait and that both traits aredescribed
by
the samemodel,
thenfollowing
thearguments
of Foulley
et al.(1983),
the residual variance of the
conceptual
variable is chosen to be:and therefore:
For
lp, <
1.
R ispositive
definite and a lowertriangular
matrix Texists,
so thatTT’ = R.
T-
1
is used toperform
aCholesky
transformation to remove the residualcovariance between transformed variables where
Let the tilde
symbol,
-, above a variable indicate that same variable on thetransformed
scale,
thenfor i = 1 to
s, and
similarly
for elementsofp,
u, and e. On the transformedscale,
variances and covariances are
Foulley
et al.(1983)
show that under multivariatenormality
and the residuals on the transformed scale are
uncorrelated,
one canproceed
asin a
single binary
traitanalysis,
asimplication
of(Gianola
&Foulley,
1983)
fromseveral
categories
totwo,
computing
where
0(.)
and4)(.)
are thedensity
and distribution function of a standard normalvariate, respectively.
Let:then from (4),
The mode of the
posterior
density
(3)
can be calculatedby applying
theNewton-Raphson algorithm
to thelog
of(3),
which leads to the nonlinearsystem
ofequations
for the[k
+l]th
round of iteration(Foulley
etal.,
1983)
where
an s x s
diagonal
matrix with wi in thediagonal
andNote,
that W and Y2 in round[k
+1]
arecomputed
based on the solutions fromthe
preceding
round!k!.
Let
and :
be transformed back to the
original
scale(the
superscript
[t]
indicating
the set ofdispersion
parameters
used):
Estimation
of genetic
variances and covariancesFoulley
et al.(1983)
discussbriefly
the case of the variance-covariance structurebeing
unknown andsuggest
theapplication
of the restricted maximum likelihood(RE1VIL)
algorithm
describedby
Schaeffer et al.(1978)
to estimate the entiregenetic
variance-covariance structure as well as the residual variance of the continuous traitonly,
and the residual covariance between continuous andbinary
traits.Methods to estimate
dispersion
parameters
have beendeveloped
from aBayesian
background
for different data structures. Procedures forsingle polychotomous
traits have been describedby
Harville & Mee(1984)
andby
H6schele et al.(1987).
Methods for multivariate
binary
traits have beensuggested by Foulley
et al.(1987a),
and for multivariate continuous traitsby
Gianola et al.(1986).
Essentially,
all authors recommend analgorithm analogous
to theexpectation-maximization
(EM)
algorithm
(Dempster
etal.,
1977)
for REML and its multitraitextension,
respectively.
Let
which
requires integration
of(2)
withrespect
to 0.Foulley
et al.(1987a)
show that(7)
can be written aswhere
E
e
is theexpectation
taken withrespect
tof (u ! Y,
yl,T
).
Furthermore,
Foulley
et al.(1987a)
show that whenever a flatprior
forg is
used,
thejoint
posterior
distribution
of
T
can be maximized withrespect
tog by maximizing
E!{ln f( u I g) I
at each iterate. Theresulting
iterative scheme for the[t
+1]th
round iswhere tr
(.)
is the traceoperator
andThese terms cannot be derived
explicitly,
butapproximations
have beensuggested
(Harville
&Mee, 1984;
Stiratelli etal.,
1984).
Making
use of theseapproximations
and
C
ii
,
by
theU
t
x,
i
u
part
of the inverted left hand side of(6).
The new estimates are on the transformed scale and need to be transformed back:Foulley
et al.(1987a)
have shown that if the initial G ispositive definite,
then thesubsequent
estimates of G are alsopositive
definite,
which holds in combinationwith the
applied
Cholesky
transformation.Estimation of the residual variance of the continuous trait
Maximum a
posteriori
predictors
tend to converge to trivial results when flatpriors
for thedispersion
parameters
are used(Lindley
&Smith,
1972).
As an alternativeGianola et al.
(1986)
suggest
estimatorsarising
from anapproximate
integration
of the variances. For g these are
equivalent
to thealgorithms
described in section B aslong
as flatpriors
for the variances are used. For theresidual
variance of thecontinuous trait estimator is
which can be shown to lead to the same results as
iterating
onassuming
full column rank of X. Estimators on theoriginal
scale are obtainedby:
which
yields
Estimation of the residual correlation between traits
Foulley
et al.(1983)
suggest
the use of restricted maximum likelihood estimationas described
by
Schaeffer et al.(1978)
for multivariate normal data.However,
thisrequires
that the residual correlations between continuous andbinary
traits be known.Foulley
et al.(1983)
show thatproportionately
except
for a constant.The
problem
is to findy
[t]
and 8(which
is a function ofT
14
)
which maximize thislog
posterior density.
Estimates for0,
g, and rll can be obtained under thepretense
that pe =
pe
similar to the one dimensionalgrid
searchtechnique proposed by
Smith
&
Graser(1986).
In this case thestrategy
would be:-
-
2)
for ith value of pe in theset,
iterate on(6)
using
thedispersion
parameters
(g,
rll,Pe
i)
until convergence of0,
then do onestep
of(9)
and(10)
and continueto iterate until convergence of
(g
andr
ll
)
is achieved. Thencompute
the value of(11)
defined as h(p
ei
).
Repeat
for all p,i in theset;
-
3)
to find the p,ithat maximizes thelog likelihood,
fit a seconddegree polynomial
through
thepoints
(p
ei
,
h)).
e2
(p
Add the mode of this curve to the set ofpossible
values of pe and
repeat
the secondstep
with this new value. Continue this iterativeprocess until convergence of pe is achieved.
This
strategy
iscomputationally
verydemanding
as convergence is very slow. Analternative
empirical approach
using
a maximum likelihoodprocedure
to estimatep
e was
suggested by
equation
4.4 of Tate(1955)
for the biserial correlationproblem.
The[t
+l]th
improved
estimate of pe iscomputed
as:where
v = s x 1 vector with elements vi, i =
1,
s,1 = s x 1 vector of
I’s,
Yi
(
y
i-
1 *)
I
!
*l/(
d
tl)
l
/
2
,
!
I
= phenotypic
mean of y21 for allanimals,
and
OW
]
is
the estimated value of the threshold which can be calculated from!lk’]
analogously
to a least squares mean. Note that thisprocedure yields
a new estimate of pe on theoriginal scale,
andtherefore,
no backtransformation is needed.Computing algorithm
Combining
all the differentsteps
described,
an obviouscomputing
strategy
would be:-
1)
choose a set ofstarting
valuesyl’],
set(t]
=1;
-
2)
iterate on(6)
until convergence toget
9 ! Y
=T
14
;
-
3)
do onestep
each of(9), (10)
and(12)
toget
a new set ofdispersion
parameters
T
l’
+
’],
set[t]
=[t
+
1]
and continue with theprecedent
step.
Iterate on the above scheme until the estimates
of y are converged. Empirically,
(10)
and(12)
were found to converge morequickly
than(9),
and the estimates in r seemed to berelatively independent
of those in g. Thissuggests
the use of adifferent
strategy:
-
1)
choose a set ofstarting
valuesg
l]
and],
rI
l
set[t]
= 1 and[s]
=1;
-
2)
do one round of(6)
toget
animproved
estimate of0
g =
gI
t
],
r =r1
4
;
-
3)
do onestep
each of(10)
and(12)
toget
a new set ofparameters
r
Is+1]
,
set[s]
=[s
+1]
and continue with theprecedent
stage
-
4)
if convergence in the secondstep
isreached,
do onestep
of(9)
toget
a newestimate of
gI
t+1]
,
set[t]
=[t
+1]
and continue with the secondstep.
Iteration is
stopped
when the estimates in g areconverged.
Thisalgorithm
waset al.
(1987a)
for themultiple
binary
trait case.Iterating
on ronly
for some roundswith every new value of g
improved
theprocedure
evenfurther,
as the estimates ofr converge
quite
rapidly.
SIMULATION STUDY
Model and methods
A simulation
study
was done to assess thesampling
properties
of theproposed
estimators. The assumed true model was:
where:
-
Y ij
kl =
phenotypic
record(i
=1)
orunderlying liability
value(i
=2)
of animal Iin
herd j
with sirek,
-
pz = mean of trait
i,
-hij
= effect ofherd j on
traiti,
-
sz! = effect of sire k on trait
i,
and-
e2!! = residual term.
Pairs of
herd,
sire and residual effects weregenerated
for each animal as randomsamples
from a bivariate normal distribution. Trait 2 was thendichotomized,
applying
a threshold at 1.282 standarddeviations,
whichassigns
90 and 10 per cent of the observations to thephenotypic
classes 0 and1,
respectively.
Recordswere
generated
for10,000
individuals perreplicate,
which wererandomly assigned
to sires
(of
which there were50)
and to herds(of
which there were either 100 or1000).
Withonly
100 herds there were13.5% empty
sireby
herdsubclasses,
and with 1 000 herds there were81% empty
sireby
herd subclasses. Theobjective
ofhaving
two data sets was tostudy
the effect of small sireby
herd subclass size onthe
parameter
estimates from the above methods. The truedispersion
parameters
were:The model of
analysis
wasequal
to(13),
except
that herd effects were treated asfixed and the non observable y2 was
replaced by
thephenotypic
observation in thebinary
trait. Thestarting
values of iteration wereand the
stopping
criterion wasconsidering
the relativechange
to account for the differentmagnitude
of the trueResults
The mean estimates and
95%
confidence intervals aregiven
in Table I. Fisher’sx-transformation for correlations was
applied
tohi, h’,
r9 and pe, and a t-test wasapplied
to examine differences between true and estimatedparameters.
Estimatesfrom data set I with 100 herds
agreed
quite
closely
with the trueparameters.
Theonly significant
bias was observed for the residualcorrelation,
which ispartly
dueto the small confidence interval for this estimate. The
genetic correlation
had a verylarge
confidence intervalcompared
with the other estimates.The
results for data set II(1000 herds)
were similar with theexception
of a50%
overestimation of the
heritability
of thebinary
trait. H6schele et al.(1987)
reported
similar results in a simulationstudy considering
estimation ofheritability
forbinary
traits. The bias was observed when the average subclass size was below
2,
whichagrees with the
present
results,
as the mean subclass sizes wereapproximately
2.3 and 1.1 in data set I and
II, respectively.
H8schele et al.(1987)
consideredthe
approximation
of the conditional distribution in the derivation of(9)
as the main cause of thebias,
as thisapproximation
is based on anormality
assumption
DISCUSSION
A method has been
presented
for the estimation of variances and covariances for thesimplest
case of one continuous and onebinary
trait.Theoretically,
generalization
tomore
complex
observation structures isstraightforward.
For the case of n continuousand one
binary
trait,
the method to estimate locationparameters
is asgiven by
Foulley
et al.(1983),
and theprocedures
to estimatedispersion
parameters
can bereadily
extended, using
results of Hannan & Tate(1965)
for the maximum likelihood estimation of residual correlations. Furthergeneralizations, including
the use ofinformative
priors,
may be derivedthrough Bayesian
arguments
asemployed by
Gianola et al.
(1986)
andFoulley
et al.(1987a).
In
practice, however,
ananalysis
of even thesimplest
case poses a formidablecomputational
task with alarge body
ofdata,
in which case theoptimization
ofcomputing strategies
is of crucialimportance.
Actually,
themethodology presented
was
developed
for ajoint
analysis
of more than200,000
records onproduction
anddisease data in
dairy
cattle.Observations on both traits for all animals were assumed
complete
throughout
the
data,
which seldom is the case inpractice,
where selection and hencemissing
data due to selection
play
animportant
role. This has been addressedby
Henderson(1975)
for continuous traits andby
Foulley
& Gianola(1986)
forcategorical
traits. Themultiple
traitmethodology
allows forsequential
selection where the decision whether an animal isgiven
theopportunity
to be observed for the discrete trait is made on the basis of itsperformance
for the continuoustrait,
but not vice versa,as was indicated
by
Foulley
et al.(1983).
Generalapproaches
to the morecomplex
types
of selection areneeded,
as most of the reasons for inevitable natural selection(diseases, fertility)
are ofcategorical
nature.The simulation
study
was an illustrationof
themethodology
for twospecial
cases.A
general
assessment of thesampling
properties
of the estimates wouldrequire
a series of simulations over a wide range of
parameter
combinations and morerealistic
population
structures as was doneby Hbschele
et al.(1987),
for thesingle
categorical
trait case.Since
thegeneral
framework
of the methods is the same, asimilar behaviour of the estimates can be
expected,
which was shown for the biasedestimate of
heritability
for thebinary
trait when the average size of the smallestsubclass was less than two. This is a serious
problem
inpractical applications,
e.g. whendairy
progeny test data areanalyzed by
a model in which sire and herd xyear x season effects are cross-classified.
The
complexity
of theBayesian approach
in the context of thegiven
model and observation structure makes it necessary to use certainassumptions
and restrictions as discussedby
Gianola et al.(1986)
for the multivariate normal case.Although
the
approach
ofbasing
the inference about 0 on the conditional distributionACKNOWLEDGEMENT
This
study
was carried out while H. Simianer wasvisiting
theUniversity
ofGuelph,
supported by
the DeutscheForschungsgemeinschaft.
REFERENCES
Dempster
E.R. & Lerner LM.(1950)
Heritability
of threshold characters. Genetics35,
212-235Dempster
A.P.,
Laird N.M. & Rubin D.B.(1977)
Maximum likelihood fromincomplete
data via the EMalgorithm.
J.R. Statist. Soc. B39,
1-38Foulley
J.L.,
Gianola D. &Thompson
R.(1983)
Prediction ofgenetic
merit fromdata on
binary
andquantitative
variates with anapplication
tocalving difficulty,
birthweight
andpelvic opening.
Genet. Sel. Evol.15,
401-424Foulley
J.L. & Gianola D.(1984)
Estimation ofgenetic
merit from bivariate &dquo;all ornone&dquo; responses. Genet. Sel. Evol.
16,
285-306Foulley
J.L. & Gianola D.(1986)
Sire evaluation formultiple
binary
responses wheninformation is
missing
on some traits. J.Dairy
Sci.69,
2681-2695Foulley
J.L.,
ImS.,
Gianola D. & H6schele I.(1987a)
Empirical
Bayes
estimation ofparameters
for npolygenic binary
traits. Genet. Sel. Evol.19,
197-224Foulley
J.L.,
Gianola D. & Im S.(1987b)
Genetic evaluation for traits distributedas Poisson-binomial with reference to
reproductive
traits. Theor.Aw,vl.
Genet.73,
870-877Gianola D. &
Foulley
J.L.(1983)
Sire evaluation for orderedcategorical
data witha threshold model. Genet. Sel. Evol.
15,
201-224Gianola
D., Foulley
J.L. & Fernando R.L.(1986)
Prediction ofbreeding
values when variances are not known. Genet. Sel. Evol.18,
485-498Hannan J.F. & Tate R.F.
(1965)
Estimation of theparameters
for a multivariatenormal distribution when one variable is dichotomized. Biometrika
52,
664-668 Harville D.A. & Mee R.W.(1984)
A mixed modelprocedure
foranalyzing
orderedcategorical
data. Biometrics40,
393-408Henderson C.R.
(1975)
Best linear unbiased estimation andprediction
under aselection model. Biometrics
31,
423-447H6schele
L, Foulley J.L.,
Colleau J.J. & Gianola D.(1986)
Genetic evaluation formultiple binary
responses. Genet. Sel. Evol.18,
299-320H6schele
L,
Gianola D. &Foulley
J.L.(1987)
Estimation of variancecomponents
with
quasi-continuous
datausing Bayesian
methods. Z. Tierz.Z!uechtungsbiol.
104,
334-349
Lindley
D.V. & Smith A.F.M.(1972)
Bayes
estimates’ for the linear model. J.R. Statist. Soc.B, 34,
1-41Simianer H.
(1986)
Restricted maximum likelihood estimation of variances and covariances from selected data. Third WorldCongress
on GeneticsApplied
toLivestock
Production, Lincoln,
Nebraska,
16-22July 1986, University
ofNebraska,
Lincoln, 12,
421-426Smith S.P. & Graser H.U.
(1986)
Estimating
variancecomponents
in a class ofmixed models
by
restricted maximum likelihood. J.Dairy
Sci.69,
1156-1165 StiratelliR.,
Laird N.M. & Ware J.H.(1984)
Random-effects model for serial observations withbinary
response. Biometrics40,
961-971Tate R.F.