HAL Id: hal-00893811
https://hal.archives-ouvertes.fr/hal-00893811
Submitted on 1 Jan 1989
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Likelihood inferences in animal breeding under selection:
a missing-data theory view point
S. Im, R.L. Fernando, D. Gianola
To cite this version:
Original
article
Likelihood
inferences
in
animal
breeding
under selection:
a
missing-data theory
view
point
S. Im
1
R.L. Fernando
2
D.
Gianola
1
Institut National de la Recherche
Agrono!nique,
laboratoire debioraétrie,
BP27,
3i326Castanet-Tolosan, France;
2
University of Illinois
atUrbana-Cha
9
nyaign,
126 Animal SciencesLaboratory,
1207 WestGregory Drive, Urbana,
Illinois61801,
USA(received
28 October1988; accepted
20 June1989)
The Editorial Board here introduces a new kind of scientific
report
in theJournal,
whereby
a current field of research and debate isgiven emphasis, being
thesubject
of an open discussion within these columns.As a first essay, we propose a discussion about a difficult and somehow trouble
some
question
inapplied
animalgenetics:
how to take proper account of theobserved data
being
selected data? Severalattempts
have been carried out in thepast
15 years, without any clear and unanimous solution. In thefollowing,
Im,
Fernando and Gianola propose a
general approach
that should make itpossible
to deal with everyproblem.
In addition to the interest of anoriginal
article,
wehope
that their own discussion and response to the commentsgiven by
Henderson andThompson
willprovide
the reader with a soundinsight
into thiscomplex topic.
This paper is dedicated to the memory of Professor
Henderson,
who gave us here one of his latest contributions.The Editorial Board
Summary - Data available in animal
breeding
are oftensubject
to selection. Such datacan be viewed as data with
missing
values. In this paper, inferences based on likelihoodsderived from statistical models for
missing
data areapplied
toproduction
recordssubject
to selection. Conditions for
ignoring
the selection process are discussed.animal genetics - selected data - missing data - likelihood inference
Résumé - Les méthodes d’inférence fondées sur la vraisemblance en génétique
animale: prise en compte de données issues de la sélection au moyen de la théorie des données manquantes. Les données
disponibles
engénétique
animale sont souventissues d’un processus
préalable
de sélection. Onpeut
donc considérer commemanquants
on
développe
les méthodesd’inférence fondées
sur lesvraiserrebdances,
enexplicitant
dansleur calcul le processus, dû à la
sélection, qui
induit les données manquantes. On discute les conditions danslesquelles
on peutignorer
lasélection,
et donc considérer seulement lavraisemblance des données
e,!’ective!rcent
recueillies.génétique animale - sélection - données manquantes - vraisemblance
INTRODUCTION
Data available in animal
breeding
often come frompopulations undergoing
selec-tion. Several authors have considered methods for the proper treatment of datasub-ject
to selection in animalbreeding. Examples
are Henderson et al.(1959),
Curnow(1961),
Thompson
(1973),
Henderson(1975),
Rothshild et al.(1979),
Goffinet(1983),
Meyer
andThompson
(1984),
Fernando and Gianola(1989),
and Schaeffer(1987).
Data
subject
to selection can be viewed as data withmissing
values,
selectionbeing
the process that causesmissing
data. The statistical literature discussesmiss-ing
data that ariseintentionally.
Rubin(1976)
hasgiven
amathematically
precise
treatment which encompasses frequentist approaches that are not based on like-lihoods as well as inferences from likelike-lihoods
(including
maximum likelihood andBayesien
approaches).
Whether it isappropriate
toignore
the process that causesthe
missing
datadepends
on the method of inference and on the process that causes themissing
values. Rubin(1976)
suggested
that in manypractical problems,
infer-ences based on likelihoods are less sensitive thansampling
distribution inferences
tothe process that causes data. Goffinet
(1987)
gave alternative conditions to those of Rubin(1976)
forignoring
the process that causesmissing
-
data
whenmaking
sampling
distributioninferences,
with anapplication
to animalbreeding.
The
objective
of this paper is to consider inferences based on likelihoods derived from statistical models for the data and themissing-data
process, inanalysis
ofdata from
populations
undergoing
selection. As in Little and Rubin(1987),
we consider inferences based onlikelihoods,
in the sense describedabove,
because of theirflexibility
and avoidance of ad-hoc methods.Assumptions underlying
theresulting
methods can bedisplayed
andevaluated,
andlarge sample
estimates of variances based on second derivatives of thelog-likelihood taking
into account themissing
data process, can be obtained.MODELING THE MISSING-DATA PROCESS
Ideas described
by
Little and Rubin(1987)
areemployed
insubsequent
develop-ments. Let y, the realized value of a random vector
Y,
denote the data that would occur in the absence ofmissing values,
orcomplete
data. The vector y ispartitioned
into observedvalues,
yobs, andmissing values,
yi.. Letbe the
probability density
function of thejoint
distribution of Y =(Y
obs;
Y!i!),
and 0 be an unknown
parameter
vector. We define for eachcomponent
of Y an indicatorvariable, R
i
(with
realized valuer
t
),
taking
the value 1 if thecomponent
missing
data are described in table 1. Consider 2 correlated traits measured on n unrelatedindividuals;
forexample,
first and second lactationyields
of ncows.
The’complete’
data are y =(y2!),
where yij is the realized value oftrait j
in individual i(j = 1,2;
i = 1...n).
Suppose
that selection acts on the first trait(case
(a)
in TableI).
As aresult,
a subsetof y,
yobs, becomes available foranalysis.
Thepattern
of the available data is a random variable. Forexample,
if the better of two cows(n
=2)
is selected to
have
a secondlactation,
thecomplete
data would beThen when yl > y21: t
and when
y
ll
<
Y21:1Thus,
inanalysis
of selecteddata,
thepattern
of records available foranalysis,
characterizedby
the value of r, should be considered aspart
of the data. If this isnot
done,
there will be a loss of information.To treat R =
(R
i
)
as a randomvariable,
we need tospecify
the conditionalprob-ability
that R =is a
parameter
of this conditional distribution. Thedensity
of thejoint
distribution of Y and R isThe likelihood
ignoring
themissing-data
process, ormarginal
density
of yobs in the absence ofselection,
is obtainedby integrating
out themissing
data ymis from(equ.(l))
-
---The
problem
withusing
f(y
obs
[0)
as a basis for inferences is that it does not take into account the selection process. The information aboutR,
a random variable whose value r is alsoobserved,
isignored.
The actual likelihood isThe
question
now arises as to when inferences on 0 should be based on thejoint
likelihood
(equ.(4)),
and when can it based onequ.(3),
whichignores
themissing
data process. Rubin(1976)
has studied conditions under which inferences fromequ.(3)
areequivalent
to those obtained fromequ.(4).
If thesehold,
one can saythat the
missing
data process can beignored.
The conditionsgiven by
Rubin(1976)
are:1)
themissing
data aremissing
atrandom,
ie,
/(r!yobs,ymis) 4*)
=/(r!yobs)
4
l)
for
all 4o
and Ymis evaluated at the observed values r and yobg; and2)
theparameters
0
and +
aredistinct,
in the sense that thejoint
parameter
space of(0,
,)
is theproduct
of theparameter
space of 8 and theparameter
space of!.
Within thecontexte of
Bayesian inference,
themissing
data process isignorable
when1)
themissing
data aremissing
atrandom,
and2)
theprior density
of 0and,
is theproduct
of themarginal
prior
density
of 0 and themarginal
prior
density
of,.
IGNORABLE OR NON-IGNORABLE SELECTION
Without loss of
generality,
we examineignorability
of selection whenmaking
likelihood inferences about 0 for each of the threeexamples
given
in Table I.Suppose
individuals1, 2 ...
m(<
n)
are selected.Cases
(a)
Selection based on observations on the first
trait,
which are apart
of the observed data and all the data used to make selection decisions are available. The likelihood for the observeddata,
ignoring
selection,
isBecause selection is based on the observed data
only,
the conditionalprobability
It follows that maximization
of equ.(7)
withrespect
to 0 willgive
the same estimates of thisparameter
as maximization ofequ.(6).
Thus,
knowledge
of the selection process is notrequired,
i.e.,
selection isignorable.
Note that with or withoutnormality,
/(y
obs!
8)
canalways
be written asequ.(5)
or(6).
Undernormality
of thejoint
distribution ofY
il
andY
2
,
Kempthorne
and VonKrosigk
(Henderson
etal.,
1959)
and Curnow(1961)
expressed
the likelihood asequ.(6).
Theseauthors,
however,
did notjustify clearly why
themissing
data process could beignored.
In order to illustrate the
meaning
of theparameter
41
of the conditionalprobability
of R = rgiven
Y = y, we consider a ’stochastic’ form of selection: individual i is selected withprobability
g(o
o
+!i2/ti)t so + = (’Ij;
o
, ’lj;1)
’
Thistype
of selection can beregarded
as selection based onsurvival,
whichdepends
on the first trait via the functiong(O
o
+’lj;1 Yil).
We have for the data in Table IThe actual likelihood for the observed data yobs and r is
It follows that when
4
o
and 0 aredistinct,
inference about 8 based on the actuallikelihood,
f(
Yobs
,
riO, «1’),
will beequivalent
to that based on the likelihoodignoring selection,
f(y
obs1
0).
As shown inequ.(8),
the two likelihoods differby
amultiplicative
constant which does notdepend
on 0.It should be noted that in
general, although
the conditional distribution ofR
i2
given
y does notdepend
on0,
this is not with themarginal
distribution. Forexample,
whenY
il
is normal with mean pi and varianceer 2,
and g is the standardnormal function
(lF)
we havePr(Ri
2
=1!8,!) = <I>[(’Ij;o + ’l/J¡J.L¡)/(1
+1
/
i
i
ai)
1/2
]
The condition
(b)
in Goffinet(1987)
forignoring
the process that causesmissing
data is not satisfied in this situation.Cases
(b)
Data are available
only
in selected individuals because observations aremissing
in the unselected ones. In whatfollows,
we will consider truncation selection: individual i is selected when y21 >t,
where t is a known threshold.The conditional
probability
that R = rgiven
Y = ydepends
on the observed and on themissing
data. We havewhere
l!t !i(y21)
= 1 ifyii >
t,
and 0 ifyi
l
<
t.The actual
likelihood, accounting
forselection,
isComparison
ofequs.(9)
and(10)
indicates that one should make inferencesabout 0
using
equ.(10),
which takes selection into account. Ifequ.(9),
isused,
the information about 8 contained in the second term inequ.(10)
would beneglected.
Clearly
selection is notignorable
in this situation.Cases
(c)
Often selection is based on an unknown trait correlated with the trait for which
data are available
(Thompson, 1979).
As in case(c)
in TableI,
suppose the dataare available for the second trait on selected individuals
only,
following
selection,
e.g.
by
truncation,
on the first trait. The likelihoodignoring
selection isWe have
The likelihood of the observed
data,
yobS and r isUnder certain conditions one could use
/(y
obs!
8)
to make inferences aboutparameters
of themarginal
distribution of the second trait after selection.Suppose
themarginal
distribution of the second traitdepends only
onparameters
8
2
,
and that themarginal
and conditional(given
the secondtrait)
distributions of thefirst
trait do not
depend
on8
2
.
In this case, likelihood inferences on0
2
fromequs.(11)
and(12)
will be the same.In summary, the results obtained for the 3 cases discussed indicate that when selection is based
only
on the observed data it isignorable,
andknowledge
of the selection process is notrequired
formaking
correct inferences aboutparameters
of the data. When the selection processdepends
on observed and also onmissing
data,
selection isgenerally
notignorable.
Here,
making
correct inferences aboutparameters
of the datarequires
knowledge
of the selection process toappropriately
construct the likelihood.
A GENERAL TYPE OF SELECTION
Selection based on data
In this
section,
we consider the moregeneral
type
of selection describedby
Goffinet(1983)
and Fernando and Gianola(1987).
The data yo are observed in a ’basepopulation’
and used to make selection decisions which lead to observe a set ofdata,
Ylobs
, among nl
possible
sets of values - Yini ... Yl2,Yll Eachyl!(k
=I ...
n
i )
isa vector of measurements
corresponding
to a selection decision. The observed dataat the first
stage,
yiobs, are themselves used(jointly
withy
o
)
to make selection decisions at a secondstage,
and so forth. Atstage j
(j
= 1 ...J),
let yj be thevector of all elements from
y!l ...Y!n!,
withoutduplication. The
vector yj can bepartitioned
aswhere Yiobs and yjinis are the observed and the
missing
data,
respectively.
For the Jstages,
the datacan be
partitioned
as y=(Yobs, Ymis),
whereand
are the observed and
missing
parts,
respectively,
of thecomplete
data set. Thecomplete
data set y is a realized value of a random variable Y.When the selection process is based
only
on the observeddata,
yobs, the observedmissing
datapattern,
r isentirely
determinedby
yobs.Thus,
Selection based on data
plus
’externalities’Suppose
that externalvariables,
represented by
a random vectorE,
and the observed data yobs arejointly
used to make selection decisions. Let/(y,e!6,!)
be thejoint density
of thecomplete
data Y andE,
with an additionalparameter !
such that 8and
are distinct. The actuallikelihood,
density
of thejoint
distribution ofY
obs
andR,
iswhere
j(rI
Yobs
, e, cJI)
is the distribution of themissing
data process(selection
process).
In
general,
inferences about 0 based onj(Yobs,
r[0,
ç,
«1’)
are notequivalent
tothose based on
/(y
obs!
8).
However,
if for the observeddata,
yobsfor all Ylllis and e, then
equ.(13)
can be written asThus,
under the abovecondition,
which is satisfied when Y and E areindependent,
inferences about 0 based on the actual likelihoodj(
Yobs
,
r[0,
ç,
«1’)
and those based on/(y
obs!
0)
areequivalent.
Consequently,
the selection process isignorable.
Notethat the condition
for all Ymis and e does not
require
independence
between Y and E because it holdsonly
for the observed data yobs and not for all values of the random variableY
obs
.
The results can be summarized as follows:
1)
the selection process isignorable
when it isonly
on the observeddata,
or on observed data andindependent
externalities;
2)
the selection process is notignorable
when it is based on theobserved data
plus
dependent
externalities. In the latter case,knowledge
of theselection process is
required
formaking
correct inferences.DISCUSSION
Maximum likelihood
(ML)
is awidely
used estimationprocedure
in animalbreeding applications
and has beensuggested
as the method of choice(Thompson,
1973)
when selection occurs. Simulation studies(Rothschild
etal., 1979, Meyer
andThompson,
1984)
have indicated that there isessentially
no bias in ML estimates of variance and covariancecomponents
under forms ofselection,
e.g., data-basedselection.
requires
that selection be carried out on alinear,
translation invariant function. Thisrequirement
does not appear in our treatment because we argue from a likelihoodviewpoint.
In this paper, the likelihood was defined as the
density
of thejoint
distribution of the observed datapattern.
In Henderson’s(1975)
treatment
ofprediction,
thepattern
ofmissing
data isfixed,
rather thanrandom,
and this results in a loss of information aboutparameters
(Cox
andHinkley,
1974).
It ispossible
to use the conditional distribution of the observed datagiven
themissing
datapattern.
Gianola et al.
(submitted)
studied thisproblem
from a conditional likelihoodviewpoint
and found conditions forignorability
of selection even more restrictive that those of Henderson(1975).
Schaeffer(1987)
arrived to similarconclusions,
but this author worked withquadratic
forms,
rather than with likelihood. The fact that thesequadratic
forms appear in analgorithm
to maximize likelihood is notsufficient to
guarantee
that the conditionsapply
to the method per se.If the conditions for
ignorability
of selection discussed in thisstudy
aremet,
theconsequence is that the likelihood to be maximized is that of the observed
data,
i.e.,
themissing
data process can becompletely ignored. Further,
if selection isignorable
f (y
obs
, r,
10)
ocj(
YobsI
O),
soEfron and
Hinkley
(1978)
suggested
using
observed rather thanexpected
infor-mation to obtain theasymptotic
variance-covariance matrix of the maximum likelihood estimates. Because the observed data aregenerally
notinde-pendent
oridentically
distributed, simple
results thatimply
asymptotic
normality
of the maximum likelihood estimates do notimmediately apply.
For further discus-sion see Rubin(-1976).
We have
emphasized
likelihoods and little has been said onBayesian
inference. It is worthnoticing
that likelihoods constitute the ’main’part
ofposterior
distri-butions,
which are the basis ofBayesian
inference. The results also hold forBayesian
inferenceprovided
theparameters
aredistinct, i.e.,
theirprior
distributions areindependent.
For data-basedselection,
our results agree with those of Gianola and Fernando(1986)
and Fernando and Gianola(1989)
who usedBayesian
arguments.
Ingeneral,
inferences based on likelihoods orposterior
distributions have been found more attractiveby
animal breedersworking
with datasubject
to selection than those based on other methods. This choice is confirmed andstrengthened by
application
of Rubin’s(1976)
results to thistype
ofproblem.
REFERENCES
Cox D.R. &
Hinkley
D.V.(1974)
Theoretical Statistics.Chapman
andHall,
London Curnow R.N.(1961)
The estimation ofrepeatability
andheritability
from recordssubjects
toculling.
Biometrics17,
553-566Fernando R.L. & Gianola D.
(1989)
Statistical inferences inpopulations undergoing
selection and non-randommating.
In: Advances in Statistical Methodsfor
GeneticImprovement of
LivestockSpringer-Verlag,
in pressGianola D. & Fernando R.L.
(1986)
Bayesian
methods in animalbreeding theory.
J. Anim. Sci.63,
217-244Gianola
D.,
Im S. Fernando R.L. &Foulley
J.L.(1989)
Maximum likelihoodestimation
ofgenetic
parameters
under a &dquo;Pearsonian&dquo; selection model. J.Dairy
Sci.(submitted)
Goffinet B.
(1983)
Selection on selected records. Genet. Sel. Evol.15,
91-98 Goffinet B.(1987)
Alternative conditions forignoring
the process that causesmissing
data. Biometrika71,
437-439Henderson C.R.
(1975)
Best linear unbiased estimation andprediction
under a selection model. Biometrics31,
423-439Henderson
C.R., Kempthorne 0.,
Searle S.R. & VonKrosigk
C.M.(1959)
The estimation of environmental andgenetic
trends from recordssubject
toculling.
Biometrics
15,
192-218Little R.J.A. & Rubin D.B.
(1987)
Statisticad
Analysis
withMissing
DataWiley,
New YorkMeyer
K. &Thompson
R.(1984)
Bias in variance and covariancecomponent
estimators due to selection on a correlated trait. Z. Tierz.
Zuchtungsbiol.
101, 33-50
RothschildM.F.,
Henderson C.R. &Quaas
R.L.(1979)
Effects of selection on variances and covariances of simulated first and second lactations. J.Dairy
Sci.62,
996-1002Rubin D.B.
(1976)
Inference andmissing
data. Biometrika63,
581-592Schaeffer L.R.
(1987)
Estimation of variancecomponents
under a selection model. J.Dairy
Sci.70,
661-671Thompson
R.(1973)
The estimation of variance and covariancecomponents
when records aresubject
toculling.
Biometrics29,
527-550COMMENT
C.R. Henderson
t
*
The paper
by Im,
Fernando and Gianolaprovides
aninteresting
and invaluable contribution to estimation andprediction
in an almost universal situation in animalbreeding. Very
few data are available forparameter
estimation orprediction
ofbreeding
values that have not arisen from either selectionexperiments
or from field data in herds that haveundergone
selection.For several years after the
adoption
ofBLUP,
a mixed linear model wasassumed,
and the usualdescription
of the model was thatE(Y)
andE(e)
are bothnull,
and in an additivegenetic
modelVar(U)
=A
Q
a.
Theassumption
ofE(U)
= 0 isclearly
untenable,
because if selection has beeneffective,
theexpectations
of a subvectors for successivegenerations
areincreasing.
A serious
attempt
to model for selection was made in my 1975 Biometrics papercited
by
Im et al. It must beemphasized,
as has been done in the paper underreview,
that my model is different from the model of thepresent
paper. Consequently, the solution toprediction
and estimation differs. I do notdisagree
with the authors’ conclusions from theirmodel,
and Ithink,
based onlong
discussions with Gianola andFernando,
thatthey
do notdisagree
with my conclusions based on my model. Thereally
criticalquestion
is,
&dquo;What is the best model fordescribing
selection?&dquo; I have no intention ofaddressing
this issue because I neither havestrong
convictions about my model nor about any others.Our models differ in that mine is
considerably
morerestrictive,
requiring
as itdoes,
a fixed incidence matrix withconceptual repeated
sampling.
This of course is the traditionalapproach
takenby
classical statisticians. Theproblem
is moredifficult,
however,
with selectionproblems,
ascompared
tonicely designed
experimental
situations. Noattempt
was made in the 1975 paper to solve theproblem
of estimation of variances and covariances.Rather,
I solved theproblem
of BLUE of estimable functionsof
(
3
and BLUP of randomvariables, given
multivariatenormality
and with variances and covariances known toproportionality.
Ipointed
out
that,
in contrast to no selectionmodels,
the estimators andpredictors
are biasedif incorrect ratios are
employed.
Thus,
it is critical to obtain the bestpossible
of theseparameters.
Im et al. address thisproblem.
Several workers have
speculated
that REMLapplied
to a selection model estimates the variances and covariances that existedprior
to selection and which may have been alteredby
selection. In contrast to most of thesespeculations,
Isuggested
that when selection is on observedrecords,
the linear selection functions should be translation invariant. I think this is true under my selection model but may well not be true for other selectionmodels.
Im et al.
strongly emphasize
thedesirability
of likelihood methods. I agree withthem,
and in manymeetings
and papers have recommended these methods over some of my own, such as Method 3. I doubt the accuracy of the last sentence of the paper under review which states that animal breeders find likelihood methods more attractive. Astudy
of animalbreeding
literature of thepast
5 years wouldprobably
disclose that animal breeders have used Method 3 much more often than*
REML or ML. If this is
true,
Icertainly agree with
minority
and with Im et al. The fact is that BLUE and BLUP under my selection model are ML estimators of/
3
and of the conditional mean of U.I should now like to discuss how results compare under my selection model and
under the model of the
present
authors. We agree partially regarding estimation when selection is on observable records. The authors’ modelclearly
showsignora-bility
of selection in this case. Under mymodel,
linear selection functions of Y musteither be translation invariant or it must be true that
E(L’Y
s
)
=E(L’Y
u
)
when Y
9
and
Y
u
refer to selection and to noselection,
respectively.
This difference issimply
a consequence of different models.We agree that if selection is on unobserved random
variables,
selection is notignorable.
Aspecial
case of this has been of interest to me. Basepopulation
animals have been selected on translation invariant linear functions ofdata,
but these are notavailable for
analysis. Assuming
that such selection results inE(U6) !
0 asimple
modification of theregular
mixed modelequations
leads to BLUE andBLUP,
andpresumably
these modifiedequations
could be used to derive REML estimation of the variances andcovariances,
Henderson(1988).
I believe that this final
question
isjustified, namely,
&dquo;What are theoperating
characteristics of the authors’ estimators?&dquo; Likelihood methods for variance esti-mation have known desirableproperties
only
inlarge samples.
We need studies for various methods ofbias, MSE,
and maximization of selection progressusing
BLUP with estimated variances and covariances.Probably
this can be doneonly through
extensive simulation for a wide range ofparameter
values,
selectionintensity,
etc.The authors have made a valuable contribution to the
problem
of estimation in selection models. This paper should motivate further studies on thisproblem.
ADDITIONAL REFERENCE
Henderson C.R.
(1988).
Asimple
method to account for selected basepopulations.
J.Dairy
Sci71,
3399-3404COMMENT
R.
Thompson
*This paper considers ways of
constructing
likelihoods for selecteddata,
specifically
taking
account of the presence and absence of datafollowing
a methoddeveloped
by
Rubin(1976)
andexplained
in the recent bookby
Little and Rubin(1987).
Some of the likelihoods have beengiven previously
without any formal recourse to ideasof missingness,
forexample
extensions of case(a)
Henderson et al.(1959),
Curnow(1961)
andThompson
(1973).
Theses authors used asequential approach
to buildup likelihoods that I find
appealing. Using
thisapproach it is easy
to see that r is a function of y and so does not contribute any extra information on 9. To derive the same likelihoodby
differing
routes isreassuring.
*
It is valuable to know when selection is
ignorable.
I havealways
found itconfusing
that in extensions of case(a)
a likelihoodapproach
would say that selection on y2, isalways ignorable
but Henderson(1975)
suggests
that selection isonly ignorable
if selection is on aculling
variate(w)
that is translation invariant.In an
interesting
paper the same 3 authors(Gianola
etal.,
1988)
have constructed thejoint
density
of the data and random effects conditional on theculling
variate(D!).
Inferences based onD
c
suggest
that selection can beignored
only
if it is based on functions of the data that do notdepend
on the fixed effects. It would have been instructive to relateD!
to terms used in thepresent
paper, aspresumably
r can be related to theculling
variate andmight help
to answer 3 comments I have on the use ofDc.
First,
Gianola et al.(1988)
condition on w, theculling
variate,
by
integrating
over
y and
the random effects. I am not sure of the need tointegrate
over the random effects. Onemight
sometimes want to considerrepeated samples
over(or
conditioning
on)
allpossible
genetic
material andonly
repeated
over the samegenetic
material. Henderson(1988)
hasrecently
suggested
aprocedure
that involves nointegration
over y of the randomeffects,
i.e.conditioning
on the observed value of w. What should one do?Secondly,
Gianola et al.(1988) highlighted
differences betweenusing D!
and Henderson’s(1975)
approach
when selection is on random effects or residuals(w
= L’u orL’e).
This case is artificial in the sense that random effects and residualswill never be known
exactly.
But if selection is on known random effects itscarcely
seems necessary topredict
themusing
only
the data. Itmight
be moreinteresting
to compare the 2
predictions.
Similarly,
if w = L’e isknown,
this known value couldimprove
estimation
andprediction
of the otherparameters.
Thirdly,
if selection is on w =L’y,
but is not translationinvariant, presumably
the authorstechnique,
which isnon-linear,
should be more efficient thanHender-son’s
approach.
I wonder if the authors havequantitative
information on this.Finally
there are cases when one wants to estimateparameters
associated withequ.(4)
and,
(Robertson (1966)).
There is discussion of this area in Little and Rubin(1987)
andtechniques developed
by Foulley,
Gianola andThompson
(1983)
forquantitative
andbinary
traits can sometimes be used.ADDITIONAL REFERENCES
Foulley
J.L.,
Gianola D. &Thompson
R.(1983)
Prediction ofgenetic
merit from data onbinary
andquantitative
variates withapplication
tocalving difficulty,
birthweight
andpelvic opening.
G6n6t. 5’e/.Evol.
15,
401-424Gianola
D.,
Im S. & Fernando R.L.(1988)
Prediction ofbreeding
value under Henderson’s selection model: a revisitation. J.Dairy
Sci.71,
2790-2798Henderson C.R.
(1988)
Simple
method tocompute
biases and meansquared
error of linear estimators andpredictors
in a selection modelassuming
normality.
J.Dairy
Sci.
71,
3135-3142REJOINDER
S.
Im,
R.L. Fernando and D. GianolaWe thank the
participants
in this discussionand,
inparticular,
ProfessorHenderson,
whose comments were receivedby
us a few weeks before hisunexpected death,
for their contributions to thetheory
ofparameter
estimation under selection. AsThompson
points
out in hiscomments,
there is acontroversy
regarding
selection based on observed data.Accordingly,
webegin
ourrejoinder
with a discussion on thisproblem.
Then we adress the issues raisedby
the discussants.Different methods of
parameter
estimation orprediction
ofbreeding
valuesdeveloped
to account for selection lead to different results onignorability
of selection based on observeddata,
as statedby Thompson.
Therepeated sampling
developments
of Henderson(1975)
are madeusing
the conditional distribution ofY
obs
given
R(the
observedpattern
ofmissing
data)
andrequire,
as indicatedby missing
datatheory
(Rubin, 1976),
stronger
conditions forignorability
than likelihood based inferences.However,
it should be noted that the latter inferences are based on thejoint
distribution ofY
obs
and R instead of theconditional
distribution mentioned above. Some papers(Gianola
etal.,
1988;
Goffinet,
1988)
have considered selection from the conditional likelihood
viewpoint
andgiven
conditions forignoring
it,
and these are very restrictive andsimilar
to those of Henderson(1975).
Goffinet(1988)
advocated the use of conditional likelihood for selection on observed data and found that it isignorable only
if themarginal
distribution of R does notdepend
on theparameter
0. The crucialquestion
tobe answered is: should inferences be based on the conditional distribution of
Y
obs
given
R?In
repeated sampling
inferences,
the statisticalquality
of an estimator isusually
measured in terms ofquantities
(bias, variance)
evaluatedby averaging
over allpossible samples
according
to the randomnessgenerated by
thesampling process.
According
to thisprinciple,
inferences should be madeunconditionally
on the observed value of R. This is done in surveysampling
theory
where selection schemes do notdepend
on the response variable and the selectionprobabilities
areknown;
see, for
example,
Gourieroux(1981).
However,
these conditions are not satisfied in animalbreeding
situationsand,
as noticedby
Henderson(1975),
the unconditionalapproach
is rather intractable.Consequently
a conditionalanalysis,
while notfully
efficient,
may be useful.&dquo;
. ’
From the likelihood
viewpoint,
the unconditional method should bepreferred
over the conditional method because the former leads to better estimates than thelatter,
in the sense ofhaving
smallerasymptotic
variances of estimators. Condi-tional likelihood isusually
considered as a device forobtaining
a consistent estimate of theparameter
of interest in the presence ofinfinitely
many nuisanceparameters
(Kalbfleisch
andSprott,
1970).
According
to Andersen(1970),
the conditional max-imum likelihood estimator is consistent andasymptotically normally
distributedbut,
in contrast to the maximum likelihoodestimators,
it will not ingeneral
be efficient even underregularity
conditions.distribution
of y
l
,
sampled
atrandom,
is affectedby
selection. Im(1989)
highlighted
difficulties that arise whenapplying
BLUP under selection in this case, andconsidered estimation and
prediction
based on the unconditional likelihood. The conditional likelihoodapproach,
which is lessefficient, requires
knowledge
of the selection process and morecomplicated calculation,
unless themarginal
distribution of R does notdepend
on theparameters
being
estimated.For selection
problems
dealt with in this paper, as well as in most animalbreeding
literature,
the correct likelihood isgiven
by
thejoint
distribution ofY
obs
and R. This may not bealways
the case.Consider,
forexample,
situation(b)
in Table I. Wesupposed
that the unselected individuals were available foranalysis,
and used the information thatthey
were not selected whenderiving
the likelihood. Ifthey
were notavailable,
the actual likelihood would be a conditional one. In any selectionproblem,
one should construct the correct likelihood and use it to make inferences. We agree with Henderson that likelihood methods have known desirable prop-ertiesonly
inlarge samples,
but little is known about their smallsample
behavior. Simulation studies he indicated could be useful. Wedisagree
that BLUE and BLUP under his selection model are ML estimators of!i
and of the conditional mean of U because thenormality
requirement
is not met underselection,
unless selection is translation invariant.Thompson’s
comments aremostly
concerned with another paper, Gianola et al.(1988),
who consideredprediction
ofbreeding
valuesby
maximizing
thejoint
distribution of the data and the random effectsusing
the conditional selection schemeproposed by
Henderson. For known variances and 0 =(
{
3,u),
D
c
would bethe
joint density
of(Y
ob
g, U)
given
R,j(Yobslr,{3,u)j(u).
In Gianola et al.(1988),
the selection process is defined as the modification of thejoint
density
of(Y, U)
into anotherdensity,
due to a restriction in thesample
space of W.Integration
overy and
u is needed forobtaining
thejoint
density
of(Y, U)
conditional on W ER
s
from that of(Y, U, W).
Theprocedure
suggested
by
Henderson(1988a)
does not involveintegration explicitly
because it isdeveloped using
conditionalmeans, variances and covariances.
However,
integration
isrequired
whencalculating
these conditionalquantities
from thejoint density
of(Y,
U,
W).
It seems to us that Henderson’s(1988a)
procedure
is not conditional on the observed value of Wbut,
rather,
on that W ER
s
.
If it were so, thenH
s
=Var(W
s
)
=Var(W!W)
= 0.But,
in his
example
of cowculling,
he simulated withH, 54
0(p. 3139).
Selection on random effects or residuals
(Henderson, 1975)
is indeed artificialand, consequently,
has no realpractical
interest. Gianola et al.(1988)
studied this in order to compare the results with those of Henderson(1975).
We are not sure of the need for furtherdeveloping
thistype
of selection. In hiscomments,
Henderson gave a new and more realistic definition of selection on random effects.Namely,
this selection is based on records correlated with U but not available foranalysis.
Itmight
beinteresting
to compare different methods under this scheme.We
have noquantitative
information on theefficiency
of Henderson’sapproach
when selection is onL’y,
but is not translation invariant. Thisquestion
deserves furtherstudy.
possible
to handle situations in which animals are selected in anunspecified
manner(Henderson, 1988b).
To end our
rejoinder,
we should like to introduce apractical
andimportant
question.
Is itpossible
to relax the condition ofnormality required
in Henderson’sdevelopments?
Thisquestion
should motivate some further studies on the selectionproblem.
ADDITIONAL REFERENCES
Andersen E.B.
(1970) Asymptotic
properties
of conditional maximum likelihood estimators.J.R. Statist. Soc. B32,
283-301Goffinet B.
(1988)
A propos de l’estimation desparam6tres
enpresence
de selection. Biorrc. Praxim28,
49-60Gourieroux C.
(1981)
Th6orie dessondages.
Economica,
ParisHenderson C.R.
(1988a)
Simple
method tocompute
biases and meansquared
error of linear estimators andpredictors
in a selection modelassuming
normality.
J.Dairy
Sci.71,
3135-3142Henderson C.R.
(1988b)
Asimple
method to account for selected basepopulations.
J.Dairy
Sci.71,
3399-3404Im S.
(1989)
On a mixed linear model when the data aresubject
to selection. Biom. J. in press.Kalbfleisch J.D. &