Likelihood inferences in animal breeding under selection: a missing-data theory view point

(1)

HAL Id: hal-00893811

https://hal.archives-ouvertes.fr/hal-00893811

Submitted on 1 Jan 1989

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Likelihood inferences in animal breeding under selection:

a missing-data theory view point

S. Im, R.L. Fernando, D. Gianola

To cite this version:

(2)

Original

article

Likelihood

inferences

in

animal

_breeding

under selection:

a

missing-data theory

view

point

S. Im

1 R.L. Fernando

2

D. Gianola

1

Institut National de la Recherche

_{Agrono!nique,}

laboratoire de

bioraétrie,

BP

_27,

3i326

Castanet-Tolosan, France;

2

University of Illinois

at

_Urbana-Cha

₉

_nyaign,

126 Animal Sciences

Laboratory,

1207 West

Gregory Drive, Urbana,

Illinois

_61801,

USA

(received

28 October

_{1988; accepted}

20 June

1989)

The Editorial Board here introduces a new kind of scientific

_report

in the

Journal,

whereby

a current field of research and debate is

_{given emphasis, being}

the

_subject

of an _opendiscussion within these columns.

As a first _essay,we propose a discussion about a difficult and somehow trouble

some

_question

in

_applied

animal

_genetics:

how to take proper account of the

observed data

_being

selected data? Several

_attempts

have been carried out in the

past

15 years, without any clear and unanimous solution. In the

following,

Im,

Fernando and Gianola propose a

general approach

that should make it

possible

to deal with every

problem.

In addition to the interest of an

_original

_article,

we

_hope

that their own discussion and response to the comments

given by

Henderson and

Thompson

will

_provide

the reader with a sound

_insight

into this

_{complex topic.}

This paper is dedicated to the memory of Professor

Henderson,

who gave us here one of his latest contributions.

The Editorial Board

Summary - Data available in animal

_breeding

are often

subject

to selection. Such data

can be viewed as data with

missing

values. In this paper, inferences based on likelihoods

derived from statistical models for

_missing

data are

_applied

to

_production

records

_subject

to selection. Conditions for

_ignoring

the selection process are discussed.

animal _{genetics -}selected data - missing data - likelihood inference

Résumé - Les méthodes d’inférence fondées sur la vraisemblance en _génétique

animale: prise en _comptede données issues de la sélection au moyen de la théorie des _{données manquantes.}Les données

_disponibles

en

génétique

animale sont souvent

issues d’un processus

préalable

de sélection. On

peut

donc considérer comme

_manquants

(3)

on

_développe

les méthodes

_{d’inférence fondées}

sur les

vraiserrebdances,

en

explicitant

dans

leur calcul le processus, dû à la

sélection, qui

induit les données manquantes. On discute les conditions dans

_lesquelles

on _peut

_ignorer

la

_sélection,

et donc considérer seulement la

vraisemblance des données

_{e,!’ective!rcent}

recueillies.

génétique animale - sélection - données _{manquantes -}vraisemblance

INTRODUCTION

Data available in animal

_breeding

often come from

_{populations undergoing}

selec-tion. Several authors have considered methods for the proper treatment of data

sub-ject

to selection in animal

_{breeding. Examples}

are Henderson et al.

_(1959),

Curnow

(1961),

Thompson

(1973),

Henderson

_(1975),

Rothshild et al.

_(1979),

Goffinet

(1983),

Meyer

and

_Thompson

_(1984),

Fernando and Gianola

_(1989),

and Schaeffer

(1987).

Data

_subject

to selection can be viewed as data with

_missing

_values,

selection

being

the _processthat causes

_missing

data. The statistical literature discusses

miss-ing

data that arise

_{intentionally.}

Rubin

₍₁₉₇₆₎

has

_given

a

_{mathematically}

_precise

treatment which _{encompasses frequentist approaches}that are not based on like-lihoods as well as inferences from likelike-lihoods

_(including

maximum likelihood and

Bayesien

approaches).

Whether it is

_appropriate

to

_ignore

the process that causes

the

_missing

data

_depends

on the method of inference and on the process that causes the

_missing

values. Rubin

₍₁₉₇₆₎

_suggested

that _{in many}

_{practical problems,}

infer-ences based on likelihoods are less sensitive than

_sampling

distribution inferences

to

the _processthat causes data. Goffinet

₍₁₉₈₇₎

_{gave alternative}conditions to those of Rubin

₍₁₉₇₆₎

for

_ignoring

_missing

_-

_data

when

_making

sampling

distribution

_inferences,

with an

application

to animal

_breeding.

The

_objective

of this paper is to consider inferences based on likelihoods derived from statistical models for the data and the

_missing-data

process, in

analysis

of

data from

_populations

_undergoing

selection. As in Little and Rubin

_(1987),

we consider inferences based on

_likelihoods,

in the sense described

_above,

because of their

_flexibility

and avoidance of ad-hoc methods.

_{Assumptions underlying}

the

resulting

methods can be

_displayed

and

_evaluated,

and

_{large sample}

estimates of variances based on second derivatives of the

_{log-likelihood taking}

into account the

missing

data _process,can be obtained.

MODELING THE MISSING-DATA PROCESS

Ideas described

_by

Little and Rubin

₍₁₉₈₇₎

are

_employed

in

_subsequent

develop-ments. Let y, the realized value of a random vector

_Y,

denote the data that would occur in the absence of

_{missing values,}

or

_complete

data. The _{vector y}is

partitioned

into observed

_values,

_y_obs_,and

_{missing values,}

_y_i_..Let

be the

_{probability density}

function of the

_joint

distribution of Y =

(Y

obs;

Y!i!),

and 0 be an unknown

parameter

vector. We define for each

_component

of Y an indicator

variable, R

i

(with

realized value

_r

_t

_),

_taking

the value 1 if the

component

(4)

missing

data are described in table 1. Consider 2 correlated traits measured on n unrelated

_individuals;

for

_example,

first and second lactation

_yields

of n

cows.

The

’complete’

data are _y=

(y2!),

where yij is the realized value of

trait j

in individual i

(j = 1,2;

i = 1...

_n).

_Suppose

that selection acts on the first trait

_(case

_(a)

in Table

I).

As a

result,

a subset

of y,

_y_obs_,becomes available for

_analysis.

The

_pattern

of the available data is a random variable. For

_example,

if the better of two cows

(n

=

2)

is selected to

have

a second

_lactation,

the

_complete

data would be

Then when yl > y21: t

and when

_y

_ll

_<

_Y₂₁_:1

Thus,

in

_analysis

of selected

data,

the

pattern

of records available for

_analysis,

characterized

_by

the value of r, should be considered as

part

of the data. If this is

not

_done,

there will be a loss of information.

To treat R =

(R

i

)

_{as a}random

variable,

_weneed _to

specify

the conditional

prob-ability

that R =

(5)

is a

_parameter

of this conditional distribution. The

_density

of the

_joint

distribution of Y and R is

The likelihood

_ignoring

the

_missing-data

process, or

_marginal

density

of _y_obs in the absence of

selection,

is obtained

_{by integrating}

out the

_missing

data ymis from

(equ.(l))

-

---The

_problem

with

_using

_f(y

_obs

_[0)

as a basis for inferences is that it does not take into account the selection process. The information about

R,

a random variable whose value r is also

observed,

is

_ignored.

The actual likelihood is

The

_question

now arises as to when inferences on 0 should be based on the

_joint

likelihood

_(equ.(4)),

and when can it based on

_equ.(3),

which

_ignores

the

_missing

data process. Rubin

₍₁₉₇₆₎

has studied conditions under which inferences from

equ.(3)

are

equivalent

to those obtained from

_equ.(4).

If these

_hold,

one can say

that the

_missing

_{data process}can be

ignored.

The conditions

given by

Rubin

(1976)

are:

₁₎

the

_missing

data are

_missing

at

_random,

_ie,

_{/(r!yobs,ymis) 4*)}

=

/(r!yobs)

4 l)

for

_{all 4o}

and _Ymi_sevaluated at the observed values r and yobg; and

2)

the

parameters

0

_{and +}

are

_distinct,

in the sense that the

_joint

_parameter

space of

(0,

,)

is the

product

of the

_parameter

space of 8 and the

parameter

space of

!.

Within the

contexte of

_{Bayesian inference,}

the

missing

data process is

_ignorable

when

₁₎

the

missing

data are

_missing

at

_random,

and

₂₎

the

_{prior density}

of 0

_and,

is the

product

of the

_marginal

_prior

_density

of 0 and the

_marginal

_prior

_density

of

_,.

IGNORABLE OR NON-IGNORABLE SELECTION

Without loss of

_generality,

we examine

_ignorability

of selection when

_making

likelihood inferences about 0 for each of the three

_examples

given

in Table I.

_Suppose

individuals

1, 2 ...

m

_(<

n)

are selected.

Cases

_(a)

Selection based on observations on the first

_trait,

which are a

_part

of the observed data and all the data used to make selection decisions are available. The likelihood for the observed

_data,

_ignoring

selection,

is

Because selection is based on the observed data

only,

the conditional

probability

(6)

It follows that maximization

_{of equ.(7)}

with

_respect

to 0 will

_give

the same estimates of this

_parameter

as maximization of

_equ.(6).

_Thus,

_knowledge

of the selection process is not

required,

i.e.,

selection is

ignorable.

Note that with or without

normality,

/(y

obs!

8)

can

always

be written as

equ.(5)

or

(6).

Under

normality

of the

_joint

distribution of

Y

il

and

Y

2 ,

Kempthorne

and Von

_Krosigk

_(Henderson

et

al.,

1959)

and Curnow

₍₁₉₆₁₎

_expressed

the likelihood as

_equ.(6).

These

authors,

however,

did not

_{justify clearly why}

the

_missing

data process could be

ignored.

In order to illustrate the

_meaning

of the

parameter

₄₁

of the conditional

probability

of R = _r

given

Y = _y,we consider a ’stochastic’ form of selection: individual i is selected with

_probability

_g(o

_o

_{+!i2/ti)t so + = (’Ij;}

_o

_{, ’lj;1)}

_’

This

_type

of selection can be

_regarded

as selection based on

_survival,

which

_depends

on the first trait via the function

_g(O

_o

+

_{’lj;1 Yil).}

We have for the data in Table I

The actual likelihood for the observed data _y_obs and r is

It follows that when

₄

_o

and 0 are

_distinct,

inference about 8 based on the actual

_likelihood,

_f(

_Yobs

_,

_{riO, «1’),}

will be

_equivalent

to that based on the likelihood

ignoring selection,

f(y

obs1

0).

As shown in

_equ.(8),

the two likelihoods differ

_by

a

multiplicative

constant which does not

_depend

on 0.

It should be noted that in

_{general, although}

the conditional distribution of

R

i2

given

y does not

_depend

on

_0,

this is not with the

_marginal

distribution. For

example,

when

Y

il

is normal with mean _piand variance

_{er 2,}

and g is the standard

normal function

_(lF)

we have

Pr(Ri

2

=

1!8,!) = <I>[(’Ij;o + ’l/J¡J.L¡)/(1

₊

1 /

i

ai)

1/2

]

The condition

_(b)

in Goffinet

₍₁₉₈₇₎

for

_ignoring

_missing

data is not satisfied in this situation.

Cases

_(b)

Data are available

_only

in selected individuals because observations are

_missing

in the unselected ones. In what

follows,

we will consider truncation selection: individual i is selected when _y₂₁ >

_t,

where t is a known threshold.

(7)

The conditional

_probability

that R = _r

given

Y = _y

depends

on the observed and on the

_missing

data. We have

where

_{l!t !i(y21)}

= 1 if

yii >

t,

and 0 if

_yi

_l

_<

t.

The actual

_{likelihood, accounting}

for

_selection,

is

Comparison

of

_equs.(9)

and

₍₁₀₎

indicates that one should make inferences

about 0

_using

_equ.(10),

which takes selection into account. If

_equ.(9),

is

used,

the information about 8 contained in the second term in

_equ.(10)

would be

_neglected.

Clearly

selection is not

_ignorable

in this situation.

Cases

_(c)

Often selection is based on an unknown trait correlated with the trait for which

data are available

_{(Thompson, 1979).}

As in case

_(c)

in Table

I,

suppose the data

are available for the second trait on selected individuals

only,

following

selection,

e.g.

by

truncation,

on the first trait. The likelihood

ignoring

selection is

We have

The likelihood of the observed

_data,

yobS and r is

(8)

Under certain conditions one could use

/(y

obs!

8)

to make inferences about

parameters

of the

_marginal

distribution of the second trait after selection.

_Suppose

the

_marginal

distribution of the second trait

depends only

on

_parameters

8

2 ,

and that the

_marginal

and conditional

_(given

the second

_trait)

distributions of the

first

trait do not

_depend

on

8

2 .

In this case, likelihood inferences on

0

2

from

equs.(11)

and

₍₁₂₎

will be the same.

In _summary,the results obtained for the 3 cases discussed indicate that when selection is based

_only

on the observed data it is

_ignorable,

and

_knowledge

of the selection process is not

_required

for

_making

correct inferences about

_parameters

of the data. When the selection process

depends

on observed and also on

_missing

data,

selection is

_generally

not

_ignorable.

_Here,

_making

correct inferences about

parameters

of the data

_requires

_knowledge

_{of the selection process}to

_{appropriately}

construct the likelihood.

A GENERAL TYPE OF SELECTION

Selection based on data

In this

_section,

we consider the more

_general

_type

of selection described

_by

Goffinet

(1983)

and Fernando and Gianola

_(1987).

The data yo are observed in a ’base

population’

and used to make selection decisions which lead to observe a set of

_data,

Ylobs

, among nl

possible

sets of values _-_Yini_..._Yl2_,_Yll Each

_yl!(k

=

I ...

n

i )

is

a vector of measurements

_{corresponding}

to a selection decision. The observed data

at the first

_stage,

_y_iobs_,are themselves used

(jointly

with

y

o

)

to make selection decisions at a second

_stage,

and so forth. At

_{stage j}

_(j

= 1 ...

J),

let y_j_{be the}

vector of all elements from

_{y!l ...Y!n!,}

without

_{duplication. The}

vector _y_j can be

partitioned

as

where Yiobs and _y_jinis are the observed and the

_missing

data,

respectively.

For the J

_stages,

the data

can be

_partitioned

as y=

(Yobs, Ymis),

where

and

are the observed and

_missing

_parts,

_{respectively,}

of the

_complete

data set. The

complete

data set _yis a realized value of a random variable Y.

When the selection process is based

only

on the observed

data,

_y_obs_,the observed

missing

data

_pattern,

r is

entirely

determined

by

_y_obs_.

Thus,

(9)

Selection based on data

plus

’externalities’

Suppose

that external

variables,

represented by

a random vector

_E,

and the observed data yobs are

jointly

used to make selection decisions. Let

_/(y,e!6,!)

be the

_{joint density}

of the

_complete

data Y and

E,

with an additional

parameter !

such that 8

_and

are distinct. The actual

_likelihood,

_density

of the

_joint

distribution of

Y

obs

and

R,

is

where

_j(rI

_Yobs

_{, e, cJI)}

is the distribution of the

_missing

data process

(selection

process).

In

_general,

inferences about 0 based on

j(Yobs,

r[0,

ç,

«1’)

are not

_equivalent

to

those based on

_/(y

_obs!

_8).

_However,

if for the observed

data,

yobs

for all _Ylll_isand e, then

equ.(13)

can be written as

Thus,

under the above

condition,

which is satisfied when Y and E are

_independent,

inferences about 0 based on the actual likelihood

_j(

_Yobs

_,

_r[0,

_ç,

_«1’)

and those based on

/(y

obs!

0)

are

equivalent.

Consequently,

the selection process is

ignorable.

Note

that the condition

for all Ymis and e does not

_require

_independence

between Y and E because it holds

only

for the observed data yobs and not for all values of the random variable

Y

obs

.

The results can be summarized as follows:

₁₎

the selection process is

ignorable

when it is

_only

on the observed

data,

or on observed data and

independent

externalities;

2)

the selection _{process is}not

_ignorable

when it is based on the

observed data

_plus

dependent

externalities. In the latter case,

knowledge

of the

selection process is

required

for

making

correct inferences.

DISCUSSION

Maximum likelihood

_(ML)

is a

_widely

used estimation

procedure

in animal

breeding applications

and has been

_suggested

as the method of choice

_(Thompson,

1973)

when selection occurs. Simulation studies

_(Rothschild

et

_{al., 1979, Meyer}

and

Thompson,

1984)

have indicated that there is

essentially

no bias in ML estimates of variance and covariance

components

under forms of

_selection,

e.g., data-based

selection.

(10)

requires

that selection be carried out on a

_linear,

translation invariant function. This

requirement

does not appear in our treatment because we _arguefrom a likelihood

viewpoint.

In this paper, the likelihood was defined as the

_density

of the

_joint

distribution of the observed data

_pattern.

In Henderson’s

₍₁₉₇₅₎

treatment

of

_prediction,

the

pattern

of

_missing

data is

_fixed,

rather than

random,

and this results in a loss of information about

_parameters

_(Cox

and

_Hinkley,

_1974).

It is

_possible

to use the conditional distribution of the observed data

_given

the

_missing

data

pattern.

Gianola et al.

_(submitted)

studied this

_problem

from a conditional likelihood

viewpoint

and found conditions for

_ignorability

of selection even more restrictive that those of Henderson

_(1975).

Schaeffer

₍₁₉₈₇₎

arrived to similar

_conclusions,

but this author worked with

quadratic

forms,

rather than with likelihood. The fact that these

_quadratic

forms appear in an

algorithm

to maximize likelihood is not

sufficient to

_guarantee

that the conditions

_apply

to the method _perse.

If the conditions for

_ignorability

of selection discussed in this

_study

are

_met,

the

consequence is that the likelihood to be maximized is that of the observed

_data,

i.e.,

the

_missing

data process can be

_{completely ignored. Further,}

if selection is

ignorable

f (y

obs

, r,

10)

oc

j(

YobsI

O),

so

Efron and

_Hinkley

₍₁₉₇₈₎

_suggested

_using

observed rather than

_expected

infor-mation to obtain the

_asymptotic

variance-covariance matrix of the maximum likelihood estimates. Because the observed data are

_generally

not

inde-pendent

or

_identically

_{distributed, simple}

results that

_imply

_asymptotic

_normality

of the maximum likelihood estimates do not

_{immediately apply.}

For further discus-sion see Rubin

_(-1976).

We have

_emphasized

likelihoods and little has been said on

_Bayesian

inference. It is worth

_noticing

that likelihoods constitute the ’main’

_part

of

_posterior

distri-butions,

which are the basis of

_Bayesian

inference. The results also hold for

_Bayesian

inference

_provided

the

_parameters

are

_{distinct, i.e.,}

their

_prior

distributions are

independent.

For data-based

_selection,

our results _agreewith those of Gianola and Fernando

₍₁₉₈₆₎

and Fernando and Gianola

₍₁₉₈₉₎

who used

_Bayesian

_arguments.

In

_general,

inferences based on likelihoods or

_posterior

distributions have been found more attractive

_by

animal breeders

_working

with data

_subject

to selection than those based on other methods. This choice is confirmed and

_{strengthened by}

application

of Rubin’s

₍₁₉₇₆₎

results to this

_type

of

_problem.

REFERENCES

Cox D.R. &

_Hinkley

D.V.

₍₁₉₇₄₎

Theoretical Statistics.

_Chapman

and

Hall,

London Curnow R.N.

₍₁₉₆₁₎

The estimation of

_{repeatability}

and

_heritability

from records

subjects

to

_culling.

Biometrics

_17,

553-566

(11)

Fernando R.L. & Gianola D.

₍₁₉₈₉₎

Statistical inferences in

_{populations undergoing}

selection and non-random

_mating.

In: Advances in Statistical Methods

_for

Genetic

Improvement of

Livestock

_{Springer-Verlag,}

in _press

Gianola D. & Fernando R.L.

₍₁₉₈₆₎

_Bayesian

methods in animal

_{breeding theory.}

J. Anim. Sci.

63,

217-244

Gianola

_D.,

Im S. Fernando R.L. &

_Foulley

J.L.

₍₁₉₈₉₎

Maximum likelihood

estimation

of

_genetic

_parameters

under a &dquo;Pearsonian&dquo; selection model. J.

_Dairy

Sci.

_(submitted)

Goffinet B.

₍₁₉₈₃₎

Selection on selected records. Genet. Sel. Evol.

_15,

91-98 Goffinet B.

₍₁₉₈₇₎

Alternative conditions for

_ignoring

missing

data. Biometrika

71,

437-439

Henderson C.R.

₍₁₉₇₅₎

Best linear unbiased estimation and

_prediction

under a selection model. Biometrics

_31,

423-439

Henderson

_{C.R., Kempthorne 0.,}

Searle S.R. & Von

_Krosigk

C.M.

₍₁₉₅₉₎

The estimation of environmental and

_genetic

trends from records

_subject

to

_culling.

Biometrics

_15,

192-218

Little R.J.A. & Rubin D.B.

₍₁₉₈₇₎

Statisticad

_Analysis

with

_Missing

Data

_Wiley,

New York

Meyer

K. &

_Thompson

R.

₍₁₉₈₄₎

Bias in variance and covariance

component

estimators due to selection on a correlated trait. Z. Tierz.

_{Zuchtungsbiol.}

_{101, 33-50}

Rothschild

_M.F.,

Henderson C.R. &

_Quaas

R.L.

₍₁₉₇₉₎

Effects of selection on variances and covariances of simulated first and second lactations. J.

_Dairy

Sci.

62,

996-1002

Rubin D.B.

₍₁₉₇₆₎

Inference and

_missing

data. Biometrika

_63,

581-592

Schaeffer L.R.

₍₁₉₈₇₎

Estimation of variance

_components

under a selection model. J.

_Dairy

Sci.

_70,

661-671

Thompson

R.

₍₁₉₇₃₎

The estimation of variance and covariance

components

when records are

_subject

to

_culling.

Biometrics

29,

527-550

(12)

COMMENT

C.R. Henderson

_t

_*

The paper

by Im,

Fernando and Gianola

_provides

an

interesting

and invaluable contribution to estimation and

_prediction

in an almost universal situation in animal

breeding. Very

few data are available for

_parameter

estimation or

prediction

of

breeding

values that have not arisen from either selection

experiments

or from field data in herds that have

_undergone

selection.

For several years after the

adoption

of

BLUP,

a mixed linear model was

assumed,

and the usual

_description

of the model was that

E(Y)

and

E(e)

are both

null,

and in an additive

_genetic

model

_Var(U)

=

A

Q

a.

The

_assumption

of

E(U)

= 0 is

clearly

untenable,

because if selection has been

_effective,

the

_expectations

of a subvectors for successive

_generations

are

_increasing.

A serious

_attempt

to model for selection was made in my 1975 Biometrics paper

cited

_by

Im et al. It must be

_emphasized,

as has been done in the paper under

review,

that _mymodel is different from the model of the

_present

_{paper. Consequently,}the solution to

_prediction

and estimation differs. I do not

_disagree

with the authors’ conclusions from their

_model,

and I

_think,

based on

long

discussions with Gianola and

_Fernando,

that

_they

do not

_disagree

with _myconclusions based on my model. The

_really

critical

_question

_is,

&dquo;What is the best model for

_describing

selection?&dquo; I have no intention of

addressing

this issue because I neither have

strong

convictions about my model nor about any others.

Our models differ in that mine is

_considerably

more

_restrictive,

requiring

as it

_does,

a fixed incidence matrix with

conceptual repeated

sampling.

This of course is the traditional

_approach

taken

by

classical statisticians. The

_problem

is more

_difficult,

_however,

with selection

problems,

as

_compared

to

_{nicely designed}

experimental

situations. No

_attempt

was made in the 1975 paper to solve the

problem

of estimation of variances and covariances.

_Rather,

I solved the

_problem

of BLUE of estimable functions

of

(

3

and BLUP of random

_{variables, given}

multivariate

normality

and with variances and covariances known to

_{proportionality.}

I

_pointed

out

_that,

in contrast to no selection

_models,

the estimators and

predictors

are biased

if incorrect ratios are

_employed.

Thus,

it is critical to obtain the best

_possible

of these

_parameters.

Im et al. address this

_problem.

Several workers have

_speculated

that REML

_applied

to a selection model estimates the variances and covariances that existed

_prior

to selection and which may have been altered

by

selection. In contrast to most of these

speculations,

I

suggested

that when selection is on observed

records,

the linear selection functions should be translation invariant. I think this is true under my selection model but may well not be true for other selection

models.

Im et al.

_{strongly emphasize}

the

_desirability

of likelihood methods. I agree with

them,

and in _many

_meetings

and papers have recommended these methods over some of _{my own,}such as Method 3. I doubt the _accuracyof the last sentence of the _paperunder review which states that animal breeders find likelihood methods more attractive. A

_study

of animal

_breeding

literature of the

_past

5 years would

probably

disclose that animal breeders have used Method 3 much more often than

*

(13)

REML or ML. If this is

_true,

I

_{certainly agree with}

_minority

and with Im et al. The fact is that BLUE and BLUP under _myselection model are ML estimators of

/

3

and of the conditional mean of U.

I should now like to _{discuss how results compare under}my selection model and

under the model of the

_present

authors. We _{agree partially regarding}estimation when selection is on observable records. The authors’ model

_clearly

shows

ignora-bility

of selection in this case. Under my

model,

linear selection functions of Y must

either be translation invariant or it must be true that

_E(L’Y

_s

₎

=

E(L’Y

u

)

when Y

9

and

Y

u

refer to selection and to no

_selection,

_{respectively.}

This difference is

_simply

a _consequenceof different models.

We agree that if selection is on unobserved random

variables,

selection is not

ignorable.

A

_special

case of this has been of interest to me. Base

population

animals have been selected on translation invariant linear functions of

data,

but these are not

available for

_{analysis. Assuming}

that such selection results in

_{E(U6) !}

0 a

_simple

modification of the

_regular

mixed model

_equations

leads to BLUE and

_BLUP,

and

presumably

these modified

_equations

could be used to derive REML estimation of the variances and

_covariances,

Henderson

_(1988).

I believe that this final

_question

is

_{justified, namely,}

&dquo;What are the

operating

characteristics of the authors’ estimators?&dquo; Likelihood methods for variance esti-mation have known desirable

_properties

_only

in

_{large samples.}

We need studies for various methods of

_{bias, MSE,}

and maximization _{of selection progress}

_using

BLUP with estimated variances and covariances.

_Probably

this can be done

_{only through}

extensive simulation for a wide range of

parameter

values,

selection

_intensity,

etc.

The authors have made a valuable contribution to the

problem

of estimation in selection models. This paper should motivate further studies on this

problem.

ADDITIONAL REFERENCE

Henderson C.R.

_(1988).

A

_simple

method to account for selected base

_populations.

J.

_Dairy

Sci

_71,

3399-3404

COMMENT

R.

_Thompson

*

This paper considers ways of

constructing

likelihoods for selected

_data,

_specifically

taking

account of the _presenceand absence of data

_following

a method

_developed

by

Rubin

₍₁₉₇₆₎

and

_explained

in the recent book

_by

Little and Rubin

_(1987).

Some of the likelihoods have been

_{given previously}

without any formal recourse to ideas

of missingness,

for

_example

extensions of case

_(a)

Henderson et al.

_(1959),

Curnow

(1961)

and

Thompson

(1973).

Theses authors used a

_{sequential approach}

to build

up likelihoods that I find

appealing. Using

this

approach it is easy

to see that r is a function of _yand so does not contribute _{any extra}information on 9. To derive the same likelihood

_by

_differing

routes is

_reassuring.

*

(14)

It is valuable to know when selection is

_ignorable.

I have

_always

found it

_confusing

that in extensions of case

(a)

a likelihood

_approach

would _saythat selection on _y₂_, is

_{always ignorable}

but Henderson

₍₁₉₇₅₎

_suggests

that selection is

_{only ignorable}

if selection is on a

_culling

variate

_(w)

that is translation invariant.

In an

_interesting

_paperthe same 3 authors

(Gianola

et

_al.,

₁₉₈₈₎

have constructed the

_joint

_density

of the data and random effects conditional on the

_culling

variate

(D!).

Inferences based on

_D

_c

_suggest

that selection can be

ignored

_only

if it is based on functions of the data that do not

_depend

on the fixed effects. It would have been instructive to relate

D!

to terms used in the

_present

paper, as

_presumably

r can be related to the

_culling

variate and

_{might help}

to answer 3 comments I have on the use of

_Dc.

First,

Gianola et al.

₍₁₉₈₈₎

condition on w, the

culling

variate,

by

integrating

over

y and

the random effects. I am not sure of the need to

_integrate

over the random effects. One

_might

sometimes want to consider

_{repeated samples}

over

(or

conditioning

on)

all

_possible

_genetic

material and

only

repeated

over the same

genetic

material. Henderson

₍₁₉₈₈₎

has

_recently

_suggested

a

procedure

that involves no

_integration

over y of the random

effects,

i.e.

_conditioning

on the observed value of w. What should one do?

Secondly,

Gianola et al.

_{(1988) highlighted}

differences between

_{using D!}

and Henderson’s

₍₁₉₇₅₎

_approach

when selection is on random effects or residuals

(w

= L’u _or

L’e).

This _{case is}artificial _{in the}_sense_thatrandom effects and residuals

will never be known

_exactly.

But if selection is on known random effects it

_scarcely

seems _necessaryto

_predict

them

_using

_only

the data. It

_might

be more

_interesting

to _comparethe 2

_predictions.

_Similarly,

if w = L’e is

_known,

this known value could

improve

estimation

and

_prediction

of the other

parameters.

Thirdly,

if selection is on w =

L’y,

but is not translation

_{invariant, presumably}

the authors

technique,

which is

_non-linear,

should be more efficient than

Hender-son’s

_approach.

I wonder if the authors have

_quantitative

information on this.

Finally

there are cases when one wants to estimate

_parameters

associated with

equ.(4)

and,

(Robertson (1966)).

There is discussion of this area in Little and Rubin

(1987)

and

_{techniques developed}

_{by Foulley,}

Gianola and

_Thompson

₍₁₉₈₃₎

for

quantitative

and

_binary

traits can sometimes be used.

ADDITIONAL REFERENCES

Foulley

J.L.,

Gianola D. &

_Thompson

R.

₍₁₉₈₃₎

Prediction of

_genetic

merit from data on

_binary

and

_quantitative

variates with

_application

to

_{calving difficulty,}

birth

weight

and

_{pelvic opening.}

G6n6t. 5’e/.

Evol.

_15,

401-424

Gianola

_D.,

Im S. & Fernando R.L.

₍₁₉₈₈₎

Prediction of

_breeding

value under Henderson’s selection model: a revisitation. J.

_Dairy

Sci.

_71,

2790-2798

Henderson C.R.

₍₁₉₈₈₎

Simple

method to

_compute

biases and mean

_squared

error of linear estimators and

_predictors

in a selection model

_assuming

normality.

J.

Dairy

Sci.

_71,

3135-3142

(15)

REJOINDER

S.

_Im,

R.L. Fernando and D. Gianola

We thank the

_participants

in this discussion

_and,

in

_particular,

Professor

_Henderson,

whose comments were received

_by

us a few weeks before his

_{unexpected death,}

for their contributions to the

_theory

of

parameter

estimation under selection. As

Thompson

points

out in his

_comments,

there is a

_controversy

_regarding

selection based on observed data.

_Accordingly,

we

begin

our

_rejoinder

with a discussion on this

_problem.

Then we adress the issues raised

_by

the discussants.

Different methods of

_parameter

estimation or

prediction

of

_breeding

values

developed

to account for selection lead to different results on

_ignorability

of selection based on observed

_data,

as stated

_{by Thompson.}

The

_{repeated sampling}

developments

of Henderson

₍₁₉₇₅₎

are made

_using

the conditional distribution of

Y

obs

given

R

_(the

observed

_pattern

of

_missing

_data)

and

_require,

as indicated

by missing

data

_theory

_{(Rubin, 1976),}

_stronger

conditions for

_ignorability

than likelihood based inferences.

_However,

it should be noted that the latter inferences are based on the

_joint

distribution of

_Y

_obs

and R instead of the

conditional

distribution mentioned above. Some papers

(Gianola

et

_al.,

_1988;

_Goffinet,

₁₉₈₈₎

have considered selection from the conditional likelihood

_viewpoint

and

_given

conditions for

_ignoring

_it,

and these are _veryrestrictive and

similar

to those of Henderson

_(1975).

Goffinet

₍₁₉₈₈₎

advocated the use of conditional likelihood for selection on observed data and found that it is

_{ignorable only}

if the

_marginal

distribution of R does not

_depend

on the

_parameter

0. The crucial

_question

to

be answered is: should inferences be based on the conditional distribution of

_Y

_obs

given

R?

In

_{repeated sampling}

_inferences,

the statistical

_quality

of an estimator is

_usually

measured in terms of

_quantities

_{(bias, variance)}

evaluated

_{by averaging}

over all

possible samples

according

to the randomness

_{generated by}

the

_{sampling process.}

According

to this

_principle,

inferences should be made

_{unconditionally}

on the observed value of R. This is done _{in survey}

_sampling

_theory

where selection schemes do not

_depend

on the response variable and the selection

probabilities

are

_known;

see, for

example,

Gourieroux

(1981).

However,

these conditions are not satisfied in animal

_breeding

situations

_and,

as noticed

by

Henderson

(1975),

the unconditional

approach

is rather intractable.

_Consequently

a conditional

_analysis,

while not

_fully

efficient,

may be useful.

&dquo;

. ’

From the likelihood

_viewpoint,

the unconditional method should be

_preferred

over the conditional method because the former leads to better estimates than the

_latter,

in the sense of

_having

smaller

asymptotic

variances of estimators. Condi-tional likelihood is

_usually

considered as a device for

obtaining

a consistent estimate of the

_parameter

of interest in the _presenceof

_infinitely

_manynuisance

_parameters

(Kalbfleisch

and

_Sprott,

_1970).

_According

to Andersen

_(1970),

the conditional max-imum likelihood estimator is consistent and

_{asymptotically normally}

distributed

but,

in contrast to the maximum likelihood

_estimators,

it will not in

_general

be efficient even under

_regularity

conditions.

(16)

distribution

_{of y}

_l

_,

_sampled

at

_random,

is affected

_by

selection. Im

₍₁₉₈₉₎

_highlighted

difficulties that arise when

_applying

BLUP under selection in this case, and

considered estimation and

_prediction

based on the unconditional likelihood. The conditional likelihood

_approach,

which is less

efficient, requires

knowledge

of the selection _processand more

complicated calculation,

unless the

marginal

distribution of R does not

_depend

on the

_parameters

_being

estimated.

For selection

_problems

dealt with in this _paper,as well as in most animal

_breeding

literature,

the correct likelihood is

_given

_by

the

joint

distribution of

Y

obs

and R. This _{may not}be

_always

the case.

Consider,

for

example,

situation

(b)

in Table I. We

_supposed

that the unselected individuals were available for

_analysis,

and used the information that

_they

were not selected when

_deriving

the likelihood. If

_they

were not

_available,

the actual likelihood would be a conditional one. In _anyselection

problem,

one should construct the correct likelihood and use it to make inferences. We agree with Henderson that likelihood methods have known desirable prop-erties

_only

in

_{large samples,}

but little is known about their small

_sample

behavior. Simulation studies he indicated could be useful. We

_disagree

that BLUE and BLUP under his selection model are ML estimators of

!i

and of the conditional mean of U because the

_normality

_requirement

is not met under

selection,

unless selection is translation invariant.

Thompson’s

comments are

_mostly

concerned with another _paper,Gianola et al.

_(1988),

who considered

_prediction

of

_breeding

values

_by

_maximizing

the

_joint

distribution of the data and the random effects

_using

the conditional selection scheme

_{proposed by}

Henderson. For known variances and 0 =

(

{

3,u),

D

c

would be

the

joint density

of

_(Y

_ob

_{g, U)}

_given

_{R,j(Yobslr,{3,u)j(u).}

In Gianola et al.

_(1988),

the selection process is defined as the modification of the

_joint

_density

of

_{(Y, U)}

into another

_density,

due to a restriction in the

sample

space of W.

Integration

over

y and

u is needed for

_obtaining

the

_joint

density

of

(Y, U)

conditional on W E

_R

_s

from that of

_{(Y, U, W).}

The

_procedure

_suggested

_by

Henderson

_(1988a)

does not involve

_{integration explicitly}

because it is

_{developed using}

conditional

means, variances and covariances.

However,

integration

is

_required

when

_calculating

these conditional

quantities

from the

_{joint density}

of

_(Y,

_U,

_W).

It seems to us that Henderson’s

_(1988a)

_procedure

is not conditional on the observed value of W

but,

rather,

on that W E

_R

_s

_.

If it were so, then

H

s

=

Var(W

s

)

=

Var(W!W)

= 0.

But,

in his

_example

of cow

_culling,

he simulated with

H, 54

0

(p. 3139).

Selection on random effects or residuals

(Henderson, 1975)

is indeed artificial

and, consequently,

has no real

_practical

interest. Gianola et al.

₍₁₉₈₈₎

studied this in order to _comparethe results with those of Henderson

(1975).

We are not sure of the need for further

_developing

this

_type

of selection. In his

comments,

Henderson gave a new and more realistic definition of selection on random effects.

Namely,

this selection is based on records correlated with U but not available for

_analysis.

It

_might

be

_interesting

to _comparedifferent methods under this scheme.

We

have no

_quantitative

information on the

_efficiency

of Henderson’s

_approach

when selection is on

L’y,

but is not translation invariant. This

_question

deserves further

_study.

(17)

possible

to handle situations in which animals are selected in an

_unspecified

manner

(Henderson, 1988b).

To end our

rejoinder,

we should like to introduce a

_practical

and

_important

question.

Is it

_possible

to relax the condition of

_{normality required}

in Henderson’s

developments?

This

_question

should motivate some further studies on the selection

problem.

ADDITIONAL REFERENCES

Andersen E.B.

_{(1970) Asymptotic}

properties

of conditional maximum likelihood estimators.J.R. Statist. Soc. B

_32,

283-301

Goffinet B.

₍₁₉₈₈₎

A propos de l’estimation des

param6tres

en

_presence

de selection. Biorrc. Praxim

28,

49-60

Gourieroux C.

₍₁₉₈₁₎

Th6orie des

_sondages.

Economica,

Paris

Henderson C.R.

_(1988a)

_Simple

method to

_compute

biases and mean

squared

error of linear estimators and

_predictors

in a selection model

_assuming

_normality.

J.

_Dairy

Sci.

_71,

3135-3142

Henderson C.R.

_(1988b)

A

_simple

method to account for selected base

_populations.

J.

Dairy

Sci.

_71,

3399-3404

Im S.

₍₁₉₈₉₎

On a mixed linear model when the data are

_subject

to selection. Biom. J. in _press.

Kalbfleisch J.D. &

_Sprott

D.A.

₍₁₉₇₀₎

_Application

of likelihood methods to models