Univariate and multivariate parameter estimates for milk production traits using an animal model. II. Efficiency of selection when using simplified covariance structures

(1)

HAL Id: hal-00893968

https://hal.archives-ouvertes.fr/hal-00893968

Submitted on 1 Jan 1992

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Univariate and multivariate parameter estimates for

milk production traits using an animal model. II.

Eﬀiciency of selection when using simplified covariance

structures

Pm Visscher, Wg Hill, R Thompson

To cite this version:

(2)

Original

article

Univariate and multivariate

parameter

estimates

for milk

_production

traits

_using

an

animal model.

II.

_Efficiency

of

selection

when

_using

simplified

covariance

structures

PM Visscher

WG

Hill

R

_Thompson

1 University

of Edinburgh,

Institute

_{of Cell,}

Animal and

_{Population Biology,}

West Mains

_{Road, Edinburgh}

EH9 _3JT;

2 AFRC

Institute

_of

Animal

_Physiology

and Genetics

Research,

. Roslin EH25

9PS, Midlothian,

UK

(Received

5

August

1991; accepted 3 June

₁₉₉₂₎

Summary - The effect of

_simplifying

covariance structures for

_milk,

fat and protein

yield

in lactations 1-3 on _accuracyof selection for lifetime

_yield

was

_{investigated using}

selection index

theory. Previously

estimated

_{(co)variances}

were assumed to be the true

population

parameters. A modified

_{repeatability}

model, ie

_assuming

a

_genetic

correlation

of

_unity

between

_performances

across lactations but

allowing

for different heritabilities

and variances in different

_lactations,

was used to

_investigate

the

_efficiency

of

selecting

sire

on their progeny

_performance.

For most combinations of heritabilities in first and later

lactations, selecting

sires

_using

the

_{repeatability}

model was

_nearly

100% efficient in terms

of accuracy of selection

compared

to a

_general

multivariate model. The

predicted

response to selection

_using

a modified

_{repeatability}

model was

_{approximately}

10% too

_high.

It was

found that

_applying

a transformation to make new traits in lactation 1 uncorrelated at

the

_phenotypic

and

_genetic

level to _milk,fat and protein

yield

in later

_lactations,

and assuming that 3 new uncorrelated variates were formed, was

_highly

efficient in terms of accuracy of selection when

compared

to the _accuracyof a

_general

multivariate model. This transformation was recommended for a national BLUP

_evaluation,

_{since it may}take

account of selection to a

_larger

extent than when

_performing

separate

analyses

for

milk,

fat and _protein

_yield.

’

dairy cattle / milk production

/

selection index

_/

_genetic_parameters_/_{repeatability}

_/

canonical transformation

*

(3)

Résumé - Utilisation du modèle animal pour l’estimation des paramètres univariates et multivariates concernant les caractères de _productionlaitière. II. Efficacité de la sélection après utilisation d’une structure simplifiée de covariance.

L’efj&dquo;et

d’une

simplification

de la structure des covariances pour la

_production

de

lait,

de matière grasse

et de matière

protéique

a été

analysé

en utilisant la théorie des index de sélection. Les estimées antérieures de

(co)variances

ont été

_supposées

être les vrais

_paramètres

de la

population.

Un modèle

_{&dquo;répétabilité&dquo; modifié,}

c’est-à-dire supposant une corrélation

génétique

de 1 entre les

performances

des

_différentes

lactations mais permettant des héritabilités et des variances

_différentes

à

_{chaque lactation,}

a été utilisé _pour

analyser

l’e,f,jîcacité

de la sélection des taureaux à

_partir

de la

_performance

de leur descendance. Pour la

_plupart

des combinaisons d’héritabilité en

_première

lactation ou lactations ultérieures, la sélection des taureaux en utilisant le modèle

répétabilité

a

_permis

à _peu

près

la même

précision

de la sélection que le modèle

général

multivariate. La

réponse

prédite

à la

sélection selon le modèle

_{répétabilité modifié}

a été surévaluée d’environ 10%. On a

_appliqué

aux lactations ultérieures une

_{transformation}

rendant

indépendants

en

_{première lactation,}

tant au niveau

_{phénotypique}

que

génétique,

les caractères de

production

laitière, de matière

grasse, de matière

protéique

et on a

supposé

_queles 3 nouvelles variables étaient

toujours

indépendantes.

Une telle

_procédure

a été hautement

e,f!cace

en termes de

_précision

de la sélection

_comparée

à la

_précision

d’un modèle

général

multivariate. Cette

_{transformation}

est recommandée pour une évaluation nationale selon le

BLUP, puisqu’elle

peut mieux tenir compte de la sélection que des

analyses séparées

des

productions

de

lait,

matière grasse et matière

protéique.

bovin laitier

/

production laitière

_/

index de sélection

_/

paramètre génétique répétabilité

/

transformation canonique

INTRODUCTION

In a

_previous

_paper

(Visscher

and

Thompson,

1992),

genetic

and environmental

parameters

were

_presented

for milk

yield

(M),

fat

yield

(F)

and

protein

yield

(P)

in lactations 1-3. If the

breeding

goal

for

dairy

cattle

breeding

is some

(linear)

combination of these

production

traits in all

_lactations,

an

optimal

_wayto combine all available information to

_{predict breeding}

values is a multivariate

(MV)

BLUP

analysis.

For a national animal model

_(AM)

breeding

value

_prediction,

_however,

a

_general

MV BLUP

analysis

is

computationally

not

_yet

feasible. In

practice,

therefore,

simplified

assumptions

are made when

predicting breeding

values for

large populations

using

an AM. In

dairy

cattle AM

prediction,

milk,

fat and

protein

yield

are

_usually

evaluated

separately

_using

a

repeatability

model with some

_scaling

for observations in later lactations to account for

heterogeneity

of variance across

lactations

_(Wiggans

et

al,

1988a,

b; Ducrocq

et

_{al, 1990;}

Jones and

Goddard,

1990).

Selection index

theory

is used to

_investigate

the loss _{in accuracy}of selection when

_simplified

covariance structures are used to

_predict

_breeding

values. A second aim is to

_investigate

how to reduce the

_{dimensionality}

of the above MV

_prediction

(4)

MATERIAL

As

_reported

_previously

_(Visscher

and

Thompson,

1992),

the 9 x

_{9 genetic}

covariance

matrix of

_M,

F and P in lactations 1-3 was found to be not

_positive

definite. To create a

(semi)

_positive

definite matrix the

single

negative eigenvalue

was set to

_10-’,

and covariance matrices were recalculated. These matrices were then

used for

_subsequent

_(index)

calculations. Without loss of

_{generality, phenotypic}

variances for

M, F,

and P in lactation one were set to 1.0. The

parameters

are

summarised in table I. An alternative _wayto summarise the covariance matrices is

to calculate

_eigenvalues

and

_eigenvectors

of the matrix

P-

1

G. Eigenvectors

then

represent

linear combinations of the

original

traits which are

independent

of each

other at the

_genetic

and

_phenotypic

level

_(Hayes

and

_{Hill, 1980;}

_Meyer,

_1985).

The

_{corresponding}

_eigenvalues

of the

newly

created

traits,

called canonical variates

(eg

Anderson, 1958;

ch

_12),

have been termed canonical heritabilities

_(Hayes

and

Hill,

1980;

Meyer,

1985).

Table II shows the

_eigenvalues

and

_eigenvectors

of matrix

P -

1 G,

where P and G are the 9 x

9 phenotypic

and

_genetic

covariance matrices for

milk,

fat and

_{protein yield}

in lactations 1-3

_P

_),

₃

_P

₃

_F

₃

_M

₂

_F

_M

₁

_P

₁

_F

₁

_(M

calculated from

_parameters

in table I. A number

_{following M,}

F or P indicates the lactation

number. The smallest

_eigenvalue

from the

_original

P-’G

was

_-0.03,

and the

corresponding eigenvector

was:

Therefore,

one canonical

_trait,

{-0.04M

1

+

.’..

i

0.15F

+

_0.09P

₃₁

_,

was found to have a

_heritability

of -0.03. The

_negative

eigenvalue

resulted

mainly

from the

(5)

row

of table

_II).

This was

_{expected, given}

the _very

high genetic

correlations for

yield

traits in lactations 2 and 3

_(see

table

_I).

ANALYSIS

Selection index

theory

Let:

u ₌_qx 1 vector of

breeding

values of q

traits;

a = _qx 1 vector of

_(marginal)

economic values for q

traits;

H = u’a =

_aggregate

breeding

value;

x =

p x 1 vector of sources of information on an individual

(for

example phenotypic

observations,

daughter

averages,

predicted

breeding

values);

b =

p x 1 vector of index

weights;

I = b’x = index value used to

predict

H ;

P =

v(x);

G =

cov(x’u)

and the

symbol A

added to a scalar or matrix indicates an estimate thereof.

For index calculations standard results were used

(see,

for

example,

Sales and

Hill,

1976).

Equations

for the _responsesare

_using

b =

P-’Ga

for the

optimal index,

and

b =

P-

1 Ga

for an index

using

estimates of P and G :

Where

R,

R and R* are the

_optimal,

predicted

and achieved _responseto selection in

the

_aggregate

_breeding

value

_(=H)

respectively, expressed

as a ratio of the selection

(6)

If new traits are created which are linear combinations of the

observations,

y =

W’x,

ie variables in vector

y are a linear combination of the variables in

vector x, then

It was assumed that the

_marginal

economic value for any of the

production

traits

in later lactations was the

_product

of the relative

_expression

of that trait and the

phenotypic

standard

_deviation,

thus

_reflecting

survival to later lactations and the

economic

_importance

of a

larger

standard deviation

(and mean)

in later lactations.

a z =

e ZQi

, where ai, si and aj are relative economic

value,

relative

_expression

and standard deviation for lactation i. Relative

_expression

was assumed to follow

a

_geometric

_series,

_Ei =

(0.8)?!! ,

_assuming

_arelative survival of

80%

from one

lactation to the next and

_setting

the

_expression

in lactation one to 1.0.

_Phenotypic

standard deviations were assumed to be

_1.0,

1.20 and 1.25 for lactations

_1-3,

and 1.25 for all

subsequent

lactations

_(from

table

_I).

If it is further assumed that the covariance of any observation with the

breeding

value in lactation 3 is

equal

to the covariance of that observation with

_breeding

values in later

_lactations,

ie the

_{corresponding}

rows of matrix G are

identical,

then the economic values

for lactations 1-3 are

[1.0

1.0

_4.0],

since the sum of economic values for third

and

_subsequent

lactations is

_{4.0(E(l.25)(0.8)’}

=

4.0,

for i =

2,3...).

Similarly

for the case where

only

2 lactations are considered and

_assuming

second and

later

_lactations,

_breeding

values have

_equal

covariances with observed

_phenotypes,

a’ =

_{[1.0 5.0!.}

Economic values for traits within a lactation were varied to reflect different

_{breeding goals.}

For all

_subsequent

calculations it is assumed that observations in later lactations

are included in the

genetic

evaluation

_only

if observations on

_previous

lactations

are available.

Single

trait

_multiple

lactations considerations

Meyer

(1983)

investigated

the

_potential

gain

in response to selection from

_including

multiple

lactation information on _progenyof sires for sire evaluation. The _accuracy

of selection was increased

directly

through

more

(genetic)

information about the

trait(s)

of

_interest,

since the

_genetic

correlation between

_performances

across

lactations

_(r

₉

₎

is less than _one,and

_{indirectly through}

a better data structure

(better &dquo;connectedness&dquo;).

Assuming

a’ =

_{!1 14!}

for either

_milk,

fat or

_protein

_yield

in lactations

_1-3,

and

_using

the relevant

_parameters

for any of these traits from

table

_I,

it was found

_(using

standard selection

theory)

that for sires the increase in accuracy

through

including

second

(and third)

lactation

daughter

information in

the selection index was

_{approximately}

6-10%.

The number of _{progeny per sire}for

first and second lactations was varied from 25 to 50 and 5 to 35

_{respectively.}

See

Meyer

(1983)

for more

examples.

Perhaps

a more

_{interesting question}

_regarding

the use of

_multiple

lactation

information on a

_single

trait is how much _accuracyis lost when a modified

repeatability

model

_(eg

_assuming

_r₉ =

1)

is assumed for

breeding

value

prediction

instead of the &dquo;true&dquo; MV covariance structure. This was

_investigated

for 3 selection

(7)

1

=

phenotypic

index,

ie sources of information are

phenotypic

observations on

individuals;

1

2

= sire

index;

_sourcesof information _are

_daughter

averages of sires in different

lactations;

1

3

= _cow

index;

_sourcesof information _arethe

_{predicted breeding}

value

(index)

of the cow’s sire and dam and the cow’s own records.

The

_largest

reduction _{in response}to selection is

_expected

when selection is

across _age

_classes,

_egacross cohorts with different amounts of

_information,

since an

_improper

weighting

of later lactations then would have the

_{largest impact.}

In the

_following

_{examples, only}

2 lactations and 2 cohorts were

considered,

but the

results are

thought

to be similar for more lactations

(given

the _very

_{high genetic}

correlation

between second and later lactation

_yields)

and more _{age groups.}The

genetic

means for the cohorts were assumed to be zero, hence the consequences of

the error in

predicting genetic

trend were

_ignored.

(A

_{thorough study}

_{of long-term}

losses in _response

_through

incorrect estimation of

_{genetic trend,}

thus

_creating

an

suboptimal ranking

of _{young vs}old

animals,

was outside the _scopeof this

study.)

For each index there were different amounts of information on the 2

_cohorts,

I

1 :

Cohort 1:

_phenotypic

observation in lactation 1

Cohort 2:

_phenotypic

observations in lactations 1 and 2

I

2 :

Cohort 1: first lactation

_daughter

_averagebased on _n_l

_daughters

Cohort

2:n

1

first lactation

_daughter

records and _n₂ second lactation records

(n

2

<

_n

_l

₎

I

3 :

1

Cohort 1: sire index based on n first lactation _progeny,dam index based on sire index of dam and dam’s first lactation

_record,

cow’s first lactation record

Cohort 2: sire index based on _n_l + n2 progeny

records,

dam index based on sire

index of dam and dam’s records in first and second

lactation,

cow’s first and second lactation records.

Parameters used for the

_example

with 2 lactations and 2 cohorts were

_(from

table

_I):

a’ =

_!15!;

_7-g=

0.85;

7-p =

0.55;

phenotypic

variances _were1.0 and 1.45 and

heritabilities were 0.40 and 0.30 for first and second lactations

_{respectively;}

_n_l =

50;

n2 = 35. The &dquo;estimated&dquo;

(assumed)

_parameters

_were:

r

9

= 1.0

(repeatability

model);

the &dquo;true&dquo;

_phenotypic

covariance matrix

_(from

table

_I)

was used and

heritabilities for lactation

_{1(!2 )}

and for lactation 2

(h2)

were varied. A

_proportion

of

10%

of the total number of animals available was selected. The definition of

this

_{repeatability}

model differs from that

_usually

used because heritabilities and

phenotypic

variances are not

_{necessarily equal}

in different lactations.

Responses

to selection were calculated

_{using equations}

(1), (2)

and

(3).

Given

any set of

_parameters

the

_{optimal proportion}

of animals to be s from each cohort

was determined

_using

an

_algorithm

from

_Ducrocq

and

_Quaas

_(1988),

_assuming

the

parameters

used were the true

_population

_parameters.

Results are

_presented

in

table III.

(8)

These results _maybe

_expected,

since the &dquo;true&dquo;

_genetic

correlation

₍₌

_0.85)

between

performance

in lactations 1 and 2 was

high

and an observation for lactation

performance

is

_always

conditional on the _presenceof a first lactation observation.

The ratio of achieved to

_predicted

_responsewas less robust to

_changes

in

_parameters.

Even when the correct heritabilities

_(0.40

and

_0.30)

were used the achieved _response

(accuracy)

was

_{approximately}

10%

below the maximum

_expected

_response.This

may be seen as a _very

_simple

illustration that one should be cautious when

_using

predicted

values

_(whether

from selection indices or

BLUP)

to estimate

_genetic

trend when the parameters used in the

_prediction

are

_subject

to

_large

_sampling

errors or

when

_they

are a

_priori

incorrect

_(as

in the case of a

_{repeatability}

model when it is

known that _r₉ G

_1).

Since results were similar for the 3 indices

_{used, subsequent}

calculations were

_{only performed}

for the case of mass selection.

Multiple

trait

multiple

lactation considerations

Suppose

the

_{breeding goal}

is a linear combination of 9 traits

_,

₁

_(M

_F

₁

_,

_P

_I

_,

_M

₂

_,

F

2 ,

P

2 ,

M

3 ,

F

3 ,

),

3 P

which is

_thought

to be a

_good

indicator of lifetime economic

production

since third and later lactation

_performances

are assumed to be

_highly

correlated.

_Then,

_choosing

a set of economic values and

_using

_parameters

from table

_I,

the relative _accuracyof selection for different indices which use different

amounts of information can be

_{investigated.}

For 3 different sets of economic values these relative accuracies were

_calculated,

and results are

_presented

in table IV. The

(9)

to first lactation economic

_weightings

used in

_practical

selection indices in

_Europe

(eg

Wilmink,

1988).

H

3

reflects a more

_{&dquo;progressive&dquo;}

_breeding

goal

with selection

only

on

_{protein production.}

Results from table IV show that

approximately

20%

accuracy is lost when

only

one observation is used to

_predict

the

_aggregate

_breeding

value. Results for

H

2

and

_H

₃

were similar since

_breeding

values for these

_composite

traits were

_highly

correlated.

If

_only

_{accuracy is}

considered, using

milk and fat

yield

in a selection index does not contribute

_{substantially}

to increase response

to selection for lifetime

_protein

yield

(H

3 ).

For all 3

_{breeding goals,}

the

_expected

gain

in accuracy for sire selection when

_adding

progeny means on

multiple

traits is

expected

to be smaller than for mass selection.

Although

these calculations are an

oversimplification

of

breeding

value

prediction

and selection in

_practice,

_they

are useful when

_comparing

the accuracies from table

IV with accuracies when

simplified assumptions

are made

_regarding

the covariance

(10)

Proportionality

considerations

One

possible

way to reduce the

dimensionality

of a MV

_{prediction problem}

is

to

_investigate

whether some traits _maybe

_{approximately}

_expressed

as linear

combinations of other

_traits,

or if some linear combination of the traits

_explains

most of the variation in the

_aggregate

_breeding

value. To reduce

_computations

(further)

it would be of interest to find a minimum number of

independent

traits

which would

_provide

all the _necessaryinformation

_(see

for

_example

Lin and

Smith,

1990).

In

_particular,

it would be convenient if one linear transformation could be found that reduced the

_{prediction problem}

of 9

_highly

correlated traits

_(milk,

fat and

_{protein yield}

in lactations

_1-3)

to that of 3

_independent

new

_traits,

since in that

case 3 _separate

_analyses

would contain all the information and would

_(therefore)

account for selection. One

_approach

is to

_investigate

if submatrices of the 9 x 9

covariance matrix are

_proportional

to each

_other,

since that would indicate that

a suitable transformation _mayexist to reduce the

_{dimensionality}

of the

_prediction

problem.

With observation on _ptraits in I

_lactations,

this

_suggests

_testing

if the covariance matrix of

_lp

traits,

V,

may be written as V = K Q9

V

h

,

where

V

h

is a

submatrix

_(or

a transformation

_thereof)

_describing

a

(co)variance

block of _{p traits}

within or between

_lactations,

K is a I x I

_symmetric

matrix of

_{proportionality}

or

scaling

constants, and (9

is the direct

_product

_operator

_{(Searle, 1966).}

For the case

of

_M,

F and P in lactations

_1-3,

the

_hypothesis

is that the

_complete

covariance

matrix _maybe

_expressed

as a

_{proportionality}

_{(or scaling)}

matrix

_multiplied

_by

a

transformation matrix. In the

appendix

a

general

framework is

presented

to test for

proportionality

of covariance matrices

_using

likelihood ratio tests and to estimate

constants of

_{proportionality, using}

multivariate normal

_{theory applied}

to observed and

_expectations

of moment matrices.

Unfortunately,

the data from table I were found unsuitable for a likelihood ratio

test.

_Obviously

the additive

_genetic

covariance matrix

_(A)

and the environmental covariance matrix

_(E)

from table I are not

_independent

moment

_matrices;

A and E are

_highly

correlated and the determinant of A is zero. An alternative is to transform A and E into a between and within sire covariance matrix

(B

and

W),

assuming

these matrices are from a balanced half-sib

_design

based on s sires and

n _{progeny per}sire.

_However,

there was insufficient information about the

sampling

variances of the estimated E and A matrices to determine the

appropriate degrees

of freedom.

Furthermore,

the exact distribution of the likelihood ratio test statistic

based on

_empirically

derived

_degrees

of freedom and

using

animal model estimates

may differe

substantially

from a

Chi-square

distribution.

Therefore,

_significance

testing

for

_{proportionality}

was not

_pursued.

Using

parameter

estimates from table

_{I, however,}

some inference with

_respect

to

proportionality

may be drawn. One

(obvious)

choice for the transformation matrix

V

h

is a canonical transformation on

_M

₁

_,

_F

₁

and

_P

_l

_.

This transformation was

calculated

_(see

_Hayes

and

_Hill,

_{1980 ;}

and

_lMeycr,

_1985,

for

_{computations)}

and the

(11)

the 3 x

_{3 genetic}

_(Vg

_ll

₎

and

_phenotypic

_(V

_Pll

₎

covariance matrices of

_milk,

fat and

_{protein yield}

in lactation 1. Then the transformation matrix of

_milk,

fat and

protein

yield

in lactation one,

Q

l

,

was chosen such that:

with elements of

diagonal

matrix D

_equal

to

_eigenvalues

of the matrix

_(V-

_l

_vgil).

Using Q

1 ,

the vector of

observations,

was transformed

_using:

The

_eigenvectors

for the 3 canonical variates in lactation 1 were

_{(2.96 -}

0.72

-2.09], (-0.85 -1.85 2.61!

and

_[-0.08

0.42

_0.69]

_{respectively,}

which form the rows of

matrix

_Q

₁

_.

The 9 x 9 correlation matrices and the heritabilities of the 9 new traits

(y

c

)

are shown in table V.

_{Off-diagonals}

in all 3 x 3 blocks were

small, indicating

that one transformation matrix created 3

nearly independent

variates with for each new

variate

_highly

correlated &dquo;observations &dquo; in later lactations.

Using

the covariance matrix of

milk,

fat and

_protein

_yield

in lactation 1 as

_V

_h

_,

proportionality

matrices

for additive

_genetic

and environmental effects were calculated from

_equation

(A7].

This assumed the observed covariance matrices E and A were moment

_matrices,

but

degrees

of freedom did not need to be

_specified.

For A and

_E,

the estimates of

proportionality

matrix

_K,

_K

_a

and

K

e

respectively,

were:

For

_example,

the estimate of the average

scaling

factor for

genetic

variances for

M,

F and P on the transformed scale

(transformation

such that new variates are

uncorrelated)

in lactation 3 was

_1.74,

and the estimate of the _average

_scaling

factor for environmental covariances between the transformed variates in lactations 1 and 3 was 0.58. If

_{proportionality}

is

_assumed,

the 9 x 9 MV

prediction problem

_may

be reduced to 3

_independent

3 x 3 multivariate

_predictions

or to 3

_independent

evaluations with a

_{repeatability}

model.

_Using

the

breeding goals

defined

previously

the

_efficiency

of the reduction in

_{dimensionality}

was calculated for mass

selection,

conditional on the

_parameters

in table I

being

the true

_population

_parameters.

Thus the

_parameters

from table V were used for index calculations with all

off-diagonals

of all 3 x 3 covariance blocks set to zero. In the case of a

_{repeatability}

model on the

canonical

_variates,

_genetic

correlations between canonical variates across lactations

were set to

_unity.

Results of selection index calculations for

_phenotypic

selection are

(12)

is

_slightly

lower than the

_{corresponding}

_accuracy

_using

the

_original

first 3 variates

(M

1 ,

F

1

and

_P

₁

₎

from table III because the

_genetic

covariance structure between the canonical traits in lactation one and transformed variates in later lactations

was

_simplified

(off-diagonals

of 3 x 3 blocks in matrix G were set to

_zero).

_Clearly

little accuracy is lost

assuming proportionality

of the covariance structure for

milk,

fat and

_protein

_yield

across lactations.

Simplification

to a

repeatability

model on

3 canonical variates was

approximately

97%

as efficient

compared

to a multivariate

analysis

on 9 traits. When

_using

the canonical variates there was no

advantage

of

a MV

analysis

over an

analysis

with a

repeatability

model.

Analysing

linear combinations of the observations

A final reduction in

_{dimensionality}

is achieved

_{by analysing}

a reduced set of traits which are linear combinations of the available observations. One

_suggestion

is to create new traits for each lactation which are the sum of the

_phenotypic

observations

weighted by

the

_{corresponding}

economic values in the

aggregate

breeding

value.

Using

the notation from

_equation

_{(4!, !}

y =

a’x,

where variables in x _are,for

example,

observations for

M

1 ,

F,

and

_P

₁

_.

Relative accuracies were calculated

_{using equation}

(4!,

fitting

first lactation

yield

traits,

first and second lactation

_yield

_traits,

and all

yield

traits in vector _x,

_{respectively.}

Results are

_presented

in table VII.

Another

_suggestion

is to use linear combinations of the

yield

traits within a

lactation as new traits and to

_perform

an

_analysis

on those new traits. For

_example,

if _y_l =

w

’

x

l

,

for

x

’

=

(M1F1P1!,

and

!2 =

W’X

2 ,

for

x’

=

[M

2 F

2 P

Z

),

then in the

selection index framework this would be

_fitting

y = W’x _asused for

_equation

(4).

Using

Xl and x2 as

_above,

and

_x3

=

[M

3 F

3 P3!,

3 new traits were created

_using

the economic values for each trait in the

aggregate

breeding

value as elements for

matrix W. Accuracies for

_fitting

combinations of these new traits are

_presented

in

table VII.

Finally,

3

_separate

selection indices could be calculated from

_using

_(M

_l

_M

₂

_M

₃

_),

(F

(13)

way selection indices for

dairy

cattle are

usually

calculated:

_breeding

values are

calculated for

_milk,

fat and

_protein

_yield

_separately,

and the 3

estimated

_breeding

values are

_subsequently

combined in a selection index

_{( eg}

Dommerholt and

Wilmink,

1986; Wilmink,

1988).

Separate

selection indices for milk

_yield,

fat

_yield

and

protein

yield

were calculated

_using

either the &dquo;true&dquo;

_{(multivariate)}

covariance structure or the modified

_{repeatability}

model

_(r

_.

across lactations set to

_unity),

and the _accuracyof

_using

the 3 indices to

_predict

the

_aggregate

_breeding

value was

calculated. For the construction of each of the 3 indices it was assumed that the

traits in the index were identical to the traits in the

_aggregate

_breeding

values. For

example,

the index

_{using M}

_I

_,

_M

₂

and

_M

₃

was calculated as

Pn; G&dquo;,a!&dquo;

where

P

m

and

_GI&dquo;

are the 3 x

_{3 phenotypic}

and

_genetic

covariance matrices for milk

_yield

in

lactations

_1-3,

and am is a vector of economic values for these traits. The relative

accuracies for this

_approach

are shown in table VII.

Comparing

results from tables

_IV,

VI and VII shows that little

efficiency

was

lost when

_analysing

linear combinations of the observations

_using

economic values

as

_weights.

For the case of

_{just using}

observations for

_,

_i

_M

_F

₁

and

_P

_I

this is not

surprising,

since these traits were so

_highly

correlated and had similar

heritabilities,

hence their index values resembled the economic values. For

_{breeding goals}

_H

₂

and

H

3 approximately

8%

accuracy was lost when

_using

all observations

_{weighted by}

their economic values as a

_single

_trait,

and

_{approximately}

4%

_accuracywas lost

when

_analysing

3 new

_traits,

each trait

_being

a linear combination of observations

and economic values within a lactation

_{(table VII).}

The last 2 rows of table VII indicate what accuracy was lost

_{by ignoring}

(14)

separately.

In effect

_only

_averagecorrelations between the traits within lactations

are taken into account when

_combining

_separate

indices for milk

_yield,

fat

yield

and

protein

yield.

The loss _{in accuracy}was small and _verysimilar to losses when

using

a

repeatability

model on the canonical variates

(see

table

_VI).

DISCUSSION

Only

one

_aspect

of

efficiency

of

selection,

namely

accuracy of

predicting

some

aggregate

breeding

value

_assuming

fixed effects were

known,

was considered in this

study. Meyer

(1983)

found that for BLUP

prediction

of

_breeding

values increase

in accuracy from

_including

later lactation observations was

_{largely through}

an

improved

data structure. Results from

_including

information from relatives in the

calculations,

and

_including

_comparisons

between young and old animals should have

(15)

Use of a

_{repeatability}

model for milk

_production

traits across lactations seemed

to have little effect on _accuracyof

_{selection, although}

the

_predicted

_{gain/accuracy}

may be

approximately

10%

too

_high.

More research is needed to

_investigate

long-term losses _{in response}to selection when incorrect models are used to

_predict

breeding

values.

More information on the

_sampling

variance of the

_parameter

estimates are needed

for

_testing

the

_{proportionality}

_{hypothesis. Ideally,}

one MV REML

_analysis

on the

9 traits should be carried out, with an

_algorithm

that would

produce

(second)

derivatives.

Still,

calculations would then involve a 90 x 90

₍₄₅

_genetic

and 45

environmental)

sampling

variance matrix which would

_probably

be

_subject

to

_large

sampling

errors itself.

As

_pointed

out

_by

_Meyer

_(1985),

the canonical variates from

_creating

indepen-dent variates in lactation _{1 may}have a

biological explanation.

The

_eigenvectors

show that canonical variate 1

_{corresponds approximately}

to

_{percentage protein}

(and

fat content to a lesser

_extent)

and canonical variate 2 to the difference be-tween fat and

_protein

content. Canonical variate 3 seems

_just

to be the sum of fat

and

_protein

_yield.

Heritabilities for canonical variates were consistent with

heri-tabilities found for

_fat

and

_protein

content

_{previously reported.}

Canonical variates

from

_{diagonalising}

the

complete

9 x 9

P-

1 G

matrix have similar

_biological

expla-nations to the canonical variates from lactation

_1,

but now

_including

_comparisons

between lactations

_(see

table

_II).

Given that

_parameter

estimates from table I are

_subject

to

_sampling

_error,

ma-trices

_describing

covariances between

_{M, F,}

and P within and between lactations

were

_remarkably

_proportional

to each other. Calculations for mass selection

con-firmed that little information is lost if

_{proportionality}

is assumed. A

_{repeatability}

model on canonical variates from lactation one should account for selection bias and

_only

loses

_{approximately}

3%

in accuracy

compared

to a

_general

multivariate

prediction

of

_breeding

values of

_milk,

fat and

_protein

_yield

in lactations 1-3. The loss in accuracy is _verysimilar to the loss when information between traits within each lactation

_separately

is

_ignored,

as is common

_practice

in

_dairy

cattle selection indices. For

_practical

_(national)

animal model BLUP

_evaluations,

the

_proposed

transformation _{is easy}to

_implement.

Reducing

the

_{dimensionality}

of the

_{prediction problem by analysing}

linear combinations of observations and economic values of

_{corresponding breeding}

values

was found to be very efficient.

However,

no information from relatives was included in the

_{calculations,}

and for the traits considered heritabilities and

_phenotypic

and

_genetic

correlations were similar between

_pairs

of traits. When

_using

traits

(16)

Testing

for

_{proportionality}

of covariance matrices

Notation:

M = Moment

_matrix;

_a

_{symmetric positive}

definite

(PD)

matrix of order

lp,

with mean _squaresand mean

_{cross-products}

based on

_{df degrees}

of freedom

I = number of

lactations,

p = number of traits _perlactation

V =

E(M) ;

unknown PD covariance matrix of

lp

traits

K =

symmetric

matrix of

proportionality

_constantsof order 1

Q

9 = direct

product

_operator

(see

eg

Searle,

1966),

tr = _trace

_operator

L = natural

logarithm

of likelihood.

Using

standard multivariate

_theory

_(eg

Anderson, 1958;

ch

_10),

with

_A

_j

=

eigenvalue

of

V,

y

Z =

eigenvalue

of

MV-’.

The maximum likelihood

_(ML)

is obtained for V =

_M,

Suppose

a moment matrix

_M

_o

is

_observed,

and the null

_hypothesis

_is,

Ho :

V

o

=

E(M

o

)

= kV with V

specified

and k a constant.

Then,

and the ML estimate of

_k,

k

=

(’

2 :,Îi)/lp.

Hence

For the trivial case of

M

o

= kM where M is the ML estimate of

V,

all

eigenvalues

of

_M

_o

_V-

₁

=

M-’

O

M

_are_constantand

equal

_tothe

proportionality

constant

_{(= k).}

The likelihood ratio

_(LR)

test

_statistic,

t =

_{2(ML — ML}

_o

_),

asymptotically

has a

X

2 distribution

with

degrees

of freedom

[1/21p(lp

+

_{1) -}

_1].

With observations on _{p traits in}I

_lactations,

one

_suggestion

is to test

_V

_o

=

K I8i V

h

,

where

_V

_h

is a

_{(transformation}

of

_a)

submatrix

_describing

a

(co)variance

block of _p

traits within or between lactations.

V

o

may be written as,

(17)

the trace in

_[A5]

_maybe written as:

for P a

_permutation

matrix and

_M!!

a I x

_{I diagonal}

block of M*. Thus

_[A5]

becomes:

Using

(A6!,

the ML estimate of K is:

ACKNOWLEDGMENT

PM Visscher

_acknowledges

financial support from the Milk

_Marketing

Boards in the UK.

REFERENCES

Anderson TW

₍₁₉₅₈₎

An Introduction to Multivariate Statistical

_Analysis.

John

Wiley

and

Sons, NY,

374 _pp

Dommerholt

_J,

Wilmink JBM

₍₁₉₈₆₎

_Optimal

selection _responseunder

_varying

milk

_prices

and

_margins

for milk

_prices

and

_margins

for milk

production.

Livest Prod Sci

_14,

109-121 1 _..

Ducrocq V,

Quaas

RL

₍₁₉₈₈₎

Prediction of

_genetic

_response

to truncation selection

across

_generations.

J

_Dairy

Sci

_71,

2543-2553

Ducrocq V,

Boichard

_D,

Bonaiti

_B,

Barbat

_A,

Briend M

₍₁₉₉₀₎

A

_{pseudo-absorption}

strategy

for

_solving

animal model

_equations

for

_large

data files. J

_Dairy

Sci

_73,

1945-1955

Hayes

JF,

Hill WG

₍₁₉₈₀₎

A

_{reparameterisation}

of a

_genetic

index to

_locate

its

sampling properties.

Biometrics

_36,

237-248

’ ’

’ .

Jones

_LP,

Goddard ME

₍₁₉₉₀₎

Five _years

_experience

with the animal model for

dairy

cattle evaluations in Australia. In: Proc

_4th

World

_Congr

Genet

_Appl

Livest

Prod,

Edinburgh,

Scotland

_{July !J-27, 13,}

382-385

Lin

CY,

Smith SP

₍₁₉₉₀₎

Transformation of multitrait to unitrait mixed model

analysis

of data with

_multiple

random effects. J

_Dairy

Sci

_73,

2494-2502

Meyer

K

₍₁₉₈₃₎

_Scope

for

_evaluating

_dairy

sires

_using

first and second lactation records. Livest Prod Sci

_10,

531-553

Meyer

K

₍₁₉₈₅₎

Maximum likelihood estimation of variance components for a

(18)

Searle SR

₍₁₉₆₆₎

Matrix

_Algebra

_for

the

_Biological

Sciences. John

_Wiley

and

_Sons,

NY

Visscher

_{PM, Thompson}

R

₍₁₉₉₂₎

Univariate and multivariate

parameter

estimates for milk

_production

traits

_using

an animal model. I:

_Description

and results of

REML

_analyses.

Genet Sel

_Evol,

_24,

415-430 ’ ’

Wiggans GR,

Misztal

_I,

Van Vleck LD

_(1988a)

_{Implementation}

of

an

animal model for

_genetic

evaluation of

_dairy

cattle in the United States. J

Dairy

Sci

71 _{(suppl 2)}

54-69

Wiggans

GR,

Misztal

_I,

Van Vleck LD

_(1988b)

Animal model evaluation of

_Ayrshire

milk

_yield

with all

_lactations,

herd-sire

_interaction,

and _groupsbased on unknown

parents.

J

_Dairy

Sci

71,

1319-1329