Genetic evaluation of horses based on ranks in competitions

(1)

HAL Id: hal-00893872

https://hal.archives-ouvertes.fr/hal-00893872

Submitted on 1 Jan 1991

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Genetic evaluation of horses based on ranks in

competitions

A Tavernier

To cite this version:

(2)

Original

article

Genetic evaluation of horses based

on

ranks in

competitions

A Tavernier

Institut National de la Recherche

_Agronomique,

Station de

_{Génétique Quantitative}

et

_Appliquée,

Centre de Recherches de

_{Jouy-en-Josas,}

7835!

Jouy-en-Josas Cedex,

France

(Received

14 October

_{1988; accepted}

9

_January

₁₉₉₁₎

Summary - A method is presented for

analysing

horse

performance

recorded as a series

of ranks obtained in races or _{competitions.}The model is based on the assumption of the existence of an

_underlying

normal variable. Then the rank of an animal is

_merely

the

phenotypic expression

of the value of this

_underlying

variable relative to that of the other horses

_entering

the same

_competition.

The

_breeding

values of the animals are estimated

as the mode of the a

_{posteriori density}

of the data in a

_Bayesian

context. Calculation of this mode entails

_solving

a non-linear system

by

iteration. An

_{example involving}

the results of races of

_{2 .yr-old}

French trotters in 1986 is

_given.

Practical

_computing

methods

are

_presented

and discussed.

horse

_/

_ranking

_/

order statistics

_/

Bayesian methods

Résumé - _{Évaluation génétique}des chevaux à _partirde leurs classements en

compéti-tion. Cet article

_présente

une méthode

_d’analyse

de

_{performances enregistrées}

sous la

forme

de classements obtenus dans des

_{confrontations}

restreintes et variables

_(courses

ou

concours).

Le modèle

postule

d’existence d’une variable normale

sous-jacente.

Le

classe-ment d’un cheval est alors

_{simplement d’expression phénotypique}

de la valeur de cette

va-riable

_sous-jacente

relativement à celles des autres animaux

participant

à la même

épreuve.

Les valeurs

_génétiques

des animaux sont estimées à

_partir

du mode de la densité a

poste-riori des données dans un contexte

_bayésien.

Le calcul de ce mode amène ic la résolution

d’un

_système

non linéaire par itérations. Un

exemple

d’application

est réalisé sur les

résul-tats des courses des chevaux Trotteurs

_Français

de 2 ans en 1986. Des méthodes de calculs

pratiques

sont

proposées

et discutées.

cheval

_/

classement

_/

_statistiquesd’ordre

_/

méthodes bayésiennes INTRODUCTION

Choosing

a

good

selection criterion is one of the

major

problems

in

genetic

evaluation of horses. The

_breeding

_objective

is the

_ability

to succeed in

_riding

competitions

(jumping,

dressage,

3-day-event)

or in races

_(trot

and

_gallop).

But

how should success be measured?

The "career" of a horse is made _upof a series of ranks obtained in races or

competitions.

A

_"physical"

measure of

performance

is not

_always

available. Such a

(3)

These data are not

_always

collected

and, furthermore,

they

may

give

a _poor

indication of the real level of the

_performance:

a

_racing

horse must be fast but it

_must,

above

_{all, adapt}

to

_particular

conditions

_prevailing

in each event. This may

explain

the

relatively

low

heritability

of time

performance

of

thoroughbreds

(Hintz,

1980;

Langlois,

1980a).

In the case of

_{riding horses,}

it is difficult to assess

the technical level of a

_jumping

event. It

_depends

not

_only

on the

_height

of the

obstacles

_but,

to a

_greater

_extent,on the difficulties encountered when

_approaching

the obstacles and on the distance between obstacles. None of these variables can be

easily

quantified.

Therefore,

information

_{provided by}

the

_ranking

of horses in each event deserves attention.

_Ranking

allows horses

_entering

the same event to be

_compared

to the others.

However,

the level of the event has to be determined too. The most

frequently

used criterion related to

_ranking

is transformed

earnings.

Each horse that is

_{&dquo;placed&dquo;}

in an event,

ie,

ranked among the first ones, receives a certain

amount of money.

Prize-money

in a race is allocated in an

exponential

_way:for

instance,

the second horse earns half the amount

_given

to the

_first,

the third half of that

_given

to the second and so on... If the rate of decrease is not

_50%,

it often

equals

a fixed

_percentage,

for instance

75%

in horse shows. The

earnings

of a horse in a race can then be

expressed

as G =

ax( k-

l

) D

with _a

being

the

_proportion

of

the total endowment

_given

to the winner

_(constant),

x

_being

the rate of decrease of

earning

with rank

_(constant),

k the rank of the horse in the race and D the total

endowment of the race. The constants a and x must

_satisfy

_{(axK-1-!+(1-a)}

=

0)

with K the total number of horses

_{&dquo;placed&dquo;. So,}

a

_logarithmic

transformation

_gives

Log(G)

=

Log(a)

₊

Log(D)

₊

(k - 1) Log(x).

This is _alinear function of the rank

of the horse. To use it as a function of the

_ability

of the

_horse,

Log(D)

should be

assumed to be a linear function of the level of the race. The total amount of _money

given

in a race or a

_competition

should

_depend

on the technical

_difficulty

or the

level of the

_{competitors. Hence,}

with

_{adequate competition}

_programmes

_(Langlois,

1983),

the

_logarithm

of

_earnings

of a horse may be a

_good

scale for

_measuring

horse

_performance

and it has been

_widely

used

_(Langlois,

_{1980b, 1989;}

Meinardus and

_{Bruns, 1987; Tavernier, 1988, 1989;}

Arnason et

_{al, 1989; Klemetsdal, 1989;}

Minkema,

1989).

However,

this criterion

_strongly

depends

on the _{way money is} distributed. The choice of the amount of money

given

in

_jumping

_competitions

does not follow strict technical rules in France and does not

_{directly depend}

on

the scale of technical difficulties but on the choice of the

_organizing

committee.

Therefore,

it _appearsthat ranks should be taken into account without reference to

earnings.

The purpose of this article is to

_present

a method for

_estimating

the

_breeding

value of an animal

_using

a series of ranks obtained in events where it

_competed

against

a

sample

of the

population.

In order to

_interpret

these

data,

the notion of

underlying

variable will be used as in Gianola and

_Foulley

(1983)

for estimation

of

_breeding

value with

_{categorical data,}

and in

_Henery

₍₁₉₈₁₎

for

_constructing

the likelihood of outcomes of a race. The horse’s &dquo;real&dquo;

_performance,

which cannot be

measured,

is viewed as a normal

_variable;

this is a reasonable

_assumption

for traits

with

_polygenic

determination.

_Only

the location or

_ranking

of this

_performance

(4)

is recorded instead of a

performance.

Practical

computational

aspects

as well as an

application

to trotters are

presented.

METHOD

Data

The data

_(Y)

consist of the ranks of all the animals in all the events. The total number of observations is therefore

_equal

to the sum of the number of animals

per event. It is assumed that the ranks are related to an

underlying

unobserved

continuous variable. The rank

_depends

on the realized value of this

underlying

unobserved variable

_{(&dquo;real&dquo;}

animal

_performance)

relative to that of the other animals

_entering

the same event. The

_genetic

model is the same as for usual traits

with

_polygenic

determinism. The

_{underlying performance}

_y_jk follows a normal

distribution with residual standard deviation (Fe and

expected

value !,2!. The model is:

where:

- y2!! _ &dquo;real&dquo;

performance

of

horse j under

environmental conditions i in the kth race of

_j;

-

b

i

= environmental effect

_{i (eg}

_age,sex,

rider...);

- u

j = additive

breeding

value of horse

j;

-

p

j = environmental effect common to the different

_performances

of horse

_j,

as

it may

participate

in several

events;

- eij

k = residual effect in kth race.

The vector of

parameters

to be estimated is 0 =

(b’,

u’,

p’)

where b =

{b

i

},

u =

(uj )

and

p =

{

Pj

}.

Inference is based on

_Bayes

theorem. Since the

marginal

density

of Y does not _varywith 0:

where

_pee)

is the

_prior

_density

of

_0,

_g(Y/6)

is the likelihood function and

_{f (9/Y)}

is the

posterior

density

of the

_parameters.

Prior

_density

The vectors

_b,

u, p and e are assumed to be

mutually independent

and to follow

the normal distributions:

_N(13,

_{V), N(O, G),}

_{N(O, H), N(O, R),}

respectively.

Prior information about b is assumed to be vague, which

implies

that the

diagonals

of V tend to +00.

Then,

the

_prior

_density

of b is uniform and the

_posterior

_density

of e does not

_depend

on

_{!3 !}

G =

Ao,’

where

A is the

relationship

matrix and

0 -; is

the additive

_genetic

variance. H is a

_diagonal

matrix with

diagonal

elements

equal

to the variance of

_p

_{(u p 2).}

The variances

₀

_-;

_and

_a

_P2

are assumed to be

known,

0 -; is

(5)

Likelihood function

Given ai, the

performances

y2!! are

_{conditionally independent.}

Let _y(_l_),

_!(2),...,

Yen)

be the ordered

underlying performances

of the n horses which

competed

in an event

_(for

_notation,

see for

_example

David, 1981,

p

4).

Then,

the likelihood

of

_obtaining

the observed

_ranking

in that event can be written as

(Henery,

_1981;

Dansie,

1986):

where:

-

yis the standard normal

density.

-

J1(t)

is the location

parameter

of the horse ranked &dquo;t&dquo; in that event.

This

_probability

can be

_interpreted

in the

_following

way: the

performance

of the

last animal may

vary

between -oo and +00, the

_performance

of the next to last varies from that of the last to +oo and so on.

_Thus,

the

_performance

of a horse varies

from that of the horse ranked

_just

behind it to +00, hence

_leading

to the bounds of each

_integral

in

P

k

.

Each

_integration

variable

_(t)

follows a normal distribution

with mean

_J1(

_t

₎

and standard deviation ue = 1.

Given 1L

(

t

),

these distributions are

independent

for all animals in the same

_competition.

This

_probability

may be

expressed

in terms of a multivariate normal

_integral

with thresholds

_independent

of

_integration

variables

_(Godwin,

_{1949; David,}

_1981):

where the distribution of

_{(xl, ... , !t, ... , !n-1 )}

is normal with mean

₍

_/1

₍

₁

₎

-!(2!, ... , ,!(t) - /1(t+1) , ... ,/1(n-1) -

/t(

n

))

and variance V =

{v

ml

}

with

Vmm =

_2,

Vm,m-1 = v m , m+1

= -1 and all other Vml = 0. Then:

Results of races are

_likely

to be correlated.

_However,

if the model is

_appropriate,

this correlation would

depend only

on

_genetic

or environmental effects ie

_given

the

J

.

L

ij’

S

,

the races are

_independent.

The likelihood function is

equal

to

the

_product

of the

_{probabilities}

of each event:

(6)

Estimation of

_parameters

The

_posterior

_density

of the

_parameters

is:

The best selection criterion is known to be the mean of the

posterior

distribution

(Fernando

and

_Gianola,

_1984;

Gof&net and

Elsen,

1984).

As

_expressing

it

analyti-cally

is not

_possible

for the model used

here,

we will take as estimator of 0 the

mode of the

_{posterior distribution,}

which can be viewed as an

_{approximation}

to the

optimum

selection criterion.

_Finding

this mode is

_{computationaly equivalent}

to the maximisation of a

_{joint probability}

mass

density

function as calculated

_by

Harville and Mee

₍₁₉₈₄₎

for

_categorical

data

_(Foulley,

_1987).

It is more convenient to use

the

_logarithm

of the

_{posterior density:}

/C=1

where m is the number of events.

The

system

which satisfies the first-order condition is not linear and must

be solved

_iteratively,

for

_{example using}

a

Newton-Raphson

_type

algorithm.

This

algorithm

iterates with:

where

9

is the solution for 0 at the

_qth

round of iteration and

AM

= 9!q!-e!q 1!.

Iterations are

_stopped

when a _convergence

_criterion,

a function of

0,

is less than an

arbitrarily

small number.

The first and second derivatives of

_L(O)

with

_respect

to

_b,

u, p are

reported

in

Appendix

1.

The

_system

can be written in the

following

way:

m

where

A, B, C,

D are sub-matrices of minus the second derivatives of

_L

_Log(P

_k

₎

k=l

m

with

_respect

to 0 and w, z are the vectors of first derivatives of

_E

_Log(P,!)

with

k=l

(7)

The numerical solution of

_system

_(I)

raises the

_problem

of the calculation of the

_{corresponding integrals.}

Multivariate normal

_integrals

_maybe calculated with numerical methods such as’that of Dutt

_(1973),

described and

_{programmed by}

Ducrocq

and Colleau

_(1986).

A second method consists of

_using

a

Taylor’s

series

expansion

about zero which seems to

_{give good}

results

_(Henery,

_{1981; Dansie,}

1986; Pettitt,

1982).

This

requires

that animals

_{participating}

in a

_given

event have

relatively

close means _I_t_ij_,which is a reasonable

_assumption

in the

_present

context

of horse

_{competitions.}

This

expansion

involves moments of normal order

_statistics,

as

explained

in

Appendix

2.

Example

In order to illustrate these

_{computations,}

a

_simple

_example

was constructed. This

example

involves 5 unrelated horses. There are no fixed

_effects,

_{hence a}

=

(u

₊

p)

is estimated. The variance-covariance matrix

_{of p}

is

_diagonal

with each term

_being

9/11.

Two races with 4 runners are considered. _{The first gave the}

_{following ranking:}

No

_1,

No

_2,

No

_3,

No 4 and the second: No

3,

No

_2,

No

_5,

No 4. The

_starting

value for all

_A

_’s

was 0. The

system

to be solved at the first iteration of the

_{Newton-Raphson}

algorithms

as well as the

corresponding

solution are the

_following:

The

_{algorithm converged}

at the 5th iteration:

_{(A’ A )°.}

₅

= ₆_x

10-

17 .

The

correspon-ding

values as well as the solutions and the coefficient of determination

(CD)

with

CD

=

(1 —

ciilo, u 2)

where _c_2i is the

diagonal

element of the inverse of the matrix of

second derivatives of the

_logarithm

of

_posterior

_density

are:

(8)

---solution:

_[

_Al

_p₂_P3 _!4

_P5]

₌

_[0.621

0.237 0.271 - 0.902 -

_0.226]

accuracy:

[0.242

0.434 0.404 0.348

0.293]

It should be

noted that

the value of the first derivative for a horse in a

_given

race is

equal

to the

_expectation

of the normal order statistic

_{(normal score)}

_{corresponding}

to its rank.

_Similarly,

second derivatives for a

_given

race are functions of the variance

of,

and covariances

_between,

normal order statistics. This is the

_logical

_consequence of the choice of 0 for _JL as

_starting

value: all distributions of

_performances

are the same with a mean of 0 and all

_{integrals correspond}

to

_expectations

of normal order statistics. The accumulated values for all races are the sum of these.

At _convergence,these values have

_changed

and the final solution differs from the estimates obtained from the

_expectation

of normal order statistics. The

interpreta-tion of a rank

_depends

not

_only

on the number of

_competitors,

which is taken into

account

_through

the normal order

_statistics,

but also on the level of the

competi-tion.

At

convergence, the first derivative of the

_log

of a

_posteriori

_density

is set to

0.

_So,

estimates of horses are

_equal

to the first derivatives of the

_log

of likelihood function divided

_by

the variance term. These derivatives are different for the same

rank in different races.

_{They depend}

on the level of the race estimated a

_posteriori

by

the estimates of the horses

_{participating}

this

particular

race,

taking

into account

all races. In the

example,

for the winners of the 2 races, the first derivatives of the likelihood function were much lower than the

_expected

values of order statistics.

This is because the

_competitors

of these races have much lower estimates than the

winners:

0.237, 0.271,

-0.902 for horses No

_2,

No 3 and No 4

_against

0.621 for horse No 1 winner of the first race and

_{0.237, 0.226,}

-0.902 for horses No

2,

No 5 and No 4

_against

0.271 for horse No 3 winner of the second race.

_Therefore,

the first race

for No 1 and the second race for No 3 was easier than if

they

had

_{competed against}

3 horses of

_{equal ability}

to

_themselves,

ie with the same _u_i_,as

_implied

with the normal order statistics. The values of the first derivatives were 0.7589 and

_0.8475,

respectively,

compared

to 1.0294 for the

expectation

of the normal order statistics of the first out of 4. In the same _way,in the first _race,horse No 3

(0.27)

was beaten

by

a horse of lesser

_ability

_(No

2

_(0.24)),

_and,

therefore was more

_penalized

than if

it had been defeated

_by

a horse of

_equal

_ability.

The first derivative was

_-0.5165,

compared

to -0.2970 for the

_expectation

of the normal order statistics of the third

out of 4.

APPLICATION Data

(9)

horses in these races were recorded in the file. Ten races

(38

horses)

were discarded

because

_they

involved

_only

horses that did not

_compete

more than _once,and

which,

therefore,

were

_totally

disconnected from the rest of the file. We had to limit the

analysis

to

_{&dquo;placed&dquo;}

horses in each _race,

_ie,

horses ranked _amongthe best 4 or

_5,

because the

_ranking

of other

_participants

were not available. This does not

_prevent

us from

_testing

and

_comparing

our method to usual

_earning

criteria

_assuming

that these races involved

_only

4 or 5 horses.

_Indeed,

this is _neccessaryfor a fair

comparison

since

_earnings

also involve

_{only &dquo;placed&dquo;}

horses. With our

_approach,

&dquo;non

_placed&dquo;

horses

_could,

of _course,be treated as the others

_provided

that

_they

are filed.

The data set was made up of 251 races

(211

with 4 horses ranked and 40 with

5 horses

_ranked),

_involving

490 different horses. The total number of

_performances

was 1044

_places,

ie 2.1 _perhorse on _average,with a maximum of 9 and a minimum

of

1. A

horse

_competed

_against

3.3 horses on _average.The model used was:

where: -

y!! _ &dquo;real&dquo;

performance

of

horse j

in the kth race of

_j;

- u j

= additive

breeding

value of horse

j;

-

p! = environmental effect common to the different

_performances

of horse

_j;

-

e jk

= residual effect in kth race about

_{&dquo;expected&dquo; performance}

_lLj_.

No fixed effect was considered because

particular

conditions of each race

(dis-tance, type

of

_{ground, season...)}

are the same for all horses in the race and so have no effect on the result and because trainer and driver effects cannot be used on a

small data set

_(only

one horse for the

majority

of trainers or

_drivers).

The

expectations

and variance-covariance matrices are:

where h

2

=

0 ,2/

U2

is the

_heritability

and r =

(

U2

₊

a;)/a;

is the

repeatability

of the trait. Values of

h

2

= 0.25 and r = 0.45 _werechosen _as

they correspond

_tousual

estimates of these

_parameters

obtained from

_{competitions.}

RESULTS

The elements of

_system

_(I)

were recalculated at each

Newton-Raphson

iteration with Dutt’s

!1973)

method for

_{integrals. Convergence}

was reached after 5 iterations

(with

(ð.’ ð.) .

5 /490

= 2 x

10-

15 ).

The accuracies of these solutions were measured

by

coefficient of determination

_(CD).

If cii is a

_diagonal

element of the matrix of second

_derivatives,

CD =

(1 -

c

ii/

o

u

).

Breeding

value estimates had a mean of

_0,

a standard deviation of

_0.30,

with a

maximum of 0.94 and a minimum of -0.82. The mean _accuracywas

_0.23,

with a

standard deviation of

_0.08,

a maximum of 0.43 and a minimum of 0.12.

These values were

_compared

to criteria

_{usually employed}

in trotters

_(Thery,

(10)

0.73 with

_{Log(yearly earning),}

0.88 with

_Log(yearly

_earning

per

&dquo;place&dquo;),

0.79

with

_Log(yearly

_earning

per

start).

The correlation with a selection index

_using

as

_performance

the mean of the

_logarithm

of

_earnings

in each race

(with

_parameter

values

h

2

= 0.25 and r =

_0.45)

was 0.94. Correlations with criteria related to

racing

time were

_lower,

as were correlations between

_earnings

and

_racing

time. The correlation was -0.43 between our estimate and the best time per kilometer and - 0.47 between our criterion and a selection index

_using

as

_performance

_{the average}

racing

time

_(with

_parameter

values

h

2

= 0.25 and r =

0.45).

These

figures

also

suggest

that the best

_racing

time is not a

_good

measure of success in a race for

2-yr-old

horses.

This

_application

_suggests

some

_{peculiarities}

of our method. The first one relates

to the

_spread

of _{accuracy values. These}

_depend

not

_only

on the number of

&dquo;places&dquo;

but also on the

_{&dquo;place&dquo;}

of the horse in the race. Accuracies

_ranged

from 0.25 to 0.33 and from 0.20 to 0.28 for horses

_having

3 and 2

_{&dquo;places&dquo;, respectively.}

The minimal

accuracy

corresponding

to a

_single

_{&dquo;place&dquo;}

(0.12)

was smaller than the

_heritability

(0.25).

This is the result of the loss of information because ranks are used instead of continuous

_{performances.}

_{The average &dquo;loss&dquo; of}_accuracy

_ranged

from 0.10

_points

for horses ranked once to 0.05 for those ranked more than 7 times.

The second

_point

of interest is the relative

_importance

of the number of horses

per event and the level of the horses

_{participating}

in the event. At _convergence, the first derivative of the

_logarithm

of

_{posterior density}

is

_equal

to

_0,

so estimates

are

_equal

to the

_part

of the first derivative without variance terms divided

_by

these variance terms

_(see

_Appendix

_I).

When all horses

_{participating}

in an event are of

the same level

_(ie,

have the same real

_racing

ability)

this derivative is

equal

to

expectations

of normal statistics. These

_expectations

_depend

only

on the number of animals per event. In our method the first derivative also

depends

on the real

racing

abilities of the

_competitors.

So the same rank in different events does not

give

the same derivative.

_Figure

1 shows the distribution of the derivatives in all

the races with 5 horses

&dquo;placed&dquo;

for the different ranks. For a

_{given rank,}

these

derivatives are different in each race and so,

being

first in a race sometimes

_gives

a

lower estimate than

_being

second in a race of a

_higher

level.

Our method can be used as a tool to

_improve

the

_{correspondence}

between the level of the race and the

_prize

_moneyto be distributed. The average

competitive

&dquo;level&dquo; of the race can be

approximated

as the mean of the estimates of real

producing ability

(

Jij

)

of each horse. In

_practice,

the correlation between such a measure and the

_logarithm

of total endowment of the race was 0.30 for races with 4

horses

&dquo;place&dquo;,

and 0.65 for races with 5

_{&dquo;placed&dquo;.}

Races with 5 horses

&dquo;placed&dquo;

have

the

_greatest

_prize-money,

and endowment seemed to be a

_good

indicator of the value

of

_{participating}

horses. It is also

_possible

to calculate a

_posteriori

the

probabilities

of

_obtaining

the observed

_ranking

in each race - or even of fictitious races -

_using

the estimates for each horse. These

_{probabilities}

were

_directly

calculated from the

formula for

_P

_k

and do not take into account the _accuracyof the estimates. The average

probability

of

obtaining

the observed ranks was

11%

and

3%

in races with

4 and 5

_horses,

_{respectively.}

If all horses had the same real

_producing

_ability,

this

probability

would be

4%

in races with 4 horses

(24 possibilities)

and

0.8%

in races

(11)

DISCUSSION

In the

_light

of the results obtained with

_2-yr-old

trotters,

the

_proposed

method seemed

_{satisfactory:}

the estimated values are consistent with other criteria.

In

practice,

solving

a much

_larger

_system

of

_equations

_presents

difficulties. Two

numerical

_problems

arise,

namely

the calculation of the

_integrals

P!

and their

derivatives

and the dimensions of the whole

system.

Two methods for

_computing

the necessary

integrals

have been

suggested,

the first

being

a numerical calculation

of multivariate normal

_integrals

and the second an

_{approximation}

by Taylor’s

series.

Beyond

certain

_dimensions,

it takes a _very

long

time to

_compute

multiple integrals

of the normal distribution. For each iteration of

_{Newton-Raphson}

and for each race

of n

horses,

it is _necessaryto calculate one

integral

of order

(n - 1),

n

integrals

of order

_{(n - 2)}

and

_[n(n

+

_1)/2!

_integrals

of order

_{(n -}

_3).

_Therefore,

the time needed to

_accomplish

this becomes

_prohibitive

for a number of horses per race

> 5 or 6. On the other

hand,

our _purposeis to be able to

_apply

this

technique

to all

_types

of horse

competitions

(for

example

show

_jumping)

that sometimes involve more than 100

_{participants. Then,}

it is necessary to turn to

_{approximations}

like those

_{proposed by Henery}

₍₁₉₈₁₎

using

Taylor’s

series. The _accuracyof these

approximations

is difficult to test. In

_{particular, approximate}

formulae for the

moments of order statistics

_superior

to 2

_(Pearson

and

_Hartley,

_1972;

David and

Johnson,

1954)

need to be tested and

_compared

to

_integral

calculations of

_high

order.

Such

an

_{approximation}

reduces calculation times

considerably.

The moments

of order statistics not

_given

in tables can be calculated once and for all.

Then,

each

derivative

_only

consists of a linear combination of the

producing

abilities of the

(12)

The overall dimension of the

system

constitutes a second

problem. Using

an

animal

model with

repeated records,

this dimension is

_equal

to the number of horses to be evaluated

_plus

the number of

_performing

horses and fixed effects. At the

_{present time,}

in

_France,

100 000 horses are evaluated in

_jumping

with an

animal model

_(BLUP

_method)

based on

yearly earnings

and 70 000 are evaluated

in

_{trotting-races}

_(Tavernier,

_198(,)b,

_1990).

For each

Newton-Raphson

iteration of the

_proposed

method,

an iterative solution such as Gauss-Seidel will be needed.

This method has been

_developed

to include all horses in every race

_including

&dquo;non-placed&dquo;

horses.

However,

they

will have to be treated in a

slightly

different manner: the _purposeis to consider the horses

_{&dquo;placed&dquo;}

as better than the

&dquo;non-placed&dquo;,

but detailed

_ranking

of

&dquo;non-placed&dquo;

horses is of little interest. The

competitor

which no

_longer

has a chance of

finishing &dquo;placed&dquo;

is not

_going

to

_try

to

improve

its rank

_{and, therefore,}

its rank relative to the other

_{&dquo;non-placed&dquo;}

horses does not

_accurately

reflect its real

_ability.

Therefore the

_{&dquo;non-placed&dquo;}

should be treated as

_having

a

performance

below that of the last

&dquo;placed&dquo;. Then,

the likelihood

of the outcome of a race can be written as:

where there are n’ horses in the race and n horses

&dquo;placed&dquo;.

This

integral

can be

used in this form or

_equivalently

as the sum of all the

integrals

over all

_possible

rank combinations between

_{&dquo;non-placed&dquo;}

horses,

which allows a

simplified

application

of

the calculation

_{by Taylor’s}

approximation.

Another

_difficulty

is the estimation of the

_genetic

parameters.

The estimation of variance _componentscould

_probably

be made

_using

a

_marginal

maximum likelihood

approach

which

_requires

the inversion of the matrix of second

derivatives,

as

discussed

_by

Gianola et al

₍₁₉₈₆₎

and

_applied

_by

_Foulley

et al

_{(1987a, b).}

In

_practice,

this method can be

_{applied only}

on a reduced data file or with a &dquo;sire&dquo; model.

The

_heritability

of a

_single

_performance

is lower than that of

yearly

earning

criteria.

_Yearly

criteria are

_compound

functions of the number of events and of success in each event. For

_instance,

for

_{single performance,}

Meinardus and Bruns

(1987)

reported h

2

= 0.18 and r = 0.48 for the

logarithm

of

earnings

in

jumping

shows,

Klemetsdal

₍₁₉₈₉₎

_reported

h

2

= 0.18 and r = 0.65 for time in

trotting-races,

Thery

(1981)

found

h

2

= 0.23 and r = 0.52 for the _samecriterion and

h

z

= 0.07

and r = 0.13 for the

logarithm

of

earnings. However,

the number of

elementary

performances

during

the lifetime of a horse is sufficient to

_expect

_good

accuracies of estimations.

_Taking

the

_previous

_examples

and a number of

yearly

starts

_equal

to 12

_(the

_{average number of}

_yearly

starts for an adult horse is 12 in

_{trotting-races}

and 14 in

_{riding competitions),}

the accuracies of

_breeding

value estimation

_ranged

from 0.27 to 0.41 after one _yearof

_performance.

With a loss of 0.10

_point

due to

the use of

ranks,

accuracies of evaluations based on ranks would _rangefrom 0.17

to

_0.31,

which is reasonable.

This model

_requires

a

sufficiently large

amount of

_comparisons

between horses

to allow a proper classification. The presence of isolated events which do not

(13)

the

_necessity

of

_good

connections between races, which is the

_only

_guarantee

of a

reliable result.

CONCLUSION

This article describes a method of evaluation of the

_breeding

value of an animal

from its rank relative to those of other

_competitors

in a

_given

_event,without

_using

a

direct measure of

_performance.

It is

_interesting

that the method

_suggests

a solution based on a conventional

_genetic

model. It can be

_applied

to an &dquo;individual animal&dquo;

model as well as to a &dquo;sire&dquo; model. It takes into account the level of the

_competition

which is the main factor

_influencing

a rank’s

value,

_together

with the number of

participants

in the event.

_Although

use of ranks _mayseem to lead to a loss of information

_compared

to a

physical

_measure,it is sometimes more reliable. In the case of horse races,

ranking

is

_absolutely

_necessaryas a real

_physical

measure is

not identifiable. It _mayalso be useful in the case of a distorted scale of measure or

when the usual

_physical

measure is

_nothing

but the

_{transcription}

of a rank.

REFERENCES

Arnason

_T,

Bendroth

_{M, Phillipsson J,}

Henriksson

K,

Darenius A

₍₁₉₈₉₎

Genetic evaluation of Swedish trotters. In: State

_{of Breeding}

Evaluation in Trotters. EAAP Publ No

_{42, Pudoc, Wageningen,}

106-130

Dansie BR

₍₁₉₈₆₎

Normal order statistics as

_permutation

probability models. Appl

Statist

_35,

269-275

David

_FN,

Johnson NL

₍₁₉₅₄₎

Statistical treatment of censored data. I.

Fundamen-tal formulae. Biometrika

_41,

228-240

David HA

₍₁₉₈₁₎

Order Statistics.

_Wiley,

NY,

2nd

edn,

p 360

Ducrocq V,

Colleau JJ

₍₁₉₈₆₎

Interest in

quantitative

genetics

of Dutt’s and Deak’s methods for numerical

_computation

of multivariate normal

probability

integrals.

Genet Sel

_{Evol 18,}

447-474

Dutt JE

₍₁₉₇₃₎

A

_{representation}

of multivariate

_{probability integrals by integral}

transforms. Biometrika

_60,

637-645

Fernando

_RL,

Gianola D

₍₁₉₈₄₎

_Optimal

_properties

of the conditional mean as a

selection criterion. J Anim Sci 59

_(suppl),

177

_(abstr)

Foulley

JL

₍₁₉₈₇₎

Methodes

d’Evaluation

des

_{Reproducteurs}

_pourdes Caractères Discrets a Determinisme

_Podygenique

en Selection Animale. These

_d’6tat,

Univer-site de

_Paris-Sud,

Centre

_d’Orsay,

_pp320

Foulley

JL,

Gianola

_D,

Planchenault D

_(1987a)

Sire evaluation with uncertain

paternity.

Genet Sel

_{Evod 19,}

83-102

Foulley

JL,

Im

_S,

Gianola

D,

Hoschele I

_(1987b)

Empirical

Bayes

estimation of

parameters

for n

polygenic

_binary

traits. Genet Sel

Evod 19,

197-204

Gianola

D,

Foulley

JL

₍₁₉₈₃₎

Sire evaluation for ordered

_categorical

data with a

threshold model. Genet Sel

_{Evol 15,}

201-224

(14)

Godwin HJ

₍₁₉₄₉₎

Some low moments of order statistics. Ann Math Statist

_20,

279-285

Goffinet

_B,

Elsen JM

₍₁₉₈₄₎

Critbre

_optimal

de selection:

_quelques

r6sultats

g6n6raux.

G6n6t Sel

_{Evol 16,}

307-318

Harville

_DA,

Mee RW

₍₁₉₈₄₎

A mixed model

_procedure

for

_analyzing

ordered

categorical

data. Biometrics

_40,

393-408

Henery

RJ

₍₁₉₈₁₎

Permutation

_{probabilities}

as models for horse races. JR Statist

Soc

43,

86-91

Hintz RL

₍₁₉₈₀₎

Genetics of

performance

in the horse. J Anirrc Sci

_51,

582-594 Klemetsdal G

₍₁₉₈₉₎

_Norwegian

trotter

_breeding

and estimation of

_breeding

values. In: State

_{of Breeding}

Evaluation in Trotters. EAAP Publication No

_{42, Pudoc,}

Wageningen,

95-105

Langlois

B

_(1980a)

_Heritability

of

_{racing ability}

in

_{thoroughbreds.}

A review. Livest Prod Sci

_7,

591-605

Langlois

B

_(1980b)

Estimation de la valeur

_g6n6tique

des chevaux de

_sport

_d’apr6s

les sommes

_gagn6es

dans les

_{competitions 6questres frangaises.}

Ann Ggn!t Sel Anim

12,

15-31

Langlois

B

₍₁₉₈₃₎

_Quelques

reflexions au

_sujet

de l’utilisation des

_gains

_pour

appr6cier

les

_performances

des chevaux trotteurs. 34th Ann Meet

_{EAAP, Madrid,}

Spain,

October

_{3-6, 1983,}

_Study

Commission on Horse Production

Langlois

B

₍₁₉₈₄₎

H6ritabilit6 et correlations

_g6n6tiques

des

_temps

records et des

gains

établis par les Trotteurs

Fran!ais

de 2 a 6 ans. 35th Ann Meet

EAAP,

The

Hague,

The

_Netherlands,

_August

_{6-9, 1984,}

_Study

Langlois

B

₍₁₉₈₉₎

_Breeding

evaluation of French trotters

_according

to their race

earnings.

I. Present situation. In: State

_of

_Breeding

Evaluation in Trotters. EAAP. Publication No

42, Pudoc,

Wageningen,

27-40

Meinardus

H,

Bruns E

₍₁₉₈₇₎

BLUP

_procedure

in

_riding

horses based on

compe-tition results. 38th Ann Meet

_{EAAP, Lisbon,}

_{Portugal, September}

28-October

_1,

1987,

Study

Minkema D

₍₁₉₈₉₎

_Breeding

value estimation of trotters in the Netherlands. In: State

_{of Breeding}

_42,

_Pudoc,

Wageningen,

82-94.

Pearson

_ES,

_Hartley

HO

₍₁₉₇₂₎

Biometrika Tables

_for

Statisticians.

_Cambridge

University

Press,

vol

2,

27-35

Pettitt AN

₍₁₉₈₂₎

Inference for the linear model

_using

a likelihood based on ranks.

JR Statist Soc

_44,

234-243

Tavernier A

_{(1988) Advantages}

of BLUP animal model for

_breeding

value estima-tion in horses. Livest Prod Sci

_20,

149-160

Tavernier A

₍₁₉₈₉₎

_Breeding

evaluation of French trotters

_according

to their

race

_earnings.

II.

_Prospects.

In: State

_{of Breeding}

_42,

_{Pudoc, Wageningen,}

41-54

Tavernier A

_(1989b)

Caract6risation de la

_population

Trotteur

_{Frangais d’apr6s}

leur estimation

_g6n6tique

_parun BLUP modèle animal. Ann Zootech

38,

145-155 Tavernier A

₍₁₉₉₀₎

Caract6risation des chevaux de concours

_hippique

_fran!ais

d’apr6s

leur estimation

_g6n6tique

_parun BLUP modèle animal. Ann Zootech

_39,

(15)

Thery

C

₍₁₉₈₁₎

_{Analyse g6n6tique}

et

_statistique

des

performances

des Trotteurs

Fran!ais

en courses en 1979 et 1980. M6moire de

Dipl6me

d’Etudes

_Approfondies

de

_{G6n6tique Quantitative}

et

_Appliqu6e,

Universite Paris

_XI,

pp 80

APPENDIX 1

Calculation of the first and second derivatives of the

_logarithm

of the a

posteriori density

Let

_y(

_l

_{), Y(2),...,Y(n)}

be the ordered

_{underlying performances}

of the n horses

which have

_participated

in race k

(see,

for

_{example, David, 1981,}

_p

_4).

_Further,

let

Q(t),k, R(t)(t),k, R(t)(z),k

be:

We have:

where: -

(k, (t))

E j

indicates the set of events k in which the

_{animal j competed}

and obtained the rank

t;

-

(16)

where: -

(t)

E i indicates the horses ranked at the

_place

t in the event k and with associated fixed effect i

and,

if the

_{horses j and}

have

_participated

in the same event:

I I _ --

!-The second derivatives with

_respect

to u and _p,or p and p are built in the same

way. The

only

value that

changes

is the covariance which is

equal

to 0 between u

p and is

equal

to

_1/(J!

on the

_diagonal

of the second derivatives with

respect

to p

and _p. p

where: -

!i(t)

= 0 if fixed effect i does _not_{influence the horse ranked}_t

-

!i(t) =

1 if fixed effect i influences the horse ranked t

APPENDIX 2

Approximation

of first and second derivatives of

_log

of the a

posteriori

density

using

Taylor’s

series

_expansion

These

_expansions

are drawn from those used

_by

_Henery

₍₁₉₈₁₎

and Dansie

₍₁₉₈₆₎

who

_approximate

the

_probability

_P

_k

_.

An

_example

of these

_{decompositions}

is

_given

for

_{(Q!t!,K/P!;):}

where,

for n

independent

normal distributions:

-

e t:n

:

expectation

of the tth order statistic -

o-tt:

, : variance of the tth order statistic -

<!tp:n : covariance between the tth order statistic and the

pth

order statistic -

Pt p z:n

Genetic evaluation of horses based on ranks in competitions

HAL Id: hal-00893872

https://hal.archives-ouvertes.fr/hal-00893872

Submitted on 1 Jan 1991

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Genetic evaluation of horses based on ranks in

competitions

A Tavernier

To cite this version:

Original

article

Genetic evaluation of horses based

on

ranks in

competitions

A Tavernier

Agronomique,

Génétique Quantitative

Appliquée,

Jouy-en-Josas,

Jouy-en-Josas Cedex,

(Received

1988; accepted

January

1991)

analysing

performance

underlying

merely

phenotypic expression

underlying

entering

competition.

breeding

posteriori density

Bayesian

solving

by

example involving

2 .yr-old

given.

computing

presented

/

/

/

présente

d’analyse

performances enregistrées

forme

confrontations

(courses

concours).

postule

sous-jacente.

simplement d’expression phénotypique

sous-jacente

participant

épreuve.

génétiques

partir

bayésien.

système

exemple

d’application

Français

pratiques

proposées

/

_Agronomique,

_{Génétique Quantitative}

_Appliquée,

_{Jouy-en-Josas,}

_{1988; accepted}

_January

₁₉₉₁₎

_underlying

_merely

_underlying

_entering

_competition.

_breeding

_{posteriori density}

_Bayesian

_solving

_{example involving}

_{2 .yr-old}

_given.

_computing

_presented

_/

_/

_/

_présente

_d’analyse

_{performances enregistrées}

_{confrontations}

_(courses

_{simplement d’expression phénotypique}

_sous-jacente

_génétiques

_partir

_bayésien.

_système

_Français

_/

_/

_/

_breeding

_objective

_ability

_riding

_(trot

_gallop).

_"physical"

_always

_always

_performance:

_racing

_must,

_{all, adapt}

_particular

_prevailing

_{riding horses,}

_jumping

_depends

_only

_height

_but,

_greater

_approaching

_{provided by}

_ranking

_Ranking

_entering

_compared

_ranking

_{&dquo;placed&dquo;}

_given

_first,

_given

_50%,