Breeding value estimation with incomplete marker data

(1)

HAL Id: hal-00894194

https://hal.archives-ouvertes.fr/hal-00894194

Submitted on 1 Jan 1998

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Breeding value estimation with incomplete marker data

Marco C.A.M. Bink, Johan A.M. van Arendonk, Richard L. Quaas

To cite this version:

(2)

Original

article

Breeding

value estimation with

incomplete

marker data

Marco

C.A.M.

Bink

Johan

A.M.

Van

Arendonk

a

Richard

L. _Quaas

a

Animal

_Breeding

and Genetics

_{Group, Wageningen}

Institute of Animal

_Sciences,

Wageningen

Agricultural University,

PO Box 338, 6700 AH

_Wageningen,

the Netherlands b

Department

of Animal

Science,

Cornell

University, Ithaca,

NY _14853,USA

(Received

20

_{January 1997; accepted}

17 November

₁₉₉₇₎

Abstract -

_Incomplete

marker data _prevent

_application

of marker-assisted

_breeding

value estimation

_using

animal model BLUP. We describe a Gibbs

_{sampling approach}

for

Bayesian

estimation of

breeding values, allowing incomplete

information on a

single

marker that is linked to a

quantitative

trait locus. Derivation of

_sampling

densities for

marker _genotypesis

emphasized,

because reconsideration of the

gametic relationship

matrix structure for a marked

quantitative

trait locus leads to

_simple

conditional densities. A small numerical

_example

is used to validate estimates obtained from Gibbs

sampling.

Extension and

_application

of the

_{presented approach}

in livestock

_populations

is discussed.

©

Inra/Elsevier,

Paris

breeding values

_/

quantitative trait locus

_/

_incompletemarker data

_/

Gibbs _sampling

Résumé - Estimation des valeurs génétiques avec information incomplète sur les marqueurs. Un typage

incomplet

pour les marqueurs

empêche

l’estimation des valeurs

génétiques

de _typeBLUP utilisant l’information sur les marqueurs. On décrit une

procédure d’échantillonnage

de Gibbs pour l’estimation

bayésienne

des valeurs

_génétiques

permettant une information

_incomplète

_pourun _marqueur

_unique

lié à un locus

_quantitatif.

On

_développe

le calcul des densités de

_{probabilités}

des

_génotypes

au _{marqueur parce}

que la reconsidération de la structure de la matrice des corrélations

_gamétiques

_pour

un locus

quantitatif marqué

conduit à des densités conditionnelles

_simples.

Un

_petit

exemple numérique

est donné pour valider les estimées obtenues par

échantillonnage

de Gibbs.

L’application

de

_l’approche

aux

_populations

d’animaux

_domestiques

est discutée.

©

Inra/Elsevier,

Paris

valeur _génétique

_/

locus _quantitatif

_/

marqueurs incomplets

/

échantillonnage de

Gibbs

*

(3)

1. INTRODUCTION

Identification of a

_genetic

marker

closely

linked to a _gene

(or

a cluster of

genes)

affecting

a

_quantitative

_trait,

allows more accurate selection for that trait

_[5].

The

_possible

_advantages

of marker-assisted

_genetic

evaluation have been described

extensively

(e.g. [13,

16,

17]).

Fernando and Grossman

_[1]

demonstrated how best linear unbiased

_prediction

(BLUP)

can be

performed

when data are available on a

single

marker linked to

quantitative

trait locus

_((aTL).

The method of Fernando and Grossman has been modified for

_including

multiple

unlinked marked

QTL

[23],

a different method of

assigning

QTL

effects within animals

_[26];

and marker brackets

_[5].

These methods

are efficient when marker data are

_{complete. However,}

in

_practice,

incompleteness

of marker data is very

likely

because it is

expensive

and often

impossible

(when

no DNA is

available)

to obtain marker

genotypes

for all animals in a

pedigree.

For _everyunmarked

_animal,

several marker

_genotypes

can be

_fitted,

each

_resulting

in a different marker

_genotype

configuration.

When the

proportion

or number of

unmarked animals

_increases,

identification of each

_possible

marker

_genotype

con-figuration

becomes tedious and

analytical computation

of likelihood of occurrence

of these

_{configurations}

becomes

impossible.

Gibbs

_sampling

_[3]

is a numerical

_integration

method which

_provides

opportuni-ties to solve

_analytically

intractable

_problems.

_Applications

of this

technique

have

recently

been

_published

in statistics

_(e.g.

_{[2, 3])}

as well as animal

_breeding

(e.g.

[18,

25]).

Janss et al.

_[10]

_{successfully applied}

Gibbs

_sampling

to

_sample

_genotypes

for a

bi-allelic

_major

gene, in the absence of markers.

Sampling

genotypes

for multiallelic

loci,

e.g.

genetic markers,

may lead to reducible Gibbs chains

[15, 20].

Thompson

[21]

summarizes

_approaches

to resolve this

potential

reducibility

and concludes that

a

_sampler

can be constructed that

_efficiently

samples

multiallelic

_genotypes

on a

large pedigree.

The

_objective

of this _paperis to describe the Gibbs

_sampler

for marker-assisted

breeding

value estimation for situations where

genotypes

for a

_single

marker locus

are unknown for some individuals in the

pedigree.

Derivation of the

conditional,

discrete,

sampling

distributions for

_genotypes

at the marker is

_emphasized.

A small numerical

_example

is used to _{compare estimates}from Gibbs

_sampling

to true

posterior

mean estimates. Extension and

_application

of our method are discussed.

2. METHODOLOGY

2.1. Model and

_priors

We consider inferences about model

_parameters

for a mixed inheritance model

of the form

where _yand e are n-vectors

_representing

observations and residual _errors,

₍

₃

is a

p-vector

of ’fixed

effects’,

u and v are _qand

2q-vectors

of random

polygenic

and

(4)

consider three random

_genetic

_effects,

i.e. two additive effects at a marked

QTL

(v!

and

_v2,

see

_figure

1)

and a residual

_polygenic

effect

(u;).

Here e is assumed to have the distribution

_N

_n

_{(O, 1}

₀

_&dquo;;),

independently

of

(3,

u and v. Also u is taken to

be

_{Nq(0, A}

_O

_,

₂

_),

where A is the well-known numerator

_relationship

matrix.

Finally,

v is taken to be

_N2q(OGQ!),

where G is the

_{gametic relationship}

matrix

(2q

x

2q)

_computed

from

_pedigrees,

a full set of marker

_genotypes

and the known map distance between marker and

QTL

[26].

In case of

incomplete

marker

data,

we

_{augment genotypes}

for

_ungenotyped

individuals. We then denote

_f

_fi(

_k

₎

and

G(

k

)

as the marker

_genotype

_{configuration}

k and as the

_{corresponding}

_gametic

relationship

matrix.

_{Further, /3,}

_{u, v,}and

_missing

marker

_genotypes

are assumed

to be

_independent,

a

_priori.

We assume

_{complete knowledge}

on variance

_components

and _mapdistance between marker and

_QTL.

2.2. Joint

_{posterior density}

and full conditional distributions for location

parameters

The conditional

_density

of _y

_{given /3,}

_u,and v for the model

_given

in

_equation

(1)

is

_proportional

to

_{exp{ -1/2a;}

₂

_{(y -}

_{X,3 -}

Zu -

_{Wv)’(y -}

_{X/3 -}

Zu -

_Wv},

(5)

The

joint posterior

density

includes a summation

_(n

_c

₎

over all consistent marker

genotype

configurations

(

M

(

k

))-

In the derivation of the

_sampling

densities for marked

_{QTL effects,}

_however,

one

_particular

marker

_genotype

_{configuration,}

_m(

_k

_),

is fixed. The summation needs to be considered

_only

when the

_sampling

of marker

genotypes

is concerned.

To

_implement

the Gibbs

_{sampling algorithm,}

we

_require

the conditional

_posterior

distributions of each of

(3,

u, and v

_given

the

_remaining

_parameters,

the so-called full conditional

_{distributions,}

which are as follows

and

_gametic

covariances in the

_{pedigree, respectively.}

Note that the means of the

distributions

_{(3), (4)}

and

₍₅₎

_correspond

to the

_updates

obtained when mixed model

equations

are solved

_by

Gauss-Seidel iteration. Methods for

_sampling

from these distributions are well known

_{(e.g. [24,}

_25]).

2.3.

_Sampling

densities for marker

_genotypes

Suppose

m is the current vector of marker

_genotypes,

some observed and some

of which were

_augmented

(e.g.

_sampled

_by

the Gibbs

_sampler).

Let m-i denote

(6)

a

particular

_genotype

for the marker locus. Then the

_posterior

distribution of

genotype

gm

is

the

product

of two factors

with,

where

₁

_G-

_corresponds

to marker

_genotype

set

_,

_Mi

_M

_i

_-

_I

=

g

m

).

Thus, equation

(7)

shows that

_phenotypic

information needed for

sampling

new

_genotypes

for the

marker is

_present

in the vector of

_QTL

effects

_(v).

Now,

it suffices to

_compute

_equation

₍₆₎

for all

_possible

values of gm, and then

randomly

select one from that multinomial distribution

_[20].

In

_practice

consid-ering

only

those _g_m that are consistent with _m-_i and Mendelian inheritance can

minimize

_{the, computations. Furthermore, computations}

can be

simplified

because

&dquo;transmission of genes from

parents

to

_offspring

are

conditionally independent

_given

the

_genotypes

of the

_{parents&dquo;}

_[15].

_Adapting

notation from Sheehan and Thomas

[15],

let

_S

_j

denote the set of mates

_(spouses)

of individual i and

_0;,!

be the set of

offspring

of the

_pair

i and

_{j. Furthermore,}

the

_parents

of individual i are denoted

by

s

(sire)

and d

_(dam).

_{Then, equation}

₍₆₎

can be more

_specifically

written as

p(mi = gm, m-i IV, oV 2 ,Mobs, r)

When

_parents

of individual i are not

known,

then the first two terms on the

right-hand

side of

_equation

₍₈₎

are

_{replaced by}

x(m;),

which

_represents

frequen-cies of marker

_genotypes

in a

_population.

The

_probability

_p(m;

=

9 .

1 M

.,

Md

).

cor-responds

to Mendelian inheritance rules for

_obtaining

marker

genotype

gi

given

parental

genotypes

ms and md, similar for

p(m

1 Im¡

=

gm, m!).

The

computation of

p{v

i

lv

d

,m¡,m

s

,m

d

,r}

(and

,r})

1 ,m

j

,m

i

,m

Vj

Iv¡,

1 p{v

can

efficiently

be

performed

by

utilizing special

characteristics of the matrix

G-

1 .

Let

_Q

_i

denote a

_gametic

contribution matrix

_relating

the

_QTL

effects of individual i to the

_QTL

effects of its

_parents.

The matrix

_Q

_i

is

_{2(i —}

₁₎

x 2. For founder

animals,

matrix

_Qi

is

_simply

zero. The recursive

algorithm

to

_{compute G-}

1

of

_Wang

et al.

_(1995,

_equation

_{[18] )}

can be rewritten as

where

_D¡1

=

_{(C; -}

_Q;G¡-

_1Q

_¡)-

₁

_(which

reduces to

D¡

1 =

_(C

_i

_-

_QfG

_i

_-,Q

_i

_)-’

with no

inbreeding),

_O

_i

is a

2(q—i)

x 2 null matrix. The

_{off-diagonals}

in

_{C; equal}

the

(7)

Henderson’s rules for

A-

1 _[6].

The nonzero elements of

G-

1 pertaining

to an animal

arise from its own contribution

_plus

those of its

_offspring.

So,

when

_sampling

the ith animal’s marker

_genotype,

_only

those contribution matrices need be considered that contain elements

_pertaining

to animal i. These are the individual’s own

contributions and those of its progeny when i appears as a

_parent.

where Vk is the vector of animal k’s two marked

_{QTL effects,}

and

_Qp

_denotes

the

rows of

_Q

_k

_pertaining

to

_P,

one of k’s

_parents.

_Again,

we

_recognize

each term in

the sum is the kernel of a

_(bivariate)

normal which is

_pfv

_i

_Iv

_s

_,

_,_v_d_{m¡, m}_s_,_m_d_,

_r}

or

p{v1Iv¡, Vj, m¡, mj,m1, r}.

2.4.

_Running

the Gibbs

_sampling

The Gibbs

sampler

is used to obtain a

_sample

of a

_parameter

from the

_posterior

distribution and can be seen as a chained data

_augmentation

_algorithm

_[19].

_So,

one

_augments

data

_(y

and

_mobs)

with

parameters

(0)

to

_obtain,

for

_example,

p(e

1 Ie

2 , ... ,

O

d

,

y).

For the purpose of

breeding

value

_estimation,

Gibbs

_sampling

works as follows:

1)

set

_arbitrary

initial values for

_9!°!,

we use zeros for fixed

and

_genetic

effects and for each unmarked

_animal,

we

_augment

a

_genotype

that is consistent with

pedigree,

Mendelian

_inheritance,

and observed marker

_data;

2)

sample

01’

+ll

from

[3],

i =

1, 2, ..,

p; for fixed

effects,

[4],

i =

p +

1,

p +

2, ..,

p + q; for

polygenic effects,

[5],

i =

p + q +

1,

p + q +

2, ..,

p + q +

2q;

for marked

QTL effects,

or

[6],

i =

p +

3q

+

_1,

_p+

3q

+

2,..,

p +

3q

+ t;

for marker

genotypes,

and

replace

6!T!

with

ei

T+1

]

;

.

3)

repeat

2)

N

_(length

of

_chain)

times.

For any individual

parameter,

the collection of n values can be viewed as a

simulated

_sample

from the

_appropriate

_marginal

distribution. This

_sample

can be

used to calculate a

_marginal

_posterior

mean or to estimate the

_{marginal posterior}

distribution. For small

_pedigrees

with

_only

a few animals

_missing

observed marker

(8)

where B* is a

_fixed,

_polygenic

or marked

QTL

effect. This

provides

a criterion to

compare the estimates obtained from Gibbs

sampling.

3. NUMERICAL EXAMPLE

A small numerical

_example

is used to

_verify

the use of the Gibbs

_sampler

to obtain

_posterior

mean estimates and illustrate the effect of the data on the estimates

obtained from two different

_estimators,

i.e. a

_posterior

mean and the well-known

BLUP estimator

_(by

_solving

the MME

_given

in the

_Appendix).

Pedigree

and

data of the

_example

are in

_figure

2. Both sire

(01)

and dam

(02)

have observed

marker

_genotypes,

AB and

_{CD, respectively,}

but do not have

_phenotypes

observed.

Three full sibs have a marker

_genotype

BC and a

phenotype

+20

(denoted

FS

_03,

04,

05);

three other full sibs have a marker

_genotype

AD and a

_phenotype

-20

(denoted

FS

06, 07,

08).

Both animals 09 and 10 have no marker

genotypes

but have a

_phenotype

+20 and

_-20,

_{respectively. Complete knowledge}

was assumed on

variance

_components

and recombination rate between marker and

_MQTL

_{(table I).}

The

_thinning

factor in Gibbs

_sampling

chain was 50

cycles

and the burn in

period

was twice the

_{thinning factor,}

and 20 000 thinned

_samples

were used for

_analysis.

3.1. Estimates for

_genetic

effects

The

_posterior

estimates obtained from Gibbs

_sampling

were similar to the TRUE

posterior

estimates,

as shown in table 11. The

_posterior

estimates of

_MQTL

effects of

animals 09 and 10

_(f0.70)

were much less

divergent

than those of their full sibs that

had their marker

_genotypes

observed

_(f2.48).

These less

_divergent

values reflect

the

uncertainty

on marker

genotypes

of animals 09 and 10. The TRUE and GIBBS

posterior

densities for an

MQTL

effect of animal 09 were also _verysimilar

(figure 3).

The

_posterior

variance was

_52.3,

which was

_larger

than the

_prior

variance

(ufl =

50)

and reveals that the data are not

_decreasing

the

prior uncertainty

on

MQTL

effects

for animals 09 and 10 in this situation. For the other full

_sibs,

the

_posterior

variance

was

_47.02,

which was lower than the

_prior

variance because

_segregation

of

_MQTL

effects was known with

_{higher certainty,}

i.e. marker

_genotypes

were known. The

BLUP estimates for

_MQTL

effects of animal 09 and 10 were

_equal

to

_1/6

of the

(9)

(10)

(11)

3.2. Marker

_genotype

_{probabilities}

In the

_following

marker

genotype

AB

_represents

both AB and BA. In the latter case, alleles for both marker and

MQTL

are

_{reordered, maintaining linkage}

between

marker and

_MQTL

alleles within an animal.

_So,

four marker

_genotypes

were

_possible

for animals 09 and 10

_{(table III).}

Based on

pedigree

and marker data

solely,

each of

these four

_genotypes

was

_{equally likely}

(prior

_probability

=

0.25).

After

including

phenotypic

data,

(posterior)

probabilities changed:

marker

_genotype

BC and AD for animal 09 became more and less

probable, respectively.

The reverse holds for

animal 10. The estimates from the Gibbs

_sampler

were

_again

_verysimilar to the TRUE

posterior probabilities.

Complete phenotypic

and marker information on six full sibs gave the

MQTL

effects linked to marker alleles B and C

_positive

values and marker alleles A and D

_negative

values. Note that

_{probabilities}

_(TRUE)

for marker

genotypes

AC and BD also

_(slightly)

_changed

after

_considering

the

_phenotypic

data.

4. DISCUSSION

Marker-assisted

_breeding

value estimation in livestock has been

hampered by

incomplete

marker data.

_Previously

described methods

_[1,

_23,

_26]

can accommodate

ungenotyped

individuals which do not have

_offspring

themselves as was shown

by

Hoeschele

_[7].

_However,

_they

do not

_provide

the

_flexibility

to

_incorporate

parents

with unknown

_genotypes

which results in the loss of information for

estimating

marker linked effects. The described Gibbs

_{sampling algorithm}

now

provides

this

_required

_flexibility.

The innovative

_step

in our

_approach

is the

_sampling

of

genotypes

for a marker locus that is

closely

linked to

_QTL

with

normally

distributed allelic effects.

_Normality

of

QTL

effects is a robust

_assumption

to allow

segregation

of _manyalleles

_throughout

a

_population

and allow

_changes

in allelic

effects over

_generations,

_e.g.due to mutations and interactions with environments

(12)

phenotypes

of animals in the

_pedigree

are used. Jansen et al.

_[9]

indicate

that,

as a result of the use of

_{phenotypic information,}

unbiased estimates of effects at the

QTL

can be obtained in situations where animals have been

_selectively

_genotyped.

In this paper we have concentrated on the use of information from a

single

marker locus.

_Using

information from

_multiple

linked markers can increase _accuracy of

_predicting

_genetic

effects at the

_QTL.

The

_{principles applied}

here have been extended to situations where

_genotypes

for all the linked markers are known for all

individuals

_{[5, 22].}

In order to

_incorporate

individuals with unknown

_genotypes,

the method

_presented

in this _paperneeds to be extended to a

multiple

marker situation.

In

_extending

the method to

_multiple

_markers,

the

_problem

of

_reducibility

deserves

special

attention.

Reducibility

of Gibbs chains can arise when

sampling

_genotypes

for a

polymor-phic

locus with more than two alleles

_[20].

The

_{reducibility problems}

will become

more severe when

sampling

_genotypes

for

multiple

linked markers.

Thompson

[21]

suggested

several, workable,

approaches

to

_guarantee

_{irreducibility}

of the Gibbs

chain. These

_approaches

make use of

Metropolis-coupled

samplers

[11],

importance

sampling,

with

_0/1

_weights

_[15],

and

_{’heating’}

in the

Metropolis-Hastings

steps

[12].

Alternatively,

Jansen et al.

_[9]

_sampled

IBD values for all marker loci

_indicating

parental origin

of alleles instead of actual alleles to avoid the

_{reducibility problem.}

In

_extending

the method to

_multiple

linked

markers,

attention also needs to be

paid

to an efficient scheme for

_haplotypes

or

_genotypes

at the linked loci.

_Updating

of

genotypes

at

_closely

linked loci will be more efficient when

_genotypes

at the linked loci are

_updated

_together

(’in blocks’)

in order to reduce auto-correlation in the Gibbs

_sampler

_[10].

For

_posterior

inferences on the

_breeding

value of an animal a minimum of

100 effective

_samples

is needed. In the numerical

_example

this minimum would

correspond

to a chain of 5 000

_cycles

which

required

8 s of CPU at a HP9000

K260 server. It has been found that

_{computing requirements}

increase more or

less

_linearly

with the number of animals

_[10].

The

presented

method can be

applied

to data

_originating

from nucleus herds which

_comprise

the

_relatively

small number of

_genetically

_superior

animals from the

_population.

In a marker-assisted

selection scheme marker

_genotypes

will be collected

_largely

on these

animals,

with

sufficient animals

_having

marker

_genotypes

observed to

_improve

selection of

_superior

individuals.

Straightforward application

in

_large

commercial

_populations

with thousands of marker

_genotypes

_missing,

is not a valid

_option

because of

_{computational}

requirements

of Markov chain Monte Carlo

_(MCMC)

_algorithms

such as Gibbs

sampling. Hybrid

schemes will need to be

_developed

to

_incorporate

information from the commercial

_population

into the marker-assisted

_prediction

of

_breeding

values of nucleus animals. Similar schemes have been

_implemented

to

_incorporate

foreign

information into national evaluations in

_dairy

cattle.

Our

_{Bayesian approach}

can also be considered as a first

_step

towards a MCMC

algorithm,

not

_necessarily

Gibbs

_sampling,

that can also estimate

_hyper

parame-ters,

which were held constant in this

_study.

The next

_step,

_{therefore, comprises}

(13)

MCMC

_algorithm

can then be used for the

_analysis

in

QTL mapping experiments

with

complex pedigree

structures,

such as

(grand-)

_{daughter designs,}

in outbred

populations.

ACKNOWLEDGEMENTS

Valuable

_suggestions

by

S. van der Beek and anonymous reviewers are

_gratefully

acknowledged.

The financial support of Holland Genetics is

_{highly appreciated.}

REFERENCES

[1]

Fernando

R.L.,

Grossman

M.,

Marker-assisted selection

using

best linear unbiased

prediction,

Genet. Sel. Evol. 21

₍₁₉₈₉₎

467-477.

[2]

Gelfand

A.E.,

Smith

A.F.M., Sampling-based approaches

to

_{calculating marginal}

densities,

J. Am. Stat. Assoc. 85

₍₁₉₉₀₎

398-409.

[3]

Geman

_S.,

Geman

_D.,

Stochastic

_relaxation,

Gibbs distributions and the

_Bayesian

restoration of

_images,

IEEE Trans Pattern Anal Machine

Intelligence

6

₍₁₉₈₄₎

721-741.

[4]

Geyer C.J.,

A

_{practical guide}

to Markov chain Monte

_Carlo,

Stat. Sci. 72

₍₁₉₉₂₎

320-339.

[5]

Goddard

M.E.,

A mixed model for

_analysis

of data on

_{multiple genetic markers,}

Theor.

_Appl.

Genet. 83

₍₁₉₉₂₎

878-886.

[6]

Henderson

C.R.,

A

_simple

method for

_computing

the inverse of a numerator

rela-tionship

matrix used in

_prediction

of

breeding values,

Biometrics 32

₍₁₉₇₆₎

69-83.

[7]

Hoeschele

I.,

Elimination of

_quantitative

trait loci

_equations

in an animal model

incorporating genetic

marker

data,

J.

Dairy

Sci. 76

₍₁₉₉₃₎

1693-1713.

[8]

Jansen

R.C., Complex plant

traits: time for

polygenic analysis,

Trends Plant Sci. 3

(1996)

73-103.

[9]

Jansen

R.C.,

Johnson

D.L.,

Van Arendonk

J.A.M.,

A mixture model

_approach

to the

mapping

if

_quantitative

trait loci in

_{complex populations}

with an

_application

to

multiple

cattle

families,

Genetics

₍₁₉₉₇₎

_{(in press).}

[10]

Janss

_{L.L.G., Thompson R.,}

Van Arendonk

J.A.M.,

(1995)

Application

of Gibbs

sampling

for inference in a mixed

major gene-polygenic

inheritance model in animal

populations,

Theor.

_Appl.

Genet. 91

₍₁₉₉₅₎

1137-1147.

[11]

Lin

_S.,

Markov chain Monte Carlo estimates of

probabilities

on

_complex

_structures,

Ph.D.

dissertation, University

of

Washington.

[12]

Lin

_{S., Thompson E.A., Wijsman E.,}

_{Achieving irreducibility}

of the Markov chain Monte Carlo method

applied

to

pedigree data,

IMA J. Math.

_Appl.

Med. Biol. 10

(1993)

1-17.

[13]

Meuwissen

T.H.E.,

VanArendonk

J.A.M.,

Potential

_improvements

in rate of

_genetic

gain

from marker-assisted selection in

_dairy

cattle

breeding schemes,

J.

_Dairy

Sci. 75

₍₁₉₉₂₎

1651-1659.

[14]

Schaeffer

L.R., Kennedy B.W., Computing strategies

for

_solving

mixed model

equations,

J.

Dairy

Sci. 69

₍₁₉₈₆₎

575-579.

[15]

Sheehan

_N.,

Thomas

_A.,

On the

_{irreducibility}

of a Markov chain defined on a _space

of _genotype

_{configurations by}

a

sampling scheme,

Biometrics 49

(1993)

163-175.

[16]

Smith

_{C., Simpson S.P.,}

₍₁₉₈₆₎

The use of

_genetic

_{polymorphisms}

in livestock

(14)

[17]

Soller

_M.,

Beckmann

J.S.,

Restricted

_{fragment length polymorphisms}

and

_genetic

improvement,

in: Proc. 2nd World

Congress

Genet.

Appl.

Livest.

Prod., Madrid,

Editorial

Garsi, Madrid, Spain,

vol. 6, 1982, pp. 396-404.

[18]

Sorensen

_{D.A., Wang C.S.,}

Jensen

J.,

Gianola

_{D., Bayesian}

_analysis

of

_genetic

_change

due to selection

_using

Gibbs

_sampling,

Genet. Sel. Evol. 26

₍₁₉₉₄₎

333-360.

[19]

Tanner

M.A.,

Tools for Statistical

_{Inference, Springer-Verlag,}

New

_{York, NY,}

1993.

[20]

Thomas

D.C.,

Cortessis

V.,

A Gibbs

_{sampling approach}

to

_{linkage analysis,}

Hum.

Hered. 42

₍₁₉₉₂₎

63-76.

[21]

Thompson

E.A.,

Monte Carlo likelihood in

_{genetic mapping.}

Stat. Sci. 9

₍₁₉₉₄₎

355-366.

[22]

Uimari

P.,

Thaller

G.,

Hoeschele

I.,

The use of

_multiple

markers in a

_Bayesian

method

for

mapping quantitative

trait

_loci,

Genetics 143

₍₁₉₉₆₎

1831-1842.

[23]

VanArendonk

_J.A.M.,

Tier

_B.,

_{Kinghorn B.P.,}

Use of

_{multiple genetic}

markers in

prediction

of

_{breeding values,}

Genetics 137

₍₁₉₉₄₎

319-329.

[24]

VanTassell

C.P.,

Casella

G.,

Pollak

_E.J.,

Effects of selection on estimates of variance

components

using

Gibbs

sampling

and restricted maximum

_likelihood,

J.

_Dairy

Sci. 78

₍₁₉₉₅₎

678-692.

[25]

Wang C.S., Rutledge J.J.,

Gianola

_D.,

₍₁₉₉₃₎

_Marginal

inferences about variance

components in a mixed linear model

_using

Gibbs

_sampling,

Genet.

_Sel.

Evol. 25

(1993)

41-62.

[26]

Wang T.,

Fernando

R.L.,

van der Beek

_S.,

Grossman

_M.,

Van Arendonk

_J.A.M.,

Covariance between relatives for a marked

_quantitative

trait

_locus,

Genet. Sel. Evol.

27

₍₁₉₉₅₎

251-272.

Al. APPENDIX

A1.1.

_Computation

of _averageG with

_incomplete

marker data

Wang

et al.

_[26]

_{suggested computing}

an _average

G,

here denoted

G,

as

where

_G(

_k

₎

is the

_gametic

_relationship

matrix

_given

a

_particular

marker

_genotype

configuration

k

);

m(

and

)lM

,)

ob

k

(

M

p(

is the

probability

of

m(

k

)

given

mobs. This

equation

is not conditioned on

_phenotypic

information.

Al.2. Marker-assisted best linear unbiased

_prediction

of

_breeding

values

(15)

where a&dquo; =

Qe !Qu,

_a&dquo;=

Qe !Q!

and G are all known. Solutions can be obtained

by

iteration on the data

[14].

These

equations

can be used in three situations.

First,

G

is

_unique

_(complete

marker

_data).

Second,

with

_{missing markers,}

a linear estimator

is obtained

_{by taking}

G = G.

Third,

with G =

G!!!,

they

are used to

Breeding value estimation with incomplete marker data

HAL Id: hal-00894194

https://hal.archives-ouvertes.fr/hal-00894194

Submitted on 1 Jan 1998

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Breeding value estimation with incomplete marker data

Marco C.A.M. Bink, Johan A.M. van Arendonk, Richard L. Quaas

To cite this version:

Original

article

Breeding

value estimation with

incomplete

marker data

Marco

C.A.M.

Bink

Johan

A.M.

Van

Arendonk

a

Richard

L.

Quaas

Breeding

Group, Wageningen

Sciences,

Wageningen

Agricultural University,

Wageningen,

Department

Science,

University, Ithaca,

(Received

January 1997; accepted

1997)

Incomplete

application

breeding

using

sampling approach

Bayesian

breeding values, allowing incomplete

single

quantitative

sampling

emphasized,

gametic relationship

quantitative

simple

example

sampling.

application

presented approach

populations

&copy;

Inra/Elsevier,

/

/

/

incomplet

empêche

génétiques

procédure d’échantillonnage

bayésienne

génétiques

incomplète

unique

_Quaas

_Breeding

_{Group, Wageningen}

_Sciences,

_Wageningen,

_{January 1997; accepted}

₁₉₉₇₎

_Incomplete

_application

_breeding

_using

_{sampling approach}

_sampling

_simple

_example

_application

_{presented approach}

_populations

©

_/

_/

_/

_génétiques

_incomplète

_unique

_quantitatif.

_développe

_{probabilités}

_génotypes

_gamétiques

_simples.

_petit

_l’approche

_populations

_domestiques

©

_/

_/

_genetic

_quantitative

_trait,

_[5].

_possible

_advantages

_genetic

_[1]

_prediction

_((aTL).

_including

_[26];

_[5].

_{complete. However,}

_practice,

_animal,

_genotypes

_fitted,

_resulting

_genotype

_increases,

_possible

_genotype