HAL Id: hal-00894194
https://hal.archives-ouvertes.fr/hal-00894194
Submitted on 1 Jan 1998
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Breeding value estimation with incomplete marker data
Marco C.A.M. Bink, Johan A.M. van Arendonk, Richard L. Quaas
To cite this version:
Original
article
Breeding
value estimation with
incomplete
marker data
Marco
C.A.M.
Bink
Johan
A.M.
Van
Arendonk
a
Richard
L.
Quaas
a
Animal
Breeding
and GeneticsGroup, Wageningen
Institute of AnimalSciences,
Wageningen
Agricultural University,
PO Box 338, 6700 AHWageningen,
the Netherlands bDepartment
of AnimalScience,
CornellUniversity, Ithaca,
NY 14853, USA(Received
20January 1997; accepted
17 November1997)
Abstract -
Incomplete
marker data preventapplication
of marker-assistedbreeding
value estimationusing
animal model BLUP. We describe a Gibbssampling approach
for
Bayesian
estimation ofbreeding values, allowing incomplete
information on asingle
marker that is linked to a
quantitative
trait locus. Derivation ofsampling
densities formarker genotypes is
emphasized,
because reconsideration of thegametic relationship
matrix structure for a marked
quantitative
trait locus leads tosimple
conditional densities. A small numericalexample
is used to validate estimates obtained from Gibbssampling.
Extension and
application
of thepresented approach
in livestockpopulations
is discussed.©
Inra/Elsevier,
Parisbreeding values
/
quantitative trait locus/
incomplete marker data/
Gibbs samplingRésumé - Estimation des valeurs génétiques avec information incomplète sur les marqueurs. Un typage
incomplet
pour les marqueursempêche
l’estimation des valeursgénétiques
de type BLUP utilisant l’information sur les marqueurs. On décrit uneprocédure d’échantillonnage
de Gibbs pour l’estimationbayésienne
des valeursgénétiques
permettant une informationincomplète
pour un marqueurunique
lié à un locusquantitatif.
On
développe
le calcul des densités deprobabilités
desgénotypes
au marqueur parceque la reconsidération de la structure de la matrice des corrélations
gamétiques
pourun locus
quantitatif marqué
conduit à des densités conditionnellessimples.
Unpetit
exemple numérique
est donné pour valider les estimées obtenues paréchantillonnage
de Gibbs.L’application
del’approche
auxpopulations
d’animauxdomestiques
est discutée.©
Inra/Elsevier,
Parisvaleur génétique
/
locus quantitatif/
marqueurs incomplets/
échantillonnage deGibbs
*
1. INTRODUCTION
Identification of a
genetic
markerclosely
linked to a gene(or
a cluster ofgenes)
affecting
aquantitative
trait,
allows more accurate selection for that trait[5].
Thepossible
advantages
of marker-assistedgenetic
evaluation have been describedextensively
(e.g. [13,
16,
17]).
Fernando and Grossman
[1]
demonstrated how best linear unbiasedprediction
(BLUP)
can beperformed
when data are available on asingle
marker linked toquantitative
trait locus((aTL).
The method of Fernando and Grossman has been modified forincluding
multiple
unlinked markedQTL
[23],
a different method ofassigning
QTL
effects within animals[26];
and marker brackets[5].
These methodsare efficient when marker data are
complete. However,
inpractice,
incompleteness
of marker data is very
likely
because it isexpensive
and oftenimpossible
(when
no DNA is
available)
to obtain markergenotypes
for all animals in apedigree.
For every unmarked
animal,
several markergenotypes
can befitted,
eachresulting
in a different markergenotype
configuration.
When theproportion
or number ofunmarked animals
increases,
identification of eachpossible
markergenotype
con-figuration
becomes tedious andanalytical computation
of likelihood of occurrenceof these
configurations
becomesimpossible.
Gibbs
sampling
[3]
is a numericalintegration
method whichprovides
opportuni-ties to solve
analytically
intractableproblems.
Applications
of thistechnique
haverecently
beenpublished
in statistics(e.g.
[2, 3])
as well as animalbreeding
(e.g.
[18,
25]).
Janss et al.[10]
successfully applied
Gibbssampling
tosample
genotypes
for abi-allelic
major
gene, in the absence of markers.Sampling
genotypes
for multiallelicloci,
e.g.genetic markers,
may lead to reducible Gibbs chains[15, 20].
Thompson
[21]
summarizesapproaches
to resolve thispotential
reducibility
and concludes thata
sampler
can be constructed thatefficiently
samples
multiallelicgenotypes
on alarge pedigree.
The
objective
of this paper is to describe the Gibbssampler
for marker-assistedbreeding
value estimation for situations wheregenotypes
for asingle
marker locusare unknown for some individuals in the
pedigree.
Derivation of theconditional,
discrete,
sampling
distributions forgenotypes
at the marker isemphasized.
A small numericalexample
is used to compare estimates from Gibbssampling
to trueposterior
mean estimates. Extension andapplication
of our method are discussed.2. METHODOLOGY
2.1. Model and
priors
We consider inferences about model
parameters
for a mixed inheritance modelof the form
where y and e are n-vectors
representing
observations and residual errors,(
3
is ap-vector
of ’fixedeffects’,
u and v are q and2q-vectors
of randompolygenic
andconsider three random
genetic
effects,
i.e. two additive effects at a markedQTL
(v!
andv2,
seefigure
1)
and a residualpolygenic
effect(u;).
Here e is assumed to have the distributionN
n
(O, 1
0
&dquo;;),
independently
of(3,
u and v. Also u is taken tobe
Nq(0, A
O
,
2
),
where A is the well-known numeratorrelationship
matrix.Finally,
v is taken to beN2q(OGQ!),
where G is thegametic relationship
matrix(2q
x2q)
computed
frompedigrees,
a full set of markergenotypes
and the known map distance between marker andQTL
[26].
In case ofincomplete
markerdata,
weaugment genotypes
forungenotyped
individuals. We then denotef
fi(
k
)
andG(
k
)
as the markergenotype
configuration
k and as thecorresponding
gametic
relationship
matrix.Further, /3,
u, v, andmissing
markergenotypes
are assumedto be
independent,
apriori.
We assumecomplete knowledge
on variancecomponents
and map distance between marker andQTL.
2.2. Joint
posterior density
and full conditional distributions for locationparameters
The conditional
density
of ygiven /3,
u, and v for the modelgiven
inequation
(1)
isproportional
toexp{ -1/2a;
2
(y -
X,3 -
Zu -Wv)’(y -
X/3 -
Zu -Wv},
The
joint posterior
density
includes a summation(n
c
)
over all consistent markergenotype
configurations
(
M
(
k
))-
In the derivation of thesampling
densities for markedQTL effects,
however,
oneparticular
markergenotype
configuration,
m(
k
),
is fixed. The summation needs to be considered
only
when thesampling
of markergenotypes
is concerned.To
implement
the Gibbssampling algorithm,
werequire
the conditionalposterior
distributions of each of
(3,
u, and vgiven
theremaining
parameters,
the so-called full conditionaldistributions,
which are as followsand
gametic
covariances in thepedigree, respectively.
Note that the means of thedistributions
(3), (4)
and(5)
correspond
to theupdates
obtained when mixed modelequations
are solvedby
Gauss-Seidel iteration. Methods forsampling
from these distributions are well known(e.g. [24,
25]).
2.3.
Sampling
densities for markergenotypes
Suppose
m is the current vector of markergenotypes,
some observed and someof which were
augmented
(e.g.
sampled
by
the Gibbssampler).
Let m-i denotea
particular
genotype
for the marker locus. Then theposterior
distribution ofgenotype
gmis
theproduct
of two factorswith,
where
1
G-
corresponds
to markergenotype
set,
Mi
M
i
-
I
=g
m
).
Thus, equation
(7)
shows thatphenotypic
information needed forsampling
newgenotypes
for themarker is
present
in the vector ofQTL
effects(v).
Now,
it suffices tocompute
equation
(6)
for allpossible
values of gm, and thenrandomly
select one from that multinomial distribution[20].
Inpractice
consid-ering
only
those gm that are consistent with m-i and Mendelian inheritance canminimize
the, computations. Furthermore, computations
can besimplified
because&dquo;transmission of genes from
parents
tooffspring
areconditionally independent
given
the
genotypes
of theparents&dquo;
[15].
Adapting
notation from Sheehan and Thomas[15],
letS
j
denote the set of mates(spouses)
of individual i and0;,!
be the set ofoffspring
of thepair
i andj. Furthermore,
theparents
of individual i are denotedby
s(sire)
and d(dam).
Then, equation
(6)
can be morespecifically
written asp(mi = gm, m-i IV, oV 2 ,Mobs, r)
When
parents
of individual i are notknown,
then the first two terms on theright-hand
side ofequation
(8)
arereplaced by
x(m;),
whichrepresents
frequen-cies of marker
genotypes
in apopulation.
Theprobability
p(m;
=9
.
1
M
.,
Md
).
cor-responds
to Mendelian inheritance rules forobtaining
markergenotype
gigiven
parental
genotypes
ms and md, similar forp(m
1
Im¡
=gm, m!).
Thecomputation of
p{v
i
lv
d
,m¡,m
s
,m
d
,r}
(and
,r})
1
,m
j
,m
i
,m
Vj
Iv¡,
1
p{v
canefficiently
beperformed
by
utilizing special
characteristics of the matrixG-
1
.
Let
Q
i
denote agametic
contribution matrixrelating
theQTL
effects of individual i to theQTL
effects of itsparents.
The matrixQ
i
is2(i —
1)
x 2. For founderanimals,
matrixQi
issimply
zero. The recursivealgorithm
tocompute G-
1
of
Wang
et al.(1995,
equation
[18] )
can be rewritten aswhere
D¡1
=(C; -
Q;G¡-
1Q
¡)-
1
(which
reduces toD¡
1
=
(C
i
-
QfG
i
-,Q
i
)-’
with no
inbreeding),
O
i
is a2(q—i)
x 2 null matrix. Theoff-diagonals
inC; equal
theHenderson’s rules for
A-
1
[6].
The nonzero elements ofG-
1
pertaining
to an animalarise from its own contribution
plus
those of itsoffspring.
So,
whensampling
the ith animal’s markergenotype,
only
those contribution matrices need be considered that contain elementspertaining
to animal i. These are the individual’s owncontributions and those of its progeny when i appears as a
parent.
where Vk is the vector of animal k’s two marked
QTL effects,
andQp
denotes
therows of
Q
k
pertaining
toP,
one of k’sparents.
Again,
werecognize
each term inthe sum is the kernel of a
(bivariate)
normal which ispfv
i
Iv
s
,
, vdm¡, ms, md,r}
orp{v1Iv¡, Vj, m¡, mj,m1, r}.
2.4.
Running
the Gibbssampling
The Gibbs
sampler
is used to obtain asample
of aparameter
from theposterior
distribution and can be seen as a chained data
augmentation
algorithm
[19].
So,
oneaugments
data(y
andmobs)
withparameters
(0)
toobtain,
forexample,
p(e
1
Ie
2
, ... ,
O
d
,
y).
For the purpose ofbreeding
valueestimation,
Gibbssampling
works as follows:1)
setarbitrary
initial values for9!°!,
we use zeros for fixedand
genetic
effects and for each unmarkedanimal,
weaugment
agenotype
that is consistent withpedigree,
Mendelianinheritance,
and observed markerdata;
2)
sample
01’
+ll
from[3],
i =1, 2, ..,
p; for fixed
effects,
[4],
i =p +
1,
p +2, ..,
p + q; forpolygenic effects,
[5],
i =p + q +
1,
p + q +2, ..,
p + q +2q;
for markedQTL effects,
or[6],
i =p +
3q
+1,
p +3q
+2,..,
p +3q
+ t;
for markergenotypes,
and
replace
6!T!
withei
T+1
]
;
.
3)
repeat
2)
N(length
ofchain)
times.For any individual
parameter,
the collection of n values can be viewed as asimulated
sample
from theappropriate
marginal
distribution. Thissample
can beused to calculate a
marginal
posterior
mean or to estimate themarginal posterior
distribution. For small
pedigrees
withonly
a few animalsmissing
observed markerwhere B* is a
fixed,
polygenic
or markedQTL
effect. Thisprovides
a criterion tocompare the estimates obtained from Gibbs
sampling.
3. NUMERICAL EXAMPLE
A small numerical
example
is used toverify
the use of the Gibbssampler
to obtainposterior
mean estimates and illustrate the effect of the data on the estimatesobtained from two different
estimators,
i.e. aposterior
mean and the well-knownBLUP estimator
(by
solving
the MMEgiven
in theAppendix).
Pedigree
anddata of the
example
are infigure
2. Both sire(01)
and dam(02)
have observedmarker
genotypes,
AB andCD, respectively,
but do not havephenotypes
observed.Three full sibs have a marker
genotype
BC and aphenotype
+20(denoted
FS03,
04,
05);
three other full sibs have a markergenotype
AD and aphenotype
-20(denoted
FS06, 07,
08).
Both animals 09 and 10 have no markergenotypes
but have aphenotype
+20 and-20,
respectively. Complete knowledge
was assumed onvariance
components
and recombination rate between marker andMQTL
(table I).
Thethinning
factor in Gibbssampling
chain was 50cycles
and the burn inperiod
was twice thethinning factor,
and 20 000 thinnedsamples
were used foranalysis.
3.1. Estimates for
genetic
effectsThe
posterior
estimates obtained from Gibbssampling
were similar to the TRUEposterior
estimates,
as shown in table 11. Theposterior
estimates ofMQTL
effects ofanimals 09 and 10
(f0.70)
were much lessdivergent
than those of their full sibs thathad their marker
genotypes
observed(f2.48).
These lessdivergent
values reflectthe
uncertainty
on markergenotypes
of animals 09 and 10. The TRUE and GIBBSposterior
densities for anMQTL
effect of animal 09 were also very similar(figure 3).
The
posterior
variance was52.3,
which waslarger
than theprior
variance(ufl =
50)
and reveals that the data are not
decreasing
theprior uncertainty
onMQTL
effectsfor animals 09 and 10 in this situation. For the other full
sibs,
theposterior
variancewas
47.02,
which was lower than theprior
variance becausesegregation
ofMQTL
effects was known withhigher certainty,
i.e. markergenotypes
were known. TheBLUP estimates for
MQTL
effects of animal 09 and 10 wereequal
to1/6
of the3.2. Marker
genotype
probabilities
In the
following
markergenotype
ABrepresents
both AB and BA. In the latter case, alleles for both marker andMQTL
arereordered, maintaining linkage
betweenmarker and
MQTL
alleles within an animal.So,
four markergenotypes
werepossible
for animals 09 and 10
(table III).
Based onpedigree
and marker datasolely,
each ofthese four
genotypes
wasequally likely
(prior
probability
=0.25).
Afterincluding
phenotypic
data,
(posterior)
probabilities changed:
markergenotype
BC and AD for animal 09 became more and lessprobable, respectively.
The reverse holds foranimal 10. The estimates from the Gibbs
sampler
wereagain
very similar to the TRUEposterior probabilities.
Complete phenotypic
and marker information on six full sibs gave theMQTL
effects linked to marker alleles B and Cpositive
values and marker alleles A and Dnegative
values. Note thatprobabilities
(TRUE)
for markergenotypes
AC and BD also(slightly)
changed
afterconsidering
thephenotypic
data.4. DISCUSSION
Marker-assisted
breeding
value estimation in livestock has beenhampered by
incomplete
marker data.Previously
described methods[1,
23,
26]
can accommodateungenotyped
individuals which do not haveoffspring
themselves as was shownby
Hoeschele[7].
However,
they
do notprovide
theflexibility
toincorporate
parents
with unknowngenotypes
which results in the loss of information forestimating
marker linked effects. The described Gibbssampling algorithm
nowprovides
thisrequired
flexibility.
The innovativestep
in ourapproach
is thesampling
of
genotypes
for a marker locus that isclosely
linked toQTL
withnormally
distributed allelic effects.
Normality
ofQTL
effects is a robustassumption
to allowsegregation
of many allelesthroughout
apopulation
and allowchanges
in alleliceffects over
generations,
e.g. due to mutations and interactions with environmentsphenotypes
of animals in thepedigree
are used. Jansen et al.[9]
indicatethat,
as a result of the use ofphenotypic information,
unbiased estimates of effects at theQTL
can be obtained in situations where animals have beenselectively
genotyped.
In this paper we have concentrated on the use of information from asingle
marker locus.
Using
information frommultiple
linked markers can increase accuracy ofpredicting
genetic
effects at theQTL.
Theprinciples applied
here have been extended to situations wheregenotypes
for all the linked markers are known for allindividuals
[5, 22].
In order toincorporate
individuals with unknowngenotypes,
the methodpresented
in this paper needs to be extended to amultiple
marker situation.In
extending
the method tomultiple
markers,
theproblem
ofreducibility
deservesspecial
attention.Reducibility
of Gibbs chains can arise whensampling
genotypes
for apolymor-phic
locus with more than two alleles[20].
Thereducibility problems
will becomemore severe when
sampling
genotypes
formultiple
linked markers.Thompson
[21]
suggested
several, workable,
approaches
toguarantee
irreducibility
of the Gibbschain. These
approaches
make use ofMetropolis-coupled
samplers
[11],
importance
sampling,
with0/1
weights
[15],
and’heating’
in theMetropolis-Hastings
steps
[12].
Alternatively,
Jansen et al.[9]
sampled
IBD values for all marker lociindicating
parental origin
of alleles instead of actual alleles to avoid thereducibility problem.
Inextending
the method tomultiple
linkedmarkers,
attention also needs to bepaid
to an efficient scheme forhaplotypes
orgenotypes
at the linked loci.Updating
ofgenotypes
atclosely
linked loci will be more efficient whengenotypes
at the linked loci areupdated
together
(’in blocks’)
in order to reduce auto-correlation in the Gibbssampler
[10].
For
posterior
inferences on thebreeding
value of an animal a minimum of100 effective
samples
is needed. In the numericalexample
this minimum wouldcorrespond
to a chain of 5 000cycles
whichrequired
8 s of CPU at a HP9000K260 server. It has been found that
computing requirements
increase more orless
linearly
with the number of animals[10].
Thepresented
method can beapplied
to dataoriginating
from nucleus herds whichcomprise
therelatively
small number ofgenetically
superior
animals from thepopulation.
In a marker-assistedselection scheme marker
genotypes
will be collectedlargely
on theseanimals,
withsufficient animals
having
markergenotypes
observed toimprove
selection ofsuperior
individuals.Straightforward application
inlarge
commercialpopulations
with thousands of markergenotypes
missing,
is not a validoption
because ofcomputational
requirements
of Markov chain Monte Carlo(MCMC)
algorithms
such as Gibbssampling. Hybrid
schemes will need to bedeveloped
toincorporate
information from the commercialpopulation
into the marker-assistedprediction
ofbreeding
values of nucleus animals. Similar schemes have been
implemented
toincorporate
foreign
information into national evaluations indairy
cattle.Our
Bayesian approach
can also be considered as a firststep
towards a MCMCalgorithm,
notnecessarily
Gibbssampling,
that can also estimatehyper
parame-ters,
which were held constant in thisstudy.
The nextstep,
therefore, comprises
MCMC
algorithm
can then be used for theanalysis
inQTL mapping experiments
with
complex pedigree
structures,
such as(grand-)
daughter designs,
in outbredpopulations.
ACKNOWLEDGEMENTS
Valuable
suggestions
by
S. van der Beek and anonymous reviewers aregratefully
acknowledged.
The financial support of Holland Genetics ishighly appreciated.
REFERENCES
[1]
FernandoR.L.,
GrossmanM.,
Marker-assisted selectionusing
best linear unbiasedprediction,
Genet. Sel. Evol. 21(1989)
467-477.[2]
GelfandA.E.,
SmithA.F.M., Sampling-based approaches
tocalculating marginal
densities,
J. Am. Stat. Assoc. 85(1990)
398-409.[3]
GemanS.,
GemanD.,
Stochasticrelaxation,
Gibbs distributions and theBayesian
restoration of
images,
IEEE Trans Pattern Anal MachineIntelligence
6(1984)
721-741.[4]
Geyer C.J.,
Apractical guide
to Markov chain MonteCarlo,
Stat. Sci. 72(1992)
320-339.[5]
GoddardM.E.,
A mixed model foranalysis
of data onmultiple genetic markers,
Theor.
Appl.
Genet. 83(1992)
878-886.[6]
HendersonC.R.,
Asimple
method forcomputing
the inverse of a numeratorrela-tionship
matrix used inprediction
ofbreeding values,
Biometrics 32(1976)
69-83.[7]
HoescheleI.,
Elimination ofquantitative
trait lociequations
in an animal modelincorporating genetic
markerdata,
J.Dairy
Sci. 76(1993)
1693-1713.[8]
JansenR.C., Complex plant
traits: time forpolygenic analysis,
Trends Plant Sci. 3(1996)
73-103.[9]
JansenR.C.,
JohnsonD.L.,
Van ArendonkJ.A.M.,
A mixture modelapproach
to themapping
ifquantitative
trait loci incomplex populations
with anapplication
tomultiple
cattlefamilies,
Genetics(1997)
(in press).
[10]
JanssL.L.G., Thompson R.,
Van ArendonkJ.A.M.,
(1995)
Application
of Gibbssampling
for inference in a mixedmajor gene-polygenic
inheritance model in animalpopulations,
Theor.Appl.
Genet. 91(1995)
1137-1147.[11]
LinS.,
Markov chain Monte Carlo estimates ofprobabilities
oncomplex
structures,Ph.D.
dissertation, University
ofWashington.
[12]
LinS., Thompson E.A., Wijsman E.,
Achieving irreducibility
of the Markov chain Monte Carlo methodapplied
topedigree data,
IMA J. Math.Appl.
Med. Biol. 10(1993)
1-17.[13]
MeuwissenT.H.E.,
VanArendonkJ.A.M.,
Potentialimprovements
in rate ofgenetic
gain
from marker-assisted selection indairy
cattlebreeding schemes,
J.Dairy
Sci. 75(1992)
1651-1659.[14]
SchaefferL.R., Kennedy B.W., Computing strategies
forsolving
mixed modelequations,
J.Dairy
Sci. 69(1986)
575-579.[15]
SheehanN.,
ThomasA.,
On theirreducibility
of a Markov chain defined on a spaceof genotype
configurations by
asampling scheme,
Biometrics 49(1993)
163-175.[16]
SmithC., Simpson S.P.,
(1986)
The use ofgenetic
polymorphisms
in livestock[17]
SollerM.,
BeckmannJ.S.,
Restrictedfragment length polymorphisms
andgenetic
improvement,
in: Proc. 2nd WorldCongress
Genet.Appl.
Livest.Prod., Madrid,
Editorial
Garsi, Madrid, Spain,
vol. 6, 1982, pp. 396-404.[18]
SorensenD.A., Wang C.S.,
JensenJ.,
GianolaD., Bayesian
analysis
ofgenetic
change
due to selectionusing
Gibbssampling,
Genet. Sel. Evol. 26(1994)
333-360.[19]
TannerM.A.,
Tools for StatisticalInference, Springer-Verlag,
NewYork, NY,
1993.[20]
ThomasD.C.,
CortessisV.,
A Gibbssampling approach
tolinkage analysis,
Hum.Hered. 42
(1992)
63-76.[21]
Thompson
E.A.,
Monte Carlo likelihood ingenetic mapping.
Stat. Sci. 9(1994)
355-366.[22]
UimariP.,
ThallerG.,
HoescheleI.,
The use ofmultiple
markers in aBayesian
methodfor
mapping quantitative
traitloci,
Genetics 143(1996)
1831-1842.[23]
VanArendonkJ.A.M.,
TierB.,
Kinghorn B.P.,
Use ofmultiple genetic
markers inprediction
ofbreeding values,
Genetics 137(1994)
319-329.[24]
VanTassellC.P.,
CasellaG.,
PollakE.J.,
Effects of selection on estimates of variancecomponents
using
Gibbssampling
and restricted maximumlikelihood,
J.Dairy
Sci. 78(1995)
678-692.[25]
Wang C.S., Rutledge J.J.,
GianolaD.,
(1993)
Marginal
inferences about variancecomponents in a mixed linear model
using
Gibbssampling,
Genet.Sel.
Evol. 25(1993)
41-62.[26]
Wang T.,
FernandoR.L.,
van der BeekS.,
GrossmanM.,
Van ArendonkJ.A.M.,
Covariance between relatives for a markedquantitative
traitlocus,
Genet. Sel. Evol.27
(1995)
251-272.Al. APPENDIX
A1.1.
Computation
of average G withincomplete
marker dataWang
et al.[26]
suggested computing
an averageG,
here denotedG,
aswhere
G(
k
)
is thegametic
relationship
matrixgiven
aparticular
markergenotype
configuration
k
);
m(
and)lM
,)
ob
k
(
M
p(
is theprobability
ofm(
k
)
given
mobs. Thisequation
is not conditioned onphenotypic
information.Al.2. Marker-assisted best linear unbiased
prediction
ofbreeding
valueswhere a&dquo; =
Qe !Qu,
a&dquo; =Qe !Q!
and G are all known. Solutions can be obtainedby
iteration on the data
[14].
Theseequations
can be used in three situations.First,
Gis
unique
(complete
markerdata).
Second,
withmissing markers,
a linear estimatoris obtained