HAL Id: hal-00894117
https://hal.archives-ouvertes.fr/hal-00894117
Submitted on 1 Jan 1995
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Computing approximate monogenic model likelihoods in
large pedigrees with loops
Llg Janss, Jam van Arendonk, Jhj van der Werf
To cite this version:
Original
article
Computing approximate monogenic
model likelihoods in
large
pedigrees
with
loops
LLG Janss
JAM
Van
Arendonk,
JHJ Van der Werf
Wageningen Agricultural University, Department of
AnimalBreeding,
PO Box338,
6700 AH
Wageningen,
The Netherlands(Received
30January
1995;accepted
11September
1995)
Summary - In this
study
’iterativepeeling’
isintroduced,
a methodequivalent
tothe traditional recursive
peeling
method forcomputing
exact likelihoods innonlooped
pedigrees,
but which can also be used to obtainapproximate
likelihoods inlooped
pedigrees.
Iterativepeeling
is aninteresting
tool for animalbreeding,
where exact recursivepeeling
isgenerally
unfeasible due to the abundant number ofloops
in animalpedigrees.
Insimulations,
hypothesis testing
and parameter estimation werecompared
based onapproximate
likelihoods inlooped pedigrees
and exact likelihoods innonlooped
pedigrees,
showing
no biases introducedby
theapproximation
inlooped pedigrees.
likelihood/
pedigree peeling/
major gene/
looped pedigreeRésumé - Calcul approximatif de vraisemblance pour un modèle monogénique dans de grands pedigrees à boucles. Dans cette étude on introduit une
procédure
itérativede condensation de
l’information
contenue dans unpedigree, appelée « épluchage »,
quiest
équivalente
àl’épluchage récursif pour
le calcul des vraisemblances exactes dans despedigrees
sansboucles,
mais qui estégalement
utilisable pour le calcul de vraisemblancesap-proximatives dans les
pedigrees
à boucles.L’épluchage itératif
est une méthode intéressante engénétique
animale où la méthode récursive exacte estgénéralement inapplicable
à cause dugrand
nombre de boucles dans lespedigrees
animaux.À
l’aide desimulations,
on acomparé
des testsd’hypothèse
et l’estimation deparamètres
basés sur des vraisemblances approximatives dans despedigrees
à boucles et des vraisemblances exactes dans despedi-grees sans
boucles,
montrantqu’il n’y
a pas de biais introduit par le calculapproximatif
dans des
pedigrees
à boucles.INTRODUCTION
Research into the use of
major
gene models in animalbreeding
has been aimedmainly
atapproximations
to a mixed inheritancemodel,
including polygenes,
inone
generation
half-sib structures(Hoeschele,
1988;
LeRoy
etal, 1989;
Knott etal,
1992).
Because of thepedigree loops
that arise in animalbreeding situations,
ex-tension to
multigeneration
pedigrees
is difficult. Apedigree loop
arises when 2 individuals are connectedby
more than onepath
of descent ormarriage
relation-ships.
Lange
and Elston(1975)
described varioustypes
ofloops,
among which areinbreeding
loops, marriage rings
andmarriage loops.
In animalbreeding
pedigrees
these kinds ofloops
are very common. Inparticular, multiple matings
which aregenerally
applied
to males and often tofemales,
result in manymarriage
loops
andmarriage rings.
For
genotype
probability
and likelihoodcomputation, loops
canonly
be dealtwith in an exact manner in
pedigrees
with a fewsimple non-overlapping loops using
the traditional recursive
peeling
method(Elston
andStewart, 1971;
Cannings
etal,
1976;
Cannings
etal,
1978).
However,
inhighly looped
pedigrees,
common in animalbreeding,
exact recursivepeeling
is toodemanding
computationally
and recursivepeeling
is not flexibleenough
to allow forapproximate
computations.
In this
study
we introduce ’iterativepeeling’.
Iterativepeeling
isdeveloped
asan exact method for
application
innonlooped pedigrees, equivalent
to recursivepeeling,
butwhich,
unlike theoriginal
recursivevariant,
can be used withoutmodifications in
looped
pedigrees
to obtainapproximate
likelihoods. The mainobjective
of this paper is to introduce iterativepeeling
for suchapproximations
inlooped
pedigrees,
allowing
for a moregeneral application
ofmajor
gene models in animalbreeding.
Using simulations,
the usefulness of theapproximation
for likelihood-basedhypothesis testing
andparameter
estimation inlooped
pedigrees
isinvestigated.
Amonogenic
model will beconsidered,
which can be extended to amixed inheritance
model,
as will be discussed.RECURSIVE AND ITERATIVE PEELING
In the first
section,
recursivepeeling
is described forobtaining monogenic
model likelihoods innonlooped pedigrees.
In the secondsection,
’iterativepeeling’
is introduced as anequivalent
method for exactcomputations
innonlooped
pedigrees.
The
equivalent
exact method innonlooped
pedigrees
can be used as anapproximate
method in
looped pedigrees.
Recursive
peeling
Probability
and likelihoodcomputations
innonlooped
pedigrees
can be doneby
recursive
peeling
(Elston
andStewart, 1971;
Cannings
etal,
1976;
Cannings
etal,
1978)
using
2 basicpeeling operations
of’peeling up’
and’peeling
down’.Roughly,
considering
asingle family,
apeel-up
operation
represents
the information in afamily
inprobabilities
for thegenotype
G
i
of aparent
i,
and apeel-down
operation
of the
peel-up operation
is denotedby
prog(G
i
)
and the result of thepeel-down
operation
is denotedby
prior(G
k
).
Thecorresponding
notation inCannings
et al(1976, 1978)
is theR
*
(..;G
i
)
function forpeeling
up and theR+(..;G!)
function forpeeling
down.Peeling operations
are usedrecursively,
eg, thecomputation
of a prog term fora
parent
based on progeny data may includepreviously computed
prog terms ofthose progeny,
representing
information fromgrand-progeny.
The aim ofpeeling
isto condense all information from a
pedigree
into aprior
and prog term for asingle
individual
1, obtaining
the likelihood L for all data in thepedigree
as:where
f (y, I G
i
)
is thepenetrance
function,
which is theprobability
for the observed data yl onindividual l, given
that it hasgenotype G
i
.
The individualmay
be anindividual from the base
population,
in which case thebase-population
genotype
frequency
P(G!)
is used inplace
ofprior(G
i
).
Individuall may
also have no owndata or no progeny, in which case the
corresponding
penetrance
term or prog termis removed.
Computationally
this isimplemented using
apenetrance
or prog termcontaining
l’s.Peeling equations
A
peeling
equation
for an individual is obtainedby
considering
the collection ofpossible base-population
genotype
frequencies,
genotype
transmissionprobabilities,
penetrance
probabilities
and otherpeeling
termspertaining
to the individuals in itsfamily
andsumming
over allpossible
genotypes
of thefamily
members. The termsthus
entering
apeeling equation
are difficult togive
ingeneral. Here, equations
will begiven
to usepeeling
in apedigree
structure with dams nested within sires. In this structure afamily
is a half-sibfamily
of one sire with severalmates,
containing
groups of full sibs which are, across groups,
paternal
half-subs. Three differentpeeling equations
are considered: 2 forpeeling
up,dependent
on whether this isdone for a sire or a
dam,
and 1 forpeeling
down. In thepeeling
equations,
prior,
prog and
penetrance
functions onfamily
members arespecified
in allplaces
wherethey
can enter. When these are notrelevant,
eg, when a progeny does not haveprogeny of its own, these are removed or,
computationally,
termscontaining
l’s are used. Prior terms for individuals in the basepopulations
are substituted withbase-population
genotype
frequencies.
To condense all information in a prog term for a sire i the
following
expression
is used:
prog(Gi) = r
l
jrgj
pr2or(Gj) fly7lC-T7)Hk!Gk
Pl!’klGi,
Gj)
f(
y
kl
-G
k
)
!r!9(CTk) [2]
where j
=
1 to ni are mates ofi,
each matehaving
k = 1 ton
ij progeny, and
P(Gk !Gi, G! )
is thegenotype
transmissionprobability
of sire i and adam j
to
offspring
k. To condense all the information from a half-sibfamily
into a prog termwhere i is the sire of the
family,
prog-j.)
i
(G
is like inequation
[2],
butexcluding
damj
*
and k =1,
n2!* are progeny of dam
j
*
.
To condense all the information ina
prior
term for 1particular
progeny with dam *kj
*
,
thefollowing
expression
is used:where i is the sire of the
family,
phs(G
i
)
is a term that includes information onthe
paternal
half-sibs ofk
*
,
which is a function of thegenotype
of its sire i and iscomputed
as:Iterative
peeling
Iterative
peeling
isequivalent
to recursivepeeling
used innonlooped pedigrees.
Iterative
peeling
is based onalgebraic partitioning
of the likelihood and onrepeated
computation
ofpeeling
equations,
based on the idea of iterativecomputation
ofgenotype
probabilities
(Van
Arendonk etal,
1989).
Partitioning
of likelihoodThe aim of
obtaining
the likelihood of all datausing equation
[1]
requires
familiesto be handled in a certain order and
requires
peeling,
within eachfamily,
to bein a certain direction.
Peeling operations
can be used topartition
the likelihoodpertaining
toparts
of thepedigree.
Thispartitioning
is continued untilparts
areobtained
pertaining
tosingle
families. This allows afamily-wise
evaluation of thelikelihood,
and therequirement
ofpeeling
to have a direction within eachfamily
becomes obsolete.
Consider the
pedigree
with 5 individuals infigure
1. In thispedigree
2 familiesare
present,
onefamily
with individuals1,
2 and3,
and a second with individuals3,
4 and 5.
Here,
onepartitioning
above and below individual 3 divides thepedigree
in2
families,
with individual 3being
in both families. Individual 3 is called alinking
individual. The likelihood for a
monogenic model, assuming
data is available on all5
individuals,
iscomputed
as:Now,
L ismultiplied
and dividedby Li =!1!2!03
P(Gi) P(G
2
)
P(G
3
) Gi , G
2
)
!(
Y1
IG
1
where the
part !01!02
P(G
i
)
P(G
2
)
P(G
3
) Gi ,
)
G
2
!(
Y1
IG
d
f(y
2
) G2 )
has beeniso-lated. This
part
isprior(G
3
).
The term defined asL
1
can be rewritten asE
G3
I;
G1
I;c
2
P(G
1
) P(G
2
) P(G
3I
G
1
, G
2
) !(
Y1I
G
1
) !(
Y2I
G
2
),
which isI;c3 prior(G3).
This
simplifies
L to:where
prio&dquo;
sC
( G
3
)
stands for ascaled,
ornormalised, prior
term. Now the likelihood can be written as L =L
1
L
2
,
orln(L)
=ln(L
i
)
+ln(L
2
),
with one likelihood termper
family.
This is apartitioning using
aprior
term for thelinking
individual. It shows that for thistype
partitioning
(i)
in thefamily
where thelinking
individual isa progeny, after the
partitioning,
information on thelinking individual,
ie own data and progenydata,
isignored;
and(ii)
in thefamily
where thelinking
individual isa
parent,
a scaledprior
term is used for thelinking
individual. This term is usedin a manner like a
base-population
genotype
frequency
for base individuals. Thescaled
prior
term for alinking
individual1,
iscomputed
ingeneral
as:Although
thepartioning
is shownonly
for 1example,
thepartitioning
is verygeneral.
The termL
1
above is ingeneral
the sum of theprior
term for alinking
individual
1,
which is the collection of allprobability
termspertaining
to anterior individuals of and the transmissionprobability
tol,
summed over allpossible
genotypes
of l and of its anterior individuals. At the same time this termrepresents
the likelihood of the entire anterior
part
of thepedigree
andl, excluding
data onl. The
remaining
part
after thepartitioning,
L
2
in theexample,
is the likelihood of theposterior
part
of thepedigree
ofl, including
l with a scaledprior
term. Inlarger pedigrees
thispartitioning
isrepeated
toyield
parts
corresponding
tosingle
families. When
repeating
thepartitionings,
results of earlierpartitionings
must be taken intoaccount,
eg, the result that after apartitioning
information on alinking
The likelihood of a
pedigree
can bepartitioned entirely using prior
terms.However,
the iterativecomputation,
as will be introducedhereafter,
can bespeeded
up
by
alsousing
apartitioning
of the likelihoodusing
a prog term.Showing
this based on the
example,
the likelihood L ismultiplied
and dividedby
a termrepresenting
the likelihood offamily
2,
ignoring
data on individual3,
L2
=E
G3
E
G4
E
G5
P
(Ga) P(G
5I
G
3
,
Ga) !
(
Y3I
G
3
)
/(!!G4),
which leads to:2
Here a term
E
G4
E
G5
P(G
4
)
P(G
5l
G
3
,
)
G
4
!(
Y4I
G
4
)
!(
Y5I
G
5
)
has beenisolated,
which isprog(G
3
).
The divisionby
L2
scales
thisterm,
L2
being
E
G3
!rog(G3).
Hence,
L is written as:where
prot
C
( G
3
)
denotes the scaled or normalised prog term. For apartitioning
using
a prog term it is seen that(i)
in thefamily
where thelinking
individual isa progeny, a
prog
s
’
term is added as information for theindividual;
and(ii)
in thefamily
where thelinking
individual is aparent,
all information from observationsand from
prior
terms isignored.
The scaled prog term for alinking
individual l isgenerally
computed
as:Partitioning
in a nesteddesign
In a nested
design, partitionings
are carriedthrough
untilparts
are obtainedcorresponding
to sire families. In suchfamilies,
several femaleparents
can bepresent.
Thelinking
individuals are all the sires and dams of thefamilies,
except
when
they
are in the basepopulation.
In thisdesign
we consider apartitioning
using
a prog term for each male and aprior
term for each female that is alinking
individual. When all
parents
of afamily
are in the basepopulation,
thepart
of thelikelihood
pertaining
to such afamily
iscomputed
as:where i indicates the sire of
family
s, j
sums over the dams of thefamily,
k indicatesmale progeny that are
linking
individuals,
1 indicates female progeny that arelinking
individuals and m indicates all other progeny. When the sire of the
family
is not inthe base
population,
the termP(G
i
)
f (y2!Gi)
on the first line of[5]
is removed andof
[5]
isreplaced
withpriof’c (G j)
’
The consideredpartitionings using
prog termsfor all male
linking
individuals lead to this removal of information from sires on thefirst line of
[5]
when sires are not in the basepopulation
and lead to the inclusion of theprog
sc
for males on the third line ofequation
!5!.
The consideredpartitionings
using prior
terms for all femalelinking individuals,
lead to the inclusion of apriors!
term on the second line of
[5]
when dams are not in the basepopulation
and the removal of all information of females on the fourth line ofequation
!5!.
Based onthe results from the
previous
paragraph,
the likelihood of the entirepedigree
after thepartitionings
is:Repeated computation
ofpeeling
equations
Iterative
peeling
usesrepeated
computation
ofpeeling equations.
Therepeated
computation
is a method to establish the order in whichequations
should be handled.Therefore,
iterativepeeling
does notrequire knowledge
of such an orderbeforehand,
as isrequired
for recursivepeeling.
For each individual a
prior
and a prog term iscomputed
and remains stored because results ofpeeling
terms can berequired
asinput
for thecomputation
ofother
peeling
terms. Iterativepeeling
computes
a series of solutionspriorlo, rtorlil,
etc,
for these terms.Starting
values are taken for individual i aspr!or!(Gt) =
P(G
i
),
thegenotype
frequencies
in the basepopulation
andI
l
°
prog
(G
i
)
equals
1 for allG
i
.
Iterativecomputation
startsby computing
prior
[l
] (G
i
)
for each individuali,
in order ofdescending
age. Evaluation of theseprior
terms is based onprior!l!
termsof
parents,
which are available because older individuals areupdated
before youngerindividuals,
and onprog
lol
terms of sibs.Subsequently,
proglo (Gi)
iscomputed
for each individual
i,
in order ofascending
age. Evaluation of these prog terms isbased on
prior!l!
terms ofmates,
onprog[l]
terms of progeny, which are availablebecause now younger individuals are
updated
before olderindividuals,
and forfemale
parents,
on aprog
iol
orprog
[l]
term of their male mate. Whether this lastterm is
already updated
asprog
[l]
depends
on the order in which prog terms arecomputed.
Aftercomputation
of allprior!l!
andprog
[l]
terms iscompleted,
a newiteration starts
computing
prior
[2
]
andprog!2!,
etc.Starting
values are such thatprior
lol
terms are correct for all individuals in the basepopulations,
andprog
lol
terms are correct for all individuals without progeny.Terms that can be correct after the first
cycle
ofcomputations
are, forinstance,
prior
l
1]
terms of individualsdescending
from 2 base individuals andprog
[l]
termsof
parents
withoutgrandprogeny.
Correctcomputation
of a term is shown when inthe next
cycle recomputed
terms areequal
to old terms. Once it is found that aterm is
correctly computed, recomputation
can be omitted infollowing
iterationsof the
algorithm.
The order in which terms are found correctgives
information onthe order in which recursive
peeling
could be used.Generally,
in eachiteration,
reasonably large
groups of terms appear correct,keeping
the number ofcycles
required
tocompute
all termscorrectly
reasonably small, typically
about the number ofgenerations
in the data set. When all terms are foundcorrectly
computed,
Application
inlooped
pedigrees
The series of solutions
prior
f
°
]
,
prio!ll,
etc,
obtained with iterativepeeling
can beconsidered as
temporary
solutions for therequired
terms,
corresponding
to solutions based on a notyet
fully
determinedpeeling
order.’Temporary’
likelihoods canalso be
computed using
[5]
and[6]
based on a notyet
fully
determined order. Innonlooped
pedigrees,
apeeling
order caneventually
be found andtemporary
solutions become exact. In
looped
pedigrees,
apeeling
order for recursivepeeling
cannot be determined. In the iterative
peeling algorithm
theimpossibility
offinding
a
peeling
order inlooped pedigrees
is shownby
continuing
changes
inpeeling
terms. In
looped pedigrees,
thesechanges
were found to decrease in sizequickly
andtemporary
likelihoods were found tostabilise, supplying
anapproximation.
Becausein iterative
peeling
everyfollowing
update
of terms includes information from50%
less related
individuals,
ageometric
rate of convergence isplausible.
As astopping
rule to use the
approximation
inlooped
pedigrees,
we used the average absolutedifference between
subsequent
normalisedheterozygote probabilities,
based oncomputed
peeling
terms. Forconvenience,
only
theheterozygote
probability,
whichchanged
themost,
was monitored.SIMULATION STUDY
Application
of iterativepeeling
to obtainapproximate
likelihoods inlooped
pedi-grees was the aim of thisstudy.
Simulations were thereforeperformed
toinvestigate
the usefulness of this
approximation.
Because exactcomputations
are unfeasible inlarge looped
pedigrees,
approximate
likelihoods could not becompared
with exactones.
Hence,
an indirect way tostudy
theapproximation
was foundby studying
the distribution of test statistics and ofparameter
estimates over a number ofreplicated
analyses
inlooped
as well as innonlooped pedigrees.
Innonlooped pedigrees
exactlikelihoods could be
computed, serving
as a reference. Simulations andanalysis
arebased on a biallelic autosomal locus and a normal
penetrance
function. Simulated dataData sets had a nested structure each
generation,
with full-sibs nested withinpaternal
half-sibs. Three different data structures were used(table I),
1 structurewithout
loops
and 2 structures withloops.
The data structures weredesigned
tocontain
approximately
the same number ofobservations,
the same number of baseindividuals
(structure
1 vs2)
and the samefamily
sizes(1
vs3).
In structures 2 and3,
the thirdgeneration
wasproduced by taking
1 son from each sire and 1daughter
from eachdam, maintaining
the samebreeding
structure acrossgenerations.
No directional selection waspractised,
andbreeding
females for a male were eachtaken from a different
sire-family.
Half- and full-sibmatings
wereavoided,
so thatinbreeding
was absent within the 3generations
considered. The additional thirdgeneration
in structures 2 and 3 caused manypedigree loops
in the form ofmarriage
loops.
All individuals used forbreeding
the lastgeneration,
ie 120 for structure 2and 60 individuals for structure
3,
were involved in 1 or more suchloops,
oftenGenotype
G
i
of an individualequals
1,
2 or 3corresponding
togenotypes
A
1
A
1’
A
l
A
2
andA
2
A
2
at an autosomal locus.Genotypes
for individuals in the basepopulation
wererandomly sampled
using
genotype
frequencies according
toHardy-Weinberg
proportions,
after whichgenotypes
of other individuals wererandomly
sampled
based on realisedparental
genotypes
assuming
Mendelian transmissionprobabilities.
For each individual a randomnormally
distributed environmentalcomponent
wassampled
and added to apre-determined
effect of eachgenotype
toobtain a
phenotypic
observation. Random numbers weregenerated
using
GGUBFSand
GGNQF
(IMSL, 1984).
Details on theparameters
used for these simulationsare
given
in thefollowing
sections. Model and modelfitting
The statistical model can be
specified by
theprobability
terms in!2!,
[3]
and[4]
which areP(G
i
),
thegenotype
frequency
in the basepopulation
for individuali,
P(GiIG
s
,
G
d
),
the transmissionprobability
for individual igiven
thegenotypes
of its sire s and damd,
and thepenetrance
function/(<
/
t!G,),
theprobability
forthe data y2 on individual i
given
thegenotype
G
i
of individual i. From these 3terms,
transmissionprobabilities
are assumed known to be Mendelian.Genotype
frequencies
in the basepopulation depend
on the unknownfrequency f
of theA
1
allele, assuming Hardy-Weinberg proportions
ofgenotypes.
Thepenetrance
function for an individual i is taken as:This
penetrance
function is a normalprobability density
function with variancea-2
around
the mean JiGi forgenotype G
i
.
No dominance is assumed. Foranalysis,
means attributed to thegenotypes
areexpressed
as Ji1 =p -
1/2t,
!2 = tc and A’3 =J
i +
1/2t,
where t is the difference betweenhomozygotes,
referred to as thegene effect. The unknown
parameters
in the model are thenf,
p.,t,
andQ2
.
Likelihoods were
computed using
iterativepeeling.
For structure1,
withoutloops,
computations
were doneexactly
by
repeating
thecomputations
until nofurther
changes
occurred,
having
found the order for recursivecomputation.
For thelooped
pedigrees
of structures 2 and3,
iterativepeeling
was used to obtainapproximate
likelihoods. Thestopping
rule was achange
less than10-
8
for the average absoluteheterozygote probabilities
of all individuals. The maximum of thelikelihood was
sought using
the downhillsimplex algorithm
(Nelder
andMead,
1965),
using
as convergence criteria the variance of likelihood values ofpoints
inComparisons
Looped
andnonlooped
pedigrees
werecompared
inhypothesis
tests andparameter
estimation. Inhypothesis
testing,
a nullhypothesis postulating
the absence of amajor
gene isused,
describedby
a model withparameters it
anda
2
,
and analternative
hypothesis postulating
the presence of amajor
gene isused,
describedby
a model withparameters
f,
/1,
t anda2.
Tests are based on the likelihood ratio(LR)
teststatistic,
which is twice the naturallogarithm
of the ratio of maximum likelihoods under eachhypothesis. Type
I error and power, thecomplement
oftype
II error, were
investigated
at their nominallevel,
ieassuming
theexpected
classicalasymptotic
X2
distribution
for the LR test statistic under the nullhypothesis
(Wilks, 1938).
Using
the classicalrules, rejection
thresholds were obtained from aX2
2distribution with 2
degrees
offreedom,
ie the difference in number ofparameters
between the null and alternativehypothesis.
It should be noted that fortesting
mixtures,
these classical rules do not leadexactly
to the nominaltype
I errors(Titterington
etal,
1985),
but this is not ofimportance
for thecomparisons
betweenlooped
andnonlooped
pedigrees
to be made here. The likelihoodL
o
for the nullhypothesis
iscomputed
as:where y
i
are observations with i =1, ... , N,
the total number ofobservations,
assumed
normally
andindependently
distributed. Under the nullhypothesis,
the maximum likelihood estimate for the mean isíi
=&dquo;
BYi/
N
and for the variance is(;
2
=
&dquo;B(
Yi
-
íiO)
2
/N.
Type
I error of the test for amajor
gene wasinvestigated
by
simulating
1 000data sets of each structure
(table I),
generating
for each individualonly
arandomly
distributed error term with
U
2
= 100 asphenotype.
Likelihoods for the nullhypothesis
and the alternativehypothesis
werecomputed
in each of thesereplicated
datasets,
and the likelihood ratio test statistic was obtained. The number ofsignificant
tests in these 1 000 data sets was countedusing rejection
thresholdsof 4.605 and
5.991,
corresponding
to nominaltype
I errors of 10 and5%.
Power todetect a
major
gene wasinvestigated by
simulating
100 data sets of each structure(table I)
for 3 different gene effects t =5,
t = 7.5 and t = 10 andusing
allelefrequency f
= 0.5 and residual varianceU
2
= 100.Hence,
relative gene effectst
l
a
were
0.5,
0.75 and 1. Power was based on a nominaltype
I error of5%,
using
arejection
threshold of 5.991. Parameter estimates werecompared using
the 100 datasets of each structure
(table
I)
used toinvestigate
power with t = 10.RESULTS
Type
I errors weresignificantly
lower than theirnominal,
ieasymptotically
ex-pected, level,
butcomparison
oftype
I errors betweenlooped
andnonlooped
struc-tures did not showsignificant
differences(table
II).
This indicates that absolute values ofapproximate
likelihoods obtained are on average close toexpected
and that the distribution of the test statistic over a number ofreplicates
is notestimates for gene effect under the alternative
hypothesis
are biased ingeneral,
but estimates for gene effects as well as allelefrequency
do not differ betweenlooped
andnonlooped
structures(table IV).
This indicates that location of the maximumis,
on average overreplicates,
not altered forapproximate
likelihoods.DISCUSSION AND CONCLUSIONS
An alternative
peeling
algorithm,
called iterativepeeling,
has beenpresented.
The iterativepeeling algorithm
includes analgorithm
to find an order forevaluating
peeling
equations.
When an order cannot befound,
as inlooped
pedigrees,
anapproximate
likelihood issupplied.
In this case, use of apartitioned computation
of the likelihood is also crucial. Traditional recursive
peeling
does not involve suchapproximations,
because this methodonly
computes
the exact likelihood once apeeling
order is found andcomputes
the likelihoodby representing
allpedigree
information in terms for asingle
individual. Usefulness of iterativepeeling
asan
approximate
method inlooped
pedigrees
wasinvestigated by
simulations. At anaggregate
level,
iecompared
on average over a number ofreplicated
datasets,
no differences were found betweenlooped
andnonlooped pedigrees.
Exactcomputations
were unfeasible due to thelarge
number ofloops
in thetypical
animalbreeding pedigrees
weconsidered,
andproperties
of iterativepeeling
could not be studiedcomparing
exact andapproximated
likelihoods in individual data sets.The iterative
peeling
method may be of interest forapplication
in animalbreeding.
In humanpopulations,
pedigrees
aregenerally
small andloops
are notabundant so that exact
computations
can be consideredusing
morecomplicated
forms of
peeling
(see
Cannings
etal,
1978).
These morecomplicated
forms ofpeeling
considergenotypes
on sets of individualsjointly.
Larger pedigrees
andmore abundant
looping
in animalbreeding, however,
makes the sets ofgenotypes
consideredjointly
toolarge
for exactcomputations
to be feasible.Therefore,
approximate
methods arerequired
forapplication
in animalbreeding.
Iterativepeeling
seems verysuited,
being
exact withoutloops,
andautomatically supplying
approximate
likelihoods whenloops
arepresent.
Notethat,
due to thepartitioned
computation
oflikelihood,
iterativepeeling
alsoautomatically
handlespedigrees
consisting
ofindependent families,
ie datatraditionally
handled with sire orsire-and-dam models. The
equations
andpartitionings given
here could be extended toallow for more
general pedigrees.
Inparticular,
allowance could be made for femalesbeing
mated with several males. In this case,partitionings
should accommodate for’linking
individuals’being
parents
in severalfamilies,
rather thanjust
one. Themonogenic
model used could also be extended to a mixed inheritancemodel,
themodel
usually required
foranalysis
of animalbreeding
data. In iterativepeeling only
uni- and bivariate functions ofgenotypes
are considered onsingle
families. This canbe combined with for instance a Hermitian
integration
(Le
Roy
etal, 1989;
Knottet
al,
1992)
to include apolygenic
component.
ACKNOWLEDGMENTS
This research was
supported financially by
the Dutch Product Board for Livestock andMeat, the Dutch
Pig
HerdbookSociety,
BovarBV,
VOC Nieuw-Dalland BV, Euribrid BVREFERENCES
Cannings C, Thompson EA,
Skolnick MH(1976)
The recursive derivation of likelihoodson
complex pedigrees.
AdvAppl
Prob 8, 622-625Cannings C, Thompson EA,
Skolnick MH(1978)
Probability
functions oncomplex
pedigrees.
AdvAppl
Prob10,
26-61Elston
RC,
Stewart J(1971)
Ageneral
model for thegenetic analysis
ofpedigree
data. Hum Hered 21, 523-542Hoeschele I
(1988)
Genetic evaluation with datapresenting
evidence of mixedmajor
geneand
polygenic
inheritance. TheorAppl
Genet76,
81-92IMSL
(1984)
Library Reference Manual,
Edition 9.2, International and StatisticalLi-braries, Houston,
TXKnott
SA, Haley CS, Thompson
R(1992)
Methods ofsegregation analysis
for animalbreeding
data: acomparison
of power.Heredity
68, 299-311 1Lange K,
Elston RC(1975)
Extensions topedigree analysis
I. Likelihood calculations forsimple
andcomplex pedigrees.
Hum Hered25,
95-105Le
Roy P,
Elsen JM, Knott SA(1989)
Comparison
of four statistical methods for detectionof a
major
gene in a progeny testdesign.
Genet Sel Evol 21, 341-357Nelder
JA,
Mead R(1965)
Asimplex
method for function minimization.Comp
J 7,147-151
Titterington
DM, SmithAFM,
Makov EU(1985)
StatisticalAnalysis of
Finite Mixture Distributions.Wiley
andSons,
NewYork,
NYVan Arendonk