HAL Id: hal-00893889
https://hal.archives-ouvertes.fr/hal-00893889
Submitted on 1 Jan 1991
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Estimation of relatedness in natural populations using
highly polymorphic genetic markers
P Capy, Jfy Brookfield
To cite this version:
Original
article
Estimation
of
relatedness
in natural
populations
using highly polymorphic
genetic
markers
P Capy
JFY Brookfield
2
1Centre National de la Recherche
Scientifique
Laboratoire deBiologie
etGénétique
Evolutives91198
Gifsur
YvetteCedex, France;
2
University of Nottingham, Department of Genetics,
Queen’s
MedicalCenter, Nottingham
NG72UH,
UK(Received
7Nlay 1990; accepted
2August 1991)
Summary - This report addresses 3 important
questions
inpopulation biology: 1),
Is itpossible
to determine the actualkinship
between individuals taken at random froma natural
population?
2),
Is itpossible
to estimate an averagedegree
of kinship in apopulation
in terms of theprobability
that 2 individuals drawn at random are related?3),
Is itpossible
to estimate apopulation’s family
structure in terms of the number and the relative size of the different families? To answer thesequestions
the estimation ofkinship
between 2 individuals is first considered. To do this, identityprobabilities,
basedupon 2 sets of assumptions
concerning
thegenetic
markersused,
were derived for differentcases of
kinship.
The use of VNTRs(variable
number of tandemrepeats)
shows that formultilocus
probes,
all distributions ofidentity broadly overlap
even when the number ofloci is about 20. Therefore
by
VNTRs alone, it is difficult to define the truekinship
between 2 individuals whenonly
their DNAfingerprints
arecompared.
More accurate estimationscan be achieved with monolocus
probes.
However, to estimate apopulation’s
structure orthe average
degree
ofkinship
betweenindividuals,
it is not necessary toidentify precisely
each individualsampled,
but rather,only
to determine whether individuals are related ornot. For this, it is necessary to define a threshold
identity
value whichdepends
on thecommon patterns that can be observed between unrelated individuals. Below this
value,
individuals are considered to be unrelated
and,
aboveit, they
are considered to be related.Finally,
asequential sampling procedure
isproposed.
natural populations
/
relatedness/
genetic marker/
multilocus probes/
monolocusprobes
*
Résumé — Estimation de la parenté au sein des populations naturelles à l’aide de marqueurs génétiques hautement polymorphes. Peut-on déterminer les liens de
parenté
entre 2 individus pris au hasard dans unepopulation
naturelle ? Peut-on estimer laparenté
moyenne, c’est-à-dire laprobabilité
de tirer au hasard 2 individusapparentés,
au sein d’unepopulation
naturelle ? Ou bien encore peut-on déterminer la structure d’unepopulation,
à savoir le nombre et la taille relative desdifférentes familles qui
lacomposent ?
#Pour
répondre
à cesquestions,
l’estimation de laparenté
entre 2 individus a été tout d’abordenvisagée.
Apartir
de 2 sériesd’hypothèses
relatives aux marqueursgénétiques
utilisés,
lesprobabilités
d’identité entre 2 individus ont étédéfinies
pour des liens deparenté simples. L’application
de ces 2 modèles aux VNTR montre que pour les sondesmultilocus, les distributions des
probabilités
d’identité se recouvrent trèslargement,
mêmelorsqu’une
vingtaine de locus sont détectés. Parconséquent,
il estdifficile,
voireimpossible,
de déterminer
précisément
laparenté
entre 2 individus en se basant exclusivement sur ce type de données. Par contre, l’utilisation simultanée deplusieurs
sondes monolocuspermet d’obtenir des estimations
plus précises.
Pour estimer la structure d’unepopulation
ou laparenté
moyenne entre individus, il n’est pas nécessaired’identifier précisément
chaque
individu, mais uniquement de déterminer si 2 individus sontapparentés
ou non. Pourcela,
un seuil d’identité estdéfini en fonction
des valeurs d’identité observées entreindividus non
apparentés.
Endeçà
de cette valeurseuil,
les individus ne sont pas considérés commeapparentés
etau-delà,
il est admisqu’ils
lesont. Enfin,
uneprocédure séquentielle
d’échantillonnage
estproposée.
population naturelle
/
relation de parenté / marqueur génétique/
sonde multilocus/
sonde monolocusINTRODUCTION
In
population
genetics
manyproblems
of naturalpopulations
cannot be solved without a betterknowledge
of thekinship
structure atpresent
and in a smallnumber of
generations
in the recentpast.
The effective size of thepopulation,
itsnumber of founders and the
possible
existence of groups of related individuals maybe of
great
importance,
but it isusually
very difficult to obtain such data or evento make accurate estimates.
For
instance,
inDrosophila melanogaster, analyses
of enzymepolymorphism
often show a deficit inheterozygotes
in naturalpopulations.
TheWright
fixation index(Fis)
can reach 0.6-0.7(Danielli
andCosta, 1977;
David etal, 1989;
Vouidibioet
al,
1989).
Severalhypotheses
arefrequently proposed
toexplain
such results: selectionagainst
heterozygotes,
inbreeding,
and/or
themixing
ofpopulations
with different allelic
frequencies
(Wahlund effect).
However,
it remains difficult todetermine the relative
importance
of each process.Indeed,
inDrosophila
species,
it is almostimpossible
to estimate thesize,
thegeographical
limits and thekinship
structure
(number
of groups of related individuals orfamilies)
of apopulation.
During
the last few years, newtechniques
have beendeveloped
for estimatesof relatedness between two individuals chosen from a natural
population.
Thesetechniques
rest upon the detection ofhighly polymorphic
DNA sequences, suchthe main
problem
lies infinding
ahighly polymorphic
system
or a combination ofsystems.
Theprincipal
characteristic of thesesystems
must allow thedefinition,
foreach
individual,
of a&dquo;genetic identity card&dquo;,
or afingerprint,
sufficiently
accurateto avoid 2 unrelated individuals
possessing
the samepattern.
Such
genetic
systems exist in numerous vertebrates. Oneexample
is themajor
histocompatibility complex
(Dausset,
1958;
Vaiman, 1970; Klein,
1987)
which determinestransplant rejection.
Thissystem
consists of 4loci,
having
an average of10-20 alleles.
However,
in several naturalpopulations,
strong
linkage disequilibria
are found
(Dausset
andSvejgaard,
1977).
Thus,
theprobability
that unrelatedindividuals possess the same
haplotype
can behigh.
For
invertebrates, only enzymatic
data arepresently
available.However,
thesetechniques
do not detect many alleles. Forinstance,
inDrosophila melanogaster,
theAmylase
locus hasapproximately
13 described alleles(Dainou
etal,
1987)
and is among the mosthighly
polymorphic
loci. For other enzymes such as Esterase-6 andXanthine
dehydrogenase,
it is oftenpossible
to detect many morealleles,
ie between20 and 30
alleles,
whenelectrophoresis
conditions like bufferpH
orgel
concentration are modified(Coyne,
1976;
Singh
etal,
1976;
Modiano etal,
1979;
Ramshaw etal, 1979;
Singh,
1979; Keith,
1983).
However,
thegeographical
distribution of the alleles is nothomogeneous
and it is rare for all the alleles to exist in asingle
region.
In other
words,
at agiven place,
unrelated individuals may have similargenotypes.
Moreover,
thisdisadvantage
is reinforcedby
the factthat,
in agiven population,
the allele
frequencies
are far from uniform withgenerally
1 or 2frequent
alleles andseveral alleles at low
frequencies.
Such
problems
can bepartially
avoided when severalenzymatic
loci areconsid-ered
together.
This solution hasalready
beenproposed
forpaternity
determination(Chakraborty
etal,
1988),
for estimates of relatedness between colonies of social insects(Pamilo
andCrozier,
1982; Pamilo, 1984;
Queller
etal, 1988; Queller
andGoodnight,
1989)
and between individuals in vertebrates(Schartz
andArmitage,
1983;
Wilkinson andMcCraken,
1985).
However,
theseprocedures
are notalways
suitable when the social structures of
species
are unknown or not accessible.Recently,
severalgenetic
systems,
such astransposable
elements or minisatellitesand more
generally
RFLPs(Restriction
Fragment Length
Polymorphisms)
haveprovided
new ways ofestimating
thekinship
between individuals and ofanalysing
the structure of relatedness
(number
of groups of relatedindividuals)
in naturalpopulations.
However,
suchsystems
as minisatellites may still not be accurateenough,
and several authors havealready
stressed the limits of theseapproaches
for theanalysis
of naturalpopulations
(Lynch,
1988; Brookfield, 1989; Lewin,
1989).
The first aim of the
present
work is to evaluate the difficulties inestimating
the kinrelationship
between 2 individualsaccurately
when differentparameters
of a naturalpopulation,
such as the social structure, themating
system,
theage-classes,
thegeneration
turnover, and the existence ofoverlapping generations
amongothers,
are unknown. After a briefpresentation
of the basic model and a means ofmeasuring
thedegree
ofidentity
between 2individuals,
the distributions ofidentity
probabilities
between 2 individuals(using
two sets ofassumptions
concerning
thegenetic
systemsused)
will bepresented
for different kinrelationship. Then,
theirestimation of
kinship
structure,ie,
the number and the size of groups of relatedindividuals,
and on the estimation of an averagekinship level,
ie theprobability
that 2 individuals drawn at random are
related,
in apopulation
of unknownkinship
structure. A
sampling procedure
based upon the modelproposed
by
Rouault andCapy
(1986)
andby Capy
and Rouault(1987)
will beproposed.
MATERIALS AND METHODS
Basic model and
identity
between 2 individualsEach individual is defined
by
a set of bands obtained afterdigestion
by
a restrictionendonuclease(s)
of totalDNA, hybridisation
with a marked nucleic acidprobe
and
autoradiography.
Theresulting
set of bandscorresponds
to the individual’sfingerprint
and thesegregation
of each band is Mendelian.Identity
between 2 individuals can be calculated from the number of sharedbands;
these bandsbeing
identicalby
state orby
descent(Lynch, 1988).
Theexpression proposed by
Nei and Li(1979)
will be used. Inthis,
theidentity
betweena and b is:
where na and nbare the number of bands of individuals a and
b,
and nab the numberof bands shared
by
a and b. Thisexpression,
whichcorresponds
to theproportion
of bands shared between 2
individuals,
varies from 0(if
a and b have no commonbands)
to 1(if
a and b share all theirbands).
Identity
and relatednessIn the
previous
definition,
the value ofidentity
increases with the relatedness of individuals. Table Igives
some values ofidentity
for commonkinship.
For allsituations
given
in thistable,
it is assumed thatparents
in Go do not share any band and areheterozygous
at all their loci. In theseconditions,
for asingle locus,
the
comparison
between full sibs leads to the definition of 3 classes ofidentity 0,
1/2
and 1 with therespective
probabilities
4/16, 8/16
and4/16.
For thecomparison
betweenoffspring
of a bacl:cross,
4 classes ofidentity
exist0,
1/2, 2/3
and 1 with therespective
probabilities
2/16,
6/16, 4/16
and4/16.
From theseexamples,
it is clear that for agiven
averageidentity,
several kinrelationships
may exist. Forinstance,
the
expected
values ofidentity
betweenparent/offspring
and between full-sibs areidentical
(I
=50%).
The samephenomenon
is observed for theexpected
identitiesbetween F2 individuals
(offspring
ofFlxF1)
or betweenoffspring
of a backcross(I = 60.42%).
This result is more conclusive when the distributions ofidentity
areRESULTS
Expressions
and distributionsof identity probabilities
Two
simple
models will beconsidered,
each of themcorresponding
to 2 differentgenetic
markers and 2 levels ofpolymorphism
detection. As discussion will be interms of the
application
toVNTI3s,
model I is related to a monolocus system and model II to a multilocus system. In both cases, tosimplify
thepresentation,
theexistence of an
identity by
state will beneglected. Expressions
for theprobabilities
and distributions ofidentity
will begiven
for 4kinships
ieparent/offspring,
full-sibs,
half-sibs and unrelated individuals.Furthermore,
the distribution ofidentity
between Fl individuals of a
population,
foundedby
4 unrelated individuals(2
malesand 2
females),
will be calculated.Finally,
in the secondmodel,
to illustratethe
problem posed by overlapping generations,
identities for 4 otherkinships
(grandparent/grandchildren, uncle/nephew,
cousins and doublecousins)
will be defined.Model I
This model
corresponds
to an idealized situation. It is assumed that:1),
all locipresent
in a genome, for agiven probe,
aredetected;
2),
all individuals have the same number of loci(
T
i)
and all loci areheterozygous
(so
that all individuals have2n
bands); 3),
2 unrelated individuals do not share any bands.Under this
model,
theprobability
that 2 individuals share i bandsaccording
totheir
kinship,
is:Parent/offspring
(po):
where
CL
is the number of combinations of i bands among 2nbands;
Half-sibs
(hs):
-Unrelated individuals
(nr):
The
probability
ofsharing
i bands if the 2 individuals(a
andb)
compared
arederived from the first
generation
of apopulation
foundedby
F females and Mmales,
isgiven by:
where
P0,
P1 and P2 are theprobabilities
ofdrawing
2 individuals that are,respectively, unrelated,
half-sibs and full-sibs from thepopulation. Assuming
that all females and all males have the sameexpected
number ofoffspring,
the values of theseprobabilities
are :In these
expressions
it is assumed that agiven
female can be inseminatedby
several males and a
given
male can inseminate several females. WhenF/M
matesper males
exist,
ie monogamy when F =M,
theseprobabilities
become:According
to thismodel,
therelationship
betweenidentity
(I)
and the number of shared bands(i)
is:&dquo;&dquo;
Model II
In this second
model,
it is assumed that:1),
the number of bands per individual isnot constant;
2),
not all loci aredetected;
3),
only
one band per locus isdetected,
ie there are no allelic bands in the
fingerprint
of agiven individual;
4),
all loci areheterozygous; 5)
2 unrelated individuals do not share anybands;
6),
the number of bands per individual follows a Poisson distribution with a mean of n.Under these
conditions,
theprobability
that 2 individuals share i bandsaccording
to their
kin-relationship,
is:where
P!!!
is theprobability
that aparent
hasexactly
i bands,
e isexponential,
and wherej
max
is thehighest
possible
valueof j,
ie,
the maximum number of bands for an individual. Theprobability
P(
j
)
isgiven by:
Full-sibs
(fs):
Half-sibs
(hs):
Grandparent-grandchildren
(pc), uncle/nephew (un),
double-cousins(dc):
Cousins
(co):
Unrelated individuals
(n
T
):
Finally,
if 2 individuals are taken at random in the F1generation
of apopulation
founded
by
F females and Allmales,
theprobability
thatthey
share i bands isgiven
by expression
(4). Otherwise, according
to thismodel,
therelationship
betweenidentity
(I)
and the number of shared bands(i)
is:Figure
1gives
the theoretical distributions of identities for the 2 models and forthe first 4
kinship
relations described here. It has been assumed thatexactly
10 loci(ie
exactly
20 bands per individualaccording
to the modelI)
or an average of 10 loci(ie
about 10 bands per individual in the modelII)
can be detected. It can be seenfirstly,
that the distributions of full-sibs and of half-sibs aresymmetrical
in modeldifficult to discriminate between the distributions of half-sibs and full-sibs in the Fl progeny of a
simple population
(see
fig
ID).
When successive
generations overlap,
it becomes more and more difficult toestimate the true
kinship
between 2 individuals.Indeed,
the distributions ofdouble-cousins must all be considered. Several of these distributions have the same average
identity.
An illustration of this lastproblem
isgiven by
theanalysis
of asimple
hypo-thetical
genealogy
of 3 successivegenerations
(fig 3).
In this case, 6 unrelatedpairs
ofgrandparents
represent the firstgeneration.
Thesepairs
eachproduce
between 1and 4 children. These children
(a
total of 15individuals)
form the secondgenera-tion. The third
generation
iscomposed
of theoffspring
(a
total of 16individuals)
of thecouples
in the secondgeneration.
In thisgenealogy,
8 kinds orrelationship
exist and their relativeproportions
aregiven
in table II.Finally,
figure
4presents
thedistributions of identities
according
to model II. Most ot the distributionsoverlap,
making
it difficult to determine the exact kinrelationship
between 2 individuals. Forinstance,
for anidentity
of0.25,
the 2 individualscompared
can be: full sibs(3.12%),
half sibs(2.25%),
uncle/nephew
(35%),
parent/offspring
(3.75%),
grand-parent/grand
children(43.75%),
first cousins(8.75%),
double cousins(3.38%).
Application
to VNTR locipointing
out the difficulties inestimating
the relatedness between 2 individuals taken at random in apopulation
of unknown structure.The 2
systems
ofprobes
allow one to detecthighly polymorphic
loci for which the mutation rate can be close to1/100
pergeneration
and pergamete
(Burke, 1989).
Thus,
thepolymorphism
(number of alleles)
at agiven
locus should be muchgreater
than that
generally
observed for anenzymatic
locus. Inspite
of thisproperty,
theestimation of the true
genetic
relationship
between 2 individuals remains hazardous with multilocusprobes,
but seems more accurate with monolocusprobes.
Theprimary
advantages
of monolocusprobes
are that:1),
the number of loci isknown;
and
2),
thehomozygous
andheterozygous
states at a locus can be defined for agiven probe
(see
forexample
Nakamura etal,
1987).
As
regards
theseadvantages,
it appears that modelI,
which was not realistic withrespect
to multilocusprobes,
becomes more valid for monolocusprobes. Indeed,
inthis context, if n monolocus
probes
are usedsimultaneously,
each individual will bedefined
by
a number of bandslying
between n and2n,
and at least50%
of these bands will be transmitted to itsoffspring
(table III).
To
improve
modelI,
hypothesis
2 can bechanged,
insofar as it is not necessary to consider that all loci areheterozygous.
This isparticularly
important
in smallThus,
for n monolocusprobes,
agiven
individual(a)
willpresent
na bands withn < na < 2n. The number of
homozygous
loci will be HO = 2n - na. In theseconditions,
theexpressions
ofidentity probabilities
are identical to thosegiven
inheterozygous
loci.Thus,
if HO represents the average number ofhomozygous
lociper individual in a
given population, expressions
2 and 3 become:Full-sibs
( f s):
Half-sibs
(hs):
In these
conditions,
the total number of shared bands HO + i will be associatedwith the above
probabilities
Pfs!i!
orP
hs
(
i
)
’
Theoverlapping
proportion,
betweenthe
identity
distributions of these 2 kinrelationships,
will be related to the number ofheterozygous
loci in theirparents.
Thegreater
thisnumber,
the more the 2distributions will
overlap.
Estimation of the average
degree of kinship
andof kinship
structureThe
previous
models aresimple
cases with some unrealisticassumptions.
Oneassumption
is that 2 unrelated individuals do not share any bands.Indeed,
Wetton et al(1987)
and DT Parkin(personal communication)
haveshown, using
minisatellite sequences, that unrelated birds may share between 10 and25%
of theirbands,
which areprobably
identical in state and notby
descent. For minisatelliteprofiles,
thisidentity
can be due toelectrophoretic comigration,
especially
in theupper
part
of thegel
(Lynch, 1988).
Two other unrealisticassumptions
are that allloci detected are
heterozygous
and that in afingerprint
there are no allelic bands.For
instance,
several allelic bands were found in thefingerprint
analysis
of humanfamilies
(Jeffreys
etal,
1985)
indogs
and cats(Jeffreys
andMorton,
1987),
and inbirds
(Burke
andBruford,
1987).
Therefore,
a more realistic model should consider:1),
the number of bands variesfrom one individual to
another;
2),
there arehomozygous
loci andpairs
of allelicbands in the
fingerprint
of anindividual;
3),
2 unrelated individuals may share similar bands identicalby
state. Under theseassumptions,
it is obvious that anaccurate estimate of
kinship
between 2 individuals will be even more difficult. This results from the increase in theoverlapping proportion
of the different distributions ofidentity, mainly
due toidentity by
state.However,
with monolocusprobes
itseems
possible
to choose asample
ofprobes
which avoid or minimize these obstacles.In
population genetics,
andespecially
in theanalysis
of naturalpopulation
structure,
the aim is notalways
toget
accurate estimates ofkinship
betweendifferent individuals
(Gilbert
etal, 1990;
Kuhnlein etal,
1990).
In most cases, thepurpose is the estimation of the
kinship
structure.Therefore,
it isonly
necessary todetermine whether individuals
belong
to the samefamily
or not. On the otherhand,
anidentity
in state mayexist,
meaning
that 2 unrelated individuals may share someof their bands. In this
situation,
it becomes necessary to define a threshold valueor not. Below this
value,
it will beimpossible
to determine if two individuals aredirectly
related or share a recent common ancestor, and sothey
will be consideredto be
unrelated;
above thisvalue,
it will be considered that akinship
relationexists between these individuals. Of course, the definition of TV
depends
upon thepolymorphism
of thegenetic
system
used and upon thepopulation
understudy.
The more
polymorphic
thegenetic
system
and thepopulation,
the lower the TVwill be.
Estimates of the TV can be obtained
by comparing
known unrelated individuals.For
instance,
in the work of Wetton et al(1987)
onbirds,
the TV could be chosenbetween 0.044 and 0.247
(see
tableI,
p147).
However,
whennothing
is known aboutthe
kinship
structure of thepopulation,
the TV can be defined from theidentity
ofindividuals
belonging
to differentpopulations.
If
only
a fixed TV isdefined,
errors can be made when identities are very closeto the TV. For
instance,
it will bepossible
toclassify
as unrelated some relatedindividuals and to
classify
as related some unrelated individuals.Thus,
it will be more correct to define a zone ofuncertainty
around the TV in which it will be notpossible
to determine whether 2 individuals are related or not. Of course, the TVand the
uncertainty
zone will be definedaccording
to the distribution ofidentity
between unrelated individuals.Moreover,
with thisprocedure, only
individuals whoare
directly
related(ie
parent-offspring,
full-sibs,
grandparent/grandchildren, etc)
will be classed in the same
family;
andaccording
to theTV,
firstcousins,
for whom theexpected identity
is12.5%
could be considered as unrelated.Thus, employing
anappropriate
TVvalue, identity
can indeed be usedjust
to determine whether individuals are related or not. From an
identity matrix,
itis then
possible
to estimate theproportion
ofpairs
of related individuals. Thiscorresponds
to theprobability, Pr,
ofdrawing
at random 2 individuals who sharea common ancestor in the recent
past,
ie in theprevious
1 or 2generations,
or whoare
directly
related.Moreover,
from the sameidentity
matrix,
it is alsopossible
todefine different groups of related individuals or families in order to estimate the
population
structure,
ie the number of families and theirrespective
size.To
get
accurate estimates of Pr and ofpopulation
structure,
asampling
procedure
similar to thatproposed
by
Rouault andCapy
(1986)
andby Capy
and Rouault(1987)
can be used. This is asequential
procedure
based on therelationship
between thesample
size,
theparameter
estimated and confidence intervals ofproportions
and/or
asampling
error. In the first case, theproportion
of
pairs
of related individuals must be estimated. Theprobability
ofobserving
np
pairs
of related individuals in asample
of n individuals follows a binomial.Since a
proportion
(Pr)
must beestimated,
thesampling procedure
will bestopped
when the confidence interval of Pr will beequal
to or below agiven
value fixed apriori
beforesampling.
In the second case, thepopulation
structure will be definedby
the number and the size of the different groups of related individuals.Thus,
the
probability
ofdrawing
ni members of eachfamily
i follows a multinomialdistribution. In this latter case, the
sampling procedure
should bestopped
when theprobability
of thesample
and the confidence interval of eachproportion
(here,
the relativeproportion
of eachfamily)
isequal
to or below theparameters
definedDISCUSSION AND CONCLUSIONS
The above results
complement
those ofLynch
(1988)
and Brookfield(1989)
and indicate the limits of the use ofgenetic
systems
such as minisatellites for theanalysis
of relatedness in natural
populations
(see
alsoLewin,
1989).
Nevertheless,
as has been shown forbirds,
suchsystems
mayprovide
new information tocomplete
orconfirm that obtained
by
othertechniques
(Wetton
etal,
1987;
Burke:1989;
Burkeet
al,
1989).
Withoutpreliminary
data on the structure of thepopulation
(size,
geographical
limit,
age-classes,
etc)
and on the sexualand/or
family
behavior ofindividuals,
it isquite impossible
to estimate the exactkinship
relation between different individuals.However,
ifonly
the relatedness(without
accurate estimatesof the true
kinship
relation)
between individuals isconsidered,
it ispossible
toenvisage
the estimation of an average rate ofkinship
or of apopulation
structure.However,
withgenetic
systems
which show ahigh
mutation rate and for which it isimpossible
to detect thekinship
between individualshaving
anidentity
of10-15%,
the
only
individuals which can be shown to be related will beparent/offspring,
brother-sister,
individuals involved in a backcross or, moregenerally,
individuals ofinbred strains or families. The main
advantage
of the modelproposed
here is thatit is not necessary to
identify
the different alleles and their relativefrequencies.
However,
this can be done for monolocusprobes,
and in this case a method similarto that
proposed by Queller
andGoodnight
(1989)
could be used for estimation of relatedness.In the
present
work,
only
2 kinds ofhypothetical
genetic
systems
have been considered.Among
the differentsystems
already described,
several could be used for such ananalysis.
The main characteristics of a suitable system would be thefollowing:
(1)
each individual has agreat
number of bands(from
10 to30);
(2)
heterozygosity
must behigh;
(3)
the number of bands sharedby
unrelated individuals must be as low aspossible.
With
regard
to the multilocusprobes available,
most of them do not fulfill all these conditions. The number of bands may vary from 2-3 to more than20;
theheterozygosity
and the mutation rate seem to be variable but veryhigh
(in
somecases, ;:
97%
for theheterozygosity
and 0.003 pergamete
for the mutationrate;
Jeffreys
etal,
1988);
but the number of bands shared between unrelated individualsmay be
large
(
z
14%
inbirds;
Wetton etal,
1987).
With the
development
of monolocusprobes,
many inconveniences could be avoided or reduced. Severalprobes
could be usedsimultaneously,
as differentenzymatic
loci,
with theadvantage
that most loci possess ahigh
mutation rateand
probably
a uniform distribution of theirrespective
alleles in agiven population
as well.Moreover,
with suchprobes
it becomespossible
to minimize the level ofidentity
in state between bands of 2 individuals.ACKNOWLEDGMENTS
We thank D
Iirane,
DLHartl,
ALarson,
M Hedin and 2 anonymous referees forhelpful
REFERENCES
Burke T
(1989)
DNAfingerprinting
and other methods for thestudy
ofmating
success. Trends Ecol Evol
5,
139-144Burke
T,
Bruford MW(1987)
DNAfingerprinting
in birds. Nature(Lond)
327,
149-152Burke
T,
DaviesNB,
BrufordMW,
Hatchwell BJ(1989)
Parental care andmating
behaviour
of polyandrous
dunnocks Prunella modularis related topaternity
by
DNAfingerprinting.
Nature(Lond)
338,
249-251Brookfield JF1’
(1989)
Analysis
of DNAfingerprinting
data in cases ofdisputed
paternity.
IMA J MathAppl
MedBiol 6,
11-131Capy
P.
Rouault J(1987)
Estimation of allele number in a naturalpopulation by
the isofemale line method. Genetics
117,
795-801Chakraborty
R, Meagher TR,
Smouse PE(1988)
Parentage analysis
withgenetic
markers in naturalpopulations.
I. Theexpected
proportion
ofoffspring
withunambiguous
paternity.
Genetics118,
527-536Co3-ne
JA(1976)
Lack ofgenetic similarity
between twosibling
species
as revealedby
variedtechniques.
Genetics84,
593-607Dainou
0,
CariouML,
DavidJR,
Hickey
D(1987)
Amylase
geneduplication:
anancestral trait in the
Drosophila melanogaster species
subgroup.
Heredity 59,
245-251Danielli
GA,
Costa R(1977)
Transientequilibrium
at the Est-6 locus in wildpopulation
ofDrosophila melanogaster.
Genetica47,
37-41David
JR,
Alonso-Morraga
A,
Capy
P,
Munoz-SerranoA,
Vouidibio J(1989)
Short range variations and alcohol resources in
Drosophila melanogaster.
Symp
Evolutionary Biology of
Transient UnstablePopulations
(Fontevilla
A,
ed)
Springer-Verlag
Berlin,
130-143Dausset J
(1958)
Iso-leuco-anticorps.
ActaHematol 20,
156Dausset
J,
Svejgaard
A(1977)
HLA and Diseases.Munksgaard,
Copenhagen
Gilbert
DA,
LehmanN,
O’BrienSJ, Wayne
RK(1990)
Geneticfingerprinting
reflectspopulation
differentiation in the California Channel Island fox. Nature(Lond)
344,
764-767Jeffreys AJ,
WilsonV,
Thein SL(1985)
Hypervariable
&dquo;minisatellite&dquo;regions
inhuman DNA. Nature
(Lond)
314, 67-73
Jeffreys
AJ,
Morton DB(1987)
DNAfingerprints
ofdogs
and cats. Anim Genet18,
1-15
Jeffreys
AJ, Royle NJ,
WilsonV, Wong
Z(1988)
Spontaneous
mutation rates tonew
length
alleles attandem-repetitive
hypervariable
loci in human DNA. Nature(Lond)
332,
278-281Keith TP
(1983)
Frequency
distribution of esterase-5 alleles in twopopulations
ofDrosophila pseudoobscura.
Genetics105,
135-155Klein J
(1987)
NaturalHistory of
theMajor
Histocompatibility
Complex.
JohnWiley
and
Sons,
New YorkKuhnlein
U, Zadworny D,
DaweY,
FairfullRW,
Gavora JS(1990)
Assessment ofinbreeding by
DNAfingerprinting:
development
of a calibration curveusing
definedLewin R
(1989)
Limits to DNAfingerprinting.
Science243,
1549-1551Lynch
M(1988)
Estimation of relatednessby
DNAfingerprinting.
Mol BiolEvol 5,
584-599
Modiano
G,
BattistuzziG,
EsanGJF,
TestaV,
Lazzato L(1979)
Genetichet-erogeneity
of &dquo;normal&dquo; humanerythrocyte
glucose-6-phosphate
dehydrogenase:
anelectrophoretic polymorphism.
Proc Natl Acad Sci USA76,
852-856Nakamura
Y, Leppert M,
O’ConnellP,
WolffR,
HolmT,
CulverM,
MartinC,
Fujimoto E,
HoffM,
KumlinE,
White R(1987)
Variable number of tandemrepeat
(VNTR)
markers for human genemapping.
Science235,
1616-1622lVei
M,
Li WH(1979)
Mathematical model forstudying genetic
variation in termsof restriction endonucleases. Proc Natl Acad Sci
USA,
76,
5269-5273Pamilo P
(1984)
Genotypic
correlation andregression
in social groups:multiple
alleles, multiple
loci and subdividedpopulations.
Genetics107,
307-320Pamilo
P,
Crozier RH(1982)
Measuring genetic
relatedness in naturalpopulations:
methodology.
TfieorPopul
Biol 21,
171-193Queller DC,
StrassmannJE,
Hughes
CR(1988)
Genetic relatedness in colonies oftropical
wasps withmultiple
queens. Science242,
1155-1157Queller
DC,
Goodnight
KF(1989)
Estimating
relatednessusing genetic
markers. Evolution43,
258-275Ramshaw
JAM,
Coyne
JA,
Lewontin RC(1979)
Thesensitivity
ofgel
electrophore-sis as a detector of
genetic
variation. Genetics93,
1019-1037Rouault
J,
Capy
P(1986)
Comment 6valuer un nombre decategories
par6chantillonnage.
Ann EconStat 4,
111-124Singh
RS(1979)
Genicheterogeneity
withinelectrophoresis
&dquo;alleles&dquo; and thepattern of variation among loci in
Drosophila
pseudoobscura.
Genetics93,
997-1018Singh RS,
LewontinRC,
Felton AA(1976)
Genicheterogeneity
withinelec-trophoretic
&dquo;alleles&dquo; of xanthinedehydrogenase
inDrosophila pseudoobscura.
Ge-netics
84,
609-629Schwartz
OA,
Armitage
hB(1983)
Problems in the use ofgenetic similarity
toshow relatedness. Evolution
37,
417-420Vaiman
M,
RenardC,
Lafage
P,
AmeteauJ,
Nizzu P(1970)
Evidence for ahistocompatibility
system
inpig. Transplantation
10,
155-164Vouidibio
J,
Capy P, Defaye D,
PlaE,
SandrinJ,
CsinkA,
David JR(1989)
Shortrange
genetic
structure ofDrosophila
melanogaster
populations
in anAfrotropical
urban area, and its
significance.
Proc Natl Acad Sci USA86,
8442-8446Wetton
JH,
CarterRE,
ParkinDT,
Walters D(1987)
Demographic
study
of a wildhouse sparrow
population by
DNAfingerprinting.
Nature(Lond)
327,
147-149
Wilkinson