A NNALES DE L ’I. H. P., SECTION B
D AVID A LDOUS
J IM P ITMAN
Tree-valued Markov chains derived from Galton-Watson processes
Annales de l’I. H. P., section B, tome 34, n
o5 (1998), p. 637-686
<http://www.numdam.org/item?id=AIHPB_1998__34_5_637_0>
© Gauthier-Villars, 1998, tous droits réservés.
L’accès aux archives de la revue « Annales de l’I. H. P., section B » (http://www.elsevier.com/locate/anihpb) implique l’accord avec les condi- tions générales d’utilisation (http://www.numdam.org/conditions). Toute uti- lisation commerciale ou impression systématique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit conte- nir la présente mention de copyright.
Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques
http://www.numdam.org/
637
Tree-valued Markov chains derived
from Galton-Watson processes
David ALDOUS and Jim PITMAN Department of Statistics, University of California, 367 Evans Hall # 3860, Berkeley, CA 94720-3860, U.S.A.
E-mail: aldous @ stat.berkeley.edu,
pitman @ stat. berkeley .edu
Vol. 34, n° 5, 1998, p. -686 Probabilités et Statistiques
ABSTRACT. -
Let G
be a Galton-Watson tree, and for 0 u 1 letGu
be the subtree
of G
obtainedby retaining
eachedge
withprobability
u. Westudy
the tree-valued Markov process(&u,0 ~ 1 )
and ananalogous
process
(~~,
0 u1)
in which~i
is a critical or subcritical Galton- Watson tree conditioned to be infinite. Resultssimplify
and are furtherdeveloped
in thespecial
case of Poissonoffspring
distribution. @Elsevier,
Paris
Key words: Borel distribution, branching process, conditioning, Galton-Watson process,
generalized Poisson distribution, h-transform, pruning, random tree, size-biasing, spinal decomposition, thinning.
RESUME. -
Soit y
un arbre deGalton-Watson,
et pour 0 u 1 soitGu
l’arbre contenant la racine obtenu enpartant de G
et retenantchaque
branche avec
probability
u. Nous etudions le processus de Markov a valeurs« arbres »
(Yu,
0 u1 )
et un processusanalogue (~, 0 u 1 )
ou~i
est un arbre de Galton-Watson
critique
ousous-critique
conditionnellement a Fevenement ou il est infini. Les resultats sesimplifient
et sontdeveloppes plus
en details pour le casspecial
de distribution de Poisson des descendants.@
Elsevier,
ParisAMS Subject classifications 05C05, 60C05, 60J27, 60J80
Research supported in part by N.S.F. Grants DMS9404345 and 9622859 Annales de l’lnstitut Henri Poincaré - Probabilités et Statistiques - 0246-0203
638 D. ALDOUS AND J. PITMAN
Contents
1. Introduction 638
1.1. Related
topics ...
6402.
Background
and technicalset-up
640 2.1. Notation andterminology
for trees ... 6412.2. Galton-Watson trees ... 644
2.3. Poisson-Galton-Watson trees ... 646
2.4. Uniform Random Trees ... 646
2.5.
Conditioning
on non-extinction ... 6473.
Pruning
Random Trees 652 3.1. Transition rate.s ... 6543.2.
Pruning
a Galton-Watson tree ... 6573.3.
Pruning
a GW tree conditioned on non-extinction ... 6613.4. The
supercritical
case ... 6654. The PGW
pruning
process 666 4.1.The joint
law of ... 6664.2. Transition rates and the ascension process ... 669
4.3. The distribution ... 671
4.4. The process
(C§J,
0 M1)
... 6724.5. A
representation
of the ascension process ... 6744.6. The
spinal decomposition
of9*
... 6764.7. Some distributional identities ... 678
4.8. Size-Modified PGW-trees ... 681
1. INTRODUCTION
This paper
develops
sometheory
for Galton-Watsontrees 9 (i.e. family
trees associated with Galton-Watson
branching processes), starting
from thefollowing
two known facts.(i) [Lemma 10]
For fixed 0 u 1 letYu
be the"pruned"
tree obtainedby cutting edges of G (and discarding
the attachedbranch) independently
with
probability
1 - u.Then Yu
is another Galton-Watson tree.(ii) [Proposition 2]
For critical orsubcritical ~
one can define a treeinterpretable as g
conditioned on non-extinction.Qualitatively, ~°°
consistsof a
single
infinite"spine"
to which finite subtrees are attached.We
interpret (i)
asdefining
apruning
process u1),
whichis a tree-valued continuous-time
inhomogeneous
Markov chain such thatde l’Institut
go
is the trivial treeconsisting only
of the root vertex, and91 = 9.
Ananalogous pruning
process(9~0 u 1)
with~l
=900
is constructed from the conditioned tree900
of(ii).
Section 3gives
a carefuldescription
of the transition rates and transition
probabilities
for these processes. Thetwo processes are
qualitatively different,
in thefollowing
sense.If G
issupercritical
then on theevent 9
is infinite there is a random ascension time A such thatGA- is
finite but9 A
is infinite: the chain"jumps
toinfinite size" at time A. In contrast, the process
(9~) "grows
toinfinity"
at time
1, meaning
that~~
is finite for u 1 but~i _ - ~l
is infinite. A connection between the two processes is made(Section 3.4) by conditioning (9~0 ~ ~ A)
on the event that Aequals
the criticaltime,
i.e. the a forwhich 9a
has meanoffspring equal
1.By rescaling
the timeparameter
we may take a =
1,
and the conditioned process is then identified with?,0 ~ 1).
.These results
simplify,
and further connections appear, in thespecial
caseof Poisson
offspring distribution,
which is thesubject
of Section 4. Therewe consider
(C, , 0 /~ oo ),
-where~~
is thefamily
tree of the Galton- Watsonbranching
process withPoisson(J1) offspring,
and the associatedpruned
conditioned process(~,0 ~c 1 ) .
Tohighlight
fourproperties:
. The distribution of
9~
has several differentinterpretations
as a limit(Section 4.3).
. For
fixed ~ 1,
the distribution of~~
is the distribution of9~,
size-biased
by
the total size of~~ (Section 4.4).
~ The process
(~,~)
run until its ascension time A > 1 has arepresentation
in terms of
(9;)
as(Section 4.5)
where U is uniform
( 0,1 ) , independent
of(9~0 ~ ~ 1 ) .
~ In
constructing ~~ by pruning
a certain vertex becomesdistinguished,
i.e. thehighest
vertex of thespine
of~i
retained in~.
This vertex turns out to be distributeduniformly
on~~
and asimple spinal decomposition
ofC§J
intoindependent
treecomponents
is obtainedby cutting
theedges
of~~ along
thepath
from the root tothe
distinguished
vertex(Section 4.6).
Other
topics
includeconsequent
distributional identitiesrelating
Borel andsize-biased Borel distributions
(Sections
4.5 and4.7)
and theinterpretation
640 D. ALDOUS AND J. PITMAN
of trees conditioned to be infinite as
explicit
Doobh-transforms,
with the related identification of the Martinboundary
of(p, ~~ ) (Section 4.4).
None of the individual results is
especially hard;
thelength
of the paper is duepartly
to ourdevelopment
of aprecise
formalism forwriting rigorous proofs
of such results. Section 2 contains this formalism and discussion of known results.1.1. Related
topics
Of course,
branching
processes form a classicalpart
ofprobability theory.
Various
"probability
on trees"topics
ofcontemporary
interest are treatedin the
forthcoming monograph by Lyons
and Peres[31],
whichexplores
several
aspects
of Galton-Watson trees but touchesonly tangentially
on thespecific topics
of this paper.Our motivation came from the
following considerations,
which will be elaborated in a morewide-ranging
but less detailedcompanion
paper[ 1 ] . Suppose
that for each N there is a Markov chaintaking
values in the set of forests on N vertices.Looking
at the treecontaining
agiven
vertexgives
a tree-valued process, andtaking
~V 2014~ oo limits maygive
a tree-valued Markov chain. The
prototype example (not exactly forest-valued,
ofcourse)
is the randomgraph
process(G(N, P(edge) = ~c N)
for which the limit tree-valued Markov chain is our
pruned
Poisson-Galton-Watson process
(9~0 ~c oo ) [2].
Thepruned
conditioned Poisson-Galton-Watson process(~~,
0 p1)
arises in a more subtle way as a limit of the Marcus-Lushnikov(discrete coalescent)
process with additive kernel(see [7]
forbackground
on thegeneral
Marcus-Lushnikov process and[42], [43], [44], [38]
for recent results on the additivecase).
More exotic variations of
(~),
e.g. astationary
Markov process in which branches grow and are cut down uponbecoming infinite,
arise as other N ~ oo limits and are studied in[ 1 ] . Finally,
we remark that theunconditioned and conditioned critical Poisson-Galton-Watson distributions arise as ~V 2014~ oo limits in several other contexts
(as "fringes"
in randomtree models
[4],
inparticular
in randomspanning
trees[3], [37];
in theWright-Fisher model)
where there is no naturalpruning
structure.2.
BACKGROUND
AND TECHNICAL SET-UPHere we set up our
general
notation for random trees, andpresent
somebackground
material about Galton-Watson trees.Annales de l’Institut Probabilités
2.1. Notation and
terminology
for treesExcept
where otherwiseindicated, by
a tree t we mean a rooted labeled tree, that is a set V =verts(t),
called the set of vertices orlabels of
t, equipped
with a directededge
relation~
such that for some(obviously unique)
elementroot (t )
E V there is for each vertex v E Va
unique path from
the root to v, that is a finite sequence of vertices( vo
=root(t), vi, ... , vh
=v)
such that for each 1 i h.Then h =
h (v, t )
is theheight
of vertex v in the tree t.Formally,
t isidentified
by
its vertex set V and its set of directededges,
that is the set~(v, w)
E V x V : If a subset S ofverts(t)
is such that the restriction of the relation~
to S x S defines a tree s withverts(s)
=S,
then either S or s may be called a subtree of t. Let
# V E {O, 1,2,
... ,be the number of elements of a set
V,
and for a tree t let#t
=~verts(t).
The number of
edges
of a tree t is#t -
1. For a tree t and v Everts(t)
let
children ( v, t) := {t~
Everts(t) :
denote the set of children ofv in
t,
and let :==#children(v, t ),
the number of children of v in t.Each non-root vertex w of t is a child of some
unique
vertex v oft,
sayv =
parent (w, t ) .
Let s and t be two trees. Call s arelabeling of t
if thereexists a
bijection .~ : verts(s)
-verts(t)
such thatv w
if andonly
ifThen
root(t) == .(root(s))
andh(v, s)
=t).
In the discussion above there was no notion of "birth order" for children.
To
incorporate
thisnotion,
for n E N:= {1,2,...}
letTn
denote the set offamily
trees(also
called rooted ordered trees orplanted plane
trees[ 14], [46])
with n vertices.Figure
1 illustrates the 5 treescomprising T4.
We
interpret
an element t of T :=UnTn
as a finitefamily
tree withthe root
representing
asingle progenitor,
and each vertex of the treerepresenting
an individual of thefamily.
Then for g =0,1, 2 ...
each642 D. ALDOUS AND J. PITMAN
vertex at
height g corresponds
to an individual in thegth generation
ofthe
family.
While thegraphical representation
of t inFigure
1 involvesno
explicit labeling
of the vertices v oft,
weidentify
an individual in thegth generation
of t as a sequenceof g integers,
for instance(2, 7, 4)
to indicate a third
generation
individual who is the 4th child of the 7th child of the 2nd child of theprogenitor. Thus, following
Harris[22], §VI.2
and Kesten
[25],
weidentify
each t ET(°°)
as a rooted labeled tree withverts (t )
CV,
where V :=U {0}
is the set of all finite sequences ofpositive integers, together
with a root element denoted 0.Regard V
as arooted tree, with v if and
only
if w =(v, j )
forsome j
EN,
where(v, j )
EN9+1
is definedby appending j
to v G N9for g
>1,
and where(0, j) = j
E N. If w =( v, j )
call wthe j th
child of v,call j
the rank of w,and
write j
=rank(w).
So for each non-root w EV,
thepositive integer rank( w)
E N is the lastcomponent
of the finite sequence w. A(finite
orinfinite) family
tree is a subtree t of V such that 0 Everts(t),
and for eachv E
verts(t)
the set of children of v is theset {(t~) :
1j
where
cut
isrequired
to be finite. LetT(°°)
denote the set of all suchfamily
trees t. Each t ET(°°)
is a subtree ofV,
whoseedge
relation2014~
on
verts(t)
is definedby
restriction of theedge
relation~
on V. Afamily
tree t is therefore
uniquely
identifiedby
its vertex setverts(t)
CV,
andit is convenient to
identify
t withverts(t).
ThusT( 00)
is identified as acollection of subsets of V
subject
to certain constraints indicatedabove,
and we may write for instance v ~ t instead of v Gverts(t)
to indicatethat v is a vertex of t. From the definitions above
The
height
of a finite tree is the maximumheight
of all vertices in thetree. There is a natural restriction map rh : - where
T~h>
. is the set of finite
family
trees ofheight
at most h. For t identified as a subset ofV, rht
is the tree formedby
all vertices of t ofheight
atmost h. A tree t G
T(°°)
is identifiedby
the sequence(rht, h
>0).
Notethat the
rht
ET(h)
aresubject only
to theconsistency
condition thatrht
= The set is now identified as a subset of an infiniteproduct
of countable setsWe
give T(°°)
thetopology
derivedby
this identification from theproduct
of discrete
topologies
on theT~h~ .
So a sequence of treestn
has a limitlimn tn
= t E iff for every h there exists E andn(h)
suchthat
rhtn
=t(h)
for all n >n(h);
the limit is then theunique
t ET(°°)
with
rht = t~B
Inparticular,
for each t ET~B
the sequencern t
has limit t as n - oo .Let s be a tree whose vertex set V is a subset of some set S
equipped
with a total
ordering.
Inparticular,
we have in mind the cases S = N with the usualordering,
and S = V withlexicographical ordering
and 0as least element.
Suppose
each v Everts(s)
hasonly
a finite number ofchildren. Then there is a natural
relabeling R
of the vertices of sby V
whichdefines a
family
tree t associated with s, say t =fam(s).
Therelabeling
.~ :
verts(s)
2014~ is defined as follows. Let~(root(s))
= 0. For each non-root vertex v of s letrank(v)
be the rank of vamongst
itssiblings,
that is the number of w E
verts(s)
such that w has the sameparent
as v and w v. For v withheight h;2:
1 in s let(root(s),
~i,... vh =f)
bethe
path
fromroot(s)
to v in s, and defineThe tree t =
fam ( s )
ET( 00)
> is the subtree of V whose vertex set isl(verts(s)),
the range of therelabeling map l : verts(s)
- V.For t E
T~~ and g
>0,
letgen(g, t)
be thegth generation of
individuals int,
in other words the set of vertices of t ofheight
g. To illustratenotation,
there are thefollowing
identities between subsets of the set V:for all h =
0,1, ... :
Let
Zht
:=#gen(h, t),
the size ofgeneration
h of t. The above identitiesimply
The abbreviation
makes a convenient notation for the number of individuals in the first
generation
oft,
that is the number of children of the root 0 of t. Denote the trivialfamily
tree with thesingle
vertex 0by
8.Starting
fromrot
= ~,644 D. ALDOUS AND J. PITMAN
a
family
tree t isconveniently specified
as theunique
tree t such thatrht
=t(h)
for all h for some sequence of treest(h)
ET(h)
determinedrecursively
as follows. Given thatt(h)
ET(h)
has beendefined,
the setof vertices
gen ( h, t )
=gen(/~,t~~)
=rh t -
isdetermined,
henceso is the size
Zht
= of this set; for eachpossible
choice ofZht non-negative integers ( cv , v
Egen ( h, t ) ) ,
there is aunique t~+~
> ET ~ h+ 1 ~
such that =
t(h)
and = Cv for all v Egen(h, t).
So aunique
tree t ET( 00)
> is determinedby specifying
for each h > 0 the way in which theseZht non-negative integers
are chosengiven
thatrht
=t(h)
for some
t(h)
ET(h).
A random
family
tree is a random element offormally specified by
its sequence ofheight restrictions,
say T =(rhT, h
=0,1, ...),
where .each
rh T
is a random variable with values in the countable setT(h),
andrhT
=rh(rh+iT)
for all h. The distributionof T,
denoteddist(T),
isthen determined
by
the sequence of distributions ofrhT
for h > 0. Sucha
distribution
is determinedby
aspecification
for each h > 0 of thejoint
conditional distribution
given rh T
of the numbers of childrencvT
as vranges over
gen( h, rhT ) .
Define convergence of distributions on
T(oo) by
weak convergence relative to theproduct
of discretetopologies
onT~.
Thatis,
forrandom
family
treesTn, n
=1, 2, ...
andT,
we say thatTn
converges in distribution toT,
and write eitherT,
ordist(T),
orlimn dist(Tn)
=dist (T)
if2.2. Galton-Watson trees
Let
p(’) = (p(0), p(1), ...)
be aprobability
distribution on the non-negative integers
withp(l)
1.Following [22], [25], [34], [35],
call arandom
family tree 9
a Galton-Watson(GW)
tree withoffspring
distributionp(~)
if the number of childrenZ~
of the root has distributionp(~):
and for each
~=1,2,..., conditionally given rhY
=t~h>,
the numbers of children E are i.i.d.according to p( . ) . Equivalently,
foreach h >
1 the distribution ofrhc
is determinedby
the formulaAnnales de l’Institut
where the
product
is over all vertices v of t ofheight
at most h - 1. Therestriction of the distribution
of G
onT( 00)
to the set T of finitefamily
trees is then
given by
the formulawhere the
product
is over all vertices v of t.Denote the mean of the
offspring
distributionof 9 by
~c:It is well known that
oo)
=1,
orequivalently
>0)
- 0as h - x, if and
only
if 1. Sofor
1 the distributionof G
iscompletely
determinedby
formula(3).
LetWhatever
p(.),
it is known[15]
that the distribution of the total progeny#~
on the event(#~ oo)
isgiven by
the formulaLet k > 1. Given
ZQ
=k,
for 1 i k let be the subtreeof ~
formed
by
the ith child of the root and all itsdescendants,
and observe that the associatedfamily
trees for 1 i k are i.i.d.copies
of~.
Sowhere
# 1, #2 ,
... are i.i.d.copies
For all k > 1 and n > k .for
Sn
as in(4),
where the firstequality spells
out(6),
and the secondequality generalizes (5).
See[15], [27], [39], [47], [48]
for derivations of this formula and otherinterpretations
of the distribution of#9 involving
random walks and queues.
646 D. ALDOUS AND J. PITMAN
2.3. Poisson-Galton-Watson trees
For p > 0 let
9/1
be a GW tree with theoffspring
distribution~~(n)
:= Denote the distribution of9/1
onby
From
(3)
and(2)
In this case,
Sn
in(4)
and(5)
hasPoisson(n )
distribution. So from(5)
the total progeny of a tree has the
probability
distribution~’~
on{1,2,’’ *, 00}
which is known as the Borel(~~
distribution[12], [35], [50]:
From
(7),
the sum of kindependent
random variablesN,~ (i),
each with theBorel( )
distribution(10),
has distribution on{1,2,’", oo} specified by
This is the Borel-Tanner distribution
[21], [49], [50]
withparameters
k and p.2.4. Uniform Random Trees
Let
Rpj
be the set of all rooted trees labeledby - ~ 1, 2, ... , n ~ .
For a rooted labeled tree t with n
vertices, let i
denote thecorresponding
rooted unlabeled tree.
Formally,
definet
to be the set of all trees s GR[~j
obtained
by
somerelabeling
of the vertices of tby ~n~.
Sot
E whereR[n]
is a set ofequivalence
classes of elements of LetUn
be arandom tree with uniform distribution on Aldous
[5]
observed thatfor a tree
That
is, (~~, given
=n)
andUn
induce identical distributions onR[n]
when unlabeled. Toput
this another way,fix
> 0 andgenerate
a PGW
family
tree Given that = V for some set V with#V
= n, letgt E R[n]
be~~
relabeledby
a uniform randompermutation
o- : V 2014~
[n].
Then(12)
amounts to:Call a function ~ of a rooted labeled tree t an invariant if
BlJ (t) ==
whenever s is a
relabeling
of t. Forexample,
the numberZ h t
of vertices of t atheight
h is an invariant. So is the matrixM(t) =
>0,
c >0)
where is the number of individuals in
generation
h of t that havec children. The
identity (12)
can be restated asThis
identity
for W = M and W =( Zl , ... , Zn)
was discovered earlier andexploited by
Kolchin[26], [27].
Thefollowing proposition
records asharper result, implicit
in the discussion of[6]
andexplicit
in[39],
whichobviously implies
all of these identities( 12)-( 14).
We call formula(15)
theUn-representation
ofdist(G |#G = n).
PROPOSITION 1. - For each n =
1, 2, ...
the conditional distributionof
atree
9 p, given #G
= n is the samefor all
>0,
and identical tothe distribution This common distribution is
given by
for
all t E T with#t =
n.2.5.
Conditioning
on non-extinctionAn infinite random tree which we
call g
conditioned on non-extinction is derived in the
following proposition
from a critical orsubcritical GW tree
~.
Theprobabilistic description
ofg°°
involves thesize-biased
distributionp*
associated with aprobability
distributionp ( ~ )
onthe
non-negative integers
with mean E(0,oo):
Here is an exact statement of "known result
(ii)"
in the Introduction.PROPOSITION 2
(Kesten [25]). -
Let~n~
:=#gen( n, .~~. )
be the numberof
individuals in the nth
generation of
a GWtree C
withoffspring
distributionp(.)
such thatp(0)
1and Jl; :s;
1. Then5-1998.
648 D. ALDOUS AND J. PITMAN
where is the distribution
of a random family
tree~°° specified by
(ii)
Almostsurely ~°°
contains aunique infinite path (root
=Yo, Yl, Y2, ...)
such thatTlh+1
is a childofVh for
every h =0,1, 2, ....
(iii)
For each h thejoint
distributionof rhG~
andVh
isgiven by
(iv)
Thejoint
distributionof (Yo, Th, Y2, ...)
and~°°
is determinedrecursively
asfollows: for
each h =0,1, 2,
...,given (Yo, Yl,
...,Vh)
andthe numbers
of
children areindependent
as v ranges overgen(h,
with distributionp(.) for Vh,
and with thesize-biased
distributionp* (.) for
v =Vh; given
also the numbersof
childrenfor
v E
gen(h,
the vertexYh+1
hasuniform
distribution on the setof
children
of Tlh.
That
(18)
defines aprobability
distribution for an infinitefamily
tree~°°
follows from the well known fact that n =0, 1 , ... )
isa
non-negative martingale
withexpectation
1. The sequence ofheight
restrictions =
0, 1 , ... )
which determines~°°
is a Markovchain with state space T obtained as the Doob
h-transform
of theMarkov chain
(r ng, n
=0, 1, ...)
via thespace-time
harmonic functionh(n, t)
:== See[41] ]
forbackground
and otherapplications
of h-transforms,
and[30]
for anelegant
treatment of the recursive construction(iv)
of~°°
and the infinitepath (Vh),
which we call thespine
of Asobserved in
[30],
the construction(iv)
ofVh
and~°°
such that( 18)
and( 19)
hold can also be carried out in the
supercritical
case 1 J-L oo. While ourfocus in this paper will be on the case 0
J-L 1,
we note inpassing
thatfor
> 1 thepath (Vh)
is almostsurely
not theonly
infinitepath
from theroot in
rather,
there areuncountably
many suchpaths
almostsurely.
Also,
the conditional limit theorem(17)
does not holdfor
> 1 withG~
constructed via
(18). Rather,
there is theelementary
result thatwhere the
right
side does not have the same distribution as~°°
definedby (18), except
whenp(.)
isdegenerate.
Annales de l’Institut Henri Poincaré - Probabilités
In
particular, Proposition
2 contains the classical result([9]
sec.1.8)
thatfor the usual
integer-valued
GW process =0,1, ... )
started atZo
=1,
1 there is the conditioned limit theoremwhere =
0,1, ... )
is a Markov chain with state space N andhomogeneous
transitionprobabilities
defined as follows. Given = msay, the number of individuals in the next
generation
of~°°
isdistributed as the sum of m
independent
randomvariables,
with m - 1of these variables distributed
according
to theoffspring
distributionp(.)
of
Z~,
and one variable distributedaccording
to the size-biasedoffspring
distribution p* ( .) .
In otherwords,
the process1, h
=0,1, ... )
isa
branching
process withimmigration, starting
from an initialpopulation
ofzero, with
offspring
distributionp(.)
andimmigration
distribution?*’(’),
where
?*"(’)
is the distribution of Z* - 1 for Z* with the size-biasedoffspring
distributionp* (.),
that isIt is
elementary
and well known thatThe
following corollary
is an easy consequence of this fact combined with theprevious Proposition.
COROLLARY 3
(Spinal Decomposition
of In thesetting of Proposition 2,
with(Vh)
theinfinite spine of ~°°
derivedby conditioning ~
on
non-extinction, for
i =0, 1,
... be thefamily
tree derivedfrom
the subtree
of ~°°
with rootY
in the randomforest
obtainedfrom ~°° by deleting
eachedge along
thespine,
and let be theJith
childof Vi.
Then .(i)
the i =0,1, ...
areindependent
and almostsurely finite,
with identical distribution
(ii)
the treesG(i)
have the same distributionas G iff p(.)
isPoisson( );
(iii) conditionally given .(~~i~,
i =0, 1, ...)
the ranksJi
areindependent
and
Ji
hasuniform ~Z~ ~i>
+1] distribution,
whereZ~ ~2~
= 1.650 D. ALDOUS AND J. PITMAN
The common distribution of the described
by (22)
is that of amodified GW tree, in which the number of the first
generation
individuals has distributionp* - ( ~ ),
while these and allsubsequent
individuals haveoffspring
distributionp(.).
Note thatVh
==( Jo , ... , Jh-1)
E for allh > 1. So the
spinal decomposition specifies
thejoint
distribution of thespine
of and the sequence of finitefamily
trees derivedby cutting
alledges
of thespine.
This determines the distribution of for it is clearthat G~
is a measurable function of(Vh )
and We record now forlater use the
following
consequence ofProposition
2.LEMMA 4. - In the
setting of Proposition 2,
with(Vh)
theinfinite spine of
derived
by conditioning ~
onnon-extinction,
let H be anon-negative integer
random variableindependent
and let be thefamily
treederived
from
thefinite
subtreeof
that contains the rootafter
is cutinto two subtrees
by
deletionof
theedge (VH , Then for
each t E Tand each vertex v
of
t atheight
hwhere
for p(. )
theoffspring
distributionof G
and :==03A3n np(n)
1.Proof. - By conditioning
on H it suffices to prove(23)
for a constantH,
say H = h. Lett h
=rht.
Then for each v atheight h
in t the leftside of
(23) equals
But for H = h
fixed,
=by construction,
so(19) gives
Also, given
=t h
andVh
= v,according
topart (iv)
ofProposition 2, 9(h) develops
overgenerations h + 2, ...
much like9,
with individualshaving independent
numbers ofoffspring, except
that in9(h)
each individualexcept v
hasoffspring
distributionp( .),
whereas v hasoffspring
distribution?*"(’). By
consideration of theproduct
formulae(2)
and(3),
it follows thatfor any
particular
tree t in which v has noffspring,
Annales de l’Institut Henri Poincaré - Probabilités
for
p*- (n)
as in(24).
Combine(26)
and(27)
in(25)
to obtain(23)
for afixed H = h. D
Conditioning
on the total progenyKennedy [24]
obtained ananalog
of the conditioned limit theorem(20)
as n - oo with
conditioning
= n instead ofZn9
>0,
where#9 == En Zn9
is the total progeny. Hisassumption
on theoffspring
distribution
p( . )
is that thegenerating
functiong(s)
:_L.n p( n )sn
satisfiesReinterpreting
hisargument
in terms offamily
treesgives
the convergence assertion in(29):
theequality
followseasily
from theproduct
formula(3).
PROPOSITION 5. -
Let C
be a GW tree whoseoffspring generating function
g
satisfies (28). Let G
be the critical GW tree whoseoffspring generating
function
isfor
a as in(28).
Thenwhere
G~ is G
conditioned on non-extinction.In terms of the
offspring
mean M, condition(28)
isalways
satisfied ifM > 1 and
p(.)
isnondegenerate,
with a 1.If M
= 1 then(28)
holdsif and
only
if oo, in which case a =1; then 9 == 9
and900. If M
1 then(28) requires p(n)
todecay exponentially,
anda >
1;
then the distributionsand G~
are different.Assume for this
paragraph
that(28) holds,
and consider whathappens
ifwe condition on
( ~ ~
>n )
instead of(#9
=n )
and then let n - oo :The first case is
elementary,
and the second two cases followeasily
fromProposition
5. Note theparadoxical
fact that whileand both intersections involve
decreasing
sequences of events,652 D. ALDOUS AND J. PITMAN
For
> 1 both limits in(30)
are thenaively
defineddist(9 I #9
=oo ) .
For~c = 1 both limits
yield
whichsuggests
the intuitiveinterpretation
as
dist(9 ~ ~
=oo ) . However,
suchinterpretations
arepotentially slippery,
as shownby
the fact that for 0 M 1limdist(9 [ ZnG
>0) =
=limdist(9 I #9
>n) . (31)
3. PRUNING RANDOM TREES
Let T be a random
family
tree. Call aT(")-valued
process(%,
0u
1)
auniform pruning
of T ifTi
= T almostsurely
and(7~
0 u1) d (T (u), 0
u1) (32)
where
(T(u) , 0
u1 )
is constructed as follows from some withT(l) =
T.Here 4.
denotesequality
indistribution, meaning equality
offinite dimensional distributions in a
display
such as(32). Suppose
thatgiven T(l)
there areindependent uniform(0,1 )
random variablesÇe
attached tothe
edges
e of~( 1 ) ;
letTt ( u)
be thecomponent containing
the root inthe
subgraph
ofT(l) consisting
of thoseedges
e withÇe
u, and letT(u)
= beTt(u)
relabeled as afamily
tree.Let I be an interval of the form either
[0, (]
for some 0(
o0 or[0, ()
for some 0(
~. Call a processt E I)
auniform pruning
process ifuniform
pruning
ofTt
for all t ~ I.(33)
If
(7~,
0 u1)
is a uniformpruning
of?i
then(7~,
0u ~ 1)
is auniform
pruning
process, and almostsurely
To
= 2022, thesingle-vertex
tree, andlimu~1 Tu
=?i. ( 34)
The second
equality
means that for almost every 03C9 in the basicprobability
space, for each h there exists a
u(h, cv)
1 such that =for all
u(h, cv)
u 1. It iseasily
seen that every uniformpruning
process is aninhomogeneous
Markov process. All uniformpruning
processes share the same co-transitionprobabilities,
which have an obviousinvariance under
scaling:
for 0 u 1 and z E I with z >0,
Annales de l’Institut Henri Poincaré - Probabilités
where
(7~B0 ~ 1 )
is a uniformpruning
of the fixed tree t. For afinite tree t E T these transition
probabilities
are describedby
thefollowing formula,
which is derivedby conditioning
on which subtree s of t remainscontaining
the root aftercutting
eachedge
of t withprobability
1 - u:where the sum ranges over all subtrees s of t with 0 E s and
fan(s)
= r,and
n(s, t)
is the number ofedges (v, w)
of t such that v E s and w E t - s. This formula determines the co-transitionprobabilities
of every uniformpruning
process, for it iseasily
seen that aT(°°)-valued
process(7~)
is a uniformpruning
process iff for each h > 0 theheight
restrictedT-valued process is a uniform
pruning
process.If
(Tu,
0 u1)
is derived from?i by
uniformpruning,
the sizeZ7~
of the first
generation
of7~
is distributed as the sum ofZ7i independent Bernoulli(u)
variables. In terms ofprobability generating
functions[16]
In
particular,
if =Poisson( )
for some p > 0 then=
Poisson(u ).
The
following
two lemmas record some more technicalproperties
ofuniform
pruning
processes for ease of later reference.LEMMA 6. - Let u E
In )
be a sequenceof uniform pruning
processes and lettn
EIn
be such thattn
-~ 1 and~~n~
as n - 00for
somerandom
family
tree T. Then(I) for
each E > 0where 0
u ~ 1 )
is auniform pruning of T1
with7í 4 T;
(ii) if tn
> 1for
all n then(38)
holds alsofor
E = 0.Proof. -
The convergence in(38)
should be understood as convergence of finite dimensional distributions ofheight
restricted processes for each finiteheight
h. Theprobability
that aheight
restricted uniformpruning
process passes
through
some sequence of treesti,1
i 1~ at times0 ui ... uk is the