Restricted exchangeable partitions and embedding of associated hierarchies in continuum random trees

(1)

www.imstat.org/aihp 2013, Vol. 49, No. 3, 839–872

DOI:10.1214/12-AIHP533

Restricted exchangeable partitions and embedding of associated hierarchies in continuum random trees

Bo Chen and Matthias Winkel

University of Oxford, 1 South Parks Road, Oxford OX1 3TG, UK. E-mail:[email protected];[email protected] Received 31 August 2010; revised 1 August 2012; accepted 20 October 2012

Abstract. We introduce the notion of a restricted exchangeable partition ofN. We obtain integral representations, consider associated fragmentations, embeddings into continuum random trees and convergence to such limit trees. In particular, we deduce from the general theory developed here a limit result conjectured previously for Ford’s alpha model and its extension, the alpha-gamma model, where restricted exchangeability arises naturally.

Résumé. Nous introduisons la notion d’une partition restreinte échangeable deN. Nous obtenons des représentations intégrales, nous considérons les fragmentations associées, des plongements dans des arbres aléatoires continus et la convergence vers de tels arbres limites. En particulier, nous déduisons de la théorie générale développée içi un résultat limite formulé en conjecture dans un travail précédent. Ce résultat particulier concerne les arbres alpha de Ford et leurs généralisations, les arbres alpha-gamma, deux exemples où l’échangeabilité restreinte arrive de manière naturelle.

MSC:60G09; 60J80

Keywords:Exchangeability; Hierarchy; Coalescent; Fragmentation; Continuum random tree; Renewal theory

1. Introduction

This paper introduces the concept of restricted exchangeability, which captures a weak form of exchangeability that occurs naturally in models such as the alpha-gamma tree model of [10].

1.1. Motivating example:Alpha-gamma trees as random hierarchies

An important motivation for this paper is the study of the limiting behaviour of the alpha-gamma tree-growth model [10], which is based on a simple stochastic growth rule to build a treeTn+1from a treeTnby adding a leaf (degree-1 vertex) labelledn+1. Let us specify this rule in a framework of hierarchies (also called total partitions or fragmentations in the literature).

Following [20,24,29,30], we callhierarchy onB⊆Nany subsettB of the power set ofB such thatB∈tB and {j} ∈tBfor allj∈B, and so that for everyA, A∈tB, eitherA⊆AorA⊆AorA∩A=∅. To avoid trivialities, we also require∅∈tB. We say that a strict subsetA∈tBofA∈tBisa maximal subset ofAintBif for allA∈tBwith A⊆A⊆AeitherA=AorA=A. For finiteB⊂Nwith #B≥2, the maximal subsetsA1, . . . , Ak ofB intB

form a partition ofBand therestrictionstA_i=tB∩A_i= {A∩A_i:A∈tB}are hierarchies onA_i,i∈ [k] := {1, . . . , k}; a hierarchytBfully encodes arooted tree, i.e. a connected acyclic graph, with vertex settBand edge relation linking each set to its maximal non-empty subsets, withroot∅related toB; hierarchiestAi are thesubtreesoftBabove the first branchpointBoftB. See Fig.1. We callA∈tB branchpointorinternal vertexif #A≥2. Denote byTnthe set of all hierarchies on[n],n≥1. We say thattn∈Tnandtn+1∈Tn+1areconsistentiftn=tn+1∩ [n].

(2)

Fig. 1. Two hierarchies onB= {1,2,3,4,5,6,7,8,9}illustrated as rooted trees.

Fig. 2. Alpha-gamma growth rule: displayed is one internal vertex,Bsay, ofTnwith degreek+1, hence vertex weight(k−1)α−γ, withk−r leavesL_r+1, . . . , L_k∈ [n]andrbigger subtreesS₁, . . . , Sr; all edges also carry weights, weight 1−αandγare displayed here for theleaf edge below{L_k}and theinner edgebelowBonly; the three associated possibilities forT_n+1are displayed.

The alpha-gamma model [10] is a consistent family(T_n, n≥1)of random hierarchies on[n], for which the con- ditional distributions ofTn+1 given Tn are particularly simple. In terms of trees, passing from Tn toTn+1 means identifying the random place inTnwhere{n+1}connects toTn: as illustrated in Fig.2, for parameters 0≤γ≤α≤1 and forn≥1, vertex{n+1}connects to

• a new vertex{j, n+1}inserted (in the edge) below{j} ∈Tnwith probability(1−α)/(n−α);

• a new vertexB∪ {n+1}inserted below branchpointB∈Tnwith probabilityγ /(n−α);

• an existing branchpointB∈Tnwith probability((k−1)α−γ )/(n−α), wherek+1 is the degree of vertexB in the treeT_n, or equivalentlykis the number of blocks of the partition into maximal subsetsA₁, . . . , A_kofBin the hierarchyT_n;

nowT_n₊₁is built fromT_nby addingn+1 to all vertices on the path between{n+1}and∅.

A random hierarchyT_B onBis calledexchangeable[20] if for every bijectionβ:B→B, the hierarchyβ(T_B)= {{β(j ): j∈A}, A∈T_B}obtained by permuting labels byβis distributed likeT_B. An alpha-gamma treeT_nforn≥3 is exchangeable iffγ=1−α; note for instance that

P

⎛

⎝T₃=

⎞

⎠=P

⎛

⎝T₃=

⎞

⎠=1−α

2−α while γ 2−α=P

⎛

⎝T₃=

⎞

⎠.

However, forγ =1−αthere is still some exchangeability. To capture this, we introduce the partitionΠnofTninto maximal strict subsets of[n]and refer to its distributionPn on the setPnof partitions of[n]as asplitting rule. We say that(Tn, n≥1)is alabelled Markov branching modelif conditionally givenΠn= {A1, . . . , Ak}, the hierarchies

(3)

T_n∩A_i,i∈ [k], are independent and distributed asβ_i(T_#A_i), whereβ_i is the unique increasing bijection from[#Ai] toA_i. Then(P_n, n≥2)determines the distributions ofT_n,n≥1. We will show in Section6that the alpha-gamma model is a labelled Markov branching model with splitting rulesPn^α,γ satisfying

P_n^α,γ(π )=P_n^α,γ β(π )

for all bijectionsβ:[n] → [n]withπ∩ {1,2} =β(π )∩ {1,2},

whereβ(π )= {{β(j ): j ∈A}, A∈π}. Equivalently,P_n^α,γ satisfiesP_n^α,γ(π )=P_n^α,γ(π)ifπ∩ {1,2} =π∩ {1,2}

and ifπ= {A₁, . . . , A_k}andπ= {A₁, . . . , A_k}have the same multiset of block sizes #Ai,i∈ [k], and #A_j,j∈ [k].

Alpha-gamma trees(T_n, n≥1)give rise to a random hierarchyH= {A⊂N: A∩ [n] ∈T_nfor alln≥1}onN. We studied the limiting behaviour ofT_nand identified a scaling limit in [10], but only obtained convergence in distribution.

The crucial tool to strengthen to convergence in probability is restricted exchangeability, which we will use to embed Hand more general hierarchies of (restricted exchangeable) Markov branching models into suitable limit trees.

1.2. Restricted exchangeable partitions and integral representations

For a partitionπ= {π_i, i∈N}ofB⊆Nwith disjointπ_i,i∈N, each non-emptyπ_i⊆Bis called ablockofπ. When π has only finitely many blocks, we often omit∅fromπ. To be definite, we arrange the blocks ofπ in the order of least element, i.e. minπi <minπj for everyi < j, followed by∅with the convention min∅= ∞. For finiteπi, we consider theblock size#πi. We denote the set of all partitions ofB byPB. Recall[n] = {1, . . . , n}forn∈N. Note that forΓ ∈P=P_N, the restrictionsΓ|n=Γ ∩ [n] = {Γ_i∩ [n], i∈N}are partitions of[n],n∈N. OnP, consider the metricd(Γ, Γ)=2⁻^inf^{ⁿ^≥^1:Γ^|ⁿ⁼^Γ^|ⁿ^}and the associated Borelσ-algebra.

Following de Finetti and Kingman, we call a Borel measure on the spacePBof partitions ofB⊆Nexchangeable, if it is invariant under the natural action on PB of the symmetric group on B; and a random partition is called exchangeable if its distribution is exchangeable. Then a measureμonP is exchangeable if and only if the discrete measuresμnonPn=P[n], given by

μ_n {π}

=μ P^π

, π∈Pn, whereP^π= {Γ ∈P: Γ|n=π}, (1)

are exchangeable for all n≥1. Furthermore, a measureμn onPn is exchangeable if μn({π})=μn({π})for all π, π∈Pnwith the same multiset of block sizes.

Several weaker forms of exchangeability have been studied in the literature, notably Pitman’s partial exchangeability [26] and Gnedin’s constrained exchangeability [13]. We introduce here a new weak form of exchangeability and discuss in Section3.1how these notions interact.

Definition 1. Forπ∈Pn,we call a measureμ onP^π exchangeable onP^π ifμ(P^π)=μ(P^π)for allπ, π∈

m≥1Pn+mwith the same multiset of block sizes and withπ∩ [n] =π∩ [n] =π. A measureμonP is calledrestricted exchangeable (RE)if there isC⊂K:=

n≥1Pns.th.

• noπ∈Cis the restriction of anotherπ∈C,

• the measureμis carried by

π∈CP^π,i.e.μ(P\

π∈CP^π)=0,

• and for eachπ∈C,the restriction ofμtoP^π is finite and exchangeable onP^π.

Remark 2. A measure onP^π is exchangeable onP^π if and only ifμ(P^π)=μ(P^β(π⁾)for allπ∈Pn+mand all bijectionsβ:[n+m] → [n+m]withπ∩ [n] =β(π)∩ [n] =π,m≥1.

Note that the set of admissible bijections β depends on π,and while β(j )=j, j ∈ [n],makes β admissible, there are many other admissible bijections.The point is that the specific blocks containingπ_i inπandβ(π)may have different sizes(while the multisets of all block sizes of π and β(π)coincide).This is an important feature of our definition of restricted exchangeability.The apparently more natural but strictly weaker concept obtained by restricting the admissible bijections to the subgroup of those withβ(j )=j,j∈ [n],is less convenient to work with, since integral representations of such measures – which we might callweaklyRE – no longer just involve measures on decreasing sequences,cf.Theorem3.

(4)

LetS^↓= {s=(s_i, i≥1): s₁≥s₂≥ · · · ≥0, _i_≥₁s_i≤1}. Fors∈S^↓,Kingman’s paintbox[22] is obtained from independent random variables(ξr, r≥1)with respective distributions

P(ξr=i)=si, i≥1, P(ξr= −r)=s0:=1−

i≥1

si,

as the distributionκ_sonPof the exchangeable partitionΠ= {{n≥1:ξ_n=i}, i∈Z}, which puts any twom, n∈N into the same block if and only ifξm=ξn. By the Strong Law of Large Numbers, the vector of block sizes #(Π∩[n])^↓ in decreasing order of size hasasymptotic frequencies

|Π|^↓= lim

n→∞

1 n#

Π∩ [n]_↓

=(s_i, i≥1)=s.

It is well-known [1,21,22] that exchangeable measures onP admit integral representationsμ=

S^↓κ_sν(ds). To establish integral representations for RE measures here, we introducemodified paintboxesκ_s^π,π∈K=

n≥1Pn, by conditioningκson the cylinder setP^π= {Γ ∈P: Γ|n=π}ofπ inP, but note that this conditioning is degenerate in some cases; see Section2for details.

Theorem 3 (Integral representation). Letμbe a measure onP.Thenμis RE if and only if there are a subsetC⊂K such that noπ∈Cis the restriction of anotherπ∈C,and for eachπ∈Ca finite measureν_π onS^↓such that

μ=

π∈C

S^↓

κ_s^πνπ(ds).

Note that a RE measureμcan be infinite, ifCis infinite. However, asC⊂Kis countable, such infinite measures will still beσ-finite, because they are finite onP^π,π∈C.

Examples 4.

(i) ForB⊆N,let1Bbe the trivial partition of a single blockB. Dislocation measuresare measures onP carried by P\ {1_N}, finite on P^π, π ∈K\ {1_[n], n≥1}. We set C= {{[j],{j +1}}, j ≥1} to naturally decompose P\ {1_N} =

π∈CP^π.Bertoin’s[6]possibly infiniteexchangeabledislocation measures,in the sense of(1),are exchangeable and finite onP^π,π∈C,so they satisfy Definition1.See Sections1.3and3.2.

(ii) We can associate dislocation measures with Ford’s alpha model[12]and the alpha-gamma Markov branching model[10],defined in Section1.1,so thatμ_α,γ(P^π)=λ^α,γ_n P_n^α,γ(π ),π∈Pn\ {1_[n]},for consistent ratesλ^α,γ_n , n≥2.These dislocation measuresμ_α,γ are RE,but not exchangeable,as we illustrated in terms of splitting rules Pn^α,γ at the end of Section1.1.See Section3.2for an exploration of the relationship between splitting rules and dislocation measures in a general RE framework.

From Theorem3we deduce an integral representation for restricted exchangeable dislocation measures. For sim- plicity we only allow as decomposition ofP in Definition1the most relevant and naturalC= {{[j],{j+1}}, j ≥1}.

Corollary 5. Letκbe a RE measure withC= {{[j],{j+1}}, j≥1}.Then for eachj ≥1,there are constantsc_j≥0 andk_j≥0,and a measureν_j onS^↓with

ν_j

(0,0, . . .)

=ν_j

(1,0, . . .)

=0 and

S^↓

s₀1_{j=1}+

i≥1

s_i^j(1−s_i)

ν_j(ds) <∞,

such that,forε^{(j )}= {{j},N\ {j}}andω^[^j^]= {[j],{j+1},{j+2}, . . .},j≥1, κ=c₁δ_ε(1)+

j≥1

c_jδ_ε(j+1)+k_jδ_ω[j]+

S^↓

κ_s

· ∩P^j ν_j(ds)

, whereP^j=P^{[^j^]^,^{^j⁺¹^}}.

In the exchangeable case, we have(cj, kj, νj)=(c,0, ν),j ≥1, as was shown by Bertoin [6].

(5)

1.3. RE hierarchies and continuum random trees

In the context of our motivating example, the alpha-gamma model, we demonstrated how consistent Markov branching trees give rise to a random hierarchyHofN. Let us investigate this in the context of Bertoin’s systematic studies [8]

of exchangeable homogeneous and exchangeable self-similar P-valued fragmentation processes (F^∗(t), t ≥0)in continuous time, and of Haas and Miermont’s [17] associated self-similar continuum random trees (CRTs).

Bertoin described exchangeable homogeneous fragmentation processes in terms of an exchangeable dislocation measureκ= j≥1cδ_ε(j )+

S^↓κsν(ds)onP. Informally, blocks fragment independently; for eachπ∈Pn\ {1[n]}, there is a competing rateκ(P^π)at which a given block F_i^∗(t)undergoes a split whose effect on the firstn block members is a partition according toπ. For anα-self-similar fragmentation process, this rate is increased (in the case α >0) by a factor|F_i^∗(t)|⁻^α depending on the asymptotic frequency|F_i^∗(t)|of the block. The rate increase is such that singleton blocks and indeed the all-singleton state0_Nare obtained in finite time.

Under some regularity conditions, [17] constructed self-similar CRTs(T(α,ν), μ)with characteristic pair(α, ν), i.e.

random path-connected compact metric spaces(T(α,ν),d)equipped with a rootρ∈T(α,ν)and a probability measureμ onT(α,ν), and with the tree property that there are no cyclic paths. Self-similarity here means that conditionally given the tree up to heightt above the root and given subtree massesμ(Si(t))=mi(t)above heightt, the subtreesSi(t), i≥1, above height t, are like independent copies ofT(α,ν), with masses rescaled by m_i(t)and distances rescaled by (m_i(t))^α. These CRTs can be considered as genealogical trees of Bertoin’s fragmentation processes; for a μ- distributed i.i.d. sampleΣ_n^∗,n≥1, inT(α,ν), we obtain anα-self-similar fragmentation process by considering the partition-valued process that has{n≥1:Σ_n^∗∈S_i(t)},i≥1, as non-singleton blocks and all other integers in singleton blocks at timet,t≥0.

To any exchangeableP-valued fragmentation process we associate the exchangeable hierarchyH^∗= {F_i^∗(t), i≥ 1, t≥0}of all blocks ever visited, equivalentlyH^∗= {L^∗(T^v): v∈T(α,ν)}, whereL^∗(T^v)= {n∈N: Σ_n^∗∈T^v}and T^vis the subtree ofT(α,ν)abovev∈T(α,ν). We say that the hierarchyH^∗isembedded in the CRTT(α,ν)by the sample Σ_n^∗∈T(α,ν),n≥1.

We now associate with any RE dislocation measureκaRE fragmentation processF, in which each block fragments independently, with ratesκ(P^π), π∈Pn, affecting then smallest block members by partitioning according toπ. We callH= {F_i(t): i≥1, t ≥0}the associatedRE hierarchy. Alternatively (see Section3.2),RE splitting rules P_n(π )=κ(P^π)/κ(P\P¹^[ⁿ^]),π∈Pn\ {1_[n]}, give rise to consistentRE labelled Markov branching trees(T_n, n≥1) with splitting rules(P_n, n≥2)that induce a RE hierarchy{A⊂N: A∩ [n] ∈T_nfor alln≥1}. Embedding a non- exchangeable hierarchyHinto a CRTT means findingΣn∈T,n≥1, with a non-trivial dependence structure, such thatHis embedded inT byΣn,n≥1.

Theorem 6. Letα >0,and letκbe a RE dislocation measure of the form

κ=

j≥1

S^↓

κs

· ∩P^j

νj(ds), withν(ds):=

j≥1

i≥1

s_i^j(1−si)

νj(ds) (2)

satisfying

S^↓(1−s₁)ν(ds) <∞ and ν(s₀>0)=0. Then we can construct (T(α,ν), (Σ_i, i≥1)) such that H= {L(T_(α,ν)^v ): v∈T(α,ν)}is a RE hierarchy with dislocation measure κ,embedded in a self-similar CRT T(α,ν) with characteristic pair(α, ν),whereL(T_(α,ν)^v )= {i∈N: Σ_i∈T_(α,ν)^v }.

Our proof of Theorem6in Section4gives an explicit sampling procedure for leavesΣi∈T(α,ν),i≥1, based on the self-similarity ofT(α,ν)and recursive spinal decompositions of subtrees.

Theorem6partly generalises Theorem 4 of [28]. However, apart from the alpha model (the alpha-gamma model withγ=α, which produces only binary trees), that theorem treats models that are not RE in the sense of Corollary5 nor for other decompositions ofP.

It requires no extra work to also construct hierarchies associated with RE dislocation measuresκbased on different decompositions ofP. However, in those more general cases, a RE measure still qualifies as a dislocation measure if and only if it is finite on P^{[^j^]^,^{^j⁺¹^}}, j ≥1, and this is necessary for hierarchies to be well-defined. Hence, the decomposition in Corollary5is the most natural decomposition in the context of fragmentation processes.

(6)

Exchangeable hierarchiesH^∗derived from fragmentation processes (or from Markov branching trees) have been used to construct CRTs as scaling limits [18]. We carry out a similar programme here for RE hierarchiesH, starting from a RE dislocation measure of the form identified in Corollary5. We can delabel treesT_n=H∩ [n], but retain the root, to obtain rooted combinatorial treesT_n^◦, i.e. connected acyclic graphs with no degree-2 vertex, but some degree-1 vertices, only one of which is distinguished, as the root. We can regardT_n^◦as a metric space with unit distance between adjacent vertices and with adjacent vertices connected by unit length line segments. We use notationT_n^◦/a to scale the length of the line segments and to obtain a metric space with all connecting line segments of length 1/a, where a∈(0,∞).

In the exchangeable case, [18] obtain CRT convergence under a regular variation condition

ν(s₁≤1−)=⁻^α(1/) as↓0; for someα∈(0,1)and slowly varying (3) and a log-moment condition

S^↓

i≥2

s_ilog(si)ν(ds) <∞ for some >0. (4)

Theorem 7. If in the setting of Theorem6,the measureνsatisfies(3)and(4),and ifνj=νmfor somem≥1and all j≥m,then

T_n^◦

n^α(n)Γ (1−α)→T(α,ν) in probability,in the Gromov–Hausdorff sense.

Returning to the alpha-gamma model, we can now show that Theorem7applies to give a scaling limit in probability.

The identification ofν_j,j ≥1, in the parameterisation of Corollary5finally sheds some light on the peculiar splitting rules andν-measures in Ford’s alpha model and the alpha-gamma model [10,12,18,28]. To do this, we follow [19,24, 25] and introduce Poisson–Dirichlet dislocation measures PD^∗_α,θ(ds)asσ-finite measures onS^↓given by

E

σ₁^θ;σ₁⁻¹σ_[_0,1_]∈ds

, θ >−2α, α∈(0,1),

on the interior of the parameter range, where(σ_t, t≥0)is a stable subordinator with Laplace transformE[e⁻^λσ^t] = e⁻^{t λ}^α and whereσ_[_0,1_]is the decreasing rearrangements of the jumpsσ_t=σ_t−σ_t₋,t∈ [0,1]. Forθ= −2α, the binary case, PD^∗_α,₋_2α(ds)is defined as the ranked beta measure on{(x,1−x,0, . . .), x∈(1/2,1)} ⊂S^↓with density x⁻^α⁻¹(1−x)⁻^α⁻¹1(1/2,1)(x); the associated Markov branching model is Aldous’s [4] beta-splitting model, forα <1.

As the references demonstrate, Poisson–Dirichlet dislocation measures give rise to some of the nicest and best- studied parametric families of exchangeable fragmentation processes, while alpha and alpha-gamma models have as their dislocation measure what we have previously written as linear combinations of Poisson–Dirichlet measures of different parameters [10]. With the notion of restricted exchangeability, we can now obtain a stronger and more satisfactory connection.

Proposition 8. The alpha-gamma model forα∈(0,1)andγ∈ [0, α]is a RE Markov branching model with disloca- tion measure of the form identified in Corollary5withν₁=(1−α)PD^∗_α,₋_α₋_γ andν_j=γPD^∗_α,₋_α₋_γ,j≥2.

The boundary caseα=1 degenerates [10] and leads to RE Markov branching models with

• forγ=0 star trees corresponding to(ν1, c1, k1)=(0,0,1)and(ν2, c2, k2)=(0,0,0);

• forγ=1 comb trees corresponding to(ν₁, c₁, k₁)=(0,0,0)and(ν₂, c₂, k₂)=(0,1,0);

• forγ∈(0,1)bushy combs corresponding to(ν1, c1, k1)=(0,0,1),(c2, k2)=(0,0)and ν₂(s₂>0)=0 and ν₂(s₁∈dx)=γ x⁻²(1−x)⁻¹⁻^γ1(0,1)(x)dx.

(7)

1.4. Sampling consistency and the skewed Poisson–Dirichlet model

Proposition8suggests to introduce a three-parameter family of restricted exchangeable fragmentation trees that we call theskewed Poisson–Dirichlet model, by setting

ν₁=λPD^∗_α,θ, ν_j=(1−λ)PD^∗_α,θ, j≥2,

forα∈ [0,1],θ≥ −2αandλ∈ [0,1]. Whenλ=(1−α)/(1−θ−2α)andθ= −α−γ, this is the alpha-gamma model; when λ=1/2, this is the exchangeable Poisson–Dirichlet model studied in [19,24]. We will use parameter- isations by(α, θ, λ)and(α, γ , λ), whereγ = −α−θ. We can apply Theorem7 to obtain a convergence result in probability:

Corollary 9. Let (T_n, n≥1)be a consistent family of skewed Poisson–Dirichlet trees for parameters0< α <1, 0< γ = −α−θ≤αand0≤λ <1.Then

T_n^◦

n^γ →T(γ ,ν) in probability,in the Gromov–Hausdorff sense, whereT(γ ,ν)is aγ-self-similar CRT associated with measure

ν(ds)= γ Γ (1−α) (1−λ)αΓ (1−γ /α)

λ+(1−2λ)

i≥1

s_i²

PD^∗_α,θ(ds)

forγ < α,while in the binary caseγ=α(i.e.θ= −2α),we haveν(s1+s2<1)=0and ν(s₁∈dx)= α

(1−λ)Γ (1−α)

(1−λ)+(4λ−2)x(1−x)

x⁻^α⁻¹(1−x)⁻^α⁻¹dx.

Regarding the alpha model,α∈(0,1),θ= −2α,λ=1−α, this confirms in part a conjecture formulated in [28];

specifically, the setting of the conjecture was the two-parameter(α, θ )-model that contains the alpha model as a special case, and the conjecture claims almost sure convergence, while we only obtain convergence in probability here.

Another interesting feature of the skewed Poisson–Dirichlet model relates to sampling consistency. Here we say that a family of unlabelled random trees (T_n^◦, n≥1)is sampling consistent if the treeT_n^◦ with a uniformly cho- sen leaf removed is distributed as T_n^◦₋₁. For consistent trees with exchangeable labels such as the exchangeable Poisson–Dirichlet model this is trivially so, but also and non-trivially for the alpha-gamma model that includes non- exchangeable trees [10]. Geometrically, this gives sampling consistency for two two-dimensional subsets of the three- dimensional parameter space (intersecting in the one-parameter family of stable trees [25] forγ=1−α), but some- what surprisingly, sampling consistency does not extend any further:

Proposition 10. The skewed Poisson–Dirichlet model is sampling consistent only for parameters that reduce it to the exchangeable Poisson–Dirichlet model or to the alpha-gamma model.

This shows that while Theorem 6 and7 always refer to Markov branching trees T_n^◦ in the sense of [18], they typically do not, however, satisfy the sampling consistency property of [18], so that the theory developed in [18] does not even yield convergence in distribution for these trees, where we here establish convergence in probability.

1.5. Structure of this paper

In addition to proofs of main results already formulated, the content of this paper is as follows.

• Section2proves Theorem3and Corollary5by combining approaches of Vershik and Kerov, and of Aldous, both in the exchangeable case.

• Section3includes a discussion of the relationship between restricted exchangeability, partial exchangeability and constrained exchangeability, and a discussion of RE dislocation measures, RE splitting rules, RE hierarchies and RE fragmentations.

(8)

• In Section4, we develop a new technique to sample leaves in general self-similar CRTs. We make explicit the embedding that we use to prove Theorem6, and we obtain decomposition results along subtrees spanned by the firstksampled leaves (Corollary22).

• In Section5we prove Theorem7. Our approach is similar in spirit to [18], but with added technical difficulties. We analyse the RE embedding of Theorem6in detail. While in [18] consideration of a singleΣ^∗∈T(α,ν)gives relevant estimates for all Σ_n^∗,n≥1, we here need individual estimates for each Σ_n,n≥1. Methods include Gnedin’s constrained paintboxes and renewal theory. We also establish almost sure convergences of rescaled subtrees ofTn

spanned bykleaves, as firstn→ ∞in Proposition28and then alsok→ ∞in (22).

• Section6provides proofs for Propositions8and10.

• AnAppendixcontains the proof of a technical lemma.

2. Integral representations, proof of Theorem3and Corollary5

Our first aim is to understand exchangeability on subsets of the formP^π⊆P, for someπ∈K. Let us formally define modified paintboxes. Fors∈S^↓letm≥0 such thats_m> s_m₊₁=0 (orm= ∞ifs_i>0 for alli≥1), supposeπ∈K haskblocksπ_j=∅, 1≤j≤k, of whichwith #πj≥2. For the paintboxκ_sassociated withs, we haveκ_s(P^π) >0 iff eithers₀>0 and≤m, ors₀=0 andk≤m. In these cases, setκ_s^π=κ_s(·|P^π). Thenκ_s^π is a modified paintbox:

1. Randomly assign “colours”c(π )=(c(π₁), . . . , c(π_k))to the blocksπ₁, . . . , π_k using the following rule (withZ^π_s as normalisation constant)

P

c(π )=(i₁, . . . , i_k)

= 1 Z_s^π

1≤j≤k

s_i^#π^j

j , (5)

wherei_j is allowed to be equal to 0 iff #πj=1, and thei_j withi_j≥1 are pairwise distinct.

2. Letnbe such thatπ∈Pn. Conditionally givenc(π )=(i₁, . . . , i_k), set for 1≤r≤nandr∈π_j, ξ_r=i_j ifi_j≥1, and ξ_r= −minπ_j ifi_j=0,

and forr≥n+1, consider independentξ_rwithP(ξ_r=i)=s_i,i≥1,P(ξ_r= −r)=s₀. Thenκ_s^π is the distribution of the partition Π= {{n≥1: ξn=i}, i∈Z}, which puts any twon, n∈Ninto the same block if and only if ξ_n=ξ_n.

In the degenerate case whenκs(P^π)=0, the numerator of (5) always vanishes. Roughly speaking, we use all colours 1, . . . , mfor the largest blocks ofπ. Formally, we replace 1. by 1.:

1. Randomly assign “colours” using the following rule (withZ_s^πas normalisation constant):

P

c(π )=(i₁, . . . , i_k)

= 1 Z_s^π

1≤j≤k:i_j=0

s_i^#π^j

j ,

if{i1, . . . , i_k} = {0, . . . , m}, thei_j≥1 are pairwise distinct and ^k_j₌₁#πj1_{ij=0}is maximal.

Step 2. is applied as before to constructΠ and henceκ_s^π. Note that Π_j =π_j if i_j =0, whileΠ_j⊃π_j will have limiting frequencys_i_j>0 ifi_j≥1.

Nowκ_s^π(P\P^π)=0 and, forπ=(π₁, . . . , π_k)∈K^π:= {π∈K: π∩ [n] =π}, κ_s^π

P^π

= 1 Z^π_s

(i1,...,i_k)admissible for(π,π,s)

s^#^{^J

sπ≤j≤k:ij=0} 0

1≤j≤k:ij=0

s^#π

j1_{_ij_≥1}

ij ,

whereJ_s^π =k+1 in the degenerate case,J_s^π=1 otherwise, and where(i1, . . . , i_k)is admissible for (π, π,s)if (i₁, . . . , i_k)is as in 1. or 1. above, respectively, and if fork+1≤j ≤k, we allowi_j equal to 0 iff #π_j=1, and the i_j withi_j≥1, 1≤j≤k, and pairwise distinct.

Forπ= {{1}}, this is a well-known formula for Kingman’s paintboxκ_s=κ_s^π, withZ_s^π=1. It is easy to show that, in the general case, the modified paintboxesκ_s^πare exchangeable onP^π.

(9)

Proposition 11. For anyn≥1andπ∈Pn,the modified paintboxκ_s^πcan be expressed in terms of anyΓ ∈P^π with asymptotic frequenciess,provided that any blocks ofΓ with zero asymptotic frequency are either subsets of[n]or singletons,as

κ_s^π P^π

= lim

r→∞

#{π∈K^π: π≈Γ|r}

#{π∈K^π: π≈Γ|r} for allπ∈K^π=

π∈K: π∩ [n] =π , where we writeπ≈πifπandπhave the same multiset of block sizes.

Proof. This proof is a refinement of the relevant part of the proof of Theorem 3.1 of [21], Kerov’s proof of Kingman’s paintbox representation of exchangeable partitions inP, where we need to take into account the restriction toP^π. We evaluate the right-hand side. Numerator and denominator are easily calculated, e.g. forπ=(π₁, . . . , π_k)∈P_n^π :=

K^π∩Pn as

#

π∈Pr^π: π≈Γ|r

= r−n

#Γi1|r−#π₁, . . . ,#Γi_k|r−π_k,#Γothers|r

1

j≥1p_j!,

where is over indices(i₁, . . . , i_k)such that #Γij|r−#π_j ≥0 for allj ∈ [k],Γ_others|r is the vector of allΓ_i|r, i≥1, exceptΓ_i₁|r, . . . , Γ_i_k|r, andp_j is the number of blocks of Γ|r withj elements,j ≥1. First assumes₀= 1− _i_≥₁s_i=0, then the limit exists and isZ_s^π,π/Z_s^π,π, where

Z^π,π_s = lim

r→∞

_r₋_n

#Γi1|r−#π₁,...,#Γ_ik|r−#π_k ,#Γothers|r

r^d _r

#Γ₁|r,#Γ₂|r,...

=

(i₁,...,i_k)admissible for(π,π,s)

j:ij=0

s^#π

j

i_j ,

withd the minimal ^k_j₌₁#π_j1_{ij=0}, so thatd >0 only in the degenerate case; this powerd is such that terms with higher than the minimal sum vanish asr→ ∞, and we identifyκ_s^π(P^π).

Ifs₀>0, blocks of zero limiting frequency need to be treated differently, because their unionΓ₀now has a limiting frequency, and a unionπ₀ of blocks ofπcan indeed be associated withΓ0. Specifically, we calculate a first factor as

rlim→∞

r−n

#Γ₀−π₀,#Γi1|r−#π₁,...,#Γik|r−#π_k ,#Γ_others|r

_r

#Γ₀|r,#Γ₁|r,#Γ₂|r,...

=

(i1,...,i_k)admissible for(π ,π,s)

s₀^#^π⁰

k

j=1

s^#^π

j

i_j ,

but then need to also count the further partitions of the block of size #Γ0|r. This yields forπ₀ =π_j

1 ∪ · · · ∪π_j

b a positive limit factor ifd=#π₀−bis minimal, which we then calculate as

rlim→∞

#Γ0|r−#π₀

#Γ_i₁|r−#π_j

1,...,#Γ_ib|r−#π

jb,1,...,1

r^d

(#Γ₀|r)! =s₀⁻^d;

the number of available indices is asymptotically equivalent to #Γ₀|r ∼s₀r, so that the sum contains ∼(#Γ₀|r)^b terms, and this contributes to the asymptotics of the numerator. Finally we sum over the different choices ofπ₀ with

#π₀−b=d to identifyκ_s^π(P^π).

With these representations of the modified paintboxes, we now obtain the integral representation of general measures that are exchangeable onP^π for someπ∈Pn.

Proposition 12. Letμbe a finite measure,exchangeable onP^π for someπ∈Pn.Then there is a finite measureνon S^↓such thatμ=

S^↓κ_s^πν(ds).