• Aucun résultat trouvé

Trees within trees II: Nested Fragmentations

N/A
N/A
Protected

Academic year: 2021

Partager "Trees within trees II: Nested Fragmentations"

Copied!
38
0
0

Texte intégral

(1)

HAL Id: hal-01842036

https://hal.archives-ouvertes.fr/hal-01842036

Preprint submitted on 17 Jul 2018

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Trees within trees II: Nested Fragmentations

Jean-Jil Duchamps

To cite this version:

Jean-Jil Duchamps. Trees within trees II: Nested Fragmentations. 2018. �hal-01842036�

(2)

Trees within trees II: Nested Fragmentations

Jean-Jil Duchamps Sorbonne Université

July 16, 2018

Abstract

Similarly as in [4] where nested coalescent processes are studied, we generalize the definition of partition-valued homogeneous Markov fragmentation processes to the setting of nested partitions, i.e. pairs of partitions(𝜁,𝜉)where𝜁is finer than𝜉. As in the classical univariate setting, under exchangeability and branching assumptions, we characterize the jump measure of nested fragmentation processes, in terms of erosion coefficients and dislocation measures. Among the possible jumps of a nested fragmen- tation, three forms of erosion and two forms of dislocation are identified – one of which being specific to the nested setting and relating to a bivariate paintbox process.

Contents

1 Introduction 2

2 Definitions, notation 3

3 Projective Markov property and strong exchangeability 5

3.1 Projective Markov process . . . . 5

3.2 Strongly exchangeable Markov process . . . . 9

3.3 Univariate results, mass partitions . . . 11

4 Outer branching property 12 4.1 𝑀-invariant measures . . . 15

4.2 Poissonian construction . . . 17

5 Inner branching property, simple fragmentations 19 5.1 Some examples . . . 19

5.2 Characterization of simple nested fragmentations . . . 24

5.3 Bivariate mass partitions . . . 27

5.4 A paintbox construction for nested partitions . . . 28

5.5 Erosion and dislocation for nested partitions . . . 30

6 Application to binary branching 33

References 35

Keywords and phrases.fragmentations; exchangeable; partition; random tree; coalescent; popu- lation genetics; gene tree; species tree; phylogenetics; evolution.

MSC 2010 Classification.60G09,60G57,60J25,60J35,60J75,92D15.

(3)

1 Introduction

Evolutionary biology aims at tracing back the history of species, by identifying and dating the relationships of ancestry between past lineages of extant individuals. This information is usually represented by a tree or phylogeny, species corresponding to leaves of the tree and speciation events (point in time where several species descend from a single one) corresponding to internal nodes [16,23].

Modern methods consist in analyzing and comparing genetic data from samples of individ- uals to statistically infer their phylogenetic tree. Probabilistic tree models have been well- developed in the last decades – either from individual-based population models like the classical Wright-Fisher model [2,10,15,23], or from time-forward branching processes, where the branching particles are species (see for instance Aldous’s Markov branching mod- els [1] and the revolving literature [6, 7, 11,13]) – allowing for inference from genetic data. A challenge is that trees inferred from different parts of the genome generally fail to coincide, each of them being understood as an alteration of a “true” underlying phylogeny (which we call thespecies tree).

To understand the relation betweengene treesand the species tree, our goal is to identify a class of Markovian models coupling the evolution of both trees, making the assumption that in general, several gene lineages coexist within the same species, and at speciation events one or several gene lineages diverge from their neighbors to form a new species, i.e.

we model the problem as atree within a tree[9,18–20]. See Figure1for an instance of a simple nested genealogy where discrepancies arise between the resulting gene tree and species tree.

Figure 1: Example of a nested tree where the gene tree (in black) does not coincide with the species tree (in gray).

Recent research aims at defining mathematical processes giving rise to such nested trees, generalizing several well-studied univariate (we will sometime use this term as opposed to

“nested”) processes. Some work in progress involves a nested version [5,17] of the King- man coalescent [14] (considered the neutral model for evolution, appearing as a scaling limit of many individual-based population models). In [4] we study a nested generaliza- tion ofΛ-coalescent processes [3,21,22] and characterize their distribution. Our present goal is to generalize the forward-time branching models originated from Aldous [1]. His

(4)

assumptions (which will be formally defined for our context in Section3) are basically that the random process of evolution is homogeneous in time and that the law of the process is invariant under both relabeling and resampling of individuals (we then say the process is exchangeableandsampling consistent). We are interested in the partition-valued processes satisfying these assumptions, i.e. the so-called fragmentation processes [3,13], and in this article we generalize their definition tonested partition-valuedprocesses to model jointly a gene tree within a species tree.

Crane [7] also generalizes Aldous’s Markov branching models to study the gene tree/species tree problem but uses a different approach to the one we use here. Indeed, his model is such that first the entire species treetis drawn according to some probability, and then the gene treet0 is constructed thanks to a generalized Markov branching model that depends ont. In the meantime, our goal is to characterize the class of models in which there is a joint Markov branching construction of both the gene tree and the species tree, under the assumptions of exchangeability and sampling consistency.

In particular our main result Theorem17, which will be formally stated in Section5, con- sists in showing that nested fragmentation processes satisfying natural branching proper- ties are uniquely characterized by

• threeerosion parameters𝑐out,𝑐in,1and𝑐in,2(rates at which a unique lineage can frag- ment out of its mother block, in three different situations);

• two dislocation measures 𝜈out and 𝜈in that are Poissonian intensities of how blocks instantaneously fragment into several new blocks with macroscopic frequencies.

The article is organized as follows. Section2briefly introduces some definitions and nota- tion used throughout the paper. In Section3we define our exchangeability and sampling consistency properties – or projective Markov property –, and show their equivalence to a

“strong exchangeability” property in a fairly general setting. We also recall some results in the univariate case which we seek to generalize to the nested case. In Section4we for- mulate some branching property assumptions, showing how they lead to simplifications in the representation of semi-groups of fragmentations, and giving a natural Poissonian con- struction of such processes. Under an additional branching property assumption, Section 5is devoted to the full characterization of the semi-group of simple nested fragmentation processes, in terms oferosionanddislocation measures. It is shown that dislocations, simi- larly as in the univariate case, can be understood as (bivariate) paintbox processes. Finally Section6briefly shows how our main result, Theorem17, translates in simpler terms when we make the classical biological assumption that all splits are binary.

2 Definitions, notation

For a set𝑆, writeP𝑆 for the set of partitions of𝑆:

P𝑆 := {𝜋 ⊂P(𝑆) \ {œ}, ∀𝐴, 𝐵𝜋,𝐴∩𝐵=œand S

𝐴𝜋𝐴=𝑆}, whereP(𝑆)denotes the power set of𝑆.

(5)

For𝑆,𝑆0two sets,𝜋 ∈ P𝑆 and𝜎:𝑆0𝑆aninjection, we write 𝜋𝜎:= {𝜎1(𝐴), 𝐴∈ 𝜋} \ {œ},

and if𝜇is a measure onP𝑆then we write𝜇𝜎for the push-forward of𝜇by the map𝜋7→𝜋𝜎. Note that if𝑆00

𝜏 𝑆0𝜎 𝑆are injections, then we have𝜋𝜎𝜏 =(𝜋𝜎)𝜏, and𝜇𝜎𝜏=(𝜇𝜎)𝜏. For𝑆0𝑆, there is a natural surjective function 𝑟𝑆,𝑆0 : P𝑆 → P𝑆0 called the restriction, defined by

𝑟𝑆,𝑆0(𝜋)=𝜋|𝑆0 :={𝐴𝑆0, 𝐴𝜋} \ {œ}. Note that𝜋|𝑆0 =𝜋𝜎for𝜎 :𝑆0𝑆,𝑥7→ 𝑥the canonical injection.

There is always a partial order onP𝑆, denoted and defined as:

𝜋 𝜋0 if ∀(𝐴,𝐵) ∈𝜋×𝜋0, 𝐴𝐵, œ ⇒ 𝐴𝐵,

that is𝜋 𝜋0 if 𝜋is finer than 𝜋0. We will work on the space consisting of two nested partitions, which we will noteP𝑆2,:

P2,

𝑆 := {(𝜁,𝜉) ∈ P𝑆2, 𝜁 𝜉}.

We equip the spaceP𝑆2, with a partial order defined naturally as (𝜁,𝜉) (𝜁0,𝜉0)if𝜁 𝜁0and𝜉 𝜉0.

Let us now define, for𝑛∈ Ž,[𝑛]:={1, . . . ,𝑛}and[∞]:, and for 𝑛∈ Ž∪ {∞}: P𝑛:= P[𝑛] ={𝜁partition of[𝑛]}.

We will generally label the blocks of a partition𝜋= {𝜋1,𝜋2, . . .}, in the unique way such that

min𝜋1< min𝜋2 <. . .

The spaceP2, is endowed with a distance𝑑which makes it compact, defined as follows:

𝑑(𝜋,𝜋0)= sup{𝑛∈ Ž, 𝜋|[𝑛] =𝜋|[𝑛]}1

, with the convention(supŽ)1=0.

For𝑘 ≤ 𝑛 ≤ ∞,𝜎 :[𝑘] → [𝑛]an injection and𝜋=(𝜁,𝜉) ∈ P𝑛2,, we write 𝜋𝜎 :=(𝜁𝜎,𝜉𝜎) ∈ P2,

𝑘 . Also, we write𝜋|[𝑘] :=(𝜁|[𝑘],𝜉|[𝑘]) ∈ P2,

𝑘 .

A measure𝜇onP𝑛or onP𝑛2,is said to beexchangeableif for any permutation𝜎:[𝑛] → [𝑛], we have

𝜇𝜎=𝜇.

A random variableΠ taking values inP𝑛or inP𝑛2, is said to beexchangeableif for any permutation𝜎:[𝑛] → [𝑛], we have

Π𝜎 (=𝑑)Π,

(6)

that is if its distribution is exchangeable. Similarly, a random process(Π(𝑡),𝑡 ≥ 0)taking values in P𝑛 or in P𝑛2, is said to be exchangeable if for any initial state 𝜋0 and any permutation𝜎:[𝑛] → [𝑛], we have

(Π(𝑡)𝜎,𝑡 ≥ 0)under𝜋0 (𝑑)

= (Π(𝑡),𝑡 ≥ 0)under𝜋𝜎 0, where𝜋is the distribution of the process started from𝜋.

Finally, a measure or a random process with values inPorP2,will be calledstrongly ex- changeableif its distribution is invariant under the action ofinjections. Note that while for processes this is a strictly stronger assumption than being exchangeable (see Section3.2), for measures the two properties are equivalent.

In the following we only consider time-homogeneous Markov processes.

3 Projective Markov property and strong exchangeability

3.1 Projective Markov process

For each𝑛 ∈ Ž, let 𝐴𝑛be a finite non-empty set. Assume there are surjective maps𝑟𝑚,𝑛 : 𝐴𝑚𝐴𝑛for each𝑚 ≥ 𝑛which satisfy

𝑝𝑚𝑛 ≥ 1, 𝑟𝑚,𝑛𝑟𝑝,𝑚 =𝑟𝑝,𝑛,

𝑛∈ Ž, 𝑟𝑛,𝑛= id𝐴𝑛.

The family(𝐴𝑛,𝑟𝑚,𝑛, 𝑚 ≥ 𝑛 ≥ 1)is called afinite inverse system, and we can define the inverse limit

𝐴=lim←−− 𝐴𝑛:=

(𝑎𝑛,𝑛 ≥1) ∈Q

𝑛∈Ž𝐴𝑛, ∀𝑚𝑛,𝑟𝑚,𝑛(𝑎𝑚)= 𝑎𝑛 ,

along with the canonical projection maps𝑟𝑛 : 𝐴 → 𝐴𝑛, (𝑎𝑛,𝑛 ≥ 1) 7→ 𝑎𝑛. A natural distance𝑑can be defined on the space 𝐴, by

𝑑(𝑎,𝑏):=(1/2+sup{𝑛1, 𝑎𝑛= 𝑏𝑛})1,

where we use the conventions supœ =0 and(1/2+supŽ)1= 0. Note that its topology is then generated by the sets

𝑟1

𝑛 ({𝑎}), 𝑛1,𝑎𝐴𝑛,

which are the balls of radius 1/𝑛and center any𝑐 ∈ 𝑟𝑛1(𝑎). The assumption that the sets 𝐴𝑛are finite makes the space(𝐴,𝑑)compact, so we can consider stochastic processes with values in𝐴.

Remark 1. P = lim←−− P𝑛 and P2, = lim←−− P𝑛2, are both inverse limits of finite inverse systems, where the restriction maps are𝑟𝑚,𝑛: P𝑚→ P𝑛, 𝜋7→𝜋|[𝑛].

(7)

Proposition 2. Let𝑋 = (𝑋(𝑡),𝑡 ≥0)be a stochastic process with values in𝐴the inverse limit of a finite inverse system. Assume that the followingprojective Markov propertyholds:

For all𝑛≥ 1, the process𝑋𝑛:=(𝑟𝑛(𝑋(𝑡)),𝑡 ≥0)is a continuous-time Markov chain in the finite state space 𝐴𝑛, whose distribution under𝑎 depends only on𝑟𝑛(𝑎).

Then𝑋 is a Markov process, whose distribution is characterized by a transition kernel𝐾 from 𝐴to𝐴(i.e.𝐾𝑎( · )is a nonnegative measure on 𝐴for all𝑎∈ 𝐴and𝑎7→ 𝐾𝑎(𝐵)is measurable for any𝐵Borel set of 𝐴) such that

• for all𝑎∈ 𝐴, we have𝐾𝑎({𝑎})= 0,

• for all𝑎 ∈ 𝐴and 𝑎0𝐴𝑛\ {𝑟𝑛(𝑎)}, the Markov chain 𝑋𝑛 has a transition rate from 𝑟𝑛(𝑎)to𝑎0 equal to

𝑞𝑛

𝑎,𝑎0 = 𝐾𝑎 𝑟𝑛1({𝑎0}) . Proof. 𝑋𝑛is a Markov chain, therefore there exist transition rates

𝑞𝑛

𝑎,𝑎0 =lim

𝑡0

1 𝑡

𝑎(𝑋𝑛(𝑡)=𝑎0)

for all𝑎 ∈ 𝐴, 𝑎0𝐴𝑛\ {𝑟𝑛(𝑎)}. Now since for 𝑛 < 𝑚, 𝑋𝑚 and 𝑋𝑛 = 𝑟𝑚,𝑛(𝑋𝑚) are both Markov chains, necessarily we have

𝑞𝑛

𝑎,𝑎0 = X

𝑎00𝑟−1𝑚,𝑛(𝑎0)

𝑞𝑚

𝑎,𝑎00. Fix𝑎?𝐴and𝑛≥ 1 and consider the application

𝑓𝑛: 𝑎∈ 𝐴𝑛\ {𝑟𝑛(𝑎?)} 7−→𝑞𝑛

𝑟𝑛(𝑎?),𝑎. Then these applications(𝑓𝑛, 𝑛≥ 1)satisfy

𝑚𝑛≥ 1, 𝑎∈ 𝐴𝑛\ {𝑟𝑛(𝑎?)}, 𝑓𝑛(𝑎)= X

𝑎0𝑟𝑚−1,𝑛({𝑎})

𝑓𝑚(𝑎0).

It is then easy to check that Carathéodory’s extension theorem allows us to build a measure 𝐾𝑎?on 𝐴\ {𝑎?}(which we see as a measure on 𝐴such that 𝐾𝑎?({𝑎?})=0) for which

𝑛 ≥1, 𝑎∈ 𝐴𝑛\ {𝑟𝑛(𝑎?)}, 𝐾𝑎? 𝑟1

𝑛 ({𝑎}) = 𝑓𝑛(𝑎)= 𝑞𝑛𝑟𝑛(𝑎?),𝑎.

Let us check that𝐾is a kernel, i.e. that𝑎7→ 𝐾𝑎(𝐵)is measurable for any Borel set𝐵. For𝐵of the form𝑟1

𝑛 (𝑎0), we have𝐾𝑎(𝐵)=𝑞𝑟𝑛𝑛(𝑎),𝑎0, so𝑎7→ 𝐾𝑎(𝐵)is clearly measurable. It is readily checked that the sets𝑟1

𝑛 (𝑎0) form a 𝜋-system and that the sets 𝐵 such that 𝑎 7→ 𝐾𝑎(𝐵) is measurable form a monotone class. The monotone class theorem then implies that this property holds for any Borel set𝐵⊂ 𝐴.

Let us now show that𝐾characterizes uniquely the distribution of𝑋. Clearly,𝐾characterizes the distribution of𝑋𝑛 for all 𝑛 ∈ Žsince all the transition rates of the Markov chain 𝑋𝑛 can be recovered as a function of 𝐾. By assumption, those distributions are consistent, in the sense that for any 𝑚 ≥ 𝑛, we have 𝑟𝑚,𝑛(𝑋𝑚) (=𝑑) 𝑋𝑛, where(=𝑑) denotes equality in distribution. Then, by Kolmogorov’s extension theorem, there is a unique distribution for

𝑋 which satisfies𝑟𝑛(𝑋)(=𝑑) 𝑋𝑛for all𝑛 ∈Ž.

(8)

Let us now note𝑟𝑛(𝑎) = 𝑎𝑛for any 𝑎𝐴to ease the notation. Note that the infinitesimal generator𝐺𝑛of the continuous-time finite-space Markov chain𝑋𝑛is then given by

𝐺𝑛𝑓(𝑎𝑛)= X

𝑏𝑛𝐴𝑛\{𝑎𝑛}

𝑞𝑛

𝑎,𝑏(𝑓(𝑏𝑛) −𝑓(𝑎𝑛))

=∫

𝐴

𝐾𝑎(d𝑏) 𝑓(𝑏𝑛) − 𝑓(𝑎𝑛) ,

for any function𝑓 : 𝐴𝑛→’and𝑎∈ 𝐴. Let us see that this result holds in the limit𝑛→ ∞, at least for a class of continuous functions𝑓 : 𝐴→’. Whether the preceding result holds for a continuous function𝑓 will depend on its modulus of continuity𝜔𝑓 : [0,∞) → [0,∞) defined for𝜀> 0 by

𝜔𝑓(𝜀):=sup{|𝑓(𝑎) −𝑓(𝑎0)|, 𝑎,𝑎0𝐴,𝑑(𝑎,𝑎0) ≤𝜀}, which is always finite since𝐴is compact.

Proposition 3. Let 𝑋 be a projective Markov process defined on the compact space (𝐴,𝑑), inverse limit of a finite inverse system(𝐴𝑛,𝑛 ∈Ž), and consider its characteristic kernel𝐾 as given by Proposition2.

Let𝑘𝑛:= max𝑎𝐴𝐾𝑎(𝐴\𝑟𝑛1({𝑎𝑛}))denote the maximum jump rate of the Markov chain 𝑋𝑛. Consider a function 𝑓 : 𝐴 → ’with a modulus of continuity denoted by 𝜔𝑓, and suppose 𝜔𝑓(1/𝑛)𝑘2

𝑛+10as𝑛→ ∞.

Then for every𝑎 ∈ 𝐴, the function 𝑏 7→ (𝑓(𝑏) −𝑓(𝑎))is 𝐾𝑎-integrable and the infinitesimal generator𝐺of the Markov process𝑋 is well-defined on 𝑓 and satisfies

𝐺 𝑓(𝑎)= lim

𝑡0

…𝑎𝑓(𝑋𝑡) −𝑓(𝑎)

𝑡 =

𝐴

𝐾𝑎(d𝑏) 𝑓(𝑏) −𝑓(𝑎)

. (1)

Proof. First, note that if 𝑘𝑛 = 0 for all 𝑛, then 𝐾𝑎 = 0 for all 𝑎𝐴and the process 𝑋 is almost surely constant, so (1) is correct. We now assume that𝑘𝑛 >0 for𝑛large enough.

Fix𝑎∈ 𝐴. Let us first check that𝑏7→ (𝑓(𝑏) −𝑓(𝑎))is𝐾𝑎-integrable. Let 𝐵0 := 𝐴\𝑟11({𝑎𝑛}) and for𝑛≥ 1,𝐵𝑛:=𝑟𝑛1({𝑎𝑛}) \𝑟𝑛+11({𝑎𝑛+1}), and notice that

𝐴

𝐾𝑎(d𝑏) |𝑓(𝑏) − 𝑓(𝑎)| ≤𝐾𝑎(𝐵0)𝜔𝑓(2)+

X

𝑛=1

𝐵𝑛

𝐾𝑎(d𝑏)𝜔𝑓(1/𝑛)

= 𝑘1𝜔𝑓(2)+X

𝑛=1

(𝑘𝑛+1𝑘𝑛)𝜔𝑓(1/𝑛). (2) By assumption, 𝜔𝑓(1/𝑛)𝑘2𝑛+1 → 0, so we have 𝜔𝑓(1/𝑛) = 𝑜 𝑘𝑛+21

, and since (𝑘𝑛)𝑛 is a positive, nondecreasing sequence,

X

𝑛=𝑁

𝑘𝑛+1𝑘𝑛

𝑘2

𝑛+1

X

𝑛=𝑁

𝑘𝑛+1𝑘𝑛

𝑘𝑛+1𝑘𝑛 =

X

𝑛=𝑁

1 𝑘𝑛

1 𝑘𝑛+1

1 𝑘𝑁,

which is finite for𝑁 such that𝑘𝑁 >0. It follows that the sum in (2) is finite, so the function 𝑏7→ (𝑓(𝑏) −𝑓(𝑎))is 𝐾𝑎-integrable.

(9)

Now for each𝑛∈ Ž, consider a family(𝑎1,𝑎2, . . . ,𝑎𝑝) ∈ 𝐴𝑝such that𝐴𝑛 ={𝑎𝑛,𝑎1

𝑛,𝑎2

𝑛, . . . ,𝑎

𝑝 𝑛} with no repetition, i.e. such that𝑝+1= |𝐴𝑛|. Now let us define for all𝑏∈ 𝐴,𝑓𝑛(𝑏):= 𝑓(𝑎𝑖) if and only if𝑏𝑛 = 𝑎𝑖𝑛. Notice that 𝑓𝑛is an approximation of 𝑓, in the sense that the error function𝑔𝑛 : 𝑏 7→ (𝑓(𝑏) − 𝑓𝑛(𝑏))necessarily satisfies|𝑔𝑛(𝑏)| ≤ 𝜔𝑓(1/𝑛). Note also that by definition,𝑓𝑛(𝑎)= 𝑓(𝑎).

Let us here treat the case when there exists𝑛≥ 1 such that𝜔𝑓(1/𝑛)=0. By the preceding remark, we have 𝑓𝑛 = 𝑓, in other words there exists an application e𝑓𝑛 : 𝐴𝑛 → ’such that 𝑓(𝑏) = e𝑓𝑛(𝑏𝑛) = e𝑓𝑛(𝑟𝑛(𝑏)). So …𝑎𝑓(𝑋𝑡) = …𝑎e𝑓𝑛(𝑟𝑛(𝑋𝑡)), and since(𝑟𝑛(𝑋𝑡),𝑡0)is a finite-state-space continuous-time Markov chain, it is immediate that

…𝑎𝑓(𝑋𝑡)= 𝑓(𝑎)+𝑡 𝑝 X

𝑖=1

𝑞𝑛

𝑎,𝑎𝑖(𝑓(𝑎𝑖) − 𝑓(𝑎))

+𝑂 (𝑡 𝑘𝑛)2k𝑓k , wherek𝑓k := sup𝑏𝐴|𝑓(𝑏)|, and where the constant in the term𝑂 (𝑡 𝑘𝑛)2k𝑓k

does not depend on𝑡,𝐾 or 𝑓. From this it is clear that

…𝑎𝑓(𝑋𝑡) − 𝑓(𝑎) 𝑡

−→

𝑡0 𝑝

X

𝑖=1

𝑞𝑛

𝑎,𝑎𝑖(𝑓(𝑎𝑖) − 𝑓(𝑎))=

𝐴

𝐾𝑎(d𝑏)(𝑓(𝑏) − 𝑓(𝑎)).

Now let us assume that for all𝑛 ≥1,𝜔𝑓(1/𝑛)> 0. Since𝑓𝑛(𝑏)depends only on𝑏𝑛, we can write

…𝑎𝑓𝑛(𝑋𝑡)= 𝑓(𝑎)+𝑡

𝐴

𝐾𝑎(d𝑏)(𝑓𝑛(𝑏) − 𝑓(𝑎))+𝑂 (𝑡 𝑘𝑛)2k𝑓k

= 𝑓(𝑎)+𝑡

𝐴\𝑟𝑛−1({𝑎𝑛})

𝐾𝑎(d𝑏)(𝑓(𝑏) −𝑓(𝑎))+𝑂(𝑡𝜔𝑓(1/𝑛)𝑘𝑛)+𝑂 (𝑡 𝑘𝑛)2k𝑓k , Notice also that

…𝑎𝑓(𝑋𝑡) −…𝑎𝑓𝑛(𝑋𝑡) 𝑡

𝜔𝑓(1/𝑛) 𝑡 , so that putting everything together, we have

…𝑎𝑓(𝑋𝑡) − 𝑓(𝑎)

𝑡 =∫

𝐴\𝑟−1𝑛 ({𝑎𝑛})

𝐾𝑎(d𝑏) (𝑓(𝑏) −𝑓(𝑎))+𝑂

𝜔𝑓(1/𝑛)𝑘𝑛+ 𝜔𝑓(1/𝑛) 𝑡 +𝑡 𝑘2𝑛

. (3) If one can find𝑛 = 𝑛(𝑡) such that𝑛 → ∞,𝜔𝑓(1/𝑛)/𝑡0 and𝑡 𝑘2𝑛0 as𝑡0, then passing to the limit in (3), by using the dominated convergence theorem for the integral, yields (1).

Now let us define for all𝑚 ≥ 1,𝑡𝑚 := p

𝜔𝑓(1/𝑚)/𝑘𝑝 and𝑡0

𝑚 := p

𝜔𝑓(1/𝑚)/𝑘𝑚+1. Notice that

𝑡𝑚𝑡𝑚0𝑡𝑚+1 −→

𝑚→∞ 0,

so for each𝑡 ∈ (0,𝑡1], there is an𝑚 ≥ 1 such that𝑡 ∈ [𝑡𝑚+1,𝑡𝑚]. Then,

• if𝑡 ≥ 𝑡0𝑚, let𝑛(𝑡):=𝑚, and we check 𝜔𝑓(1/𝑛)/𝑡𝜔𝑓(1/𝑛)/𝑡𝑛0 =q

𝜔𝑓(1/𝑛)𝑘𝑛+1, and 𝑡 𝑘2

𝑛𝑡𝑛𝑘2

𝑛=q

𝜔𝑓(1/𝑛)𝑘𝑛;

(10)

• if𝑡 ≤ 𝑡0𝑚, let𝑛(𝑡):=𝑚+1, and we check 𝜔𝑓(1/𝑛)/𝑡𝜔𝑓(1/𝑛)/𝑡𝑛= q

𝜔𝑓(1/𝑛)𝑘𝑛, and 𝑡 𝑘2

𝑛𝑡𝑛01𝑘2𝑛 =q

𝜔𝑓(1/(𝑛−1))𝑘𝑛. Since we assumed that𝜔𝑓(1/𝑛) > 0 for all 𝑛, then 𝑡𝑚 > 0 for all 𝑚, which implies that necessarily𝑛(𝑡) → ∞as𝑡 → 0. Finally, the assumption that𝜔𝑓(1/𝑛)𝑘2𝑛+1→ 0 as𝑛 → ∞ ensures us that both𝜔𝑓(1/𝑛)/𝑡and𝑡 𝑘2𝑛tend to 0 as𝑡 →0, which concludes the proof.

We are now interested in exchangeable projective Markov processes with values in the space of nested partitionsP2,, as an extension of univariate fragmentation processes (with values inP).

3.2 Strongly exchangeable Markov process

In the following, we write P for either P or P2,, when our assertions are valid for both spaces. We will also writeP𝑛for P𝑛or P𝑛2,. A key property of those spaces is the following.

For any𝑛 ∈Ž, and any𝜋∈P𝑛, there is a𝜋?Psatisfying:

• 𝜋?

|[𝑛] =𝜋

• for any𝜋0P such that𝜋0

|[𝑛] =𝜋, there is an injection𝜎: Ž→ Žwhich satisfies𝜎|[𝑛] =id[𝑛]and(𝜋?)𝜎=𝜋0.

Indeed for instance inP= P, it is easy to choose a𝜋?with an infinity of infinite blocks and no finite blocks, and such that 𝜋?

|[𝑛] = 𝜋. This partition satisfies immediately the required property. We will call any such𝜋?auniversal element ofPwith initial part𝜋 whenever we need to use one.

Proposition 4. LetΠ =(Π(𝑡),𝑡 ≥0)be an exchangeable Markov process taking values inP with càdlàg sample paths. The following propositions are equivalent:

(i) Πis strongly exchangeable.

(ii) Πhas the projective Markov property, i.e.Π𝑛 := (Π(𝑡)|[𝑛],𝑡 ≥ 0)is a Markov chain for all𝑛 ∈ Ž.

Remark 5. Crane and Towsner [8, Theorem 4.26] show that the projective Markov prop- erty is equivalent to the Feller property for exchangeable Markov process taking values in a Fraïssé space (i.e. a space satisfying general “stability and universality” assumptions [see 8, Definitions 4.4 to 4.11]). In particular the space of partitions and the space of nested partitions are Fraïssé spaces (the argument essentially being the existence of so-called uni- versal elements𝜋?), so for the processes we consider, strong exchangeability is equivalent to the Feller property.

Proof. (𝑖) ⇒ (𝑖𝑖): Let𝑛 ∈ Žand𝜋∈ P𝑛. Fix a universal𝜋?Pwith initial part𝜋. Now take any𝜋0Psuch that(𝜋0)|[𝑛] =𝜋, and an injection𝜎:Ž→Žsuch that𝜎|[𝑛] =id|[𝑛]

(11)

and(𝜋?)𝜎=𝜋0. Now we have

𝜋0(Π𝑛∈ ·)= 𝜋?((Π𝜎)𝑛∈ ·)

= 𝜋?(Π𝑛∈ ·),

so this distribution depends only on𝜋, which proves thatΠ𝑛is a Markov process. Now the assumption thatΠhas càdlàg sample paths ensures that the processΠ𝑛stays some positive time in each visited statea.s.ThereforeΠ𝑛is a continuous-time Markov chain.

(𝑖𝑖) ⇒ (𝑖): Let𝜎 : Ž→ Žbe an injection. For 𝑛 ∈ Ž, let𝜏be a permutation ofŽsuch that𝜏|[𝑛] =𝜎|[𝑛]. This property implies(𝜋𝜏)|[𝑛] =(𝜋𝜎)|[𝑛] for any𝜋 ∈P. We deduce

𝜋((Π𝜎)𝑛 ∈ ·)=𝜋((Π𝜏)𝑛 ∈ ·)

𝜋𝜏(Π𝑛∈ ·)

𝜋𝜎(Π𝑛 ∈ ·)

where the last equality is a consequence of the projective Markov property (the distribution ofΠ𝑛under𝜋depends only on the initial segment𝜋|[𝑛]). Since it is true for all𝑛, we have

𝜋(Π𝜎 ∈ ·)=𝜋𝜎(Π∈ ·), which proves the property of strong exchangeability.

Remark 6. To be strongly exchangeable is strictly stronger than being exchangeable. To see that, define the Markov processΠ=(Π(𝑡),𝑡0)taking values inP by:

• If𝜋∈ Phas an infinite number of blocks, then letΠunder𝜋be almost surely the constant function equal to𝜋.

• If𝜋∈ Phas a finite number of blocks, let𝑇 be an Exponential(1) random variable, and let the distribution ofΠunder𝜋 be that of the random function:

𝑡 7→

(

𝜋 if𝑡 <𝑇 0 if𝑡 ≥𝑇 ThenΠis clearly exchangeable but not strongly exchangeable.

Proposition 7. LetΠ =(Π(𝑡),𝑡0)be a strongly exchangeable Markov process inP. Then there is a unique kernel𝐾 fromP toPsuch that

• for all𝜋0P, we have 𝐾𝜋0({𝜋0})=0,

• for all𝜋1P𝑛, for all 𝜋2P𝑛\ {𝜋1}, the Markov chain Π𝑛 has a transition rate from𝜋1to𝜋2equal to

𝐾𝜋

0 𝜋|[𝑛] = 𝜋2

, where𝜋0is any element ofPsuch that(𝜋0)|[𝑛] =𝜋1.

Furthermore this kernel is strongly exchangeable, i.e. for any 𝜋0P and any injection 𝜎:Ž→Ž, we have

𝐾𝜎

𝜋0 = 𝐾𝜋𝜎0.

(12)

Proof. The first part of the proposition is an immediate consequence of Proposition 2. It remains only to prove that 𝐾 is strongly exchangeable. Consider 𝜋0P, 𝑛 ∈ Ž, 𝜋0 ∈ P𝑛\ {(𝜋0)|[𝑛]}and an injection𝜎: Ž→Ž. We have

1 𝑡

𝜋0 (Π(𝑡)𝜎)|[𝑛] =𝜋0 = 1 𝑡

𝜋𝜎

0 Π(𝑡)|[𝑛] = 𝜋0 because of the exchangeability ofΠ, and taking limits we find

𝐾𝜋

0 (𝜋𝜎)|[𝑛] =𝜋0

= 𝐾𝜋𝜎0 𝜋|[𝑛] =𝜋0 . So the two𝜎-finite measures 𝐾𝜎

𝜋0 and 𝐾𝜋𝜎

0 coincide on the sets of the form {𝜋|[𝑛] = 𝜋0}, which constitute a 𝜋-system generating the Borel sets of P. Therefore they are equal,

which concludes the proof.

Remark 8. Consider a universal element 𝜋?P such that for any 𝜋 ∈ P, there is an injection𝜎such that𝜋=(𝜋?)𝜎. The exchangeability property of the kernel𝐾then implies that𝐾𝜋= 𝐾𝜋𝜎?, therefore𝐾 is entirely determined by the single measure𝐾

𝜋?. 3.3 Univariate results, mass partitions

Random exchangeable partitions𝜋 ∈ P and their relation to random mass partitions is well known [see3, Chapter 2]. Let us recall briefly some definitions and results, which we will then extend to the nested case. We define the space of mass partitions

Pm :=

s=(𝑠1,𝑠2, . . .) ∈ [0, 1]Ž, 𝑠1𝑠2. . . , P

𝑘𝑠𝑘1 . (4)

ForsPm, one defines an exchangeable distribution𝜚sonP, by the following so-called paintbox construction:

• for𝑘 ≥ 0, define𝑡𝑘 =P𝑘

𝑘0=1𝑠𝑘0, with𝑡0=0 by convention.

• let(𝑈𝑖,𝑖≥ 1)be an i.i.d. sequence of uniform random variables in[0, 1].

• define the random partition𝜋∈ P by setting

𝑖∼𝜋 𝑗 ⇐⇒ 𝑖= 𝑗or𝑘 ≥ 1,𝑈𝑖,𝑈𝑗 ∈ [𝑡𝑘1,𝑡𝑘). Note that the set 𝜋0 := {[𝑡𝑘1,𝑡𝑘),𝑘1} ∪ {{𝑡}, P

𝑘1𝑠𝑘𝑡1} is a partition of [0, 1], and that we have𝜋= 𝜋𝜎0, where𝜎: Ž→ [0, 1]is the random injection defined by 𝜎 : 𝑖 7→ 𝑈𝑖. Also, note that by definition some blocks are singletons (blocks {𝑖} such that 𝑈𝑖 ∈ [P

𝑘1𝑠𝑘,1]), and by construction we have

#{𝑖 ∈ [𝑛], {𝑖} ∈ 𝜋} 𝑛

−→

𝑛→∞

𝑠0:=1−P

𝑘1𝑠𝑘.

These integers that are singleton blocks are called thedustof the random partition𝜋and the last display tells us there is a frequency𝑠0of dust.

Conversely, any random exchangeable partition𝜋has a distribution that can be expressed with these paintbox constructions𝜚s. Indeed,𝜋hasasymptotic frequencies, i.e.

|𝐵|:= lim

𝑛→∞

#(𝐵∩ [𝑛])

𝑛 exists a.s. for all𝐵∈ 𝜋.

(13)

Let us write|𝜋|Pmfor the decreasing reordering of(|𝐵|,𝐵∈ 𝜋), ignoring the zero terms coming from the dust. Now it is known [14, Theorem 2] that the conditional distribution of𝜋given|𝜋| =sis𝜚s, so we have

(𝜋∈ · )=∫

(|𝜋| ∈ ds)𝜚s( · ).

This means that any exchangeable probability measure onPis of the form𝜚𝜈 where𝜈is a probability measure onPm, and

𝜚𝜈( · ) :=∫

Pm

𝜚s( · )𝜈(ds).

Furthermore, Bertoin [3, Theorem 3.1] shows that any exchangeable measure 𝜇 on P such that

𝑛1, 𝜇(𝜋|[𝑛] ,1[𝑛])< ∞ (5) can be written𝜇= 𝑐e+𝜚𝜈, where𝑐 ≥ 0,𝜈is a measure onPmsatisfying

Pm

(1−𝑠1)𝜈(ds) <∞, (6) andeis the so-callederosion measure, defined by

e:=P

𝑖∈Ž𝛿{ {𝑖},Ž\{𝑖} }.

As a result, each fragmentation process with values in P is characterized by its erosion coefficient𝑐and characteristic measure𝜈, in such a way that its rates can be described as follows:

A block of size 𝑛 fragments, independently of the other blocks, into a partition with𝑘different blocks of sizes𝑛1,𝑛2, . . . ,𝑛𝑘 with rate

𝑐1{𝑘=2, and 𝑛1=1 or𝑛2=1}+

Pm

𝜈(ds)X

i

𝑠𝑛1

𝑖1 ·𝑠𝑛2

𝑖2 · · ·𝑠𝑛𝑘

𝑖𝑘, where 𝑠0 is defined to be 1 − P

𝑖1𝑠𝑖, and the sum is over the vectors i = (𝑖1, . . . ,𝑖𝑘) ∈ {0, 1, . . .}𝑘 such that 𝑖𝑗 may be 0 only if 𝑛𝑗 = 1, and if 𝑗 , 𝑗0 and 𝑖𝑗 ,0, then𝑖𝑗0 , 𝑖𝑗.

We aim at showing a similar result concerning fragmentations of nested partitions.

4 Outer branching property

From now on, to be able to give a more precise characterization of nested fragmentation processes, we will exclude from the study those processes which exhibit simultaneous frag- mentations in separate blocks. That is, we will assume a branching property: two different blocks at a given time undergo two independent fragmentations in the future. In the uni- variate case, Bertoin [3, Definition 3.2] expresses the branching property thanks to the introduction of a mapping Frag : P× PŽ → P. While a similar definition could be

(14)

made in the nested case, the analog of the Frag mapping would be too lengthy to introduce and we found simpler to assume an equivalent fact, which is all we will use in later proofs:

distinct blocks fragment at distinct times.

We also need to distinguish two branching properties in the case of nested fragmentations, each concerning either the outer or the inner blocks (branching property for𝜉or for𝜁).

Definition 9. Let Π = (Π(𝑡),𝑡 ≥ 0) = ((𝜁(𝑡),𝜉(𝑡)),𝑡 ≥ 0) be a strongly exchangeable Markov process with values in P2, and decreasing càdlàg sample paths. We say that Π satisfies theouter branching propertyif

Almost surely for all 𝑡 such thatΠ(𝑡−) , Π(𝑡), there is a unique block 𝐵 ∈ 𝜉(𝑡−) such thatΠ(𝑡−)|𝐵 ,Π(𝑡)|𝐵.

Moreover, we say thatΠsatisfies theinner branching propertyif

Almost surely for all 𝑡 such that 𝜁(𝑡−) , 𝜁(𝑡), there is a unique block 𝐵 ∈ 𝜁(𝑡−) such that𝜁(𝑡−)|𝐵 ,𝜁(𝑡)|𝐵.

Nested fragmentations processes satisfying both branching properties will be calledsimple.

The rest of the paper is dedicated to characterize as simply and precisely as possible simple nested fragmentations processes.

Proposition 10. LetΠ = (Π(𝑡),𝑡 ≥ 0) = ((𝜁(𝑡),𝜉(𝑡)),𝑡 ≥ 0)be a strongly exchangeable Markov process with values in P2, and decreasing càdlàg sample paths. Write 𝐾 for its exchangeable characteristic kernel.

IfΠsatisfies theouter branching property, then the characteristic kernel 𝐾is characterized by a simpler kernel𝜅fromP toP2, which is defined as

𝜅𝜁( · ) := 𝐾(𝜁,1)( · ),

where1denotes the partition of Žwith only one block. The simpler kernel is also strongly exchangeable.

The kernel𝐾is determined by𝜅in the following way: fix𝜋0=(𝜁,𝜉) ∈ P2,and for simplicity suppose that all the blocks of𝜉are infinite. For all 𝐵 ∈ 𝜉, define an injection𝜎𝐵 : Ž→ Ž whose image is𝐵, and𝜏𝐵 : 𝐵 → Žsuch that𝜎𝐵𝜏𝐵 = id𝐵. By definition, (𝜋0)𝜎𝐵 is of the form(𝜁𝐵,1), with𝜁𝐵 = 𝜁𝜎𝐵. Now define 𝑓𝐵 as the application which maps𝜋 ∈ P2, to the unique𝜔∈ P2, such that

𝜔 ({𝐵,Ž\𝐵},{𝐵,Ž\𝐵}),

𝜔|𝐵 =𝜋𝜏𝐵 and𝜔|Ž\𝐵 =(𝜋0)|Ž\𝐵. Then for any Borel set 𝐴⊂ P2,, we have

𝐾𝜋

0(𝐴)= X

𝐵𝜉

𝜅𝜁

𝐵({𝑓𝐵(𝜋) ∈ 𝐴} ∩ {𝜋,(𝜋0)𝜎𝐵}). Remark 11. This proposition shows how 𝐾𝜋

0 is expressed in terms of the kernel𝜅only for 𝜋0 =(𝜁,𝜉)such that all the blocks of𝜉are infinite. In fact this is enough to characterize𝐾 entirely since if𝜋0does not satisfy this property, there exists a nested partition𝜋0

0= (𝜁0,𝜉0)

Références

Documents relatifs

We also present an approximation algorithm of a tree by a self-nested one that can be used in fast prediction of edit distance between two trees..

In contrast, no association between smoking and prostate cancer was evident from the British Physicians [45], the US Health Professionals' [57] and the Physi- cians' Health

Our investigation treats the average number of transversals of fixed size, the size of a random transversal as well as the probability that a random subset of the vertex set of a

We say that a specific pattern M occurs in a tree T if M occurs in T as an induced subtree in the sense that the node degrees for the internal (filled) nodes in the pattern match

(h) Indeterminate orthotropic monopodia with determi- nate plagiotropic branches attached rhythmically and relaying terminally or zonally.. We have a list of 30 rules which seem to

since the transform of a given SAW is always a neigh- bour-avoiding walk (Watson [12]) the total number of site trees for the Kagomé, square covering and

The starting point of our approach relies on the elementary observation that considering the log- arithm of components in a homogeneous fragmentation with a finite dislocation