• Aucun résultat trouvé

Adaptation d’une population asexuée dans un système source-puit

N/A
N/A
Protected

Academic year: 2022

Partager "Adaptation d’une population asexuée dans un système source-puit"

Copied!
53
0
0

Texte intégral

(1)

École Normale Supérieure de Lyon

August 23, 2017 Internship

Adaptation of asexual populations in a source-sink system

by Florian Lavigne

Directors: Lionel ROQUES1, BioSP - INRA PACA (Avignon, France);

Guillaume MARTIN2, ISEM (Montpellier, France).

Host laboratories: BioSP, INRA, ISEM, ENS Lyon.

1lionel.roques@inra.fr

2guillaume.martin@umontpellier.fr

(2)

Acknowledgements

I thank the all laboratories BioSP, INRA PACA and ISEM, to receive me during my stay and to give me their trust. I have liked the possibilities they have given to me to learn more about statistics, thanks to seminaries.

I thank Messrs Lionel ROQUES and Guillaume MARTIN to direct me during my internship. Thanks to them, I have improved my knowledge about biological models, especially the Fisher’s Geometric Model, and my wish to do my PhD in this domain has been reinforced.

(3)

1 Introduction

Evolutive adaptation is a central process in ecology. Because of selection, particu- lar genotypes can reproduce easilier than others: the contribution of each genotype to the next generation is measured by its fitness. Added to this, the genetic drift changes the genetic composition of the population, according to chance or random events, because of finitness of population sizes. The second effect governing adap- tation is themutation: permanent and heritable changes can occur in genes at each generation.

Before going further on, let us give a precise meaning to the concept of fitness, which is central in our manuscript: the fitness of a genotype is the expected number of individuals having this genotype. Using a continuous-time viewpoint, if we ne- glect mutation and genetic drift, the number of individuals with fitness m changes according to the equation ∂tNm(t) = m Nm(t). Denoting by N(t)the total popula- tion size and p(t, m) = Nm(t)/N(t) the proportion of genotypes having the fitness m, we observe that N0(t) = m(t) N(t) and ∂tp(t, m) = p(t, m) (m−m(t)) where m(t)is the mean fitness in the population at timet. In other words, the proportion of genotypes with fitness m increases if m is larger than the average fitness in the population, and decreases otherwise: this corresponds to Darwinian evolution.

Typically, in a population with a constant size N, adaptation is modelled with stochastic Individual-Based Models (IBMs). The Wright-Fisher Model (see Figure 1) assumes that:

(Selection step) generationt+1is obtained by samplingN individuals (with re- placement) from generation t, according to their fitnessesmt = (m1t,· · · , mNt ).

This is modelled by a multinomial distributed random variable, with para- metersN andexp(mt)the Darwinian fitnessi.e. the discrete-time counterpart of the Malthusian fitness defined above;

(Mutation step) individuals are affected by a Poisson number of mutations, with rate U.

Generation t N individuals of fitness

mt = (m1t, . . . , mNt )

Selection N individuals sampled with replacement Multi(N,exp(mt))

Mutation Number: k ∼ P(U) Effect: m → m+s

Figure 1: Wright-Fisher Model.

The distribution of fitness effects (DFE) of mutations can describe epistasis or not. In other terms, the effects of mutation on fitness may be non-additive (non- epistatic case) or additive (epistatic case). If the fitness of the parent is m, the fitness of its offspring is m+s where s is a random variable with a distribution J.

If the model is non-epistatic, then J is independent of m: we say that the model is

(4)

context-independent. If the model is epistatic, thenJ =Jm depends of the valuem (see Figure 2), and we say that the model is context-dependent.

Figure 2: Mutation kernel for an epistatic model. We assume here a fitness otpimum arbitrarily fixed at m = 0. The ‘ + ‘ sign indicates beneficial mutations, and the‘−‘sign indicates deleterious mutations.

1.1 Fisher’s Geometric Model (FGM)

The most common model to describe adaptation (e.g. Tenaillon, 2014) is Fisher’s Geometric Model (FGM) with a single optimum. It assumes that each genotype is characterized by a phenotype made of n traits, namely a vector g ∈ Rn. An optimal phenotype corresponds to maximal fitnessmmax and sets the origin of phe- notype space (g = 0). Fitness decreases away from this optimum. We consider here a quadratic fitness function: in continuous time models, Malthusian fitness is a quadratic function of the breeding value m(g) = mmax − kgk2/2, and in dis- crete time versions, Darwinian fitness is a Gaussian function of g (W(g) =em(g) = Wmax exp (−kgk2/2)) withWmax=emmax and mmax is the Malthusian fitness at the optimum. Tenaillon (2014) takes the optimum atg = 0, which can be done without loss of generality by translation of the phenotypic space.

Mutations occur at a rate U and create independent and identically distributed (iid) random variationsdg around the parent, for each trait. In the classical “Gaus- sian FGM”, Martin (2014) expresses the mutation phenotypic effects, as a multi- variate normal: dg = N(0, λIn), where λ is the mutational variance at each trait, andIn is the identity matrix in n dimensions. Figure 3 explains why these assump- tions induce epistasis and a distribution of fitness effects of mutations of the form presented in Figure 2.

For finite haploid asexual populations, Martin and Roques (2016) proposed a PDE framework to study the dynamics of the distribution of the Maltusian fitnessm.

In continuous time, this is the expected exponential growth rate of the population.

In this paper, the authors analysed the context-independent and context-dependent cases, thanks to the Moment Generating Function (MGF) and the Cumulant Gene- rating Function (CGF) associated with the distribution of fitness in the population.

This approach is mainly based on the derivation and the analysis of partial differen- tial equations (PDE) satisfied by these functions.

(5)

Figure 3: Schematic description of a 2D phenotype-fitness landscape with a single optimum, as in the Fisher’s Geometric Model. The green points represent phenotypesg1,g2 and g3, and the green lines are isofitness lines. The gold point is the optimal phenotype. Red zones are beneficial mutations, and blue zones are deleterious mutations. We observe that phenotypes close to the optimum lead to fewer beneficial mutations.

1.2 Evolutionary Rescue and Source-Sink systems

Evolutionary Rescue (ER) occurs when a population, initially declining because of exposure to an environment outside of its ecological niche, avoids extinctionvia ge- netic adaptation restoring population growth. This phenomenon underlies a range of biological contexts of fundamental – colonization attempts, human-mediated in- troductions or reintroductions, or climate change– and applied importance: range expansions/contractions, host shifts in pathogens, and the emergence of resistance to herbicides, pesticides, fungicides, antibiotics or chemotherapy in cancer. Better management strategies of resistance emergence would benefit from an empirically established and general framework to understand and predict ER, in more or less complex situations.

Modelling ER relies on eco-evolutionary coupling i.e. coupling a model of po- pulation dynamics for the population size N(t) with a model for the dynamics of adaptation. Obviously, the evolutionary dynamics of mean fitness (growth rate) di- rectly influence the population size dynamics: this is the core of all ER models. This must be handled in non-equilibrium models to provide predictions for the probability of ER in a given population facing a given stress. Things further complicate when considering multiple habitats connected by migration or within-habitat variation in stress level, at the timescale of ER.

We will analyse asink – an initially empty (or sufficiently dimly inhabited) stress- ful environment where the population decays – with constant immigration from a stable source, in which the population strives. The population in the source is as- sumed to be at mutation-selection balance. As the migrants are initially not adapted to the sink, their fitness in the sink, noted m?, is different from their initial fitness

(6)

Source

Stable population obtained with the IBM in Fig 1

Sink

Generation t N(t) individuals

of fitness

mt = (m1t,· · ·, mN(t)t )

N˜(t) = N(t) + Dand m˜t = (m1t,· · · , mN(t)t , m?1,· · · , m?D)

Immigration

Sample of D ∼ P(d)migrants

Selection (+ demography) N(t + 1) ∼ P(emtN˜(t)) individuals sampled w.r.t.

Multi(N(t + 1),exp( ˜mt)) Mutation

Number: k ∼ P(U) Effect: m → m + s

Figure 4: IBM including demography (N(t) is not constant) for population under mutation, selection and migration effects.

in the source. We will consider two cases: in the context-independent case, the dis- tribution ofm? will be arbitrary; in the FGM case, it will be obtained by assuming a translation of the optimum in the phenotypic fitness space (see Figure 3). Inva- sion of the sink relies on ER occuring there. Source-sink systems rely on simplistic assumptions but constitute a first step towards spatial eco-evolutionary modelling.

The corresponding stochastic Individual-Based Model is detailled in Figure 4: com- pared to standard Wright-Fisher Model, this approach includes immigration events and demography. Let us explain how the demography term is constructed. Assume that, after immigration step, there are N˜(t) individuals in the sink at generation t, each one with (Darwinian) fitness emit (i = 1,· · · ,N(t)). By definition of fitness,˜ the expected number of offspring of an individual i is emit. Assuming a Poisson distribution of the number of offsprings for each individual, the total population at generationt+ 1 satisfies

N(t+ 1) ∼ P

N(t)˜

X

i=1

emit

.

In other terms, N(t+ 1)∼ P

N˜(t)emt

, where emt is the mean Darwinian fitness at generation t.

Gomulkiewicz and Holt (1995) studied the times needed for a population to re- cover from a stress, or after which the population encounters critically low densities (and so becomes extinct). Here, we will be interested by the time needed before the

(7)

invasion of the sink: ER always occurs after sufficient time because of continuous immigration. This characteristic time t0 corresponds to the time when the popu- lation can get round of the genetic resistance and so when the mean fitness in the sink population becomes non-negative.

1.3 Aim of this work

The aim of this work is to build a mathematically tractable framework to analyze the source-sink problem, with a particular focus on the characteristic time. In that respect, we will extend the PDE approach of Martin and Roques (2016) to describe the dynamics of fitness distribution in the sink, accounting for migration (from source to sink only) and variable population sizeN(t) (due to migration and birth/death events). We will assume the same type of mutation effects on fitness in the source and in the sink – either context-independent or context-dependent.

In Section 2, we will first build the PDEs satisfied by the Cumulant Generating Function of the distribution of fitness in the sink. Then, in Section 3, we will derive analytic solution of these PDEs in the context-independent case. This will allow us to study the characteristic time (finitness and formulae). In Section 4, we will derive some results in the context-dependent case when the mutation rate is low. In Section 5, we will validate the theory of Sections 2 -4. In Section 6, we will derive an integro-differential framework for the source-sink problem, and justify the use of Generating Functions. Lastly, we will discuss our results in Section 7. All of the prooves are provided in Appendix C

All of the notations of this paper are presented in Table 1.

2 Derivation of the model

We will assume that during a small time stepδt, a single immigration event occurs with probabilityd δt, whered is the immigration rate.

2.1 Distributions of fitnesses m

?

of newly-arriving migrants

We will consider two main cases depending on the presence of epistasis or not. We will denote by p? the distribution of m?. Futhermore, we will assume that the migrants are, in average, inadapted to the sinki.e. E[m?]<0.

Context-independent case

The distribution ofm? is arbitrary. We consider the following distributions:

clonal case: all of the individuals in the source have the same genotype. In the sink, their fitness will be denoted m?: p?m?. As they are inadapted in the sink, the fitness m? is negative;

Gaussian case: we assume that the distribution of phenotypes in the source induces a Gaussian N(µ, σ2) distribution of fitness, withµ < 0.

(8)

Notation Description Formula m Malthusian fitness

{mi} Fitness classes within a population

p(t, mi) Frequency of the fitness class mi at time t N, N(t) Population size, at time t

ζ(t, m) Population size of fitness m, at time t ζ(t, m) =N(t)p(t, m) K(t) Number of fitness classes at time t

hmi(t) Mean fitness at time t

K(t)

X

i=1

p(t, mi)mi

V(t) Variance in fitness at time t

K(t)

X

i=1

p(t, mi)m2i −m(t)2 X Mean value of any variable X(m), avera-

ged over the current distribution of geno- types within a population

K(t)

X

i=1

p(t, mi)X(mi)

hXi "Ensemble expectation" of any random variableX, averaged over replicate (finite) populations

M(t, z) "Empirical" MGF ofpin the sink, at time t

K(t)

X

i=1

p(t, mi)emiz C(t, z) "Empirical" CGF of pin the sink, at time

t

logM(t, z) M(t, z) Expected MGF under the deterministic

approximation

M(t, z)≈ hM(t, z)i C(t, z) Expected CGF under the deterministic

approximation

C(t, z)≈ hC(t, z)i s Selection coefficient of a mutation relative

to its parent

DFE Distribution of fitness effects of mutations J(s|m) Probability distribution function of s in

background m

µJ Mean effect of mutations on fitness in any background in non-epistatic models

R

RsJ(s)ds

MS(z, m) MGF of the DFE R

RJ(s|m)eszds M?(z) MGF of the DFE in the background with

fitness m = 0

MS(z,0) ω(z) Linear effect ofmon the CGF of the DFE ∂mlogMS(z,0)

U Genomic mutation rate d Immigration rate

m? Migrant fitness in the sink p? Distribution of m?

FGM Fisher’s Geometric Model

n Dimension of the phenotypic space λ Mutational variance at each trait

(9)

Context-dependent case

We study a model in which adaptation in the source and in the sink are described with a FGM with a single optimum (see Section 1.1). Here, we assume that the distribution of the phenotypes in the source is at mutation-selection balance. We then compute the distribution of the fitness of the migrants in the sink, due to a translation of the optimum in the phenotypic-fitness space.

Assume that the phenotypic optima in the sink (say g = 0) and in the source are separated by a vector g0 ∈Rn. For a given distribution of phenotypes g ∈Rn, the corresponding distribution of fitnesses in the source is given by the relation

msource =mmax− kg−g0k2

2 ,

and the distribution of fitnesses in the sink bym? =mmaxkgk22, where mmax is the optimal fitness. The results in (Martin and Lenormand, 2015) and in (Martin and Roques, 2016) imply that, for low mutation ratesUS in the source, the approximate distribution of m? in the sink can be described through its Moment Generating Function:

E[em?z]≈emmaxz

emDze US/sH +

1−e US/sH

(1 +λz)−(θ−1)

, (1)

where mD < 0 is the fitness in the sink of the source optimum and θ = n/2 (see Appendix A for more details).

2.2 Generating Functions

We denote by N(t) the population size in the sink at time t. Each individual i in the population has a fitness mi(t) in the sink. To analyze the dynamics of fitness distribution, we use the Moment Generating Function (MGF) of the fitness distribution:

M(t, z) = emi(t)z = 1 N(t)

N(t)

X

i=1

emi(t)z =

K(t)

X

i=1

p(t, mi(t))emi(t)z, z ≥0,

whereK(t)is the number of fitness classes in the population at timetandp(t, mi(t)) is the frequency of the fitness class mi(t)at time t. For mathematical convenience, we mostly focus on the Cumulant Generating Function (CGF) of the distribution:

C(t, z) = ln(M(t, z)).

In particular, the mean of the fitness distribution, denoted by m(t), and its varianceV(t)satisfy

m(t) = ∂zC(t,0) and V(t) =∂zzC(t,0).

(10)

2.3 PDE for the expected CGF of the fitness distribution in the sink

We first recall some equations already obtained in Martin and Roques (2016) for the dynamics of the expected CGF under selection and mutation effects, and we show how these equations are connected with a system of Stochastic Differential Equations (SDEs).

Drift and selection effects

Accounting the genetic drift, the CGF C(t, z) and MGF M(t, z) are random vari- ables, generated by the random process governing the vector of frequencies of the K(t)different genotypes (pi =p(t, mi)) present at timet. To simplify the notations, we denote by p this vector and K = K(t). When population is large enough, p is approximately described by a K-type Wright-Fisher diffusion with selection and drift, characterized by an infinitesimal generatorA:

A f(p) =

K

X

i=1

pi(mi−m(t))∂if(p) + 1 2Ne

K

X

i,j=1

pii,j−pj)∂i,jf(p),

where m(t) =

K

X

i=1

pimi and Ne is a given variance effective size. We show here (we have not found any reference where the computation is carried out) that this in- finitesimal generator corresponds to the system of Stochastic Differential Equations (SDEs):

∀i, dpi = (mi−m(t))dt+X

j6=i

rpipj Ne dBij,

with the real white noises dBij =−dBji for all i 6=j. As in Appendix B, we have to explain this system as a matrix equation. For J ∈[1, n], we define the matrices:

TJ =

On−J,J

IJ

∈Mn,J(R),

whereOn−J,J is the matrix of size n−J ×J with all entries being zero, and IJ the identity matrix inMJ(R). Let EI,J ∈Mn,n+1−J(R) be defined by:

∀i, j, Ei,jI,Ji=Jδj=I−J+1. Then, we can check that

dp = (m−m(t))dt+σdB, with dB a multi-dimensional white noise and

σ=

L1 (0) . .. (0) Ln

Tn (0)

−E2,1 Tn−1

... . .. . ..

−En,1 · · · −En,n−1 T1

(2)

(11)

and Lj =qp

jp1

Ne ,· · · ,qp

jpj−1

Ne ,0,qp

jpj+1

Ne ,· · · ,qp

jpn

Ne

.

By Appendix B, we know that the infinitesimal generator of p satisfies:

A f(p) =

K

X

i=1

pi(mi−m(t))∂if(p) + 1 2

K

X

i,j=1

(σσT)iji,jf(p).

We have just to compute (σσT)ij. We have for all i:

(σσT)ii =X

k<i

LiEi,k(Ei,k)TLTi +LiTn+1−iTn+1−iT LTi

=X

k<i

(Li)2k+Li

0i,i (0) (0) In+1−i

LTi =LiLTi

= 1 Ne

X

j6=i

pipj = pi(1−pi) Ne . Additionally, for alli < j, we have:

(σσT)ij =X

k<i

LiEi,k(Ej,k)TLTj −LiTn+1−i(Ej,i)TLTj

=X

k<i

0 + (Li)j(Lj)i=− pi pj Ne .

The CGF is a particular function of genotypic frequencies with f(p) = log X

i

piemiz

! . Thus, we get:

A M(t, z) =∂zC(t, z)−∂zC(t,0) + 1−eC(t,2z)−2C(t,z)

2Ne .

We denote byh·ithe expectation, taken over replicate population, which is different of the average·withina population. Taking this expectation starting from the same initial distribution p0, and using Kolmogorov’s backward equation (see Appendix B), we get

thC(t, z)i=A hC(t, z)i=hA C(t, z)i

=h∂zC(t, z)i − h∂zC(t,0)i+ 1−

eC(t,2z)−2C(t,z)

2Ne .

Let us set:

δ(t, z) = 1−

eC(t,2z)−2C(t,z)

2Ne .

In the sequel, we make a deterministic ("infinite population") approximation which consists in neglecting the stochastic term in A, or equivalently assuming

δ(t, z) = 0, (3)

(12)

which yields:

thC(t, z)i=h∂zC(t, z)i − h∂zC(t,0)i.

Under the deterministic approximation, denoted here by C, the expected CGF sa- tisfies the following equation:

tC(t, z) =∂zC(t, z)−∂zC(t,0). (4) Mutation effects

We denote J(s|m) the probability distribution function of mutation fitness effects knowing the fitnessm of the parent. We define alsoMS(z, m) =R

eszJ(s|m)ds and CS(z, m) = logMS(z, m). We assume that these quantities are well-defined for any m and some z ∈[0, zmax].

During a time interval ∆t, a single mutation occurs with probability N U∆t (Poisson process). Thus the conditional change of theM GF is

∆M(t, z|s, m) = N U∆t e(s+m)z−emz

N =U∆t(esz−1)emz.

By taking expectation over the distribution of mutation fitness effectssfor a parent with a given fitness m, we get

∆M(t, z|m) =U∆t MS(z, m)−1 emz. Then taking expectation over the fitness of the parentm yields

∆M(t, z) =

K(t)

X

i=1

p(t, mi)∆M(t, z|m) = U∆t

K(t)

X

i=1

p(t, mi) MS(z, m)−1 emiz

=U∆t

emzMS(z, m)−emz . As∆C(t, z) = ∆M(t, z)/M(t, z), we have

∆C(t, z) = U∆t emzMS(z, m) emz −1

! , which yields the relation:

tC(t, z) =U emzMS(z, m) emz −1

!

. (5)

Taking expectation over replicate populations on (5) yields:

tC(t, z)≈U

*emzMS(z, m) emz

+

−1

!

. (6)

(13)

Immigration effects

Immigration induces a new term in the PDE satisfied by C, compared to the work of Martin and Roques (2016).

Let ∆t be small enough such that a single immigration event occurs during (t, t + ∆t) with probability d∆t. Denoting by m? the fitness in the sink of the migrant, the expected change in a given MGF M(t, z), conditional on m?, during

∆t is:

∆M(t, z|m?) = d ∆t

 1 N(t) + 1

N(t)

X

i=1

emi(t)z +em? z

− 1 N(t)

N(t)

X

i=1

emi(t) z

=d ∆t

em? z

N(t) + 1 − 1

N(t) + 1M(t, z)

.

Taking expectation over the distribution of m? (see Section 2.1 for more details on the distribution ofm?), we get

∆M(t, z) = d ∆t

N(t) + 1 E em? z

−M(t, z) . As ∆C(t, z) = ∆M(t, z)/M(t, z), we have

∆C(t, z) = d ∆t N(t) + 1

E em? z M(t, z) −1

! . Finally, we get:

tC(t, z)≈ d N(t) E

em? z

e−C(t,z)−1

. (7)

2.4 ODE for population dynamics

In the absence of immigration, by definition of the Malthusian fitness, the sizeNi(t) of the fraction of the population with fitnessmi(t)satisfies the ODE

Ni0(t) =Ni(t) mi(t).

Thus the total population size satisfies:

N0(t) =

K(t)

X

i=1

Ni(t) mi(t) = N(t)

K(t)

X

i=1

p(t, mi(t))mi(t) = N(t) m(t).

If we take into account the immigration from the external source, with constant rate d, the equation becomes:

N0(t) = N(t) m(t) +d.

In the sequel, we denote by N the expected population size among replicates, and we assume that:

N0(t) =N(t) hmi(t) +d, (8)

wherehmiis the expected mean fitness over replicates.

(14)

2.5 Conclusion

Combining the effects of selection (4), mutation (6) and immigration (7), and cou- pling the corresponding PDE for C(t, z) with the ODE (35) that describes the po- pulation dynamics, we obtain the system:





tC(t, z) = ∂zC(t, z)−∂zC(t,0) +UD

emzMS(z,m) emz

E−1 +N(t)d E

em? z

e−C(t,z)−1

, ∀z ≥0, ∀t >0, N0(t) = N(t) hmi(t) +d, ∀t >0,

(9)

with hmi(t) =∂zC(t,0).

Context-independent Model

We will first analyse the context-independent mutation model: the effects of mu- tations on fitness do not depend on the fitness of the parent. Thus we have MS(z, m) =M?(z), and we get the system:





tC(t, z) = ∂zC(t, z)−∂zC(t,0) +U(M?(z)−1) +N(t)d E

em? z

e−C(t,z)−1

, ∀z ≥0, ∀t >0, N0(t) = N(t) hmi(t) +d, ∀t >0,

(10)

where hmi(t) = ∂zC(t,0), M?(z) = R

RJ(m)em zdm and J the mutation kernel in the sink population. To simplify the arguments, we will denote byβ the function:

β(z) := U(M?(z)−1) =U Z

R

J(m)em zdm−1

. (11)

In this case, we will analyse the cases in which m? is a negative random variable – in particular, when it follows a Dirac distribution – and the case in which m? is normally distributed (with a negative mean).

Fisher’s Geometric Model

In Martin and Roques (2016), it is shown that for the Fisher’s Geometric Model (FGM) we have:

MS(z, m) =M?(z)eω(z)(m−mmax) where

ω(z) =−1+λ zλ z2 and M(z) = (1 +λ z)−n/2.

(12)

Thus we get:

emzMS(z, m)

emz =M?(z)e−ω(z)mmaxem(z+ω(z))

emz =M?(z)eC(t,z+ω(z))−C(t,z)−ω(z)mmax. We approximate

eC(t,z+ω(z))−C(t,z)

byeC(t,z+ω(z))−C(t,z), thanks to the approximation:

C(t, z)≈logM(t, z).

(15)

Hence we get the problem:





tC(t, z) = ∂zC(t, z)−∂zC(t,0) +U M?(z)eC(t,z+ω(z))−C(t,z)−ω(z)mmax−1 +N(t)d E

em? z

e−C(t,z)−1

, ∀z ≥0, ∀t >0, N0(t) = N(t) hmi(t) +d, ∀t >0,

(13) with hmi(t) =∂zC(t,0).

3 Mathematical analysis of the context-independent model (10)

The mutation kernelJ ∈L1(R) is assumed to satisfy



 R

RJ(m)dm= 1,

J(m)≥0 for a.e. m∈R, R

RJ(m)ex|m|dm <∞, ∀x >0.

(14)

For the last hypothesis for J, we will say that J(m) decays faster than any expo- nential function as |m| → ∞.

3.1 Existence and uniqueness of the solution

In this part, we will study the existence and the uniqueness of solution of the prob- lem:













tC(t, z) = β(z) +∂zC(t, z)−∂zC(t,0) +dE[ez m?N]e(t)−C(t,z)−1 , ∀t, z >0,

N0(t) = N(t) hmi(t) +d , ∀t >0,

C(t,0) = 0 , ∀t ≥0,

C(0, z) =C0(z) , ∀z ≥0,

N(0) =N0,

(15)

whered≥0, N0 ≥0 and the functions β and C0 are smooth, with β(0) = 0.

In our case, we have:

β(z) =U(M?(z)−1) = U Z

R

J(m)em zdm−1

.

Proposition 1. The problem (15) admits a unique solution C(t, z) defined for all t≥0 and z ≥0, if N0 >0. Futhermore, for every t ≥0, the expected mean fitness is equal to

hmi(t) = N0 (C00(t) +β(t)) eC0(t)+d E[em? t]−d eR0tβ(s)ds N0 eC0(t)+dRt

0 E[eτ m?]eRτtβ(s)dsdτ , (16)

(16)

and the expected population size N(t) =

N0+d Z t

0

eR0τhmi(s)ds

eR0thmi(s)ds. (17) The solution C(t, z) of (15) is given by the expression:

C(t, z) = Z t

0

β(z+τ)dτ −log

N0+d Z t

0

eR0τhmi(s)ds

− Z t

0

hmi(τ)dτ + log

N0eC0(z+t)+d Z t

0

E[e(z+τ) m?] exp

− Z τ

0

β(z−s)ds

. (18)

Proof. See Appendix C.1.

Remark. The formulae of Proposition 1 are still true if N0 = 0. The uniqueness of C(t, z) will be admited, in this case.

Remark. Taking d= 0, we find the formula given in Gil et al. (2017):

hmi(t) =C00(t) +β(t).

Using the properties of the CGF (Section 2.2), we can compute the expected variance in fitness within the source population.

Proposition 2. For every t≥0, the expected variance in fitness is equal to V(t) = β0(t)−β0(0) + FV(t)

G(t)2, (19)

where the functionsG and FV are given by:

G(t) =N0 eC0(t)+d Z t

0

E[eτ m?]eRτtβ(s)dsdτ,

FV(t) =N02 C000(t) e2C0(t)+N0 d (C000(t) +C00(t)2) eC0(t) Z t

0

E[eτ m?]eRτtβ(s)dsdτ +d G(t)

Z t 0

E

eτ m? (m?+β(τ)−β(t))20(τ)−β0(t)

eRτtβ(s)ds

−d N0 C00(t) eC0(t)+Fm(t) Z t

0

E

eτ m?(m?+β(τ)−β(t)) e

Rt τβ(s)ds

dτ, and

Fm(t) = N0 C00(t) eC0(t)+d Z t

0

E

eτ m?(m?+β(τ)−β(t))

eRτtβ(s)dsdτ.

Proof. See Appendix C.1.

(17)

Remark. Taking d= 0, we find the formula given in Gil et al. (2017):

V(t) = C000(t) +β0(t)−β0(0).

In Fisher’s fundamental theorem – without mutation and immigration – (Fisher (1930) and Frank and Slatkin (1992)), the expected variation in fitnesshmi0(t)and the expected variance in fitness are connected by the sinple formula

hmi0(t) =V(t).

In the presence of non-epistatic mutations, this formula (Good and Desai, 2013) extends to:

hmi0(t) = V(t) +U µJ.

There we extend Fisher’s fundamental theorem to take into account non-epistatic mutations and immigration events.

Proposition 3. Let µJ =R

RmJ(m)dm. Then we have:

hmi0(t) = V(t) +U µJ +d E[m?]− hmi(t)

N(t) . (20)

Proof. See Appendix C.1.

3.2 Finitness of the characteristic time t

0

We first give a precise meaning to the notion of “characteristic time”:

t0 = sup{t ≥0,hmi(t)≤0 in [0, t]}.

In the sequel, we assume that all of the individuals initially present in the sink population are maladapted i.e.

m0 = sup(supp(p0))<0.

This implies that hmi(0) <0 and so t0 > 0. We recall that E[m?]<0 (Section 2).

We study here the finitness of the characteristic time t0 depending on the type of mutation kernelJ (purely deleterious or including beneficial mutations) and of the distribution of fitnessm? of the migrants.

Theorem 4. (Finitness of the characteristic time, context-independent case) 1. If the mutation kernel J includes some beneficial mutations, then the characte-

ristic time t0 is finite.

2. If the mutation kernel J is purely deleterious i.e.

supp J∩R+=∅,

and if E[em?t]e−U t tends to +∞ as t tends to +∞, then the charateristic time t0 is finite.

(18)

3. If supp(p?)∩R+=∅and ifJ is purely deleterious, then the characteristic time is infinite.

Proof. See Appendix C.1.

Remark. In case 3, the population can never adapt in the new environment. All of the immigrants are indeed maladapted and the mutation cannot improve the offspring fitness. However, the population size stabilizes: for t large enough, N0(t) is approximately equal to hmi(+∞)N(t) +d, and so

t→+∞lim N(t) =dE[(U −m?)−1].

Remark. Assume that m? is Gaussian distributed. Thus we have, as in case 2 of Theorem 4,

E[em?t]e−U t = exp

(µ−U)t+ σ2t2 2

,

which diverges to +∞ at +∞. So we know that the characteristic time is finite, whileJ is purely deleterious: although all of the mutations are deleterious, selection is strong enough to drive hmi(t) towards positive values.

3.3 Explicit formulae for the characteristic time

We will assume that the hypotheses of Theorem 4 are satisfied so that the charac- teristic time is finite. For the sake of simplicity, we assume that the sink population is initially empty before the first immigration event, which occurs att = 0.

Thanks to equation (16), we have:

E[em? t0] =eR0t0β(s)ds, (21) with

β(z) =U Z

R

J(m) ezmdm−1

.

To analyse the dependence of t0 with respect to the model parameters d, U, p? and J, we will see it as a function of four variables:

t0 =t0(d, U, p?, J).

For sake of clarity, we will not write the parameters of the functiont0. Remark. The function β is strictly convex.

For kernels J including beneficial mutations, and with positive mean effect on fitness, we have the upper bound fort0:

Proposition 5. If µJ =R

RmJ(m)dm >0, then we have t0 ≤ −2 E[m?]

U µJ .

(19)

Proof. See Appendix C.1.

For kernels J with negative mean, we obtain the following bounds:

Proposition 6. Assume that µJ = R

RmJ(m)dm < 0. Then there exists a unique τ >0 such that β(τ) = 0. Moreover, we have β0(τ)>0 and

τ < t0 < τ −E[m?] β0(τ) +

s

E[m?] β0(τ) −τ

2

−τ2 β0(0) β0(τ).

Proof. See Appendix C.1.

3.3.1 Dirac distribution of m?

Now, we will assume that m? has a Dirac distributionδm?.Using the equation (21) we immediately get:

Proposition 7. (Clonal migrants, general mutation kernel) m? t0 =−

Z t0

0

β(s)ds, (22)

where β(z) =U R

RJ(m) ezmdm−1 . So we have

Z t0

0

β(s)ds≥0 ; β(t0)>0 ; β0(t0)>0.

Using expression (22), we easily describe the dependence of the characteric time with respect to the model parameters.

Proposition 8. The function t0(d, U, m?, J) does not depend to d. Moreover, it satisfies:

∂t0

∂m? =− t0

β(t0) +m? <0 (23)

and ∂t0

∂U = 1 U

m? t0

β(t0) +m? <0. (24) Proof. See Appendix C.1.

These results show that the highter the mutation rate, the more there are mu- tations, and so the less the population needs time to adapt. Also ifm? is near zero, then the population will need less time to adapt, than if |m?| is high.

• Approximations of the characteristic time

We will see that sometimes we cannot find explicit formula for t0. Therefore we develop here few results to approximatet0.

(20)

Proposition 9. We have for all U >0, β(t0(U))≈ −2m? with:

β(t0(U)) + 2m?

≤ t0(U)3 12 U

Z

R

J(m)m2et0(U)mdm.

Futhermore, by continuity and convexity, we can define β−1(−2m?) and we have t0(U)≈β−1(−2m?).

Proof. The idea of this result is to use the trapezoidal rule to compute

Z t0(U) 0

β(s)ds, and the calculus of the error, given in Süli and Mayers (2003).

Proposition 10. We have for all U >0, β(t0(U)) + 4βt

0(U) 2

≈ −6m? with:

β(t0(U)) + 4β

t0(U) 2

+ 6m?

≤ t0(U)5 90×25U

Z

R

J(m)m4et0(U)mdm.

Proof. The idea of this result is to use the Simpson rule to compute

Z t0(U) 0

β(s)ds, and the calculus of the error, given in Süli and Mayers (2003).

We now derive exact and approximate formulae for three different types of mu- tation kernels.

• Dirac mutation kernel

Here, we assume J = δµJ, with µJ > 0 (otherwise, t0 = +∞ cf Theorem 4).

Thus, we haveβ(t) =U(eµJt−1)and so the characteristic time satisfies (thanks to (22))

eµJt0J

U (m?−U)t0−1 = 0.

The solutions of this equation are given by t0 = U

µJ (m?−U) − 1 µJW

U

m?−U exp

U m?−U

, whereW is the Lambert-W function (see Appendix D).

However, if the branch isW0 then t0 = 0, which is impossible. Hence, we get:

t0 = U

µJ (m?−U) − 1 µJW−1

U

m?−U exp

U m?−U

, (25)

The approximation done in Proposition 9 yields:

t0 ≈ 1 µJ log

1− 2m? U

. (26)

(21)

-20 -15 -10 -5 m*U 1

2 3 4 t0

Figure 5: Dynamics of t0 with respect to mU?. We asume here a Dirac dis- tribution of migrant fitnesses (p? = δm?) and a Dirac mutation kernel J = δµJJ = 1). The dark curve represents the exact solution (25), the dashed red curve is the approximation (27), and the dashed pink illustrates the approximation (26).

By Proposition 10, we have:

eJ + 4 exµJ2 +6 m?

U −5 = 0.

Thus, we get

t0 ≈ 2 µJ

log −2 + r

9− 6 m? U

!

. (27)

We can see in Figure 5 that the approximation (27) is more precise than (26).

It comes from the respective precisions of the Trapezoidal and Simpson rules, to approximate integrals.

• Gaussian mutation kernel

Here, we assume that the distribution of mutation effects on fitness, J(m), is a Gaussian function mean value µJ and variance σJ2:

J(m) = 1

p2πσ2J exp

− (m−µJ)2 2 σ2J

. Then, we have:

β(t) =U

eµJ t+

σ2 Jt2

2 −1

.

(22)

The equation (22) yields:

−m?t0 = Z t0

0

β(s)ds =U r π

2J e

µ2 J 2

"

Erfi µJ2Jt0 p2σJ2

!

−Erfi µJ p2σJ2

!#

−U t0,

where Erfi is the imaginary error function Erfi(x) = Erf(ix)

i = 2

√π Z x

0

et2dt.

Using Propositions 9 and 10, we can get approximate formulae fort0. Proposition 9 yields

t0 ≈ − µJ σ2J +

s µ2J σJ4 + 2

σ2J log

1−2 m? U

.

The approximation found by the Simpson rule (Proposition 10) is equivalent here to:

eµJt0+

σ2 Jt2

0

2 + 4 eµJ t02 +

σ2 Jt2

0

8 ≈5−6m? U , which gives no more information aboutt0.

• Symetrized gamma mutation kernel

Letk ∈N\ {1}, andθ and sbe two positive real numbers. Now we assume that for all m∈R

J(m) =

( (s−m)k−1

Γ(k)θk e s−mθ if m < s

0 if not .

This means that the kernel J contains beneficial mutations, with a maximum effect on fitnesss (see Figure 6). Thus, we get:

β(t) = U

est

(1 +tθ)k −1

. For example, for k = 2, we have:

Z t0

0

esx

(1 +xθ)2dx=

− esx θ(1 +θx)

t0

0

+ 1 θ

Z t0

0

esx 1 +xθdx

= 1 θ

1− est0 1 +θt0

+s

θe s/θ

Z a(bt0+1)/b a/b

ex xdx

= 1 θ

1− est0 1 +θt0

+s

θe s/θ

Ei

a bt0+ 1 b

−Eia b

, where Ei is the exponential integral

Ei(x) = Z x

−∞

et t dt.

(23)

-20 -15 -10 -5 0 5 m 0.05

0.10 0.15 0.20

JHmL

Figure 6: Curve of a gamma mutation kernel J (withs = 3,k = 2 and θ= 4).

Again, we can compute the approximation oft0. Indeed, Proposition 10 yields est0

1−m? U

(1 +tθ)2, which is equivalent via the Appendix D to

t0 ≈ − 1 θ − 2

s W

εs 2θ

q 1− mU?

exp

− s 2θ

, with ε=±1. As t0 >0, we get:

t0 ≈ − 1 θ − 2

s W−1

"

− s 2θ

r U

U −m? exp

− s 2θ

# .

3.3.2 Gaussian distribution of m?

In this section, we will study the special case, in which the fitness m? of the mi- grants follows a Gaussian distribution N(µ, σ2). We recall that we have made the assumptionµ=E[m?]<0. In this case, we have:

E[em?z] = exp

µz+σ2z2 2

.

Thanks to Theorem 4, we can define the characteristic time as 0< t0 <∞ such that hmi(t0) = 0. By (21), this time satisfies:

(24)

µt0+ t20σ2 2 =−

Z t0

0

β(s)ds. (28)

We will see t0 as a function with four variables t0(d, U, µ, σ).

Proposition 11. The function t0(d, U, µ, σ) does not depend to d. Moreover, it satisfies:

∂t0

∂U = t0

2U . 2µ+t0σ2

µ+t0σ2+β(t0), (29)

∂t0

∂µ =− t0

µ+t0σ2+β(t0) (30)

and ∂t0

∂σ =− σt20

µ+t0σ2+β(t0). (31) Proposition 12. We have β(t0) +t0σ2 ≈ −2µ with

β(t0) +t0σ2+ 2µ

≤ t30 12U

Z

R

J(m)m2et0mdm.

By continuity and strict convexity of the application g :t7→β(t) +σ2t, we have the approximation:

t0 ≈g−1(−2µ).

We now derive an explicit formula fort0 for a particular type of mutation kernel.

• Dirac mutation kernel

Here, we assume J = δµJ. Thus, we have β(t) = U(eµJt−1) and so the cha- racteristic time satisfies (thanks to (28))

eµJt02µJ

2U t20

1− E[m?] U

µJt0−1 = 0.

Thanks to Proposition 12 and Appendix D, we have t0 ≈ U −2E[m?]

σ2 − 1 µJW−1

U µJ σ2 exp

(U −2E[m?])µJ σ2

.

4 Mathematical analysis of the context-dependent model: small mutation rate

In this part, we will focus on the context-dependent case. The distribution of m? is given by a mutation-selection balance in the Fisher’s Geometric Model (see Sec- tion 2.1). The Moment Generating Function of m?, under the assumption of small mutation rateUS is given by (1):

E[em?z] =e(mmax+mD)z−US/sH +

1−e−US/sH

emmaxz(1 +λz)−(θ−1), (32)

(25)

where θ = n/2, sH = (θ −1)λ, n the phenotypic dimension and λ the mutational variance (in the source). If mutation and selection in the sink are also described by the FGM, the dynamics of the CGF are given by (13).

The formula (32) was carried out under the assumption of a small mutation rate in the source. As the sink population is studied during a small period, we assume that the population in the sink does not undergo mutation events, and so we will take U = 0. Futhermore, as before, we will consider the case of an empty sink:

N0 = 0.

The problem (13) becomes equivalent to the problem (10) withU = 0. Then we have the same formula as in Section 3:

hmi(t) = E[etm?]−1 Rt

0 E[eτ m?]dτ. Thanks to the beginning of this part, we have:

hmi(t) =

e(mmax+mD)t−US/sH +

1−e−US/sH

emmaxt(1 +λt)−(θ−1)−1 Rt

0 E[eτ m?]dτ ,

As the numerator tends to+∞ ast tends to+∞, we have the theorem below Theorem 13. (Finitness of the characteristic time, context-dependent case)

If the mutational rate is low in the source (US << 1) and if it vanishes in the sink (U = 0), the characteristic time t0 is finite.

5 Comparison with the (stochastic) Individual-Based Model (IBM)

The goal of this part is to validate the model of Section 2 and the theoretical results of Sections 3 and 4. In that respect, we compare these results with Individual-Based simulations of stochastic Wright-Fisher Model presented in Figure 4. In particular, we compare expected population sizes, expected mean fitnesses and characteristic times. In the context-independent case, we also study the dependence of the cha- racteristic time with respect to the model parameters.

5.1 Context-independent models

We assume that the ditribution of the migrants p? is a Dirac at m? = −0.18. We take the migration rate d equal to 90 (such that d = −500m?), and the mean of the mutation kernel µJ to 0.1. We focus on two cases: Dirac mutation kernel and Gaussian mutation kernel, with variance σ2J = 0.01. To study weak and strong mutation events, we take the mutation rate equals respectively to either 0.1×µJ (= 0.01) or µJ.

Our PDE framework (10) gives accurate results for both the prediction ofhmi(t) and N(t) (respectively (16) and (17)) at small times (Figures 7 and 8). Then, the accuracy of the predictions of our PDE framework at layer times depends on the

Références

Documents relatifs

For conservative dieomorphisms and expanding maps, the study of the rates of injectivity will be the opportunity to understand the local behaviour of the dis- cretizations: we

Show that a Dedekind ring with finitely many prime ideals is

In the first part, by combining Bochner’s formula and a smooth maximum principle argument, Kr¨ oger in [15] obtained a comparison theorem for the gradient of the eigenfunctions,

Recall that for the second boundary value problem of the Monge-Amp`ere equation (see (1.5), (1.6) below), or for the reflector problem in the far field case (see (1.7), (1.8)

First introduced by Faddeev and Kashaev [7, 9], the quantum dilogarithm G b (x) and its variants S b (x) and g b (x) play a crucial role in the study of positive representations

The second mentioned author would like to thank James Cogdell for helpful conversation on their previous results when he was visiting Ohio State University.. The authors also would

As a consequence, the various quantities associated with the algorithm (such as the communication cost, the virtual energy, the communication altitude... ) are

2.1. Some properties of the sexual reproduction operator. We start by recalling some of the main properties of the sexual reproduction operator B. Furthermore, it is contractive in