• Aucun résultat trouvé

Inference for epidemic data using diffusion processes with small diffusion coefficient

N/A
N/A
Protected

Academic year: 2021

Partager "Inference for epidemic data using diffusion processes with small diffusion coefficient"

Copied!
21
0
0

Texte intégral

(1)

HAL Id: hal-02803499

https://hal.inrae.fr/hal-02803499

Submitted on 5 Jun 2020

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Inference for epidemic data using diffusion processes with small diffusion coefficient

Romain Guy, Catherine Laredo, Elisabeta Vergu

To cite this version:

Romain Guy, Catherine Laredo, Elisabeta Vergu. Inference for epidemic data using diffusion processes

with small diffusion coefficient. Rencontres des jeunes statisticiens Français 2011, Société Française

de Statistique (SFdS). FRA., Aug 2011, Aussois, France. pp.1. �hal-02803499�

(2)

Back to the epidemics and simulations results

Inference for epidemic data using diusion processes with small diusion coecient

Romain GUY1,2 with C. Laredo1,2and E. Vergu2

1Laboratoire de Probabilités et modèles aléatoires (Paris Diderot) 2Unité Mathématiques et Informatique Appliquées, INRA, Jouy-en-Josas

4èmes Rencontres Des Jeunes Statisticiens (Aussois) 7 septembre 2011

(3)

Back to the epidemics and simulations results

Introduction

A work guided by data

Estimation of key parameters of the epidemic dynamics Often low frequency time series

Often aggregated and incomplete datasets.

Objective

Provide estimators with good properties for the diusion model approximating the epidemic dynamics

(4)

Back to the epidemics and simulations results

Outline

1 Classical SIR epidemic models

2 Parametric inference for discretely observed diusion processes

3 Back to the epidemics and simulations results

(5)

Back to the epidemics and simulations results

Notations and model assumptions

Notations

N : population size ; m : initial infectives

λ: transmission rate ;γ : recovery rate (=1/mean holding time in infective state)

R0=λ/γ: basic reproduction number (=mean number of secondary infections generated by an infective in an entirely susceptible population)

S(t),I(t): numbers of susceptibles, infectives ; s(t) = SN(t),i(t) = I(Nt) : proportion of susceptibles, infectives

Assumptions

Homogenous mixing in closed population

Discrete observations of S and I on a xed interval[0,T], with sampling interval∆(T =n∆) (acceptable assumption as a rst attempt)

S λI/N I ɣ R

(6)

Back to the epidemics and simulations results

Markov pure jump model and Inference

Let Xt= (St,It)and X0= (N−m,m). Transitions

(S,I)−→λNSI (S−1,I+1)and(S,I)−→γI (S,I−1) Exponential holding times

Observations

All jumps are observed

Maximum Likelihood Estimators (MLE) and asymptotic normality (Andersson and Britton 2000)

ˆλMLE =N N−m−S(T) RT

0 S(t)I(t)dt,ˆγMLE =N−S(T)−I(T) RT

0 I(t)dt

√N

ˆλMLE−λ0 ˆ

γMLE−γ0

N−→→∞N

0,

var(λ0) 0 0 var(γ0)

with var(λ0),var(γ0)being explicit

(7)

Back to the epidemics and simulations results

ODE model and Inference (as a rst approximation)

Let xλ,γ(t) = (s(t),i(t))and(s(0),i(0)) = (1−mN,mN).

Classical ODE system (for N large) ds

dt =−λsi and didt =λsi−γi, Observations

Discrete observations Xtk =xλ,γ(tk) +k at times tk=k∆(k=0, ...,n), with k

iidN2(0,Σ)

Least Square Estimator (LSE) and asymptotic normality LSE(λ, γ) = 1n

n

X

k=0

(Xtk−xλ,γ(tk))2,(ˆλLSE,γˆLSE) =argmin

(λ,γ)∈Θ

LSE(λ, γ)

√nλˆLSE−λ0 ˆ γLSE−γ0

n−→→∞N(0,Σ)

(8)

Back to the epidemics and simulations results

Diusion approximation model

Let Xt= (st,it),(s0,i0) = (1−mN,mN)and B1,B2, two independent Brownians motions.

Stochastic Dierential Equation dst = −λstitdt+1

N

√λstitdB1(t)

dit = (λstit−γit)dt−1

N

√λstitdB1(t) +1N

γitdB2(t)

Remarks

Classical approximation : before passing to the limit (N→ ∞) in the normalized system of the Markov jump process

MLE untractable when the path is discretely observed Framework

Multidimensionnal diusion processes Small noise∼1N in large population

Parameters(λ, γ)both in drift and diusion coecients

(9)

Back to the epidemics and simulations results

General diusion model and existing results

Let Xtbe the unique strong solution of the SDE dXt=b(α,Xt)dt+σ(β,Xt)dBt, X0=x0∈Rp

We observe Xtat times tk=k∆on a xed interval[0,T](T =n∆) σ(β,x)∈Mp(R),b(α,x)∈Rp,Σ(β,x) =σ(β,x)tσ(β,x)invertible.

Existing estimation results for high-frequency data (Gloter and Sorensen (2009))

Under the condition∃ρ >0,n1ρ bounded, for a class of contrast processes, associated Minimum Contrast Estimators (MCEs) are consistent and

1( ˆα,n−α0)

√n( ˆβ,n−β0)

n→∞,→−→ 0N

0,Ib10, β0) 0 0 Iσ10, β0)

with Ib10, β0)explicit and optimal (= asymptotic variance for continuous observations on[0,T]).

For epidemics :1=√ N

(10)

Back to the epidemics and simulations results

Main idea of our inference approach (extension of Genon-Catalot(90))

Use of Taylor's stochastic expansion formula (Azencott (82)) Xt=xα(t) +gα,β(t) +2Rα,β (t)

where xα(t)is the deterministic solution dxαdt(t)=b(α,xα(t)), x(0) =x0∈Rp, dgα,β(t) = ∂b

∂x(α,xα(t))gα,β(t)dt+σ(β,xα(t))dBt, gα,β(0) =0Rp

and Rα,β satises

t∈[sup0,T]

{kRα,β (t)k} −→

P,→00.

LetΦαbe the invertible matrix solution of ddtΦα(t,t0) = ∂bx(α,xα(t))Φα(t,t0), withΦα(t0,t0) =Ip.

Properties of gα,β

gα,β is a Gaussian process for which we can obtain the analytic expression.

gα,β(tk) = Φα(tk,tk1)gα,β(tk1) +√

∆Zkα,β Zkα,β are independentN

0,Skα,β

variables.

(11)

Back to the epidemics and simulations results

Main idea of our inference approach (extension of Genon-Catalot(90))

Contrast process derived from(Zkα,β)k-likelihood U∆,(α, β) = 2

n

X

k=1

logh det

Skα,βi + 1

n

X

k=1

tNk(α)(Skα,β)1Nk(α) with Nk(α) = Xtk−xα(tk)−Φα(tk,tk1)h

Xtk−1−xα(tk1)i . ( ˆα,∆,βˆ,∆) =argmin

(α,β)∈ΘU∆,(α, β)

(12)

Back to the epidemics and simulations results

Results for low frequency data ( ∆ and n xed)

No asymptotic results forβˆ,∆

βknown

We only considerαˆ,∆0) =argmin

α∈Θa U∆,(α, β0)and then 1( ˆα,∆0)−α0)−→

0N(0,I10, β0)),with I0, β0) −→

∆→0Ib0, β0)

βunknown

We modify the contrast process in a conditional least square contrast U˜,∆ α,(Xtk)k∈{1,..,n}

= 1

n

X

k=1

tNk(α)Nk(α).

Thenαˆ=argmin

α∈Θa

,∆(α) satises1( ˆα−α0)−→

0N(0,˜I10, β0)).

For epidemics :α= (λ, γ) =β⇒Special case, results for knownβhold if we replace eachβoccurence withα.

(13)

Back to the epidemics and simulations results

Results for high frequency data ( ∆ → 0)

Contrast process

UsingkSkα00−Σ(β0,Xtk1)k −→

,∆→00, we consider : U∆,(α, β)) = 2

n

X

k=1

logh det

Σ(β,Xtk−1)i + 1

n

X

k=1

tNk(α)Σ1(β,Xtk1)Nk(α)

Asymptotic Normality Under the condition2n −→

,∆→00 1( ˆα,∆−α0)

√n( ˆβ,∆−β0)

n→∞,→−→ 0N

0,

Ib10, β0) 0 0 Iσ10, β0)

.

(14)

Back to the epidemics and simulations results

Simulation study (using Matlab)

Algorithm

1 Exact simulation of an epidemic with Markov pure jump process (Gillespie algorithm with choice of N,m, λ, γ)

2 Calculation ofˆλMLE,γˆMLE (observation of the whole path of the process)

3 Discrete observations on a xed interval

4 Estimation phase for LSE, Gloter and Sorensen (2009) contrast, our MCE forα=β, and for unknownβ(conditionnaly least square contrast).

Numerical optimisation using fminsearch.

Results

Mean of the point estimation on 1000 runs, theoretical condence intervals for MCEs, empirical condence interval for LSE, for each scenario and for each of the four estimators (+ MLE as a reference)

Simulated values

N∈[1000;10000],∆∈[T/10;1;T/100],γ=1/3, R0∈[1.2;2]

(15)

Back to the epidemics and simulations results

Example of trajectories : proportion of infectives over time (N = 1000, R

0

= 2)

Figure:Trajectories of the three processes for N=1000 R0=2,γ=1/3, T=40 and 1% of initial infectives

(16)

Back to the epidemics and simulations results

Simulation results for N = 1000, R

0

= 2, γ = 1 / 3, T = 40 and n = 10 , 40 , 100

Notations

green arrows = best ponctual estimator for one scenario

For each scenario, and each parameter in order : LSE, GS09(α), MCE(α=β), MCE(βunknown)

(17)

Back to the epidemics and simulations results

Simulation results for N = 10000, R

0

= 2, γ = 1 / 3, T = 40 and

n = 10 , 40 , 100

(18)

Back to the epidemics and simulations results

Another Case : R

0

= 1 . 2

Figure:Trajectories of the three processes for N=1000, R0=1.2,γ=1/3, T=65 and 1% of initial infectives

(19)

Back to the epidemics and simulations results

Simulation results for N = 1000 , 10000, T = 65 and n = 10 , 65 , 100

N=1000

N=10000

(20)

Back to the epidemics and simulations results

Limits and perspectives

To be rened

1 Limits of the SIR model : acceptable as a rst attempt, possible to be rened to integrate more realistic assumptions (e.g. increase the state space dimension)

2 Idealized statistical framework : here the two system coordinates(st,it)are assumed observed (instead of a function of it, a more realistic assumption)

Next directions (work in progress)

1 Complexify the model : no diculty if the new system is still an autonomous system, regardless to its size

2 Modify the statistical framework : observe integrated diusion

(21)

Back to the epidemics and simulations results

References

References

H. Andersson and T. Britton.

Stochastic epidemic models and their statistical analysis.

Springer, 2000.

S.N. Ethier and T.G. Kurtz.

Markov processes : characterization and convergence.

Wiley, 1986.

V. Genon-Catalot.

Maximum contrast estimation for diusion processes from discrete observations.

Statistics, 1990.

A. Gloter and M. Sorensen.

Estimation for stochastic dierential equations with a small diusion coecient.

Stochastic Processes and their Applications, 2009.

Thank you for your attention !

Références

Documents relatifs

3 Statistical inference for discretely observed diffusions with small diffusion coefficient, theoretical and Simulation results.. 4 Ongoing Work : statistical inference for

After proving consistency and asymptotic normality of the estimator, leading to asymptotic confidence intervals, we provide a thorough numerical study, which compares many

where the first bound is the Agmon estimate of theorem 12 on eigenfunctions coming from small eigenvalues, and the second bound comes from the fact that (H i r 0 (ε) − a) −1 is

It shows that if every member of a hereditary family of graphs admits a small separator, then the number of edges of each graph in this family is at most linear in the number

The submis- sions of each team were evaluated by an independent panel of scientists and the results of the Species Translation Challenge were presented, and best performers

In order to investigate further and better constrain the mass growth and evolution in the number of satellite galaxies per halo over a larger redshift range, one needs to have

Progressive invagination due to the first bottle cell differentiation in gastrulation process: experimental data (top left, from http://www.molbio1.princeton.edu/wieschaus/);

We consider here a parametric framework with distributions leading to explicit approximate likelihood functions and study the asymptotic behaviour of the associated estimators under