HAL Id: hal-02806254
https://hal.inrae.fr/hal-02806254
Submitted on 6 Jun 2020
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Inference for epidemic models using partially observed diffusions with small diffusion coefficient
Romain Guy, Catherine Laredo, Elisabeta Vergu
To cite this version:
Romain Guy, Catherine Laredo, Elisabeta Vergu. Inference for epidemic models using partially ob-
served diffusions with small diffusion coefficient. Séminaire de Statistique Parisien, May 2013, Paris,
France. 35p. �hal-02806254�
Inference for epidemic models using partially observed diffusions with small diffusion coefficient
Romain GUY 1,2
Joint work with C. Lar´ edo 1,2 and E. Vergu 1
1
UR 341, MIA, INRA, Jouy-en-Josas
2
UMR 7599, LPMA, Universit´ e Paris Diderot
13 mai 2013
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 1 / 34
Outline
1 Context for transmissible diseases (Data, models and statistical inference)
2 Formalism of the mechanistic model : diffusions with small diffusion coefficient ( = √ 1
N
pop)
3 Statistical inference for discretely observed diffusions with small diffusion coefficient, theoretical and Simulation results
4 Ongoing Work : statistical inference for partially and discretely observed diffusion with small diffusion coefficient. Theoretical result and
application on Sentinelles ILI data
Context of transmissible diseases (individual infection)
Scheme characterizing the modeling approach
Compartmental approach with natural health states (Susceptible (S), Exposed (E), Infected (I), Removed(R))
Infection difficult to detect (One mechanism leading to partial observations)
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 3 / 34
Context of transmissible diseases (several individuals)
⇒ Additional uncertainties (Temporally aggregated data,...)
Goal : assessing key epidemic parameters
Different dynamics (single and recurrent epidemics)
Influenza like-illness cases from Sentinelles surveillance network
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 5 / 34
The mechanistic model (Single outbreak)
Defined by the nb. of health states, possible transitions and rates SIR model
Notations N pop population size, λ transmission rate, γ recovery rate S , I , R numbers of susceptible, infected, removed individuals
Assumptions :
Closed population of size N pop (for each t, ⇒ S + I + R = N pop ) Well mixing population (S , I ) → (S − 1, I + 1) at rate S × λ N I Key epidemic parameters :
Basic reproduction number R 0 = λ γ
1
The mechanistic model (Multiple outbreaks)
SIRS model with seasonal forcing
Additional notations
δ : waning immunity rate (years) −1 , µ : demog. renewal rate (decades) −1 λ(t) = λ 0 (1 + λ 1 sin(2π T t
per
)), λ 0 basic trans. rate, λ 1 seasonal effect Important : λ 1 = 0 ⇒ damped oscillations⇒ temporal forcing required A model appropriated for very large populations
Key epidemic parameters : R 0 = γ+µ λ , d = 1 γ
Mean immunity period δT 1
per
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 7 / 34
Mathematical formalisms and statistical inference
p Number of health states
Deterministic modeling with noise (ODEs solutions on R p ) Statistical inference mainly based on least squares methods Generally used when N is very large
Stochastic modeling with pure jump processes (State space N p ) Exponential holding times ⇒ Markov jump processes.
Statistical inference for stochastic models MLE : observations of all jumps required
⇒ Infectious and recovery times observed for all individuals,
Incomplete observations : data augmentation methods (e.g. Breto et al (09) for Particle based likelihood filtering, Toni et al (09) ABC SMC approach),
⇒ Computer intensive simulations (too long for N large)
Not addressed : parameter identifiability given the data.
Diffusion processes : good alternative
Mathematical formalisms and statistical inference
p Number of health states
Deterministic modeling with noise (ODEs solutions on R p ) Statistical inference mainly based on least squares methods Generally used when N is very large
Stochastic modeling with pure jump processes (State space N p ) Exponential holding times ⇒ Markov jump processes.
Statistical inference for stochastic models MLE : observations of all jumps required
⇒ Infectious and recovery times observed for all individuals,
Incomplete observations : data augmentation methods (e.g. Breto et al (09) for Particle based likelihood filtering, Toni et al (09) ABC SMC approach),
⇒ Computer intensive simulations (too long for N large) Not addressed : parameter identifiability given the data.
Diffusion processes : good alternative
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 8 / 34
Outline
1 Context for transmissible diseases (Data, models and statistical inference)
2 Formalism of the mechanistic model : diffusions with small diffusion coefficient ( = √ 1
N
pop)
3 Statistical inference for discretely observed diffusions with small diffusion coefficient, theoretical and Simulation results
4 Ongoing Work : statistical inference for partially and discretely observed diffusion with small diffusion coefficient. Theoretical result and
application on Sentinelles ILI data
Different mathematics formalisms for a given mechanistic model
⇒ convenient summary of mechanistic information : coefficients α L
Let X ∈ N p be the number of individuals in each health state
Let L be a fixed transition between two health states on Z p :X → X + L α L (X ) : the transition rate associated to transition L given the state X α L (t, X (t)) for time dependent mechanistic models
Example : the SIR model in closed population (p = 2, X = (S , I ))
(S, I) → (S − 1, I + 1) = (S , I ) + (−1, 1) at rate α (−1,1) (S , I ) = λS N I and (S, I) → (S, I − 1) = (S, I) + (0, −1) at rate α (0,−1) (S , I ) = γI
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 10 / 34
Different mathematical formalisms from the summary α L Markov jump process Z t : direct (transitions rates q x,x+L = α L (x)) For other models : normalized version of α L ⇒ β L
For x ∈ [0, 1] p , β L (x) = lim
N→∞
1
N α L (bNx c)
Example SIR : α (−1,1) (S, I) = λS N I ⇒ β (−1,1) (s , i) = λsi
ODE solution x(t) on [0, 1] p
x(t) solution of dx(t) dt = b(x(t)) with b(x) = X
L
β L (x)L Diffusion X t on [0, 1] p ( = √ 1
N ) X t solution of dX t = b(X t )dt + √ 1
N
popσ(X t )dB t with Σ(x) = X
L
β L (x)L t L, Σ = σ t σ
Gaussian process g (t) (Y t = x(t ) + √ 1
N g (t) in R p ) Def. Φ(t, s ) sol. of dΦ(t,s) dt = ∂b ∂x (x(t))Φ(t, s ), Φ(s , s ) = I d
g (t) centered Gaussian process with cov. cov (g (t), g (r)) = R
t∧r0
Φ(t , s)Σ(x(s))
tΦ(r, s)ds
Different mathematical formalisms from the summary α L Markov jump process Z t : direct (transitions rates q x,x+L = α L (x)) For other models : normalized version of α L ⇒ β L
For x ∈ [0, 1] p , β L (x) = lim
N→∞
1
N α L (bNx c)
Example SIR : α (−1,1) (S, I) = λS N I ⇒ β (−1,1) (s , i) = λsi ODE solution x(t) on [0, 1] p
x(t) solution of dx(t) dt = b(x(t)) with b(x ) = X
L
β L (x)L
Diffusion X t on [0, 1] p ( = √ 1
N ) X t solution of dX t = b(X t )dt + √ 1
N
popσ(X t )dB t with Σ(x) = X
L
β L (x)L t L, Σ = σ t σ
Gaussian process g (t) (Y t = x(t ) + √ 1
N g (t) in R p ) Def. Φ(t, s ) sol. of dΦ(t,s) dt = ∂b ∂x (x(t))Φ(t, s ), Φ(s , s ) = I d
g (t) centered Gaussian process with cov. cov (g (t), g (r)) = R
t∧r0
Φ(t , s)Σ(x(s))
tΦ(r, s)ds
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 11 / 34
Different mathematical formalisms from the summary α L Markov jump process Z t : direct (transitions rates q x,x+L = α L (x)) For other models : normalized version of α L ⇒ β L
For x ∈ [0, 1] p , β L (x) = lim
N→∞
1
N α L (bNx c)
Example SIR : α (−1,1) (S, I) = λS N I ⇒ β (−1,1) (s , i) = λsi ODE solution x(t) on [0, 1] p
x(t) solution of dx(t) dt = b(x(t)) with b(x ) = X
L
β L (x)L Diffusion X t on [0, 1] p ( = √ 1
N ) X t solution of dX t = b(X t )dt + √ 1
N
popσ(X t )dB t with Σ(x) = X
L
β L (x)L t L, Σ = σ t σ
Gaussian process g (t) (Y t = x(t ) + √ 1
N g (t) in R p ) Def. Φ(t, s ) sol. of dΦ(t,s) dt = ∂b ∂x (x(t))Φ(t, s ), Φ(s , s ) = I d
g (t) centered Gaussian process with cov. cov (g (t), g (r)) = R
t∧r0
Φ(t , s)Σ(x(s))
tΦ(r, s)ds
Different mathematical formalisms from the summary α L Markov jump process Z t : direct (transitions rates q x,x+L = α L (x)) For other models : normalized version of α L ⇒ β L
For x ∈ [0, 1] p , β L (x) = lim
N→∞
1
N α L (bNx c)
Example SIR : α (−1,1) (S, I) = λS N I ⇒ β (−1,1) (s , i) = λsi ODE solution x(t) on [0, 1] p
x(t) solution of dx(t) dt = b(x(t)) with b(x ) = X
L
β L (x)L Diffusion X t on [0, 1] p ( = √ 1
N ) X t solution of dX t = b(X t )dt + √ 1
N
popσ(X t )dB t with Σ(x) = X
L
β L (x)L t L, Σ = σ t σ
Gaussian process g (t) (Y t = x(t ) + √ 1
N g (t) in R p ) Def. Φ(t, s ) sol. of dΦ(t,s) dt = ∂b ∂x (x(t))Φ(t, s ), Φ(s , s ) = I d
g (t) centered Gaussian process with cov.
cov (g (t), g (r)) = R
t∧r0
Φ(t , s)Σ(x(s))
tΦ(r , s)ds
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 11 / 34
Links between the processes (homogeneous case : Ethier &
Kurtz (05))
Starting from the Markov jump process Z t : LLN : N Z
tpop
−→
N
pop→∞ x(t) CLT : p
N pop
Z
tN
pop− x(t)
−→
N
pop→∞ g (t) Taylor’s stochastic expansion ( = √ 1
N ) : X t = x(t) + g (t) + O P ( 2 )
Application : diffusion approximation of the SIR epidemics
b(x) = X
L
β L (x)L, Σ(x) = X
L
β L (x)L t L, Σ = σ t σ
Functions b and Σ
depend on the two parameters (λ, γ) ⇒ b (λ,γ) (s , i ), Σ (λ,γ) (s , i ), b (λ,γ ) (s, i) = −λsi
−1 1
+ γi
0
−1
=
−λsi λsi − γi
. Σ
(λ,γ)(s, i) = λsi
−11 −1 1+γi
−10 0 −1=
λsi −λsi
−λsi λsi + γi
. ( dS t = −λS t I t dt + √ 1
N
√ λS t I t dB 1 (t) dI t = (λS t I t − γS t )dt − √ 1
N
√ λS t I t dB 1 (t) + √ 1
N
√ γI t dB 2 (t)
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 13 / 34
Generalization to time dependent processes (Guy et al.
(13)) : case of the SIRS model
Description of β L (t, x)
β (−1,+1) (t, s , i) = λ(t)s(i + η), λ(t) = λ 0 (1 + λ 1 sin(2πt/T per ))
β (0,−1) (s, i) = (γ + µ)i , β (−1,0) (s , i) = µs , β (1,0) (s , i ) = µ + δ(1 − s − i) b and Σ (θ = (λ 0 , λ 1 , γ, δ, η, µ))
b θ (t, (s, i )) =
−λ(t)s(i + η) + δ(1 − s − i ) + µ(1 − s ) λ(t)s(i + η) − (γ + µ)i
Σ
θ(t , (s, i)) =
λ(t)s(i + η) + δ(1 − s − i) + µ(1 + s) −λ(t)s(i + η)
Outline
1 Context for transmissible diseases (Data, models and statistical inference)
2 Formalism of the mechanistic model : diffusions with small diffusion coefficient ( = √ 1
N
pop)
3 Statistical inference for discretely observed diffusions with small diffusion coefficient, theoretical and Simulation results
4 Ongoing Work : statistical inference for partially and discretely observed diffusion with small diffusion coefficient. Theoretical result and
application on Sentinelles ILI data
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 15 / 34
Specificities for diffusion with small diffusion coefficient
Model : (X t ) Multidimensional diffusion process on R p , dX t = b(θ 1 , X t )dt + σ(θ 2 , X t )dB t , X 0 = x 0 ∈ R p . θ = (θ 1 , θ 2 )
Different rates of convergence for parameters in the drift and in the diffusion coefficient ⇒ splitting the parameters θ 1 , θ 2 required
Continuous observation of the diffusion on [0, T ] (Kutoyants (80)) MLE : −1 θ 1 MLE − θ 1 0
→ N 0, I opt (θ 0 1 , θ 2 0 ) −1 Discrete observations of the diffusion on [0, T ]
Minimum contrast approaches : estimation of θ 2 at rate √
n ( with n
number of observations ⇒ sampling interval ∆ = ∆ n → 0).
The case of multidimensional diffusion discretely observed
dX t = b(θ 1 , X t )dt + σ(θ 2 , X t )dB t
Observations : Discrete observations with sampling ∆ on [0, T ] X t
kfor t k = k∆, k ∈ {0, .., n}, t k ∈ [0, T ] (n∆ = T ), T is fixed Asymptotics : , ∆ → 0
Likelihood not available ⇒ Contrast based approaches U ,∆ ((θ 1 , θ 2 ), (X t
k) k ) =
n
X
k=1
log (det(V k (θ 2 )))+ 1 2 ∆
t B k (θ 1 )V k −1 (θ 2 )B k (θ 1 ) (ˆ θ 1 , θ ˆ 2 ) = argmin
θ
1,θ
2U ,∆
Existing results : Sorensen & Uchida (03) V k (θ 2 ) = Σ(θ 2 , X t
k−1)
B k (θ 1 ) = X t
k− X t
k−1−∆b(θ 1 , X t
k−1) √ −1 ( ˆ θ 1 − θ 0 1 )
n( ˆ θ 2 − θ 2 0 )
n→∞,→0 −→ N
0,
I opt −1 (θ 0 1 , θ 2 0 ) 0 0 I σ −1 (θ 0 1 , θ 2 0 )
Under the condition √
∆ → 0
⇒ Generalized by Gloter & Sorensen (09) ∃ρ > 0, ∆
ρbounded
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 17 / 34
The case of multidimensional diffusion discretely observed
dX t = b(θ 1 , X t )dt + σ(θ 2 , X t )dB t
Observations : Discrete observations with sampling ∆ on [0, T ] X t
kfor t k = k∆, k ∈ {0, .., n}, t k ∈ [0, T ] (n∆ = T ), T is fixed Asymptotics : , ∆ → 0
Likelihood not available ⇒ Contrast based approaches U ,∆ ((θ 1 , θ 2 ), (X t
k) k ) =
n
X
k=1
log (det(V k (θ 2 )))+ 1 2 ∆
t B k (θ 1 )V k −1 (θ 2 )B k (θ 1 ) (ˆ θ 1 , θ ˆ 2 ) = argmin
θ
1,θ
2U ,∆
Existing results : Sorensen & Uchida (03) V k (θ 2 ) = Σ(θ 2 , X t
k−1)
B k (θ 1 ) = X t
k− X t
k−1−∆b(θ 1 , X t
k−1) √ −1 ( ˆ θ 1 − θ 0 1 )
n( ˆ θ 2 − θ 2 0 )
n→∞,→0 −→ N
0,
I opt −1 (θ 0 1 , θ 2 0 ) 0 0 I σ −1 (θ 0 1 , θ 2 0 )
Under the condition √ → 0
Statistical framework for epidemic diffusion models
dX t = b(θ 1 , X t )dt + σ(θ 2 , X t )dB t
Specificities of epidemics : θ 1 = θ 2
= √ 1
N
popIn general : N pop >> n and ∆ ≥ 1day
I opt (θ 0 1 , θ 2 0 ) (information matrix on θ 1 ) is also the Fisher information matrix of the Markov Jump process MLE
⇒ Need of a general approach for any value of ∆
⇒ Consider the asymptotics → 0, ∆ is fixed
⇒ Focus on the estimation of θ 1
Pb : Impossible to estimate θ 2 in this asymptotics
⇒ use of epidemic specificities
Idea : Contrast process based on Gaussian process g (t) (generalization of Genon-Catalot(90))
Taylor’s stochastic expansion X t = x(t) + g (t ) + O P ( 2 )
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 18 / 34
Statistical framework for epidemic diffusion models
dX t = b(θ 1 , X t )dt + σ(θ 2 , X t )dB t
Specificities of epidemics : θ 1 = θ 2
= √ 1
N
popIn general : N pop >> n and ∆ ≥ 1day
I opt (θ 0 1 , θ 2 0 ) (information matrix on θ 1 ) is also the Fisher information matrix of the Markov Jump process MLE
⇒ Need of a general approach for any value of ∆
⇒ Consider the asymptotics → 0, ∆ is fixed
⇒ Focus on the estimation of θ 1
Pb : Impossible to estimate θ 2 in this asymptotics
⇒ use of epidemic specificities
Idea : Contrast process based on Gaussian process g (t)
Main idea
Important property of g θ : g θ (t k ) = Φ θ
1(t k , t k−1 )g θ (t k−1 ) + √
∆Z k θ (Z k θ ) k∈{1,..,n} independent Gaussian vectors with covariance matrix S k θ S
kθ=
∆1Z
tktk−1
Φ
θ1(t
k, s)Σ(θ
2, x
θ1(s))
tΦ
θ1(t
k, s)ds
Choice of functions B k and V k : Recall U
,∆((θ
1, θ
2), (X
tk)
k) =
n
X
k=1
log(det(V
k(θ
2))) + 1
2∆
t
B
k(θ
1)V
k−1(θ
2)B
k(θ
1)
⇒ B k (θ 1 ) = X t
k− x θ
1(t k ) − Φ θ
1(t k , t k −1 )
X t
k−1− x θ
1(t k −1 ) (≈ √
∆Z k θ ) Choice of function V k ? (Natural choice S k θ depends on both param.) If , ∆ → 0 S k θ ∼ Σ(θ 2 , X t
k−1) ⇒ V k (θ 2 ) = Σ(θ 2 , X t
k−1)
If → 0, ∆ is fixed, (ˆ θ 2 has no good properties) 2 possibilities : V k = I d (General case) or V k (θ 1 ) = S k θ
1,θ
1(Epidemic framework)
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 19 / 34
Main idea
Important property of g θ : g θ (t k ) = Φ θ
1(t k , t k−1 )g θ (t k−1 ) + √
∆Z k θ (Z k θ ) k∈{1,..,n} independent Gaussian vectors with covariance matrix S k θ S
kθ=
∆1Z
tktk−1
Φ
θ1(t
k, s)Σ(θ
2, x
θ1(s))
tΦ
θ1(t
k, s)ds
Choice of functions B k and V k : Recall U
,∆((θ
1, θ
2), (X
tk)
k) =
n
X
k=1
log(det(V
k(θ
2))) + 1
2∆
t
B
k(θ
1)V
k−1(θ
2)B
k(θ
1)
⇒ B k (θ 1 ) = X t
k− x θ
1(t k ) − Φ θ
1(t k , t k −1 )
X t
k−1− x θ
1(t k −1 ) (≈ √
∆Z k θ ) Choice of function V k ? (Natural choice S k θ depends on both param.) If , ∆ → 0 S k θ ∼ Σ(θ 2 , X t
k−1) ⇒ V k (θ 2 ) = Σ(θ 2 , X t
k−1)
If → 0, ∆ is fixed, (ˆ θ 2 has no good properties) 2 possibilities :
V k = I d (General case) or V k (θ 1 ) = S k θ
1,θ
1(Epidemic framework)
Asymptotic results (Guy et al.(12)) : consistency and CLT
B
k(θ
1) = X
tk− x
θ1(t
k) − Φ
θ1(t
k, t
k−1)
X
tk−1− x
θ1(t
k−1)
Identifiability assumptions : based on deterministic identifiability for θ 1
1. Low frequency, θ 2 unknown : → 0, ∆ fixed ; V k = Id −1 θ ¯ 1 − θ 0 1
−→ →0 N (0, J ∆ −1 (θ 0 1 , θ 2 0 ))
2. Low frequency, θ 2 = θ 1 (epidemics) : → 0, ∆ fixed ; V k = S k θ
1,θ
1−1
θ ˜ 1 − θ 0 1
−→ →0 N (0, I ∆ −1 (θ 1 0 ))
3. High frequency : , ∆ → 0 ; V k = Σ(θ 2 , X t
k−1) √ −1 ( ˇ θ 1 − θ 0 1 )
n( ˇ θ 2 − θ 2 0 )
n→∞,→0 −→ N
0,
I opt −1 (θ 0 1 , θ 2 0 ) 0 0 I σ −1 (θ 0 1 , θ 2 0 )
“Optimality” of I ∆ : As ∆ → 0, I ∆ (θ 0 1 ) → I opt (θ 0 1 ) Results valable for time dependent diffusions
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 20 / 34
Asymptotic results (Guy et al.(12)) : consistency and CLT
B
k(θ
1) = X
tk− x
θ1(t
k) − Φ
θ1(t
k, t
k−1)
X
tk−1− x
θ1(t
k−1)
Identifiability assumptions : based on deterministic identifiability for θ 1
1. Low frequency, θ 2 unknown : → 0, ∆ fixed ; V k = Id −1 θ ¯ 1 − θ 0 1
−→ →0 N (0, J ∆ −1 (θ 0 1 , θ 2 0 ))
2. Low frequency, θ 2 = θ 1 (epidemics) : → 0, ∆ fixed ; V k = S k θ
1,θ
1−1
θ ˜ 1 − θ 0 1
−→ →0 N (0, I ∆ −1 (θ 1 0 ))
3. High frequency : , ∆ → 0 ; V k = Σ(θ 2 , X t
k−1) √ −1 ( ˇ θ 1 − θ 0 1 )
n( ˇ θ 2 − θ 2 0 )
n→∞,→0 −→ N
0,
I opt −1 (θ 0 1 , θ 2 0 ) 0 0 I σ −1 (θ 0 1 , θ 2 0 )
“Optimality” of I ∆ : As ∆ → 0, I ∆ (θ 0 1 ) → I opt (θ 0 1 )
Results valable for time dependent diffusions
Asymptotic results (Guy et al.(12)) : consistency and CLT
B
k(θ
1) = X
tk− x
θ1(t
k) − Φ
θ1(t
k, t
k−1)
X
tk−1− x
θ1(t
k−1)
Identifiability assumptions : based on deterministic identifiability for θ 1
1. Low frequency, θ 2 unknown : → 0, ∆ fixed ; V k = Id −1 θ ¯ 1 − θ 0 1
−→ →0 N (0, J ∆ −1 (θ 0 1 , θ 2 0 ))
2. Low frequency, θ 2 = θ 1 (epidemics) : → 0, ∆ fixed ; V k = S k θ
1,θ
1−1
θ ˜ 1 − θ 0 1
−→ →0 N (0, I ∆ −1 (θ 1 0 ))
3. High frequency : , ∆ → 0 ; V k = Σ(θ 2 , X t
k−1) √ −1 ( ˇ θ 1 − θ 0 1 )
n( ˇ θ 2 − θ 2 0 )
n→∞,→0 −→ N
0,
I opt −1 (θ 0 1 , θ 2 0 ) 0 0 I σ −1 (θ 0 1 , θ 2 0 )
“Optimality” of I ∆ : As ∆ → 0, I ∆ (θ 0 1 ) → I opt (θ 0 1 ) Results valable for time dependent diffusions
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 20 / 34
Asymptotic results (Guy et al.(12)) : consistency and CLT
B
k(θ
1) = X
tk− x
θ1(t
k) − Φ
θ1(t
k, t
k−1)
X
tk−1− x
θ1(t
k−1)
Identifiability assumptions : based on deterministic identifiability for θ 1
1. Low frequency, θ 2 unknown : → 0, ∆ fixed ; V k = Id −1 θ ¯ 1 − θ 0 1
−→ →0 N (0, J ∆ −1 (θ 0 1 , θ 2 0 ))
2. Low frequency, θ 2 = θ 1 (epidemics) : → 0, ∆ fixed ; V k = S k θ
1,θ
1−1
θ ˜ 1 − θ 0 1
−→ →0 N (0, I ∆ −1 (θ 1 0 ))
3. High frequency : , ∆ → 0 ; V k = Σ(θ 2 , X t
k−1) √ −1 ( ˇ θ 1 − θ 0 1 )
n( ˇ θ 2 − θ 2 0 )
n→∞,→0 −→ N
0,
I opt −1 (θ 0 1 , θ 2 0 ) 0 0 I σ −1 (θ 0 1 , θ 2 0 )
“Optimality” of I ∆ : As ∆ → 0, I ∆ (θ 0 1 ) → I opt (θ 0 1 )
Results valable for time dependent diffusions
Asymptotic results (Guy et al.(12)) : consistency and CLT
B
k(θ
1) = X
tk− x
θ1(t
k) − Φ
θ1(t
k, t
k−1)
X
tk−1− x
θ1(t
k−1)
Identifiability assumptions : based on deterministic identifiability for θ 1
1. Low frequency, θ 2 unknown : → 0, ∆ fixed ; V k = Id −1 θ ¯ 1 − θ 0 1
−→ →0 N (0, J ∆ −1 (θ 0 1 , θ 2 0 ))
2. Low frequency, θ 2 = θ 1 (epidemics) : → 0, ∆ fixed ; V k = S k θ
1,θ
1−1
θ ˜ 1 − θ 0 1
−→ →0 N (0, I ∆ −1 (θ 1 0 ))
3. High frequency : , ∆ → 0 ; V k = Σ(θ 2 , X t
k−1) √ −1 ( ˇ θ 1 − θ 0 1 )
n( ˇ θ 2 − θ 2 0 )
n→∞,→0 −→ N
0,
I opt −1 (θ 0 1 , θ 2 0 ) 0 0 I σ −1 (θ 0 1 , θ 2 0 )
“Optimality” of I ∆ : As ∆ → 0, I ∆ (θ 0 1 ) → I opt (θ 0 1 ) Results valable for time dependent diffusions
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 20 / 34
Assessing the performances of the method on simulations of the Markov jump process
Simulation Scheme :
Choose an epidemic scenario (Mechanistic model, population size, epidemic parameters)
Run 1000 exact simulations of the Markov Jump process of the chosen scenario
For each run
I
Compute the MLE of the Markov jump process with all the jumps
I
Dicretize the simulation at sampling ∆ (with realistic values ∆ ≥ 1)
I
Compute our estimators from the discretized data
Remark : Here only results on ˜ θ 1 presented (low frequency with θ 1 = θ 2 )
(similar performance for other estimators)
Good results even for small pop. sizes (several N pop and ∆)
Theoretical conf. ellipsoids with mean point estim. (MLE and ˜ θ
1)
Scenario : SIR (R
0, d) = (1.5, 3(days)), T = 50(d), ∆ = 1, 5, 10(d)
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 22 / 34
Shape of the ellipsoids (several (R 0 , d ), ∆)
(SIR with R = 1.5, 5, d = 3, 7, ∆ = 1, T /10, N = 1000)
Good results even for the time dependent case (several ∆)
Theoretical confidence ellipsoids with mean point estimation ( ˜ θ 1 )
(SIRS R
0, d, δ = 1.5, 3(d), 2(y), λ
1= 0.15, ∆ = 1, 7(d), T = 20(y), µ = 1/50(y)), N = 10
6Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 24 / 34
Outline
1 Context for transmissible diseases (Data, models and statistical inference)
2 Formalism of the mechanistic model : diffusions with small diffusion coefficient ( = √ 1
N
pop)
3 Statistical inference for discretely observed diffusions with small diffusion coefficient, theoretical and Simulation results
4 Ongoing Work : statistical inference for partially and discretely observed diffusion with small diffusion coefficient. Theoretical result and
application on Sentinelles ILI data
Available observations
Observations of the new infecteds only between t k and t k+1 :
1
∆
Z t
k+1t
kλS t I t dt
For d small, in first approx. new recovered only between t k and t k+1 : ∆ 1
Z t
k+1t
kγI t dt
⇒ Discretized and integrated observations of one coordinate of a multidimensional diffusion process.
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 26 / 34
Discrete obs. of one coordinate of a diffusion
Remark : For SIR model : Z t
k+1t
kγI t dt = R t
k+1− R t
k⇒ Discretized obs. of one coordinate of a bi-dimensionnal diffusion For SIRS : approximation as ∆ → 0
1
∆
Z t
k+1t
kλS t I t dt ≈ λS t
kI t
k1
∆
Z t
k+1t
kγ I t dt ≈ γI t
kStatistical framework for the process X t : X t solution of dX t = b(t, θ 1 , X t )dt + σ(t, θ 2 , X t )dB t , X 0 = x 0 ∈ R p ,
Observations : X t 1
kfor t k = k∆ ∈ [0, T ], k ∈ {1, .., n} (T = n∆)
Asymptotics : , ∆ → 0 ( → 0, ∆ fixed)
Few existing results for → 0
(Kutoyants (94)) :
dX t = f t (θ)Y t dt + dB t 1 , X 0 = 0;
dY t = b t (θ)Y t dt + σ t (θ)dB t 2 , Y 0 = y 0 ;
Continuous observation of X t , functions f t , b t , σ t known
-Consistence and asymptotic normality (at rate −1 ) for MLE of parameter θ.
-Generalization by linearization of the drift function around its deterministic limit
- Identifiability assumption based on identifiability of the deterministic limit of f t (θ)E [Y t |X s≤t ]
(James & Le Gland (95)) : Consistency result for the MLE estimator in the general case of bidimensionnal process (X t , Y t ) with only the coordinate X t observed
- Identifiability of parameters based on identifiability of x(t) deterministic limit of X t
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 28 / 34
Main Idea : adapt our previous contrast approach
Recall U
,∆((θ
1, θ
2), (X
tk)
k) =
n
X
k=1
log(det(V
k(θ
2))) + 1
2∆
t
B
k(θ
1)V
k−1(θ
2)B
k(θ
1) B
k(θ
1) = X
tk− x
θ1(t
k) − Φ
θ1(t
k, t
k−1)
X
tk−1− x
θ1(t
k−1) V
k∈ {I
d, S
kθ1,θ1, Σ(θ
2, X
tk−1)}
⇒ ellaborate U ,∆ 1 (θ 1 , θ 2 ) =
n
X
k=1
log (V k 1 ) + 1 2 ∆
(B k 1 ) 2 V k 1
Pb1 : Due to Φ ∈ M d ( R ) each coordinate of B k depends on all the coordinates of X t
k−1(non observed)
Pb2 : Same goes with V k = S k θ
1,θ
1, Σ(θ 2 , X t
k−1)
⇒ Sol2 : Choose V k 1 = 1, and focus on estimating θ 1 Remark :Φ(t k , t k −1 ) ≈ I d when ∆ → 0
⇒ Sol1 : Consider B k 1 = X t 1
k− x θ 1
1
(t k ) − X t 1
k−1− x θ 1
1
(t k−1 )
Assumptions and asymptotic results for this contrast approach
U ,∆ 1 =
21 ∆ n
X
k =1
(X t 1
k
− x θ 1
1(t k ) − X t 1
k−1− x θ 1
1(t k−1 )) 2 θ ˆ 1 = argmin
θ
1U ,∆ 1
B k 1 computable function : Only if x 0 ∈ R p is known
⇒ Assume x 0 ∈ R p known or put x 0 2 , .., x 0 p in parameter θ 1
Identifiablity condition on θ 1 : based on deterministic identifiability x θ 1
1
(·) 6= x θ 1
0 1(·)
Asymptotic properties : 1. → 0, ∆ fixed : −1
θ ˆ 1 − θ 0 1
−→ →0 N
0, J ∆ part (θ 0 1 , θ 2 0 ) −1
; 2. , ∆ → 0 :
−1
θ ˆ 1 − θ 0 1
,∆→0 −→ N
0, J 0 part (θ 0 1 , θ 2 0 ) −1
; with J 0 part (θ 1 , θ 2 ) = lim
∆→0 J ∆ part (θ 1 , θ 2 )
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 30 / 34
Application to real data : Sentinelles surveillance network ILI
, λ(t) = λ 0 (1 + λ 1 cos(2π T t
per
)
Observations : X t 1
k= ργI t
k, where ρ is the reporting rate
Identifiability : θ 1 = (ρ, λ 0 , λ 1 , γ, δ) for η, µ, T per , s 0 , i 0 fixed
s , i fixed at t = −10(y) (s = 0.72, i = 1e − 4),
Non autonomous case : time forced model (SIRS)
Resonance problem (see Keeling & Rohani (08))
Different behaviors depending on the parameters (λ 1 bifurcation parameter)
Proportion of infected over time ( N = 10
7, R
0= 1.5,
γ1= 3(day ), T
per= 365(day ),
1δ= 2(yrs), η = 10
−6and λ
1∈ {0.05, 0.15}
(λ(t) = λ
0(1 + λ
1cos(2πt/T ) )
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 32 / 34
Deterministic trajectories associated to estimated values
θ 1 R 0 d (days) 1 δ (years) λ 1 ρ(%) θ ˆ 1 1.43 2.73 1.8404 0.1304 27.84
0
References
(Guy et al (12)) Parametric inference for discretely observed multidimensional diffusions with small diffusion coefficient. to appear in Stoch. Proc. Appl.
(Guy et al (13)) Approximation of epidemic models by diffusion processes and their statistical inference. submitted
(Ethier & Kurtz (05)) Markov Processes : Characterization and Convergence (Kutoyants (94)) Identification of Dynamical Systems with Small Noise
(James & Le Gland (95)) Consistent parameter estimation for partially observed diffusions with small noise
(Sorensen & Uchida (03)) Small-diffusion asymptotics for discretely sampled stochastic differential equations
(Sorensen & Gloter (09)) Estimation for stochastic differential equations with a small diffusion coefficient
(Genon-Catalot (90)) Maximum contrast estimation for diffusion processes from discrete observations
(Breto et al (09)) Time series analysis via mechanistic models
(Toni et al (09)) Approximate Bayesian computation scheme for parameter inference and model selection in dynamical systems
Romain GUY1,2Joint work with C. Lar´edo1,2and E. VerguInference for epidemic models1() 13 mai 2013 34 / 34