Optimal stopping
for change-point detection
of Piecewise Deterministic
Markov Processes
Alice Cleynen, Benoîte de Saporta
Outline
Motivation
Change-point detection problem
Numerical approximation
Numerical results
Motivation
Piecewise deterministic Markov processes
Davis (80’s)
General class ofnon-diffusiondynamic stochastichybrid models:
deterministicmotion punctuated byrandomjumps.
Applications of PDMPs
Engineering systems, operations research, management science,
economics, internet traffic,dependability and safety,neurosciences,
biology, . . .
I mode: nominal, failures, breakdown, environment,number of
individuals, response to a treatment, . . .
I Euclidean variable: pressure, temperature, time,size,
Motivation
Impulse control problem
Impulse control
Select
I intervention dates
I new starting point for the process at interventions tominimizea cost function
I repaira component before breakdown
I changetreatment before relapse
I . . .
Motivation
If the jump times are not observed?
I [BdSD 12]Optimal stopping I jump timesobserved
I post-jump locations observed through noise
Numerical approximation of the value function and -optimal stopping time
I [BL 17] Continuous control
I jump times and post-jump locationsobserved through noise Optimality equation, existence of optimal policies
Motivation
If the jump times are not observed?
I [BdSD 12]Optimal stopping I jump timesobserved
I post-jump locations observed through noise
Numerical approximation of the value function and -optimal stopping time
I [BL 17] Continuous control
I jump times and post-jump locationsobserved through noise Optimality equation, existence of optimal policies
Motivation
Change-point detection
Simplest special case
I only onejump of the modevariable
I discrete noisy observations of the continuousvariable on a
regular time grid
Optimal stopping = Change-point detection
Aim: numerical approximation to
I detect the change-point at best (not too early/late)
Change-point detection problem
Simple PDMP model
I State spaceE × R = {0, 1, . . . , d} × R × R: mode, position,
time
I Starting point X0 = (0, x ,0), flow Φ0
I time-dependent Jump intensity λ0(x ,u) = λ(u)
I Jump kernel: position and time continuous, switch to mode i
Change-point detection problem
Observations
I Observation times tn= δn
I Noisy observations of thepositions Yn= F (xtn) + n
Change-point detection problem
Observations
I Observation times tn= δn
I Noisy observations of thepositions Yn= F (xtn) + n
Change-point detection problem
Observations
I Observation times tn= δn
I Noisy observations of thepositions Yn= F (xtn) + n
Change-point detection problem
Partially observed optimal stopping problem
I Finite horizonδN
I Admissible stopping times τ: FY-measurable
I Admissible decisions A: {0, 1, . . . , d } valued, FτY-measurable
I Cost per stage before stopping
I c(0, x , y ) = 0 rightfully not stopped
I c(m 6= 0, x , y ) =βδ lateness penalty I Terminal cost at stopping
I C(m, x , y , 0) = c(m, x , y ) no stopping before the horizon
I C(0, x , y , a 6= 0) =αearly stopping penalty
I C(m 6= 0, x , y , a = m) = 0 good mode selection I C(m 6= 0, x , y , a 6= 0, m) =γwrong mode penalty
Cost of admissible strategy (τ, A)
Change-point detection problem
Partially observed optimal stopping problem
I Finite horizonδN
I Admissible stopping times τ: FY-measurable
I Admissible decisions A: {0, 1, . . . , d } valued, FτY-measurable
I Cost per stage before stopping
I c(0, x , y ) = 0 rightfully not stopped
I c(m 6= 0, x , y ) =βδ lateness penalty I Terminal cost at stopping
I C(m, x , y , 0) = c(m, x , y ) no stopping before the horizon
I C(0, x , y , a 6= 0) =αearly stopping penalty
I C(m 6= 0, x , y , a = m) = 0 good mode selection I C(m 6= 0, x , y , a 6= 0, m) =γwrong mode penalty
Cost of admissible strategy (τ, A)
Change-point detection problem
Partially observed optimal stopping problem
I Finite horizonδN
I Admissible stopping times τ: FY-measurable
I Admissible decisions A: {0, 1, . . . , d } valued, FτY-measurable
I Cost per stage before stopping
I c(0, x , y ) = 0 rightfully not stopped
I c(m 6= 0, x , y ) =βδ lateness penalty I Terminal cost at stopping
I C(m, x , y , 0) = c(m, x , y ) no stopping before the horizon
I C(0, x , y , a 6= 0) =αearly stopping penalty
I C(m 6= 0, x , y , a = m) = 0 good mode selection
I C(m 6= 0, x , y , a 6= 0, m) =γwrong mode penalty
Cost of admissible strategy (τ, A)
Change-point detection problem
Partially observed optimal stopping problem
I Finite horizonδN
I Admissible stopping times τ: FY-measurable
I Admissible decisions A: {0, 1, . . . , d } valued, FτY-measurable
I Cost per stage before stopping
I c(0, x , y ) = 0 rightfully not stopped
I c(m 6= 0, x , y ) =βδ lateness penalty I Terminal cost at stopping
I C(m, x , y , 0) = c(m, x , y ) no stopping before the horizon
I C(0, x , y , a 6= 0) =αearly stopping penalty
I C(m 6= 0, x , y , a = m) = 0 good mode selection
I C(m 6= 0, x , y , a 6= 0, m) =γwrong mode penalty
Cost of admissible strategy (τ, A)
Change-point detection problem
Fully observed optimal stopping problem
I Filter process Θn(A × B) = P(0,x ,y )(Xδn∈ A × B|FnY)
I (Θn, Yn) time inhomogeneous Markov chain with explicit
transition kernels Rn0 on P(E )× R
I cost functions c0(θ, y ) =R c(m, x , y )dθ(m, x ),
C0(θ, y , a) =R C (m, x , y , a)dθ(m, x )
Fully observed optimal stopping problem
Minimizeover all admissible strategies (τ, a)
Change-point detection problem
Aim of the talk
I numerical approximationof the value function
I computablestrategy
Difficulties
I measure-valued filter process
Numerical approximation
Dynamic programming
Value function V0(θ, y ) = inf (τ,A)J 0 (τ, A, (θ, y )) = inf (τ,A)E(θ,y ) (τ −1)∧N X n=0 c0(Θn, Yn) + C0(Θτ ∧N, Yτ ∧N, A) Dynamic programming vN0 (θ, y ) = min0≤a≤dC0(θ, y , a)vk0(θ, y ) = minmin1≤a≤dC0(θ, y , a); c0(θ, y ) +Rk0vk+10 (θ, y )
Numerical approximation
Quantization
[P 98], [PPP 04], [PRS05], . . .
Quantization of a random variable X ∈ L2(Rq)
Approximate X by bX taking finitelymany values such that
kX − bX k2 is minimum
I Find a finite weighted grid Γ with|Γ| = NΓ
I Set bX = pΓ(X ) closest neighbor projection
Asymptotic properties
If E [|X |2+η] < +∞ for some η > 0 then
lim
NΓ→∞
NΓ1/q min
|Γ|≤NΓ
Numerical approximation
Algorithms
There exist algorithms providing
I Γ
I law of bX
I transition probabilitiesfor quantization of Markov chains
Numerical approximation
Algorithms
There exist algorithms providing
I Γ
I law of bX
I transition probabilitiesfor quantization of Markov chains
Numerical approximation
Grids construction
Model−→simulator of trajectories −→grids
Numerical approximation
Grids construction
Model−→simulator of trajectories −→grids
Numerical approximation
Grids construction
Model−→simulator of trajectories −→grids
Numerical approximation
Grids construction
Model−→simulator of trajectories −→grids
Numerical approximation
Assets and drawbacks of quantization
Assets
I a simulatorof the target law is enough to build the grids
I automatic construction of grids
I convergence rate for E[|f (X ) − f ( bX )|] if f lipschitz
I empirical error measure by Monte Carlo
Drawbacks
I computation time
I curse of dimension
Numerical approximation
Candidate computable strategy
Dynamic programming
I vˆN0(ˆθ, ˆy ) = min0≤a≤dC0(ˆθ, ˆy , a)
Numerical approximation
Candidate computable strategy
n ← 0 y ← y0 ¯ θ ← δ(0,x0) r ← r0(¯θ, y ) Observation y0
Numerical approximation
Candidate computable strategy
Numerical approximation
Candidate computable strategy
n ← 0 y ← y0 ¯ θ ← δ(0,x0) r ← r0(¯θ, y ) Observation y0
Numerical approximation
Candidate computable strategy
n ← 0 y ← y0 ¯ θ ← δ(0,x0) r ← r0(¯θ, y ) Observation y0
Numerical approximation
Candidate computable strategy
n ← 0 y ← y0 ¯ θ ← δ(0,x0) r ← r0(¯θ, y ) Observation y0
Numerical approximation
Candidate computable strategy
n ← 0 y ← y0 ¯ θ ← δ(0,x0) r ← r0(¯θ, y ) Observation y0
Numerical approximation
Candidate computable strategy
n ← 0 y ← y0 ¯ θ ← δ(0,x0) r ← r0(¯θ, y ) Observation y0
Numerical results
Example 1
I d = 3, pi = 1/3, x0 = 1
I Φ0(x , t) = x , Φ1(x , t) = xe0.1t, Φ2(x , t) = xe0.5t,
Φ3(x , t) = xe1t
I β = 1 (late detection), γ = 1.5 (wrong mode), δ = 1/6
Numerical results
Example 1
I d = 3, pi = 1/3, x0 = 1
I Φ0(x , t) = x , Φ1(x , t) = xe0.1t, Φ2(x , t) = xe0.5t,
Φ3(x , t) = xe1t
I β = 1 (late detection), γ = 1.5 (wrong mode), δ = 1/6
0 1 2 3 4 5 6 0 2 4 6 8 10 12 time obser vations Observations true mode = 2 0 1 2 3 4 5 6 2 4 6 8 10 time mobile a ver age Mobile average chosen mode = 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 time P(M|Y) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● P(M=0|Y_1:t) P(M=1|Y_1:t) P(M=2|Y_1:t) P(M=3|Y_1:t) Kalman filter chosen mode = 2 1.0 1.5 2.0 2.5 3.0 decision PDMP
Numerical results
Example 1
I d = 3, pi = 1/3, x0 = 1
I Φ0(x , t) = x , Φ1(x , t) = xe0.1t, Φ2(x , t) = xe0.5t,
Φ3(x , t) = xe1t
I β = 1 (late detection), γ = 1.5 (wrong mode), δ = 1/60 1 2 3 4 5 6
0 2 4 6 8 10 12 time obser vations Observations true mode = 2 0 1 2 3 4 5 6 2 4 6 8 10 time mobile a ver age Mobile average chosen mode = 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 time P(M|Y) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● P(M=0|Y_1:t) P(M=1|Y_1:t) P(M=2|Y_1:t) P(M=3|Y_1:t) Kalman filter chosen mode = 2 0.0 0.5 1.0 1.5 2.0 2.5 3.0 decision PDMP chosen mode = 2
Numerical results
Example 1
I d = 3, pi = 1/3, x0 = 1
I Φ0(x , t) = x , Φ1(x , t) = xe0.1t, Φ2(x , t) = xe0.5t,
Φ3(x , t) = xe1t
I β = 1 (late detection), γ = 1.5 (wrong mode), δ = 1/6
0 1 2 3 4 5 6 0 2 4 6 8 10 12 time obser vations Observations true mode = 2 0 1 2 3 4 5 6 2 4 6 8 10 time mobile a ver age Mobile average chosen mode = 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 1 2 3 4 5 6 0.0 0.2 0.4 0.6 0.8 1.0 time P(M|Y) ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● P(M=0|Y_1:t) P(M=1|Y_1:t) P(M=2|Y_1:t) P(M=3|Y_1:t) Kalman filter chosen mode = 2 1.0 1.5 2.0 2.5 3.0 decision PDMP
Numerical results
Example 1
I d = 3, pi = 1/3, x0 = 1
I Φ0(x , t) = x , Φ1(x , t) = xe0.1t, Φ2(x , t) = xe0.5t,
Φ3(x , t) = xe1t
I β = 1 (late detection), γ = 1.5 (wrong mode), δ = 1/6
Numerical results
Example 1
I d = 3, pi = 1/3, x0 = 1
I Φ0(x , t) = x , Φ1(x , t) = xe0.1t, Φ2(x , t) = xe0.5t,
Φ3(x , t) = xe1t
I β = 1 (late detection), γ = 1.5 (wrong mode), δ = 1/6
Moving Average Kalman PDMP
α σ2 threshold=2
Numerical results
Example 2
I d = 1, x0= (0, 0) I Φ0((x , u), t) = (sin(3π(u + t)), u + t), Φ1((x , u), t) = (sin(5π(u + t)), u + t) I δ = 1/6, noise variance 1 0 1 2 3 4 5 6 − 2 0 2 4 6 alpha = 3 beta = 1.5 t xt 0 1 2 3 4 5 6 − 2 0 2 4 6 alpha = 4 beta = 2 t xt 0 1 2 3 4 5 6 − 2 0 2 4 6 alpha = 5 beta = 1 t xt 0 1 2 3 4 5 6 − 2 0 2 4 6 alpha = 5 beta = 0.5 t xtNumerical results
Example 3
I d = 2, x0= (0, 0) I Φ0((x , u), t) = (sin(3π(u + t)), u + t), Φ1((x , u), t) = (sin(3π(u + t))+0.5t, u + t), Φ2((x , u), t) = (sin(3π(u + t))+1.5t, u + t) I δ = 1/6, noise variance 1 0 1 2 3 4 5 6 − 2 0 2 4 6 alpha = 5 beta = 2 t xt true mode = 2 0 1 2 3 4 5 6 − 2 0 2 4 6 alpha = 4 beta = 1 t xt true mode = 2 0 2 4 6 alpha = 6 beta = 1.5 xt 0 2 4 6 alpha = 3 beta = 1.5 xtConclusion and perspectives
Conclusion and perspectives
I Change-point detection method forcontinuous-time jump
dynamics, able to detecta jump andselect the post-jump
mode
I For general flows but dimension 1 (+ time)
To be done
I Real data applications
I Theoretical validity of the stopping rule
I Allow to stop between observations
I Severaljumps
Conclusion and perspectives
Conclusion and perspectives
I Change-point detection method forcontinuous-time jump
dynamics, able to detecta jump andselect the post-jump
mode
I For general flows but dimension 1 (+ time)
To be done
I Real data applications
I Theoretical validity of the stopping rule
I Allow to stop between observations
I Severaljumps
Conclusion and perspectives
Reference
[BL 17]N. Bäuerle, D. Lange Optimal control of partially observed
PDMPs
[BdSD 12]A. Brandejsky, B. de Saporta, F. Dufour Optimal stopping for
partially observed PDMPs
[CD 89]O. Costa, M. Davis Impulse control of piecewise-deterministic
processes
[Davis 93]M. Davis, Markov models and optimization
[dSDZ 14]B. de Saporta, F. Dufour, H. Zhang Numerical methods for
simulation and optimization of PDMPs: application to reliability [P 98]G. Pagès A space quantization method for numerical integration
[PPP 04]G. Pagès, H. Pham, J. Printems An optimal Markovian
quantization algorithm for multi-dimensional stochastic control problems [PRS 05]H. Pham, W. Runggaldier, A.f Sellami Approximation by