• Aucun résultat trouvé

LEARNING DYNAMIC SYSTEMS :

N/A
N/A
Protected

Academic year: 2022

Partager "LEARNING DYNAMIC SYSTEMS :"

Copied!
30
0
0

Texte intégral

(1)

LEARNING DYNAMIC SYSTEMS : MARKOV MODELS

Markov Process and Markov Chains Hidden Markov Models

Bayesian Filtering and Kalman Filter

(2)

Bayesian, Machine Learning, Frederic Pennerath

Example of application:

Navigation Systems and Data Fusion

GPS:

☺ No drift

 Medium accuracy

 Not responsive Altimeter:

☺ No drift

 Altitude only

 Not accurate

Inertial System:

☺ Responsive

 Long term drift Navigation System:

Position (L, l, z)

Angles (yaw, pitch, roll) Speed

Angular speed

Tracking example using camshift & Kalman filtering

(3)

The whole picture:

Measure Y m

Estimated state 𝑋 ෠ Control U

Sensor

Controller

State estimator U Real state 𝑋

Input U Real system Output Y

Environment Calculator DAC

State 𝑋 is a hidden variable

(4)

Bayesian, Machine Learning, Frederic Pennerath

Controllable Markov models

Update Markov models with a command 𝑼 𝒕 :

𝑃 𝑋 𝑡 𝑋 𝑡−1 , 𝑈 1 , … 𝑈 𝑡−1 , 𝑌 1 , … 𝑌 𝑡−1 = 𝑃 𝑋 𝑡 𝑋 𝑡−1 , 𝑈 𝑡−1 and

𝑃 𝑌 𝑡 𝑋 𝑡 , 𝑈 1 , … 𝑈 𝑡−1 , 𝑌 1 , … 𝑌 𝑡−1 = 𝑃 𝑌 𝑡 𝑋 𝑡

𝑋

0

𝑈

0

𝑋

1

𝑌

1

𝑈

1

𝑋

2

𝑌

2

𝑈

𝑡−1

𝑋

𝑡

𝑌

𝑡

𝑃 𝑋 2 𝑋 1 , 𝑈 0 , 𝑈 1 , 𝑌 1 = 𝑃 𝑋 2 𝑋 1 , 𝑈 1

𝑃 𝑌 2 𝑋 2 , 𝑈 0 , 𝑈 1 , 𝑌 1 = 𝑃 𝑌 2 𝑋 2

(5)

Bayesian Filtering:

Prediction and update steps

… 𝑋

0|0

𝑈

0

𝑋

1|0

𝑋

1|1

𝑈

1

𝑋

2|1

𝑋

2|2

𝑈

𝑡−1

𝑋

𝑡|𝑡−1

𝑋

𝑡|𝑡

Prediction step:

• Predicts future state given commands and observations:

X t|t−1 ≝ X 𝑡 |𝑈 0 … 𝑈 𝑡−1 𝑌 0 … 𝑌 𝑡−1

• Model integration

• Increase of uncertainty (usually)

Update step:

• Adjust current state given new observation:

X t|t ≝ X t |𝑈 0 … 𝑈 𝑡−1 𝑌 0 … 𝑌 𝑡−1 𝑌 𝑡

• Model correction

• Decrease of uncertainty

(6)

Bayesian, Machine Learning, Frederic Pennerath

Bayesian Filtering:

State Space Representation

g t (X t , U 𝑡 ) State 𝑋 𝑡

Input U 𝑡 Output Y t

f 𝑡 (X t , U t )

State 𝑋 𝑡+𝜏 Delay 𝜏

Continuous system (𝑓 𝑡 , 𝑔 𝑡 ):

൞ 𝑑𝑋 𝑡

𝑑𝑡 = 𝑓 𝑡 𝑋 𝑡 , 𝑈 𝑡 𝑌 𝑡 = 𝑔 𝑡 𝑋 𝑡 , 𝑈 𝑡

Discrete system (𝑓 𝑡 , 𝑔 𝑡 ):

ቊ 𝑋 𝑡+𝜏 = 𝑓 𝑡 𝑋 𝑡 , 𝑈 𝑡 𝑌 𝑡 = 𝑔 𝑡 𝑋 𝑡 , 𝑈 𝑡

Stationary system:

𝑓 𝑡 = 𝑓, 𝑔 𝑡 = 𝑔

Non deterministic system:

൝ 𝑋 𝑡+𝜏 = 𝑓 𝑡 𝑋 𝑡 , 𝑈 𝑡 + ℰ 𝑡 𝑋 𝑌 𝑡 = 𝑔 𝑡 𝑋 𝑡 , 𝑈 𝑡 + ℰ 𝑡 𝑌

Linear system (A t , B t , C t , D t ):

ቊ 𝑋 𝑡+𝜏 = 𝐴 𝑡 𝑋 𝑡 + 𝐵 𝑡 𝑈 𝑡 𝑌 𝑡 = 𝐶 𝑡 𝑋 𝑡 + 𝐷 𝑡 𝑈 𝑡

State space representation ensures

Markov property.

(7)

A simple example:

localization of a wheeled vehicle

𝑋 𝑡 = 𝑥 𝑡 𝑦 𝑡 𝜃 𝑡 𝑣 𝑡

, 𝑈 𝑡 = 𝒞 𝑡

𝛼 𝑡 , 𝑌 𝑡 𝑜𝑑𝑜 = 𝑣 𝑡 𝑜𝑑𝑜 , 𝑌 𝑡 𝑔𝑝𝑠 = 𝑥 𝑡 𝑔𝑝𝑠 𝑦 𝑡 𝑔𝑝𝑠 𝜃 𝑡

𝒞 𝑡 = 1

𝑅 𝑡

(8)

Bayesian, Machine Learning, Frederic Pennerath

A simple non-linear example:

localization of a wheeled vehicle

𝑋 𝑡 = 𝑥 𝑡 𝑦 𝑡 𝜃 𝑡 𝑣 𝑡

, 𝑈 𝑡 = 𝒞 𝑡 𝛼 𝑡

𝜃

𝑡

𝒞

𝑡

= 1

𝑅

𝑡

Continuous model: Discretized model:

𝑑𝑋 𝑡

𝑑𝑡 = 𝑑 𝑑𝑡

𝑥 𝑡 𝑦 𝑡 𝜃 𝑡 𝑣 𝑡

=

𝑣 𝑡 cos 𝜃 𝑡 𝑣 𝑡 s𝑖𝑛 𝜃 𝑡

𝑣 𝑡 𝒞 𝑡 1

𝑀 𝛼 𝑡 − 𝑓 𝑀 𝑣 𝑡

⇒ 𝑋 𝑡+𝜏 = 𝑋 𝑡 +

𝑣 𝑡 sin 𝜃 𝑡 𝑣 𝑡 cos 𝜃 𝑡

𝑣 𝑡 𝒞 𝑡 1

𝑀 𝛼 𝑡 − 𝑓 𝑀 𝑣 𝑡

⋅ 𝜏

(9)

A simple example:

the problem of open loop control

𝑋 𝑡 = 𝑥 𝑡 𝑦 𝑡 𝜃 𝑡 𝑣 𝑡

, 𝑈 𝑡 = 𝒞 𝑡 𝛼 𝑡

𝜃

𝑡

𝒞

𝑡

= 1

𝑅

𝑡

𝑑𝑋 𝑡

𝑑𝑡 = 𝑑 𝑑𝑡

𝑥 𝑡 𝑦 𝑡 𝜃 𝑡 𝑣 𝑡

=

𝑣 𝑡 × cos 𝜃 𝑡 + 𝜀 𝑡 𝑥 𝑣 𝑡 × sin 𝜃 𝑡 + 𝜀 𝑡 𝑦

𝑣 𝑡 × 𝒞 𝑡 + 𝜀 𝑡 𝜃 1

𝑀 𝛼 𝑡 − 𝑓

𝑀 𝑣 𝑡 + 𝜀 𝑡 𝑣

Wind, slope, collision, load, etc Slippy road, skid

Non linearities Uncertainties

(10)

Bayesian, Machine Learning, Frederic Pennerath

0 10 20 30 40 50 60 70 80 90 100

0 50 100

t (s)

speed (km/h)

Pure integration

real estimated

0 10 20 30 40 50 60 70 80 90 100

0 1 2 3

t (s)

distance (km)

A simpler linear example:

Problem: measuring speed 𝑣(𝑡) and travelled distance l 𝑡 of a vehicle.

Silly solution: pure model integration

𝑑𝑙

𝑑𝑡 𝑡 = 𝑣 𝑡 + 𝜀 𝑡 𝑙 𝑚 𝑑𝑣

𝑑𝑡 𝑡 = −𝑓 0 sign 𝑣 𝑡 − 𝑘 𝑓 𝑣 𝑡 + 𝑎 𝑡 + 𝜀 𝑡 𝑣

Better solution: using sensor

Using odometer counting wheel turn pulses

𝑦 𝑡 = 𝑙 𝑡 + 𝜀 𝑡 𝑦 How to improve it?

• What happens if car slips?

• What happens if odometer breaks?

• How to integrate GPS?

0 10 20 30 40 50 60 70 80 90 100

0 50 100

t (s)

speed (km/h)

Using odometer

real estimated

0 10 20 30 40 50 60 70 80 90 100

0 0.5 1 1.5

t (s)

distance (km)

Simulations with varying

slope and wind

(11)

Linear State Space Representation

State integration:

𝑋 𝑡 = 𝑣 𝑡

𝑙 𝑡 , 𝑈 𝑡 = 1

𝑎 𝑡 , ℇ 𝑡 𝑋 = 𝜀 𝑡 𝑣 𝜀 𝑡 𝑙

𝑋 𝑡+1 = 𝑋 𝑡 + 1

𝑚 −𝑓 0 − 𝑘 𝑓 𝑣 𝑡 + 𝑎 𝑡 + 𝜀 𝑡 𝑣 𝑣 𝑡 + 𝜀 𝑡 𝑙

𝜏 ⇔ 𝑋 𝑡+1 = 𝐴 𝑋 𝑡 + 𝐵 𝑈 𝑡 + ℇ 𝑡 𝑋

with 𝐴 = 1 −

𝑘

𝑓

𝑚 𝜏 0

𝜏 1

, 𝐵 = 𝜏

𝑚

−𝑓 0 1 0 0 , Output equation:

𝑌 𝑡 = 𝐶𝑋 𝑡 + 𝐷𝑈 𝑡 + ℇ 𝑡 𝑌

with C = 0 1 , D = 0 0 , ℇ 𝑡 𝑌 = 𝜀 𝑡 𝑜𝑑𝑜

(12)

Bayesian, Machine Learning, Frederic Pennerath

Bayesian Filtering:

Kalman filter

Hypothesis of Kalman filters:

• The state representation is linear.

• Every “input” of the model (X 0 , ℇ 𝑡 𝑋 , ℇ 𝑡 𝑌 , 𝑈 𝑡 ) is normally distributed Consequences:

• Every state X t |𝑋 0 , 𝒰 𝑡 , 𝒴 𝑡 , and output Y t |𝑋 0 , 𝒪 𝑡 is normally distributed

• State estimation problem is defined by:

– Model parameters: 𝐴

𝑡

, 𝐵

𝑡

, 𝐶

𝑡

, 𝐷

𝑡

– Initial state distribution: 𝑋 ෠

0

= 𝐸 𝑋

0

𝑃

0

= 𝑐𝑜𝑣 𝑋

0

– State noise distribution: 𝑅

𝑡

= 𝑐𝑜𝑣 ℇ

𝑡𝑋

𝐸 ℇ

𝑡𝑋

= 0 – Output noise distribution: 𝑄

𝑡

= 𝑐𝑜𝑣 ℇ

𝑡𝑌

𝐸 ℇ

𝑡𝑌

= 0

– Output samples: 𝑦

𝑡

– Input samples: 𝑢

𝑡

(13)

-1 0 1 2 3 4 5 6

Multivariate normal distribution:

definition

Generalization to ℝ 𝐦 :

𝑋 𝑚 ~𝒩 𝜇 𝑚 , Σ 𝑚𝑚 ⟺ 𝑓 𝑋 𝑚 𝑋 = 1

2𝜋 𝑚 Σ 𝑚𝑚 𝑒 1 2 𝑋−𝜇 𝑚 𝑇 Σ 𝑚𝑚 −1 𝑋−𝜇 𝑚 Basic properties:

• 𝐸 𝑋 𝑚 = 𝜇 𝑚

• 𝑐𝑜𝑣 𝑋 𝑚 = 𝐸 𝑋 − 𝜇 𝑚 𝑇 𝑋 − 𝜇 𝑚 = Σ 𝑚𝑚 Example:

– 𝜇 2 = 2

1 , Σ 22 = 2 −1

−1 1

-1 0 1 2 3 4 5

-4 -2 0 2 4 6

0 0.05 0.1 0.15 0.2

x y

(14)

Bayesian, Machine Learning, Frederic Pennerath

Multivariate normal distribution:

fundamental properties

Closed under linear transformation:

𝑋~𝒩 μ, Σ , μ ∈ ℝ m , Σ ∈ 𝑀 𝑚𝑚 , A ∈ M nm , B ∈ M n1

𝐴𝑋 + 𝐵~𝒩 𝐴𝜇 + 𝐵, 𝐴Σ𝐴 𝑇 Particular cases:

Given 𝑋 1

𝑋 2 ~𝒩 𝜇 1

𝜇 2 , Σ 11 Σ 12 Σ 12 𝑇 Σ 22

• Addition: 𝑋 1 + 𝑋 2 ~𝒩 𝜇 1 + 𝜇 2 , Σ 11 + Σ 22 + Σ 12 + Σ 12 𝑇

• Marginalization: 𝑋 1 ~𝒩 𝜇 1 , Σ 11

(15)

Multivariate normal distribution:

fundamental properties

Closed under conditioning:

𝑋 1

𝑋 2 ~𝒩 𝜇 1

𝜇 2 , Σ 11 Σ 12

Σ 12 𝑇 Σ 22

𝑃(𝑋 1 |𝑋 2 = Ԧ 𝑥)~𝒩 𝜇 1 + Σ 12 Σ 22 −1 𝑥 − 𝜇 Ԧ 2 , Σ 11 − Σ 12 Σ 22 −1 Σ 12 𝑇 Closed under (pdf) multiplication:

𝒩 𝜇 𝐴 , Σ 𝐴 × 𝒩 𝜇 𝐵 , Σ 𝐵 ≡ 𝒩 Σ A −1 + Σ B −1 −1 Σ A −1 𝜇 𝐴 + Σ B −1 𝜇 𝐵 , Σ A −1 + Σ B −1 −1

Particular case: conjugate prior 𝜇~𝒩 𝜇 0 , Σ 0 of 𝑋~𝒩 𝜇, Σ :

𝜇|𝑋 = Ԧ 𝑥~𝒩 Σ 0 −1 + Σ −1 −1 Σ 0 −1 𝜇 0 + Σ −1 𝑥 , Σ Ԧ 0 −1 + Σ −1 −1

(16)

Bayesian, Machine Learning, Frederic Pennerath

Kalman Filter:

Prediction step

Hypothesis:

• 𝑋 𝑡|𝑡 is assumed to be 𝒩 ෠ 𝑋 𝑡|𝑡 , 𝑃 𝑡|𝑡

• ℇ 𝑡 𝑋 ~𝒩 0, 𝑄 𝑡 is a white noise

Compute 𝐏 𝑿 𝒕+𝟏|𝒕 ≝ 𝑷 𝑿 𝒕+𝟏 𝑼 𝟎 … 𝑼 𝒕 𝒀 𝟏 … 𝒀 𝒕 = ׬ 𝑷 𝑿 𝒕+𝟏 𝑼 𝒕 , 𝑿 𝒕 = 𝒙 𝐏 𝑿 𝒕|𝒕 = 𝒙 𝒅𝒙 𝑋 𝑡+1 = 𝑀 𝑋 𝑡

𝑡 𝑋 + 𝑉 with ቊ 𝑀 = 𝐴 𝑡 𝐼 𝑉 = 𝐵 𝑡 𝑈 𝑡

and 𝑋 𝑡|𝑡

𝑡 𝑋 ~𝒩 μ, Σ with μ = 𝑋 ෠ 𝑡|𝑡

0 𝛴 = 𝑃 𝑡|𝑡 0 0 𝑄 𝑡 ⇒ 𝑋 𝑡+1|𝑡 ~𝒩 𝑀𝜇 + 𝑉, M𝛴𝑀 𝑇

⇒ 𝑋 𝑡+1|𝑡 ~𝒩 ෠ 𝑋 𝑡+1|𝑡 , 𝑃 𝑡+1|𝑡 with ቐ 𝑋 ෠ 𝑡+1|𝑡 = 𝐴 𝑡 𝑋 ෠ 𝑡|𝑡 + 𝐵 𝑡 𝑈 𝑡 𝑃 𝑡+1|𝑡 = 𝐴 𝑡 𝑃 𝑡|𝑡 𝐴 𝑡 𝑇 + 𝑄 𝑡

𝑋

𝑡

𝑈

𝑡

𝑋

𝑡+1

(17)

Kalman Filter:

Output prediction (intermediate step)

Hypothesis:

• 𝑋 𝑡|𝑡−1 is assumed to be 𝒩 ෠ 𝑋 𝑡|𝑡−1 , 𝑃 𝑡|𝑡−1

• ℇ 𝑡 𝑌 ~𝒩 0, 𝑅 𝑡 is a white noise

Given 𝒀 𝒕|𝒕−𝟏 ≝ 𝒀 𝒕 |𝑼 𝟎 … 𝑼 𝒕−𝟏 𝒀 𝟏 … 𝒀 𝒕−𝟏 ~𝒩 𝑌 ෠ 𝑡|𝑡−1 , 𝑆 𝑡|𝑡−1

Compute 𝑷 𝑿 𝒕|𝒕−𝟏 , 𝒀 𝒕|𝒕−𝟏 ≝ 𝑷(𝑿 𝒕 , 𝒀 𝒕 |𝑼 𝟎 … 𝑼 𝒕−𝟏 𝒀 𝟏 … 𝒀 𝒕−𝟏 ) = 𝑷 𝒀 𝒕 𝑿 𝒕 , 𝑼 𝒕−𝟏 𝐏 𝑿 𝒕|𝒕−𝟏 𝑋 𝑡

𝑌 𝑡 = 𝑀 𝑋 𝑡

𝑡 𝑌 + 𝑉 with 𝑀 = 𝐼 0

𝐶 𝑡 𝐼 , 𝑉 = 0 𝐷 𝑡 𝑈 𝑡

and 𝑋 𝑡|𝑡−1

𝑡 𝑌 ~𝒩 μ, Σ with μ = 𝑋 ෠ 𝑡|𝑡−1

0 𝛴 = 𝑃 𝑡|𝑡−1 0 0 𝑅 𝑡 ⇒ 𝑋 𝑡|𝑡−1

𝑌 𝑡 ~𝒩 𝑀𝜇 + 𝑉, M𝛴𝑀 𝑇

𝑋 𝑋 ෠ 𝑃 𝑃 𝐶 𝑇 𝑌 ෠ = 𝐶 𝑋 ෠ + 𝐷 𝑈

𝑋

𝑡−1

𝑋

𝑡

𝑌

𝑡

𝑈

𝑡−1

(18)

Bayesian, Machine Learning, Frederic Pennerath

Kalman Filter:

Update step

Hypothesis:

• 𝑋 𝑡|𝑡−1

𝑌 𝑡|𝑡−1 ~𝒩 𝑋 ෠ 𝑡|𝑡−1

𝑌 ෠ 𝑡|𝑡−1 , 𝑃 𝑡|𝑡−1 𝑃 𝑡|𝑡−1 𝐶 𝑡 𝑇 𝐶 𝑡 𝑃 𝑡|𝑡−1 𝑆 𝑡|𝑡−1

• Observe 𝑌 𝑡 = 𝑦 𝑡

Compute 𝐏 𝑿 𝐭|𝐭 ≝ 𝑷 𝑿 𝐭 𝑼 𝟎 … 𝑼 𝒕−𝟏 𝒀 𝟏 … 𝒀 𝒕−𝟏 , 𝒀 𝒕 = 𝒚 𝒕 𝑋 1

𝑋 2 ~𝒩 𝜇 1

𝜇 2 , Σ 11 Σ 12

Σ 12 𝑇 Σ 22

𝑃 𝑋 1 𝑋 2 = Ԧ 𝑥 ~𝒩 𝜇 1 + Σ 12 Σ 22 −1 𝑥 − 𝜇 Ԧ 2 , Σ 11 − Σ 12 Σ 22 −1 Σ 12 𝑇

⇒ 𝑋 𝑡|𝑡 ~𝒩 ෠ 𝑋 𝑡|𝑡 , 𝑃 𝑡|𝑡 with ቐ 𝑋 ෠ 𝑡|𝑡 = ෠ 𝑋 𝑡|𝑡−1 + 𝑃 𝑡|𝑡−1 𝐶 𝑡 𝑇 𝑆 𝑡|𝑡−1 −1 (𝑦 𝑡 − ෠ 𝑌 𝑡|𝑡−1 ) 𝑃 𝑡|𝑡 = 𝑃 𝑡|𝑡−1 − 𝑃 𝑡|𝑡−1 𝐶 𝑡 𝑇 𝑆 𝑡|𝑡−1 −1 𝐶 𝑡 𝑃 𝑡|𝑡−1

𝑋

𝑡−1

𝑋

𝑡

𝑌

𝑡

(19)

Kalman Filter:

Summary of equations & implementation

Initialisation step: 𝑋 ෠ 0|0 ← ෠ 𝑋 0 , 𝑃 0|0 ← 𝑃 0

Prediction step: run when system time is increased (𝑡 ← 𝑡 + 1)

1. Predicted state: 𝑋 ෠ 𝑡+1|𝑡 ← 𝐴 𝑡 𝑋 ෠ 𝑡|𝑡 + 𝐵 𝑡 𝑈 𝑡 (model integration) 2. Prediction covariance: 𝑃 𝑡+1|𝑡 ← 𝐴 𝑡 𝑃 𝑡|𝑡 𝐴 𝑡 𝑇 + 𝑄 𝑡 (uncertainty increase) Estimation step: run when observations are received

1. Output prediction: 𝑌 ෠ 𝑡|𝑡−1 ← 𝐶 𝑡 𝑋 ෠ 𝑡|𝑡−1 + 𝐷 𝑡 𝑈 𝑡 (mean of output posterior) 2. Output variance: 𝑆 𝑡|𝑡−1 ← 𝐶 𝑡 𝑃 𝑡|𝑡−1 𝐶 𝑡 𝑇 + 𝑅 𝑡 (add observation noise) 3. Innovation: 𝑌 ෨ 𝑡 ← 𝑦 𝑡 − ෠ 𝑌 𝑡|𝑡−1 (error on output prediction) 4. Kalman filter gain 𝐾 𝑡 ← 𝑃 𝑡|𝑡−1 𝐶 𝑡 𝑇 𝑆 𝑡|𝑡−1 −1 (compromise of uncertainty) 5. Posterior state: 𝑋 ෠ 𝑡|𝑡 ← ෠ 𝑋 𝑡|𝑡−1 + 𝐾 𝑡 𝑌 ෨ 𝑡 (state correction)

6. Posterior covariance: 𝑃 ← 𝐼 − 𝐾 𝐶 𝑃 (uncertainty reduction)

(20)

Bayesian, Machine Learning, Frederic Pennerath

Kalman filter tuning

Initial state distribution:

X 0 = 𝑣 0

𝑙 0 ~𝒩 0

0 , 𝜎 𝑣 2 0 0

0 0 with 𝜎 𝑣 0 = 1 𝑚/𝑠 State integration noise:

Force uncertainty: 𝜀 𝑡 𝑣 ~𝒩 0, 𝜎 𝑣 2 with 𝜎 𝑣 = 𝑚 ⋅ 1 ⋅ 9.8 𝑚/𝑠 2 Slip uncertainty: 𝜀 𝑡 𝑙 ~𝒩 0, 𝜎 𝑙 2 with 𝜎 𝑙 = 1 𝑚/𝑠

𝜀 𝑡 𝑣 ⊥ 𝜀 𝑡 𝑙 ⇒ ℇ 𝑡 𝑋 ~𝒩 0

0 , 𝑄 avec Q t = 𝜏 2

𝜎 𝑣 2

𝑚 2 0 0 𝜎 𝑙 2 Measurement noise:

𝜀 𝑡 𝑦 ~𝒩 0, 𝜎 𝑦 2 with 𝜎 𝑦 = 𝑞 2

12 , 𝑞 = 50 𝑐𝑚

⇒ R t = 𝜎 𝑦 2

(21)

Kalman Filter:

Without measures

0 10 20 30 40 50 60 70 80 90 100

-50 0 50 100 150

t (s)

speed (km/h)

Kalman

margin real estimated

0 10 20 30 40 50 60 70 80 90 100

-2000 -1000 0 1000 2000

t (s)

distance (km)

(22)

Bayesian, Machine Learning, Frederic Pennerath

Kalman Filter:

With measures

0 10 20 30 40 50 60 70 80 90 100

-50 0 50 100

t (s)

speed (km/h)

Kalman

margin real estimated

0 10 20 30 40 50 60 70 80 90 100

-0.5 0 0.5 1 1.5

t (s)

distance (km)

(23)

Kalman Filter:

Slowing down measurement time rate

0 2 4 6 8 10 12 14 16 18

0 20 40 60 80

t (s)

speed (km/h)

Kalman

margin real estimated

2 4 6 8 10 12 14 16 18

-2 0 2 4 6

distance (km)

(24)

Bayesian, Machine Learning, Frederic Pennerath

Kalman Filter example:

Video surveillance of pedestrians

Problem:

Multiple cameras observe image position of pedestrians:

• State is 𝑿 = 𝑥, 𝑦, 𝑣 𝑥 , 𝑣 𝑦

• Observation is image position 𝒚 𝒊

• Linear integration and observation:

𝒚 𝒊 = 𝑪 𝒊 𝑿 + 𝒅 𝒊 Advantages:

• Data fusion: multiple cameras, other sensors, etc

• Robust to camera occlusion

See code at “Demo/Bayes/8-Continuous Markov Models/Kalman”

𝑿

Screen 𝑖 focus

pedestrian

𝒚

𝒊

(25)

Kalman Filter example:

Video surveillance of pedestrians

(26)

Bayesian, Machine Learning, Frederic Pennerath

Kalman Filter Extensions for non linear models:

Extended Kalman Filter (EKF)

Linearize non linear state space representation (𝑓 𝑡 , 𝑔 𝑡 ) around estimated state 𝑋 ෠ 𝑡 to update covariance matrix 𝑃 𝑡 :

𝑋 ෠ 𝑡+1|𝑡 = 𝑓 𝑡 𝑋 ෠ 𝑡|𝑡 , 𝑈 𝑡 , ෠ 𝑌 𝑡|𝑡 = 𝑔 𝑡 𝑋 ෠ 𝑡|𝑡 , 𝑈 𝑡 , 𝐴 𝑡 = 𝜕𝑓

𝜕𝑋 𝑋 ෠ 𝑡|𝑡 , 𝐶 𝑡 = 𝜕𝑔

𝜕𝑋 𝑋 ෠ 𝑡|𝑡

𝜃𝑡 𝒞𝑡= 1

𝑅𝑡

𝑑𝑋

𝑡

𝑑𝑡 = 𝑑

𝑑𝑡 𝑥

𝑡

𝑦

𝑡

𝜃

𝑡

𝑣

𝑡

=

𝑣

𝑡

× cos 𝜃

𝑡

+ 𝜀

𝑡𝑥

𝑣

𝑡

× sin 𝜃

𝑡

+ 𝜀

𝑡𝑦

𝑣

𝑡

× 𝒞

𝑡

+ 𝜀

𝑡𝜃

1

𝑀 𝛼

𝑡

− 𝑓

𝑀 𝑣

𝑡

+ 𝜀

𝑡𝑣

𝑋 ෠

𝑡+1|𝑡

= 𝑋 ෠

𝑡|𝑡

+

𝑣

𝑡

× cos 𝜃

𝑡

𝑣

𝑡

× sin 𝜃

𝑡

𝑣

𝑡

× 𝒞

𝑡

1

𝑀 𝛼

𝑡

− 𝑓 𝑀 𝑣

𝑡

× 𝜏

𝐴

𝑡

= 𝜕𝑓

𝜕𝑋 𝑋 ෠

𝑡|𝑡

=

1 0 −𝑣

𝑡

× sin 𝜃

𝑡

× 𝜏 cos 𝜃

𝑡

× 𝜏 0 1 𝑣

𝑡

× cos 𝜃

𝑡

× 𝜏 sin 𝜃

𝑡

× 𝜏

0 0 1 𝒞

𝑡

× 𝜏

0 0 0 1 − 𝑓

𝑀 𝑣

𝑡

𝜏

(27)

Kalman Filter Extensions for non linear models:

Unscented Kalman Filter (UKF)

Take into account curvature of non linear functions(𝑓 𝑡 , 𝑔 𝑡 ) by observing displacement of few samples called sigma points 𝜒 Ƹ 𝑖

Example for state integration:

1 . Define 2𝑑 + 1 sigma points 𝜒 Ƹ

𝑖

centered on 𝑋 ෠

𝑡|𝑡

and scattered on lines passing through 𝑋 ෠

𝑡|𝑡

and aligned with eigenvectors of 𝑃:

Ƹ

𝜒

𝑖 𝑖∈ 0…2𝑑

= ෠ 𝑋

𝑡|𝑡

∪ ෠ 𝑋

𝑡|𝑡

± k 𝜎

𝑖

𝑢 ො

𝑖 𝑖∈ 1…𝑑

k = 𝑑 + 𝜆 2. Compute their images: 𝑓( Ƹ 𝜒

𝑖

)

𝑖∈ 0…2𝑑

3. Compute new mean state and covariance as weighted barycenters (weights depend on 𝑑 & 𝜆):

𝑋 ෠

𝑡+1|𝑡

= ෍

𝑖=0 2𝑑+1

𝑊

𝑖𝑠

𝑓( Ƹ 𝜒

𝑖

) 𝑃

𝑡+1|𝑡

= ෍

𝑖=0 2𝑑+1

𝑊

𝑖𝑃

𝑓 Ƹ 𝜒

𝑖

− ෠ 𝑋

𝑡+1|𝑡

× 𝑓 Ƹ 𝜒

𝑖

− ෠ 𝑋

𝑡+1|𝑡 T

𝑋 ෠

𝑡|𝑡

𝑓

𝑡

𝜒 Ƹ

0

𝑓

𝑡

𝜒 Ƹ

2

𝑓

𝑡

𝜒 Ƹ

3

𝑓

𝑡

𝜒 Ƹ

4

𝑃 𝑡|𝑡

Ƹ

𝜒

1

Ƹ

𝜒

3

Ƹ

𝜒

0

=

𝑋 ෠ 𝑡+1|𝑡

(28)

Bayesian, Machine Learning, Frederic Pennerath

EKF & UKF example:

Cachalot tracking, again & again

Problem: localize cachalot:

• State is 𝑿 = 𝑥, 𝑦, 𝑧, 𝑣 𝑥 , 𝑣 𝑦 , 𝑣 𝑧

• Observations are:

– Depth gauge: 𝑧

𝑔

= 𝑧 + 𝜀

𝑔

– Inclinometer: tan 𝜃

𝑖

=

𝑣𝑧

𝑣𝑥2+𝑣𝑦2

+ 𝜀

𝑖

– Compass: tan 𝜙

𝑐

=

𝑣𝑥

𝑣𝑦

+ 𝜀

𝑐

– Pitot tube: 𝑣

𝑝

= 𝑣

𝑥2

+ 𝑣

𝑦2

+ 𝑣

𝑧2

+ 𝜀

𝑝

– GPS: 𝑿

𝑔𝑝𝑠

= (𝑥, 𝑦, 𝑧) + 𝜺

𝐺𝑃𝑆

only when 𝑧 > −1

• Non linear observation Advantages:

• Data fusion: multiple heterogeneous sensors

• Robust to sensor failure

See code at “Demo/Bayes/8-Continuous Markov Models/EKF and UKF”

𝑽

𝜽

𝒛

(29)

Kalman Filter Extensions:

to go further

Algorithms:

– State smoothing

– State space representation learning Continuous time (Kalman-Bucy filter):

Make 𝜏 → 0: 𝑑 ෠ 𝑋

|𝑡𝑜

𝑑𝑡 𝑡 = 𝐴 𝑡 𝑋 ෠ |𝑡

0

𝑡 + 𝐵 𝑡 𝑈 𝑡 , … Hybrid filters:

Combine continuous time integration with discrete time updates Switching Kalman Filters:

Combine HMM with Kalman filters

𝑃 𝑆 0 … , 𝑆 𝑇 , 𝑋 0 … , 𝑋 𝑇 , 𝑌 1 … 𝑌 𝑇 = 𝑃 𝑆 0 𝑃(𝑋 0 |𝑆 0 ) ෑ

𝑡=1 𝑇

𝑃 𝑆 𝑡 𝑆 𝑡−1 𝑃 𝑋 𝑡 𝑋 𝑡−1 , 𝑆 𝑡 ) 𝑃 𝑌 𝑡 𝑆 𝑡 , 𝑋 𝑡

Extensions to non linear models:

– Extended Kalman Filter

– Unscented Kalman Filter

(30)

Bayesian, Machine Learning, Frederic Pennerath

Kalman Filter:

A summary

• Bayesian filtering updates dynamic state using bayesian inference

– Relies on Markov property

– Find best balance between uncertainty of model and sensors

• Kalman filter is a special case of bayesian filtering

– Relies on a state space representation

– All input distribution of the models are normal (model & observation noises, initial state)

• Wide application scope:

– Hybrid continuous/discrete systems – Multiple sensors (data fusion)

– Smoothing

– Learning parameters

Références

Documents relatifs

We can mention, for instance, the Back and Forth Nudging proposed by Auroux and Blum [1], the Time Reversal Focusing by Phung and Zhang [10], the algorithm proposed by Ito, Ramdani

Niculescu, “Survey on recent results in the stability and control of time-delay systems,” Journal of dynamic systems, measurement, and control, vol.. Niculescu, “Global asymptotic

In this paper, we show under which conditions generalized finite mixture with underlying normal distribution are identifiable in the sense that a given dataset leads to a

Figure 2 represents a traditional network interconnecting the same 100 edge nodes using 10 electronic transit routers to concentrate traffic onto high capacity point-to-point

The sequence logo of the monotopic PGT family reveals highly conserved residues that would not have been apparent from alignments of any of the families

Bouaud, Sauquet, Giral, Julien, Cornet, Falcoff, S´ eroussi IC 2010, Nˆımes 20/23.. Contexte et objectifs Mat´ eriel et m´ ethode R´ esultats Discussion

Le choix de l’ontologie de fondement à réutiliser est fondé sur trois critères : (i) la cohérence de la catégorisation des concepts de processus, événement, état

“Structural and Strongly Structural Input and State Observability of Linear Network Systems.” In: IEEE Transactions on Control of Network Systems (2017). Appeared