Adaptive Kalman Filtering based Tracking - Mobile Terminal Tracking based on Kalman Filtering

Mobile Terminal Tracking based on Kalman Filtering

6.2 Adaptive Kalman Filtering based Tracking

Even though the Kalman filter is the work horse of position tracking, its use requires the knowledges of various parameters that describe the mea-surement error variances and the acceleration model. This aspect is often overlooked. In this section we review some known and some not so known methods to adapt these state space model parameters.

6.2.1 Adaptive Kalman Filtering Approaches

We shall first consider the topic of Adaptive Kalman filtering for a general linear state-space model.

119

Basic Kalman Filter

The KF considers the estimation of a first-order vector autoregressive (Markov) process from linear measurements in white noise. The KF performs this es-timation recursively by alternating between filtering (measurement update) and (one step ahead) prediction (time update). An alternative viewpoint is that the Kalman filter recursively generates the innovations of the mea-surement signal (by a structured Gram-Schmidt approach that decorrelates the consecutive measurements). The KF corresponds to optimal (Minimum Mean Squared Error (MMSE) or Maximum A Posteriori (MAP)) Bayesian estimation of the state sequence if all random sources involved (measure-ment noise, state noise and state initial conditions) are Gaussian. In the non-Gaussian case, the KF performs Linear MMSE (LMMSE) estimation.

The signal model can be written as state update equation:

x_k+1 = F_kx_k+G_kw_k measurement equation:

y_k = H_kx_k+v_k

(6.1)

for k = 1,2, . . ., where the initial state x₀ ∼ N( ˆx₀,P₀), the measurement noise v_k ∼ N(0,R_k), and the state noise w_k ∼ N(0,Q_k) and all these random quantities are mutually uncorrelated. In the case of time-varying system matrices F_k etc., the form of the equations as they appear in (6.1) is the most logical, with the state update corresponding to a prediction of the state on the basis of the quantities available at timek. In the position tracking application to be considered here, the system matrices are essen-tially time-invariant, or at most slowly time-varying. In that case the time index k in e.g. F_k refers to the estimate of F available at time k in an adaptive approach. In a logical ordering, a measurement is performed first and then the state gets predicted.

In the following, we introduce the notationy_1:k={y₁, . . . ,y_k}. The KF performs Gram-Schmidt orthogonalization (decorrelation) of the measure-ment variablesy_k. This is done by computing the LMMSE predictor ˆy_k_|_k₋₁ of y_k on the basis of y_1:k₋₁, leading to the orthogonalized prediction error (or innovation) ye_k = ye_k_|_k₋₁ = y_k−yˆ_k_|_k₋₁. We introduce the correlation matrix notationR_xy =Ex y^T (correlation matrices will usually also be co-variance matrices here since the processes y_k and x_k have zero mean and also various estimation errors will have (conditional) zero mean). We denote the covariance matrixR_y_e_k_y_e_k =S_k. The idea of the innovations approach is

that (linear) estimation in terms of y_1:k is equivalent to estimation in terms ofye_1:k since one set is obtained from the other by an invertible linear trans-formation. Now, since the ye_k are decorrelated, estimation in terms of ye_1:k simplifies:

This will be used to obtainpredicted estimates ˆx_k_|_k₋₁ with estimation error e

Now exploiting the correlation structure in the signal model, this leads to the following two-step recursive procedure to go from |k−1 to |k:

Measurement Update There are various other ways to formulate these update equations, in-cluding performing both steps in one step.

The choice of the initial conditions crucially affects the initial conver-gence (transient behavior). In the usual case of total absence of prior infor-mation on the initial state, one can choose ˆx₀ = 0,P₀ =p0I withp0a (very) large number. This leads to P₁_|₀ =F₀P₀F₀^T +G₀Q₀G^T₀, ˆx₁_|₀ =F₀xˆ₀.

For numerical stability (in the presence of roundoff errors), it is crucial that the symmetry of the covariance matrices P_k_|_k₋₁, P_k_|_k is maintained throughout the updates (which is not going to be the case with the update of P_k_|_k the way it appears in (6.2)). The easiest way to ensure this is to periodically (e.g. every sample to ease programming) force the symmetry of e.g. P_k_|_k by computing P_k_|_k₋₁ = ¹₂(P_k_|_k+P_k^T

Extended Kalman Filter (EKF)

For the case of a nonlinear state-space model, the idea of the EKF is to apply the KF to a linearized version of the state-space model, via a first-order Taylor series expansion. So we get

state update equation:

So, at this point, the basic KF can be applied to the thus obtained approx-imate linear state-space model.

The EKF approach can be used to adapt some parameters in an other-wise linear state-space model x^′_k+1 = F^′ x^′_k+G^′ w_k. For instance, con-sider the case in which one wants to adapt parameters appearing (e.g.) linearly in the matrix F^′ = F^′(θ). One can jointly estimate the unknown constant parameter vector θ by considering the following state update for them: θ_k+1=θ_k. Then one can introduce the augmented state and system matrices

∂θ^T . When running the EKF, the state-dependent system matrices have to be filled with the latest state estimates, so in this case

The basic EKF can be improved by correct computation of the covariance matrixP_k+1_|_k, accounting for the replacement of x_k by ˆx_k_|_k (which in fact

leads to equivalent augmented process noise). The other issue is that the parameters θ are often not really constant and hence need to be tracked adaptively. This can be done either by introducing some process noise in θ_k+1 = θ_k (random walk time evolution) or by introducing exponential weighting (at least for the θ portion) into the KF updates [42].

This EKF approach allows fairly straightforwardly to estimate param-eters in F_k, H_k, or G_k, but much less so in Q_k, R_k. For adapting (pa-rameters in)Qand R, one needs to consider the innovations representation

x_k+1_|_k = F_kxˆ_k_|_k₋₁ +F_kK_kye_k and consider gradients of the Kalman gain K_k w.r.t. these matrices.

Recursive Prediction Error Method (RPEM-KF)

The RPEM [43], [44] is an adaptive implementation of Maximum Likelihood (ML) parameter estimation. The negative log-likelihood becomes a least-squares criterion in the prediction errors (innovations) and RPEM performs one iteration per sample. Applied to KF, the RPEM can be seen as a more rigorous version of EKF and computes gradients more precisely [45]. Indeed, for the case of a state transition matrixF_k=F_k(θ), the EKF would consider the gradient

∂x_k+1

∂θ^T = ∂F_k(θ)x_k

∂θ^T (6.8)

where only the explicit dependence of F onθ would be considered, whereas the RPEM would consider more correctly

∂x_k+1

∂θ^T = ∂F_k(θ)x_k

∂θ^T +F_k(θ)∂x_k

∂θ^T . (6.9)

RPEM for KF can be found in the references above, but will not be pursued here further. One characteristic of the RPEM is a higher complexity.

Expectation-Maximization (EM-KF)

In EM [46], the parameters are estimated by minimizing expected values of negative loglikelihoods, see e.g. [47] for an application involving KF. For the state update, since G_k is typically a tall matrix, G_kw_k has a singular covariance matrix. The state update equation can be rewritten as

G⁺_k x_k+1 =G⁺_k F_kx_k+w_k (6.10)

whereG⁺_k = (G^T_kG_k)⁻¹G^T_k is the pseudo-inverse ofG_k. For the parameters involved in the state update equation, hence the following negative log-likelihood is applicable:

{ln det(Q_k) + (G⁺_k(x_k+1−F_kx_k))^TQ⁻_k¹(G⁺_k(x_k+1−F_kx_k))}. (6.11) For the parameters involved in the measurement equation, the appropriate log-likelihood is

{ln det(R_k) + (y_k−H_kx_k)^TR⁻_k¹(y_k−H_kx_k)}. (6.12) Now the expectation is taken, in principle with the conditional distribution given all data. HenceE_|ninvolving all datay_kup to the last samplen. This leads to an iterative algorithm with in each iteration a whole fixed-interval smoothing operation. An adaptive version [47], [48] can be obtained by replacing fixed-interval smoothing by fixed-lag smoothing and performing one iteration per time sample. Since the state update equation corresponds to a vector AR(1) model, one may expect (as in [48]) that a lag of 1 should be enough (to guarantee convergence). In [47], complexity is reduced further by suggesting that filtering might be enough. In that case, the (presumably) slowly varying ˆQ_k+1, ˆF_k+1 (for use in the KF at timek+1) get determined by minimizing

Xk i=1

λ^k⁻ⁱ E_|i{ln det( ˆQ)+(G⁺_i (x_i+1−F xˆ _i))^TQˆ⁻¹(G⁺_i (x_i+1−F xˆ _i))} (6.13) w.r.t. ˆQ, ˆF where we introduced an exponential forgetting factor λ <^≈ 1.

(6.13) is equivalent to γ_k⁻¹ln det( ˆQ) +

Xk i=1

λ^k⁻ⁱ tr{G^+T_i Qˆ⁻¹G⁺_i E_|i(x_i+1−F xˆ _i)(x_i+1−F xˆ _i)^T} (6.14) where we introduced γ_k⁻¹ = Pk

i=1λ^k⁻ⁱ =λ γ_k⁻₋¹₁+ 1. γ_k⁻¹ behaves initially as 1/kbut saturates eventually at γ_∞⁻¹ = 1−λ. We shall need

E_|ix_ix^T_i = xˆ_i_|_ixˆ^T_i_|_i+P_i_|_i E_|ix_i+1x^T_i = F_ixˆ_i_|_ixˆ^T_i_|_i+F_iP_i_|_i

= F_i( ˆx_i_|_ixˆ^T_i_|_i+P_i_|_i)F_i^T +G_iQ_iG^T_i .

(6.15)

In case of time-invariant G_k ≡G, we can rewrite (6.14) as

ln det( ˆQ)+ tr{G^+TQˆ⁻¹G⁺(M_k¹¹−F Mˆ _k⁰¹−M_k¹⁰Fˆ^T+ ˆF M_k⁰⁰Fˆ^T)} (6.16) where

M_k⁰⁰= (1−γ_k)M_k⁰⁰₋₁+γ_k( ˆx_k_|_kxˆ^T_k

|k+P_k_|_k) M_k¹⁰= (1−γ_k)M_k¹⁰₋₁+γ_kF_k( ˆx_k_|_kxˆ^T_k

|k+P_k_|_k) M_k⁰¹= (1−γ_k)M_k⁰¹₋₁+γ_k( ˆx_k_|_kxˆ^T_k_|_k+P_k_|_k)F_k^T M_k¹¹= (1−γ_k)M_k¹¹₋₁+γ_k( ˆx_k+1_|_kxˆ^T_k+1_|_k+P_k+1_|_k).

(6.17)

In case of furthermore time-invariantF_k≡F,Q_k≡Q, then M_k¹⁰=F M_k⁰⁰

M_k⁰¹=M_k⁰⁰F^T

M_k¹¹=F M_k⁰⁰F^T +G Q G^T .

(6.18)

As a result, (6.16) can be rewritten as

ln det( ˆQ)+ tr{Qˆ⁻¹Q}+ tr{G^+TQˆ⁻¹G⁺_i (F−Fˆ)M_k⁰⁰(F−Fˆ)^T)}. (6.19) The optimization of (6.19) now clearly leads to ˆF = F, ˆQ = Q, so we just get back the quantities that we use in the KF, without any additional information. Hence, just Kalman filtering in the EM-KF is not enough to adapt the state update parameters. Indeed, it just leads to a snake biting its own tail.

Fixed-Lag Smoothing

Using the innovations approach, we have ˆ

x_k₋₁_|_k= ˆx_k₋₁_|_k₋₁+R_x_k−₁_y_e_kS_k⁻¹ye_k. (6.20) After a few steps, we get the following lag-1 smoothing equations that need to be added to the basic Kalman Filter equations (to be inserted between the Measurement Update and the Time Update)

K_k;1 = P_k₋₁_|_k₋₁F_k^T₋₁H_k^T ˆ

x_k₋₁_|_k = xˆ_k₋₁_|_k₋₁+K_k;1S_k⁻¹ye_k P_k₋₁_|_k = P_k₋₁_|_k₋₁−K_k;1S_k⁻¹K_k;1^T .

(6.21)

Adaptive EM-KF with Fixed-Lag Smoothing

Consider now the case in which the state-space model is essentially time-invariant (or slowly time-varying). In that case the time index of the system matrices F_k etc. just reflects at which time the (unknown) system matri-ces have been adapted. The resulting KF equations with lag-1 smoothing become:

y_k_|_k₋₁ = H_k₋₁xˆ_k_|_k₋₁ e

y_k = y_k−yˆ_k_|_k₋₁

S_k = H_k₋₁P_k_|_k₋₁H_k^T₋₁+R_k₋₁ K_k;1 = P_k₋₁_|_k₋₁F_k^T₋₁H_k^T₋₁ ˆ

x_k₋₁_|_k = xˆ_k₋₁_|_k₋₁+K_k;1S_k⁻¹ye_k P_k₋₁_|_k = P_k₋₁_|_k₋₁−K_k;1S_k⁻¹K_k;1^T

K_k = P_k_|_k₋₁H_k^T₋₁S_k⁻¹ ˆ

x_k_|_k = xˆ_k_|_k₋₁+K_kye_k

P_k_|_k = P_k_|_k₋₁−K_kH_k₋₁P_k_|_k₋₁ parameter update

x_k+1_|_k = F_kxˆ_k_|_k

P_k+1_|_k = F_kP_k_|_kF_k^T +G_kQ_kG^T_k

(6.22)

So, the system matrices (F, G, Q) should be adapted after the smoothing step and before the filtering and prediction steps. We now adapt the system matricesF,G,Q from

ln det( ˆQ) + tr{Gˆ^+TQˆ⁻¹Gˆ⁺γ_k Pk

i=1λ^k⁻ⁱ E_|i(x_i−F xˆ _i₋₁)(x_i−F xˆ _i₋₁)^T}= ln det( ˆQ) + tr{Gˆ^+TQˆ⁻¹Gˆ⁺ (M_k¹¹−F Mˆ _k⁰¹−M_k¹⁰Fˆ^T + ˆF M_k⁰⁰Fˆ^T)}

(6.23) where the matrix definitions become this time

M_k⁰⁰= (1−γk)M_k⁰⁰₋₁+γk( ˆx_k₋₁_|_kxˆ^T_k₋₁_|_k+P_k₋₁_|_k) M_k¹⁰= (1−γ_k)M_k¹⁰₋₁

+γ_k( ˆx_k_|_kxˆ^T_k₋₁_|_k+F_k₋₁P_k₋₁_|_k−G_k₋₁Q_k₋₁G^T_k₋₁H_k^T₋₁S_k⁻¹K_k,1^T ) M_k⁰¹= (M_k¹⁰)^T

M_k¹¹= (1−γ_k)M_k¹¹₋₁+γ_k( ˆx_k_|_kxˆ^T_k_|_k+P_k_|_k).

(6.24) If for exampleGwould be fixed and invertible, minimization of (6.23) w.r.t.

Fˆ, ˆQ would lead to the following minimizers F_k = M_k¹⁰(M_k⁰⁰)⁻¹

Q_k = G⁺(M_k¹¹−M_k¹⁰(M_k⁰⁰)⁻¹M_k⁰¹)G⁺^T . (6.25)

For adapting the parameters in the measurement equation on the other hand, Kalman filtering should be sufficient. So consider

γ_k Pk

i=1λ^k⁻ⁱ E_|i{ln det( ˆR) + (yi−H xˆ _i)^TRˆ⁻¹(yi−H xˆ _i)}

= ln det( ˆR) + tr{Rˆ⁻¹ γ_k Pk

i=1λ^k⁻ⁱE_|i(y_i−H xˆ _i)(y_i−H xˆ _i)^T}

= ln det( ˆR) + tr{Rˆ⁻¹ ( ˆR_yy,k−Hˆ Rˆ_xy,k−Rˆ_yx,kHˆ^T + ˆHRˆ_xx,kHˆ^T)} (6.26) where

Rˆ_yy,k = γ_k Pk

i=1λ^k⁻ⁱ y_iy_i^T = (1−γ_k) ˆR_yy,k₋₁+γ_ky_ky^T_k Rˆ_xy,k = γ_k Pk

i=1λ^k⁻ⁱ xˆ_i_|_iy_i^T = (1−γ_k) ˆR_xy,k₋₁+γ_kxˆ_k_|_ky^T_k Rˆ_yx,k = ˆR^T_xy,k

Rˆ_xx,k = γ_k Pk

i=1λ^k⁻ⁱ E_|ix_ix^T_i = (1−γ_k) ˆR_xx,k₋₁+γ_k( ˆx_k_|_kxˆ^T_k

|k+P_k_|_k) (6.27) Minimization of (6.26) w.r.t. ˆH, ˆR yields

H_k= ˆR_yx,kRˆ⁻_xx,k¹

R_k= ˆR_yy,k−Rˆ_yx,kRˆ_xx,k⁻¹ Rˆ_xy,k . (6.28) For the initialization, in absence of any side information, one can takeM₀⁰⁰= 1/p₀I, M₀¹⁰ = 0, M₀¹¹ = 0, ˆR_xx,0 = 1/p₀I, ˆR_xy,0 = 0, ˆR_yy,0 = 0 where again p₀ is a very large number.

Hybrid EM-EKF

The idea here is to updateF_k via EKF andQ_k,R_k via EM.

6.2.2 State-Space Models for Position Tracking White Noise Acceleration

Consider positioning in nD, where n = 1, 2 or 3 dimensions. Let p_k be the position at sampling instant k,v_k the velocity (not to be confused with the measurement noise) and a_k the acceleration. In the case of e.g. 3D positioning, p_k is of the formp= [x y z]^T. By simple discretization of the differential equations of motion, we get

p_k+1 = p_k+v_k v_k+1 = v_k+a_k

a_k = w_k .

(6.29)

In the case of modeling the acceleration as (temporally) white noise, the acceleration is the process noise. To simplify the equations, we assume here that the unit of time for velocity and acceleration is the sampling period.

The physical speed and acceleration are thentsv_k and t²_sa_k wherets is the sampling period expressed in seconds, assuming p_k is expressed in meters.

We get for the state-space model

x_k= G⁺x_k. The only unknown system parameter in this case is the acceleration covariance matrixQ.

AR(1) (Markov) Acceleration

In this case we assume a first-order autoregressive model for the acceleration a_k+1 =A a_k+w_k where now A and Q are unknown. Note that G⁺F = AG⁺. We get for the state-space model

x_k =

6.2.3 Adaptive EM-KF with Fixed-Lag Smoothing for Posi-tion Tracking

White Noise Acceleration

The EM-KF becomes (note thatHG=0) ˆ

y_k_|_k₋₁ = Hxˆ_k_|_k₋₁ e

y_k = y_k−yˆ_k_|_k₋₁

S_k = H P_k_|_k₋₁H^T +R_k₋₁ K_k;1 = P_k₋₁_|_k₋₁F_k^T₋₁H^T ˆ

x_k₋₁_|_k = xˆ_k₋₁_|_k₋₁+K_k;1S⁻_k¹ye_k P_k₋₁_|_k = P_k₋₁_|_k₋₁−K_k;1S⁻_k¹K_k;1^T

K_k = P_k_|_k₋₁H^T S_k⁻¹ ˆ

x_k_|_k = xˆ_k_|_k₋₁+K_kye_k P_k_|_k = P_k_|_k₋₁−K_kH P_k_|_k₋₁

|k+P_k_|_k)G⁺^T Q_k = M_k¹¹−M_k¹⁰−M_k¹⁰^T +M_k⁰⁰

(Q_k = _n¹ tr{Q_k}I_n)

R_k = (1−γ_k)R_k₋₁+γ_k((y_k−Hxˆ_k_|_k)(y_k−Hxˆ_k_|_k)^T +HP_k_|_kH^T) ˆ

x_k+1_|_k = F_kxˆ_k_|_k

P_k+1_|_k = F_kP_k_|_kF_k^T +G Q_kG^T

(6.32) Q_k= _n¹tr{Q_k}I_n gets added in case we want to model the acceleration as also spatially white.

AR(1) (Markov) Acceleration

The only change for the EM-KF in (6.39) is that the update for Q_k gets replaced by

A_k = M_k¹⁰(M_k⁰⁰)⁻¹

Q_k = M_k¹¹−A_k(M_k¹⁰)^T . (6.33) 6.2.4 Non-Linear Measurements

In the case the measurement equation is of the form y_k = h(x_k) +v_k, the EKF needs to be applied as explained earlier. We get

y_k=h(x_k) +v_k ≈ H_kx_k+v_k (6.34)

where

6.3 Fitting of State Space Mobility Models to M3

Dans le document Fingerprinting based localization of mobile terminals (Page 137-148)