1.Introduction DirectmethodforYauﬁlteringsystemwithnonlinearobservations

(1)

http://dx.doi.org/./..

Direct method for Yau filtering system with nonlinear observations

Ji Shi, Zhiyu Yang and Stephen S. -T. Yau

Department of Mathematical Sciences, Tsinghua University, Beijing, China

ARTICLE HISTORY Received  June 

Accepted  January 

KEYWORDS

Duncan-Mortensen-Zakai (DMZ) equation; Kolmogorov equation; nonlinear drift;

nonlinear observation;

Gaussian approximation ABSTRACT

For all known finite dimensional filters, one always assumes that the observation terms be degree one polynomials. However, in practice, the observation terms may be nonlinear, e.g. tracking problems. In this paper, we consider the Yau filtering system (^∂_∂^{f j}_xi−_∂^∂_{x j}^fi=c_{i j}is constant for alli,j) with nonlinear obser- vation terms and arbitrary initial condition. The novelty of the paper lies in (i) the real time computation of the solution of the Duncan-Mortensen-Zakai (DMZ) equation is reduced to the computation of Kolmogorov equation. Based on Gaussian approximation of the initial condition, the Kolmogorov equation can be solved in terms of ordinary differential equations; (ii) For a given probability density function, we give a new and original approach to do Gaussian approximation which is very effective and simple. The direct method developed here can be easily implemented in a real time and memoryless way. Besides, we do not need the controllability and observability assumption. Compared to the extended Kalman filter, our method is much stable and has theoretical proof. The numerical experiments show that the proposed Gaussian approximation method is very effective and our method can track the states very well.

1. Introduction

In the early 1960s, Kalman and Bucy (1961) proposed the continuous version of Kalman filter which has been widely used in various fields of industry. However, Kalman filter is restricted to linear systems with Gaus- sian initial distribution. Actually, most of the filtering systems in practice is nonlinear with non-Gaussian initial distribution. The study of nonlinear filtering (NLF) is aimed at determining the conditional density ρ(t,x) of the statex(t) given the observation history {y(s): 0 s t}. In the late 1960s, Duncan (1967), Mortensen (1966), Zakai (1969) independently derived the so-called Duncan-Mortensen-Zakai (DMZ) equation for the NLF problem. Then ρ(t,x) can be obtained by normalising the solution σ(t, x) of the DMZ equation. Since the DMZ equation is a stochastic partial differential equation (PDE), there is no easy way to solve it.

Motivated by the Wei−Norman approach Wei and Norman (1964) of using Lie algebraic method to solve the linear time varying differential equation, Brockett and Clark (1980), Brockett (1981), and Mitter (1979) proposed the idea of using estimation algebra to con- struct finite dimensional nonlinear filter. However, in the Wei–Norman approach, one has to know explicitly the basis of the estimation algebra in order to reduce the DMZ equation to a finite system of ordinary differential

CONTACT Stephen S. -T. Yau [email protected]

equations (ODE)s, a Kolmogorov equation and first- order linear PDEs. In Chiou and Yau (1994), Tam, Wong, and Yau (1990), Yau (1994) and Yau and Hu (2005), all finite dimensional estimation algebra of maximal rank had been completely classified. Particularly, for a NLF system satisfying ^∂_∂x^f^j

i −_∂x^∂^fⁱ_j =c_{i j} is constant for all i, j is called the Yau filtering system in Chen (1994), which contains the Kalman filter and Benés (1981) filter as its special cases. However, one knows explicitly the basis of finite dimensional estimation algebra only for a few cases and one assumes that the linear system is controllable and observable.

In fact, for all known finite dimensional filters, one always needs the condition that the observation terms h_i(x), 1 i m are degree one polynomials. How- ever, the observation terms may be nonlinear in many situation. Yau and Hu (2001, 2005) first purposed the direct method to solve the DMZ equation. Later Yau and Lai (2003) gave the solution of Kolmogorov equation under certain conditions with Gaussian initial distribution in terms of ODEs. In Yau and Yau (2004a,2004b), the Yau filtering system with linear observation and linear filtering system with nonlinear observation terms are considered, respectively. The direct method has sev- eral advantages. First, the method is easy to implement, and the derivation no longer needs controllability and

(2)

observability. Second, compared to the wildly used extended Kalman filter (EKF), the direct method is much stable and has theoretic convergence proof. Moreover, the necessity of integratingn first-order linear PDEs in the estimation algebra method is eliminated. Recently, in Luo and Yau (2013a,2013b), a real-time algorithm for a general class of NLF problems was developed. Their method directly computes the Kolmogorov equation in advance and uses hermite orthogonal polynomials to approximate the initial condition at every time step.

In this paper, compared to the NLF systems considered in Yau and Yau (2004a,2004b), we consider a more general situation, i.e. both the drift and the observations can be nonlinear . It is well known that any non-Gaussian density function can be well approximated by finite linear combination of Gaussian distributions and the most wildly used technique is expectation maximisation (EM) algorithm Dempster (1997). However, typically the EM algorithm uses a set of sample points to determine the Gaussian mixture parameters, which is not the case in this paper. A new and original way to do Gaussian approximation is proposed in this paper which is very effective as verified by the numerical experiments. Besides, the method is very simple and fast. Based on our Gaussian approximation algorithm, the conditional density func- tionσ(t,x) is explicitly given via solutions of ODEs.

The paper is organised as follows. In Section2, we shall recall the basic filtering problem and some preliminary results. In Section3, we first reduce the solution of the robust DMZ equation to the solution of Kolmogorov equation, followed by the Gaussian approximation algorithm. Numerical experiments are given in Section 4.

Finally, the conclusions are presented in Section5.

2. Basic concepts and preliminary results

In this paper, we consider the following signal observation model

dx(t)= f(x(t))dt+g(x(t))dv(t), x(0)=x₀, dy(t)=h(x(t))dt+dw(t) y(0)=0,

(1)

in whichx(t),v(t),y(t) andw(t) are, respectively,Rⁿ,R^p, R^mandR^mvalued processes andv(t) andw(t) have com- ponents that are independent, standard Brownian processes. We further assume thatn = p, f andh are C smooth vector-valued functions, and thatgis an orthogonal matrix function.

Letσ(t,x) denote the unnormalised conditional prob- ability density function of the state given the observation

{y(s): 0st}, which satisfies the following DMZ equation:

dσ (t,x)=L₀σ (t,x)dt+_i=1ⁿ L_iσ (t,x)dyi(t),

σ (0,x)=σ0, (2)

where L₀= 1

2 n

i=1

∂²

∂x²_i − n

i=1

f_i ∂

∂xi − n

i=1

∂f_i

∂xi −1 2

m i=1

h²_i (3) and for i = 1, … , m,L_i is the zero degree differential operator of multiplication by h_i. σ0 is the probability density of the initial value x₀. In real application, we are interested in considering robust state estimator from observed sample paths with some properties of robust- ness. In Davis (1980), Davis considered this problem and proposed some robust algorithms. In our case, his basic idea reduced to define a new unnormalised density

u(t,x)=exp

− m

i=1

h_i(x)y_i(t)

σ (t,x), (4)

thenu(t,x) satisfies the following equation:

⎧⎨

⎩

∂u∂t(t,x)=L0u(t,x)+ ^m_i=1yi(t)[L0,Li]u(t,x) +¹₂ ^m_i,_j=1y_i(t)yj(t)[[L0,L_i],L_j]u(t,x), u(0,x)= σ0(x),

(5)

where [·,·] is the Lie bracket. Equation (5) is called robust DMZ equation and we can rewrite it in the following equivalent form

⎧⎪

⎪⎪

⎨

⎪⎪

⎪⎩

∂u∂t(t,x)= ¹₂ ⁿ_i=1^∂_∂x²^u2

i(t,x)+ ⁿ_i=1{−f_i(x)

+ ^m_j=1y_j(t)^∂h_∂x^j_i(x)}_∂x^∂u_i(t,x)

−{ ^m_i=1 ⁿ_j=1y_i(t)f_j(x)_∂x^∂hⁱ_j(x)

−¹₂ ^m_i=1 ⁿ_j=1y_i(t)^∂_∂x²^h2ⁱ j(x)

−¹₂ ⁿ_k=1 ^m_i,_j=1yi(t)yj(t)_∂x^∂hⁱ

k(x)^∂h_∂x^j

k(x)

+ ⁿ_i=1 ^∂_∂x^fⁱ

i(x)+¹₂ ^m_i=1h²_i(x)}u(t,x), u(0,x)= σ0(x),

(6)

In 1990, Yau (1990) first studied the filtering system (1) with the following conditions:

C₁) ∂f_j

∂x_i − ∂f_i

∂x_j =c_{i j},1≤i,j≤n,

wherec_ijis constant. The filtering system with condition C₁)is called the Yau filtering system in Chen (1994). Yau filtering systems include the Kalman–Bucy filtering systems and Benés filtering systems as special cases.

(3)

Theorem 2.1 (Theorem 1, Yau (1994)): The condition (C₁)holds if and only if

(f1, . . . ,fn)=(l1, . . . ,ln)+ ∂F

∂x₁, . . . , ∂F

∂x_n

, where l₁, … , l_nare polynomials of degree one and F is a C function.

Define η(x)=

n i=1

f_i²(x)+ n

i=1

∂fi

∂xi(x)+ m

i=1

h²_i(x). (7)

In this paper, we consider system (1) with the following three conditions:

(C₁) f_i(x)=l_i(x)+_∂x^∂F_i(x), 1≤i≤n;

(C₂) ^m_i=1h²_i(x)= ⁿ_i,j=1q_{i j}x_ix_j+ ⁿ_i=1q_ix_i+q₀; (C₃) η(x)= ⁿ_i,_j=1ηi jx_ix_j+ ⁿ_i=1ηix_i+η0;

where l_i(x)= ⁿ_j=1d_{i j}x_j+d_iandd_{i j},d_i,q_{i j} =q_ji,q_i, q0, ηi j, ηi, η0,1≤i,j≤n are constants. We remark that nonlinear observationh_i’s are allowed in this paper and the Kalman–Bucy filtering systems satisfies (C₃) and so does in Benés (1981).

3. Solution of the robust DMZ equation

In this section, we first reduce the robust DMZ equation to a Kolmogorov equation. Then the Kolmogorov equation can be solved in terms of ODEs based on an original Gaussian approximation method.

Let P_N = {0=τ0 < τ1 < τ2 < ···< τN =T} be a partition of [0,T]. For each time intervalτk−1 tτk, letu_k(t,x) be the solution of the following PDE (8), which is obtained from (6) by freezing the observation termy(t) toy(τk−1)

⎧⎪

⎪⎪

⎨

⎪⎪

⎪⎩

∂uk

∂t (t,x)= ¹₂ ⁿ_i=1 ^∂_∂x²^u2^k

i (t,x)+ ⁿ_i=1{−fi(x)

+ ^m_j=1y_j(τk−1)^∂h_∂x^j_i(x)}^∂u_∂x^k_i(t,x)

−{ ^m_i=1 ⁿ_j=1y_i(τk−1)f_j(x)_∂x^∂hⁱ_j(x)

−¹₂ ^m_i=1 ⁿ_j=1y_i(τk−1)^∂_∂x²^h2ⁱ

j(x)

−¹₂ ⁿ_k=1 ^m_i,j=1yi(τk−1)yj(τk−1)

∂hi

∂xk(x)^∂h_∂x_k^j(x)

+ ⁿ_i=1^∂_∂x^fⁱ_i(x)+¹₂ ^m_i=1h²_i(x)}u_k(t,x), u_k(τk−1,x)=u_k−1(τk−1,x).

(8) Define the norm of the partition P_N by |P_N| = sup_1≤i≤N(τi−τi−1), it has been proved in Yau and Yau (2000, 2005) that in both pointwise sense and

L²-sense

u(τ,x)= lim

|PN|→0u_i(τ,x). (9) 3.1 Reduction of the robust DMZ equation to

Kolmogorov equation

Our reduction of the robust DMZ equation to Kol- mogorov equation is based on the following important proposition.

Proposition 3.1 (Proposition3.1, Yau and Yau(2004b)):

For eachτk−1 tτk,1kn,u˜k(t,x)satisfies the following parabolic equation:

∂u˜_k

∂t (t,x)= 1

2u˜_k(t,x)− n

i=1

fi(x)∂u˜_k

∂x_i(t,x)

− _n

i=1

∂f_i

∂xi(x)+ 1 2

m i=1

h²_i

˜

u_k(t,x)(10)

forτk−1tτk, if and only if

u_k(t,x)=exp

− m

i=1

y_i(τk−1)h_i(x)

˜

u_k(t,x) (11)

satisfies(8).

The initial condition for (10) onτk−1tτkis

˜

u_k(τk−1,x)

=

⎧⎨

⎩

σ0(x)exp(− ^m_j=1y_j(0)h_j(x))=σ0(x),k=1 exp{ ^m_j=1(y_j(τk−1)−y_j(τk−2))h_j(x)}

˜

u_k−1(τk−1,x), k≥2

(12) From (4), (9) and (11), we have

σ (t,x)= lim

|Pk|→0u˜_k(τk,x). (13) Hence, to compute the unnormalised densityσ(t,x), we only need to find the solutionu˜_k(t,x)of the Kolmogorov Equation (10).

Lemma 3.1: For each k,τk−1<t< τk,(10)is equivalent to the following equation

∂u˜_k

∂t (t,x)= 1

2u˜_k(t,x)+ n

i=1

θi(x)∂u˜_k

∂x_i(t,x) +θ(x)u˜_k(t,x), (14) where θi(x)= −f_i(x) and θ(x)= ¹₂( ⁿ_i=1θ_i²(x)+

ni=1 ∂θi

∂xi(x)−η(x)).

(4)

Proof: Recall from (7) that

η(x)= n

i=1

f_i²(x)+ n

i=1

∂f_i

∂x_i(x)+ m

i=1

h²_i(x).

Defineθi(x)= −f_i(x), the coefficient ofu˜_k(t,x)in (10) is given by

− _n

i=1

∂fi

∂xi(x)+ 1 2

m i=1

h²_i

= n

i=1

∂θi

∂xi(x)− 1

2(η(x)− n

i=1

f_i²(x)− n

i=1

∂fi

∂xi(x))

= 1 2

n i=1

θ_i²(x)−1 2η(x)+

n i=1

∂θi

∂xi(x)+ 1 2

n i=1

∂fi

∂xi(x)

= 1 2

_n

i=1

θ_i²(x)+ n

i=1

∂θi

∂x_i(x)−η(x)

=:θ(x).

In order to solve the Kolmogorov Equation (14) in terms of ODEs, we first introduce a new transformation in the following theorem.

Theorem 3.1: For each k,τk−1tτk, supposeu˜_k(t,x) is a solution of(14)and let

ˆ

u_k(t,x)=e^(x)u˜_k(t,x), (15)

thenuˆ_k(t,x)is the solution of the following Kolmogorov equation

⎧⎨

⎩

∂uˆk

∂t (t,x) = ¹₂uˆk(t,x)− ⁿ_i=1Hi(x)^∂_∂x^u^ˆ^k_i(t,x)

−P(x)uˆ_k(t,x) ˆ

u_k(τk−1,x)=e^(x)· ˜u_k(τk−1,x)

(16)

if we can choose H_i(x), P(x)and(x)such that the follow- ing equations:

−∂

∂x_i(x)+H_i(x)+θi(x)≡0, 1≤i≤n, (17)

−1

2η(x)+ 1 2

n i=1

H_i²(x)− 1 2

n i=1

∂Hi

∂xi(x)+P(x)≡0 (18)

hold.

Proof: Differentiatinguˆ_k(t,x)with respect totandx, we have the following equations:

∂uˆk(t,x)

∂t =e^(x)∂u˜k(t,x)

∂t , (19)

∂uˆ_k(t,x)

∂x_i =e^(x)

∂(x)

∂x_i · ˜u_k+ ∂u˜_k(t,x)

∂x_i

, (20)

∂²uˆ_k(t,x)

∂x_i² =e^(x) ∂²(x)

∂x²_i +

∂(x)

∂xi

2

˜ u_k(t,x)

+2∂(x)

∂xi ·∂u˜k(t,x)

∂xi + ∂²u˜k(t,x)

∂x²_i

. (21)

Putting (19)–(21) into (16), we have

∂uˆ_k(t,x)

∂t

=e^(x)· 1

2u˜_k(t,x)+

n i=1

∂(x)

∂x_i −Hi

·∂u˜_k(t,x)

∂xi

+ 1

2(x)+1 2

n i=1

∂(x)

∂x_i ₂

− n

i=1

Hi·∂(x)

∂x_i −P(x)

· ˜uk(t,x)

(22)

Using (14), (19) and (22), we have n

i=1

θi(x)∂u˜_k

∂x_i(t,x)+θ(x)u˜_k(t,x)

= n

i=1

∂(x)

∂xi

−H_i

· ∂u˜_k(t,x)

∂x_i +

1

2(x)+1 2

n i=1

∂(x)

∂x_i ₂

− n

i=1

H_i·∂(x)

∂x_i −P(x)

· ˜u_k(t,x). (23)

Observing the coefficients of ^∂^u^˜^k_∂x^(t,x)

i and u˜_k(t,x), we have

θi(x)= ∂(x)

∂x_i −H_i, θ(x)= 1

2(x)+1 2

n i=1

∂(x)

∂x_i 2

− n

i=1

H_i· ∂(x)

∂x_i −P(x).

(5)

Recall that θ(x)= ¹₂( ⁿ_i=1θ_i²(x)+ ⁿ_i=1 ^∂θ_∂xⁱ

i(x)−

η(x)), then (17) and (18) follows.

Noting the special structure of the driftf, i.e. condition (C₁), we have a special choice of(x) in (15).

Theorem 3.2: Consider the NLF system(1)with condi- tions(C₁)−(C3). Then for each k,τk−1tτk, the solu- tionu˜_k(t,x)for(10)is reduced to the solutionuˆ_k(t,x)for the following Kolmogorov equation

⎧⎨

⎩

∂uˆk

∂t (t,x) = ¹₂uˆ_k(t,x)− ⁿ_i=1H_i(x)^∂_∂x^u^ˆ^k_i(t,x)

−P(x)uˆ_k(t,x) ˆ

u_k(τk−1,x)=e^G(x)−F(x)u˜_k(τk−1,x)

(24)

where

ˆ

u_k(t,x)=e^G(x)−F(x)u˜k(t,x), (25) if we can choose H(x), G(x)and P(x)such that

1 2

n i=1

H_i²(x)− 1 2

n i=1

∂H_i

∂x_i(x)−1

2η(x)+P(x)≡0 (26) where

H_i(x)− ∂G

∂x_i(x)=l_i(x). (27) Proof: Here we useTheorem 3.1with(x)=G(x)−F(x), whereF(x) is the one inTheorem 2.1. Then

∂

∂x_i(x)= ∂G

∂x_i(x)− ∂F

∂x_i(x). (28) Putting (28) into (17) and note thatθi(x)= −f_i(x), we haveHi(x)−^∂G_∂x_i(x)=li(x).

In the following corollary, we chooseH(x),G(x),P(x) which satisfy the conditions inTheorem 3.2such that the coefficients ofuˆ_k(t,x)and^∂_∂x^u^ˆ^k

i(t,x)are degree one polynomial and degree two polynomial, respectively. By this way, we can solve the transformed Kolmogorov Equation (24) in terms of ODEs if the initial condition is Gaussian.

Corollary 3.1: Choose G(x) 0, P(x)= ¹₂η(x)−

1 2

ni=1l_i²(x)+ ¹₂ ⁿ_i=1 _∂x^∂lⁱ_i(x)and H_i(x)=l_i(x),1i n, then(26)and(27)hold in this case. For each k,τk−1t τk, the corresponding Kolmogorov Equation(24)is given

by

∂uˆ_k

∂t (t,x)= 1

2uˆ_k(t,x)− n

i=1

l_i(x)∂uˆ_k

∂x_i(t,x) +1

2( n

i=1

l_i²(x)− n

i=1

∂l_i

∂x_i(x)

−η(x))uˆ_k(t,x) (29)

with ˆ

u_k(τk−1,x)

=

⎧⎨

⎩

e^−F(x)σ0(x), k=1, exp{ ^m_j=1(y_j(τk−1)−y_j(τk−2))h_j(x)}·

ˆ

u_k−1(τk−1,x), k≥2.

(30) Then forτk−1tτk,

˜

u_k(t,x)=e^F(x)uˆ_k(t,x). (31) Proof: Recall from (12) that

˜

u_k(τk−1,x)

=

⎧⎨

⎩

σ0(x), k=1,

exp{ ^m_j=1(y_j(τk−1)−y_j(τk−2))h_j(x)}

˜

u_k−1(τk−1,x), k≥2.

Then, fork=1,uˆ₁(0,x)=e^−F(x)σ0(x). Fork2, ˆ

u_k(τk−1,x)=e^−F(x)u˜_k(τk−1,x)

=e^−F(x)exp m

j=1

(y_j(τk−1)−y_j(τk−2))h_j(x)

˜

u_k−1(τk−1,x)

=exp m

j=1

(yj(τk−1)−y_j(τk−2))hj(x)

ˆ

u_k−1(τk−1,x)

Supposeuˆ_k(τk−1,x)is well approximated by a sum of finite number of Gaussian distributions, it follows that a well approximated solution of (29) is obtained by linear combination of solutions of (29) with Gaussian initial condition since (29) is a linear PDE. The following theorem give the solution of (29) with Gaussian initial distribution in terms of ODEs .

Theorem 3.3 (Theorem 3.2, Yau & Lai,2003):Consider the following Kolmogorov equation with Gaussian initial

(6)

condition

⎧⎪

⎪⎪

⎨

⎪⎪

⎪⎩

∂uˆ

∂t(t,x)= ¹₂uˆ− ⁿ_i=1li(x)_∂x^∂^u^ˆ_i(t,x) +¹₂ _n

i=1l_i²(x)− ⁿ_i=1^∂l_∂xⁱ^(x)

−η(x) i

u(t,ˆ x) ˆ

u(t₀,x) =e^x^T^A(t⁰^)x+B^T^(t⁰^)x+C(t⁰⁾

(32)

where A(t0) = (Aij(t0))is an n × n symmetric matrix, B^T(t₀)=(B₁(t₀), … , B_n(t₀)), x^T=(x₁, … , x_n)are row vec- tors and C(t0)is a scalar, t00.

Let

q(x)= 1 2

_n

i=1

l_i²(x)− n

i=1

∂li

∂xi

(x)−η(x)

=x^TQx+p^Tx+r (33) where l_i(x)= ⁿ_j=1d_{i j}x_j+d_i, Q=(q_ij)is an n×n sym- metric matrix, p^T=(p₁, … , p_n)is a row vector and r is a scalar. Then the solution of(32)is of the following form

ˆ

u(t,x)=e^x^T^Ax+B^T^x+C (34) where A(t) =(Aij(t))is an n×n symmetric matrix val- ued function of t, B^T(t)=(B₁(t), … , B_n(t))is a row vec- tor valued function of t, and C(t)is a scalar function of t.

Moreover, A(t), B(t)and C(t)satisfy the following system of nonlinear ODEs:

dA(t)

dt =2A²(t)−[A(t)D+D^TA(t)]+Q, (35) dB^T(t)

dt =2B^T(t)A(t)−B^T(t)D−2d^TA(t)+p^T, (36) dC(t)

dt =trA(t)+1

2B^TB(t)−d^TB(t)+r, (37) where D=(d_ij)is a n×n matrix and d^T=(d₁, … , d_n)is a1×n matrix.

3.2 Algorithms

In this section, we first give a new way to do Gaussian approximation in Algorithm 1. Then the computation of

˜

u_k(t,x)is summarised as an application in Algorithm 2.

The idea of our Gaussian approximation is the following: given a probability densityφ(x), we fit it with Gaus- sian distributions using the peaks ofφ(x) as the mean.

The procedure is repeated until the peaks ofφ(x)−g(x) is no larger than some threshold E, whereg(x) is the sum of Gaussians from previous fitting steps. In Section 4.1, the numerical experiments show that our Gaussian approximation method works very well.

Algorithm 1Gaussian approximation

1: Let f(x)=φ(x)and the threshold E=αmaxφ(x), whereαis a given small number.

2: Fitting the peaks of f(x) which are larger than E with Gaussian distributions. Specifically, for a peak P_i(x_i,y_i)of f(x)withy_i≥E, we use the functiong_i(x)= yiexp(−^(x−x_2σ2ⁱ⁾²

i )to fitPi(xi,yi)with points in a neighbor- hood ofP_i(x_i,y_i)where no other peaks exists, and the best fitting parameterσiis obtained by fitting. Suppose the sum of Gaussian distributionsg_i(x)in this step isg(x). 3: Let f1(x)= f(x)−g(x). If f1(x)has no peaks whose values larger than E, then go to step 4. Otherwise, let

f(x)= f₁(x)and go to step 2.

4: Let f₂(x)= −f₁(x). If f₂(x)has no peaks which are larger thanE, then done. Otherwise, let f(x)= f₂(x)and go to step 2.

Using the above Gaussian approximation procedure, we can decompose (30) into a finite number of Gaussian distributions. ByTheorem 3.3, the Kolmogorov equation (29) with Gaussian initial condition is solved in terms of ODEs. The algorithm to compute u˜_k(t,x) is list in Algorithm 2.

Algorithm 2Computeu˜_k(t,x)

1: Choose the total computing time T,tand the param- eterαin Algorithm 1. LetN = _t^T, and partition the time interval [0,T] by{0=τ0< τ1< τ2<· · ·< τN=T}.

2:fork=1 :Ndo

3: Using Algorithm 1, supposeuˆ_k(τk−1,x)is decomposed into ^N(k)_i=1 c_k,iG(μk,i, σk,i).

4: For each Gaussian distributionG(μk,i, σk,i), suppose the solution of (29) with initial conditionG(μk,i, σk,i)is

ˆ

u_k,i(τk,x). Solving (35)–(37), we obtainuˆ_k,i(τk,x). Then ˆ

uk(τk,x)= ^N(k)_i=1 c_k,iuˆ_k,i(τk,x).

5: From (31), we haveu˜_k(τk,x)=e^−F(x)uˆ_k(τk,x).

6: By (30), we obtainuˆ_k+1(τk,x). 7:end for

4. Numerical experiments

In this section, we first use two examples to show the effectiveness of the above Gaussian approximation algorithm, then two concrete NLF models are considered to verify the effectiveness of the direct method. The computing platform we used have Intel(R) Xeon(R) CPU E5- 2670 v3 @ 2.30GHz. We use the mean-squared error (MSE) as the performance metric. The average MSE over 100 times simulations is provided.

(7)

-10 -8 -6 -4 -2 0 2 4 6 8 10 x

-0.05 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35

σ1(x) gaussappro error

-10 -8 -6 -4 -2 0 2 4 6 8 10

x -0.2

0 0.2 0.4 0.6 0.8 1 1.2

σ2(x) gaussappro error

(a) (b)

Figure .Gaussian approximation of a given probability density function, (a) Gaussian approximation ofσ_(x), (b) Gaussian approximation ofσ(x).

4.1 Gaussian approximation

We approximate a given probability density function by

Ni=1c_ie

−(x−μi)2 2σ2

i by using Algorithm 1. Letα=0.01 and we consider the following two examples.

Example 4.1: Let the probability density function be σ1(x)= _71.2186¹ e^−x^sin^x−¹²^x^cos^x−x²^+3x+2 . The performance of our Gaussian approximation algorithm is given in Figure 1(a). The MSE is 2.5680 × 10⁻⁷. The average elapsed time is 0.2971 seconds. The Gaussian distributions we used is given inTable 1.

Example 4.2: Let the probability density function be Rayleigh distribution: σ2(x)= _b^x2e⁻^2b^x²² with b = 0.585.

The performance of our Gaussian approximation algorithm is given inFigure 1(b). The MSE is 3.2128×10⁻⁶. The average elapsed time is 0.3110 seconds. The Gaussian distributions we used is given inTable 2.

From the above examples, we can see that the proposed Gaussian approximation algorithm is very effective and fast.

4.2 Direct method

In this section, the performance of the direct method and EKF is compared. We consider the following NLF system which satisfies condition (C₁)–(C₃)

⎧⎨

⎩

dx_t = f(xt)dwt

dy₁(t)=x_tsin(x_t)dt+dv1(t) dy2(t)=xtcos(xt)dt+dv2(t)

(38)

Here w(t), v1(t),v2(t) are scalar independent standard Brownian motions. The initial distribution is taken asσ0(x)= _71.2186¹ e^−x^sin^x−¹²^x^cos^x−x²^+3x+2 . The initial values for EKF are xˆ₀ andP₀. In Gaussian approximation algorithm, we choose the parameterα=0.01. The total simulation time isTand the time step ist.

Example 4.3: In this example, we take the drift f(x)= tanh (x) in (38).

(1) WithT=30 seconds,t=0.1 seconds andxˆ0= 1. For P₀ = 1, the performance of our method and EKF is given inFigure 2(a). One can see that the EKF completely fails at aboutt=13 seconds.

The average MSE for 100 experiments by direct method and EKF are given inTable 3.

Table .Gaussian distributions used to approximateσ(x).

i          

c_i . . . −. −. −. −. −. . .

μ_i . . . −. . . −. . . .

σ_i . . . . . . . . . .

(8)

Table .Gaussian distributions used to approximateσ(x).

i        

c_i . . −. −. −. . . −.

μ_i . .  . −. −. . 

σ_i . . . . . . . .

Table .Average MSE of direct method and EKF.

P MSE-direct method^a Time-direct metehod^b(s) MSE-EKF^c

 . . –

 . . –

 . . –

aThe average MSE of direct method for  experiments.

bThe average elapsed time of direct method for  experiments.

cThe average MSE of EKF for  experiments.

Table .Performance of direct method with diﬀerentt.

t . . .

MSE . . .

(2) In this case we takeT = 10 seconds andt = 0.01 seconds. For xˆ₀=1 and P₀ = 5, the performance of the direct method and EKF is given inFigure 2(b). The average MSE for 100 experiments of the direct method and EKF is 0.1826 and 6.1364, respectively.

(3) WithT=10 seconds and different time step, the corresponding MSE of our method is given in Table 4, from which we can see that with smaller time step, the higher the estimation precision is.

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

time(s) 0

2 4 6 8 10 12 14 16 18 20

state

estimate-direct method

Figure .State estimation by direct method withT=,t=.

seconds.

Example 4.4: In this example, we takeT =2 seconds, t=0.001 seconds. The drift f(x)=x+1+^dF_dx, where

F(x)= _x

−∞{e^−(z−¹²⁾² ^z

−∞e^−(y−¹²⁾²dy−3/2}dz.

For xˆ₀ =1,P₀=3, the EKF just blow up, the performance of our method is given inFigure 3. The MSE of our method is 0.2016.

0 5 10 15 20 25 30

time(s) 0

5 10 15 20 25 30 35 40

state

estimate-direct method estimate-EKF

0 1 2 3 4 5 6 7 8 9 10

time(s) 0

2 4 6 8 10 12 14 16 18 20

state

estimate-direct method estimate-EKF

(a) (b)

Figure .State estimation by direct method and EKF, (a)T=,t=. seconds, (b)T=,t=. seconds.

(9)

From the above two examples, we can see that our method can track the state very well and it can be implemented in a real time manner. Besides, compared to EKF, our method is much more stable.

5. Conclusion

In this paper, we consider a general class of NLF sytems, called Yau filtering system with arbitrary initial condition.

The observations can also be nonlinear which occurs in many practical examples. We show that the solution of the robust DMZ equation is reduced to the solution of Kol- mogorov equation. Based on Gaussian approximation, the Kolmogorov equation can be solved in terms of ODEs.

On the one hand, the direct method can be implemented in real time and memoryless way. On the other hand, it is easy to implement and do not need the controllability and observability assumption. Compared to EKF, it is much stable and has theoretical convergence proof. Besides, we give a new approach to decompose a given probability density function into a finite number of Gaussian distributions. The proposed Gaussian approximation method is very simple and fast. The numerical experiments show that our Gaussian approximation method is very effective and the direct method can track the states very well.

Disclosure statement

No potential conflict of interest was reported by the authors.

Funding

This work was supported by the National Natural Science Foun- dation of China [grant number 11471184] and the start-up fund from Tsinghua University.

References

Benés, V. (1981). Exact finite dimensional filters for certain diffusions with nonlinear drift.Stochastics, 5, 65–92. doi:

10.1080/17442508108833174.

Brockett, R.W. (1981). Nonlinear systems and nonlinear estimation theory. InM. Hazewinkel J.S. Willems (Eds.)The mathematics of filering and identification and applications (pp. 441–477). Berlin, Germany: Springer Netherlands.

Brockett, R.W., & Clark J.M.C,. (1980). The geometry of the conditional density equation. In R.W. Brokett J.M.C. Clark (Eds.)International Conference on Analysis and optimisa- tion of stochastic systems(pp. 299–309). New York, NY: Aca- demic Press.

Chen, J. (1994, June). On ubiquity of Yau filters.Proceedings of the American Control Conference(pp. 252–254). Baltimore, MD: IEEE. doi: 10.1109/ACC.1994.751736.

Chiou, W.L., & Yau, S.S.T. (1994). Finite-dimensional filters with nonlinear drift II: Brockett’s problem on classification of finite-dimensional estimation algebras.SIAM Journal of Control Optimization, 32, 297–310.

Davis, M.H.A. (1980). On a multiplicative functional transformation arising in nonlinear filtering theory. Z. Wahrsch.

Verw Gebiete, 54, 125–139.

Dempster, A.P., Laird, N.M., & Rubin, D.B. (1977). Maxi- mum likelihood from incomplete data via the EM algorithm.Journal of the Royal Statistical Society, Series B39(1), 1–38.

Duncan, T.E. (1967).Probability densities for diffusion processes with applications to nonlinear filtering system(Ph D dissertation). Stanford University.

Kalman, R.E., & Bucy, R.S. (1961). New results in linear filtering and prediction theory.Transactions of the ASME-Journal of Basic Engineering, 83, 95–108.

Luo, X., & Yau, S.S.T. (2013a). Hermite spectral method to 1-D forward Kolmogorov equation and its application to nonliner filtering problems.IEEE Transaction on Automatic Control, 58(10), 2495–2507.

Luo, X., & Yau, S.S.T. (2013b). Complete real time solution of the general nonlinear filtering problem without memory.IEEE Transaction on Automatic Control, 58(10), 2563–

2578.

Mortensen, R.E. (1966). Optimal control of continuous time stochastic systems(Ph D dissertation). University of Cali- fornia, Berkeley.

Mitter, S.K. (1979). On the analogy between mathematical problems of nonlinear filtering and quantum physics.

Ricerche Automat, 10, 163–216.

Tam, L.F., Wong, W.S., & Yau, S.S.T. (1990). On a necessary and sufficient condition for finite dimensionality of estimation algebras.SIAM Journal on Control and Optimization, 28(1), 173–185.

Wei, J., & Norman, E. (1964). On the global representation of the solutions of linear differential equations as a prod- uct of exponentials.Proceedings of the American Mathemat- ical Society, 15, 327–334. doi: 10.1090/S0002-9939-1964- 0160009-0.

Yau, S.S.T. (1990, December). Recent results on linear filtering:

New class of finite dimensional filter.Proceeding of the 29th Conference Decision and Control(pp. 231–233). Honolulu, HI: IEEE. doi: 10.1109/CDC.1990.203587.

Yau, S.S.T. (1994). Finite-dimensional filters with nonlinear drift I: A class of filters including both Kalman-Bucy filters and Benes filters.Journal of Mathematical Systems, Estima- tion, and Control, 4, 181–203.

Yau, S.S.T., & Hu, G.Q. (2001). Finite-dimensional filters with nonlinear drift X: Explicit solution of DMZ equation. IEEE Transactions on Automatic Control, 46, 142–

148.

Yau, S.S.T., & Hu, G.Q. (2005). Classification of finite- dimensional estimation algebras of maximal rank with arbitrary state-space dimension and mitter conjecture.Interna- tional Journal of Control, 78, 689–705.

Yau, S.S.T., & Lai, Y.T. (2003). Explicit solution of DMZ equation in nonlinear filtering via solution of ODE.IEEE Trans- actions on Automatic Control, 48, 505–508.

Yau, S.S.T., & Yau, S.T. (1994, December). New direct method for Kalman-Bucy filtering system with arbitrary initial condition.Proceeding of the 33rd Conference on Decision and Control(pp. 1221–1225). Lake Buena Vista, FL. doi:

10.1109/CDC.1994.411164.

Yau, S.S.T., & Yau, S.T. (1997). Finite dimensional filters with nonlinear drift III: Duncan-Mortensen-Zakai

(10)

equation with arbitrary initial condition for linear filtering system and the Benés filtering system.IEEE Transactions on Aerospace and Electronic Systems, 33, 1277–1294.

Yau, S.T., & Yau, S.S.T. (2000). Real time solution of nonlinear filtering problem withour memory.Mathematics Research Letters, 7, 671–693.

Yau, S.T., & Yau, S.S.T. (2004a). Nonlinear filtering and time varying Schrödinger eqution I.IEEE Transactions on Aerospace and Electronic Systems, 40, 284–292.

Yau, S.S.T., & Yau, S.T. (2004b, December). Linear filtering with nonlinear observations. In43rd IEEE Conference on Deci- sion and Control(pp. 2112–2117). Atlantis, Paradise island.

doi: 10.1109/CDC.2004.1430360.

Yau, S.S.T., & Yau, S.T. (2005). Solution of filtering problems with nonlinear observations.SIAM Journal on Control Opti- mization, 44, 1019–1039.

Zakai, M., & Wahrsch, Z. (1969). On the optimal filtering of diffusion processes.Gujarat Electricity Board, 11, 230–243.