Bounds for the nonlinear filtering problem.

(1)

BOUNDS FOR THE NONLINEAR FILTERING PROBLEM

BY

VICTOR FRANKLIN KLEBANOFF A.B., UNIVERSITY OF CALIFORNIA,

BERKELEY, 1972

Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science

at the

MASSACHUSETTS INSTITUTE OF TECHNOLOGY May, 1976 Signature of Author.ctJ.-...t...-r...v... Department of pldthematics, May 7, 1976 Certified by...g.-. ... ... Th'esis s ervisor Accepted by ...

Chairman, Departmental Committee

JUN 16 1976

(2)

ABSTRACT

Title of Thesis: Bounds for the Nonlinear Filtering Problem Author: Victor Franklin Klebanoff

Submitted to the Department of Mathematics on May 7,1976, in partial fulfillment of the requirements for the degree of Master of Science.

Known bounds for the optimal filtering error of the general nonlinear filtering problem are discussed. The example of the phase-tracking problem is used as an

illustration. For the special case of the phas-tracking problem, new arguments are used to derive an upper bound for the optimal filter error.

Thesis Supervisor: Alan S. Willsky

(3)

ACKNOWLEDGEMENTS

It is a pleasure to acknowledge the help of Professor Alan Willsky, who supervised this thesis, as well as to thank him for his support and friendship.

In addition, I wish to thank my friends in the

Experimental Studies Group for the delightful opportunity to teach in the program for the past two years.

Thanks are also due to Jiri Dadok, Emden Gansner, and Alfred Magnus for many interesting and helpful

(4)

CONTENTS

Page

1. Background and Orientation ₅

1.1 The nonlinear filtering problem 5

1.2 A -specific class of nonlinear 8

filtering problems

2. The Phase-tracking Problem: Introduction 11 3. Bounds on Estimator Error: General Results 13

Applied to the Phase-tracking Problem

4. Bounds on an Error Criterion for the Optimal 23 Filter for the Phase-tracking Problem^

4.1 Study of the error criterion E(l-cos(O-O)) 24 4.2 A stochastic differential equation for Icn (t)1 2 25 4.3 Asymptotic behavior of the optimal 28

filter error

4.4 Bounds for the phase-tracking problem 32 4.5 A conjecture and generalizations; a partial

result; a counterexample 39

4.6 Strengthening the bounds 44

4.7 Comparison of results 47 5. Conclusions 52 References 53 Figure 1 21 Figure 2 21 Figure 3 22 Figure 4 38 Figure 5 38 Figure 6 46 Figure 7 51

(5)

1. Background and Orientation

1.1 The nonlinear filtering problem

The general nonlinear filtering problem has the following form: there is a signal process xt which satisfies the Ito stochastic differential equation

dxt = A(t,x) dt + B(t,x) dwt and a noisy observation process

dyt = C(t,x) _{dt +} _{D(t,x) dvt}

where xERn, yeRm, and wt and vt are independent standard Brownian motion processes. The goal is to obtain an

estimate xt based on the observations y , for 0;<t. The least squares estimate of xt is the ?-measurable

t t

random variable Rt which minimizes the expectation E( [xt - 2t]T[xt -kt] ),'

where is the a-field generated by the observations

Y, O <sSt. It is well known that

t = E( xt t)

Moreover, xt has the conditional probability density

function p(x,t), which satisfies the Kushner-Stratonovich equation (a stochastic integro-differential equation). For

results concerning the solutions of this equation, the reader may consult, for example, the works of Fujasaki,

(6)

C

Kallianpur, and Kunita 13 ] and Levieux [7 ].

Problems associated with the Kushner-Stratonovich

equation are (1) the impossibility of obtaining closed form solutions in the general case and (2) the excess of computing time required for application of numerical methods (despite proof (Levieux [V ]) that under certain conditions, the solutions are sufficiently regular that numerical methods may be used).

By recalling the structure of the Kalman filter, we note another difficulty of the nonlinear filtering problem. For the linear problem, the optimal estimate is the solution of a linear stochastic differential equatiion

dx^ t = A-t dt + B dv t

where the matrices A, B, and the Brownian motion process vt depend only on the signal and observation processes.

Furthermore, the error covariance, that is, the matrix-valued function

E(t) = E( [x t t t tT

solves a deterministic Ricatti equation. This contrasts with the general case, for which a similar stochastic differential

equation for the conditional expectation

Rt=t f x.p(x,t) dx Rn

cannot be given. Rather, on writing the equation

(7)

7

it happens that the right had side involves second order conditional moments which are nondeterministic. Similarly, the stochastic differential equation for an n th order

conditional moment will have, on the right hand side, terms including n+lst order conditional moments.

The difficulties inherent in the nonlinear filtering problem suggest using filters which are suboptimal but finite dimensional. This makes it important to know the accuracy of such filters. For the nonlinear situation discussed

above, where a least squares estimate is desired, the error E( [xt - xt]T[x tit] '

for the suboptimal estimate Xt, trivially sAtisfies

E( Ext ~ xtjxt ~ xt] )>: E( [xtrtT xxt t

t-We would necessarily like to know more.

To attack this problem, it is necessary, in principle, to compute

E( [xt - ]T[ -t~ t]

This is, at best, very difficult. An alternative approach, considered by several authors, notably Bobrovsky, Zakai, and Ziv .(working as collaborators) [l , 14] and Gilman and Rhodes

[4

, 5 1, has been to analytically bound this number. We shall show later, for a specific elementary, but highly

nonlinear problem, that these bounds are not necessarily

(8)

8

Simulations have been used in attempts to measure the estimator error (see Bucy [2 1 for results concerning the phase-lock loop, for example). Of course, in the absence of

any analysis, it cannot be known how precise these measurements are.

1.2 A specific class of nonlinear filtering problems As has been mentioned above, we are interested in nonlinear systems. A subclass of these which has received attention, by workers including Willsky and Marcus [ 8.133 is that of bilinear systems with linear observations.

These are systems of the form. dxt = Axt dt + Bxt dwt dyt =Cxt dt + dvt

For several specific problems, the structure of optimal estimators has been analyzed and various approximations, which may serve as suboptimal filters, have been proposed.

The reader may consult, for example, Willsky [121 for a discussion of these issues.

In what follows, we will discuss our efforts to understand the performance of the optimal filter for the following phase-tracking problem, which is perhaps the simplest bilinear problem.

(9)

9

Example: A particle on the unit circle in R2 undergoes

Brownian motion, having begun at time t=0 at the point (1,0). The x and y coordinates of the particle are observed during

the passage of time, but the observations are not precise. Rather, they consist of the true coordinates, corrupted by additive noise, which is taken to be white noise. The Ito stochastic differential equations describing this situation are

de ₌ _Q1/2_dvt

dx = (cos 0)dt + Rl/ 2 dw1

dy ₌ (sin 8)dt ₊ _R1_/2_dw 2

where v, wI , and w2 are independent standard Brownian

motion processes and we impose the initial conditions 0(0)=0, x(0)=1, and y(0)=0.

The filtering problem consists of finding a random variable S(t) , which is an estimate of 0(t), based on the

observations x(s) and y(s), for Osct. Technically speaking, we require 0(t) to be it-measurable, where

t is the a-field generated by the random variables x(s) and y(s), for O<sct. This estimate will typically be chosen to minimize some error criterion, for example the conditional expectation

E(j6(t) -

a(t)1

2

(10)

10

(Note that since 0 (t) and 0 (t) are points on the unit circle, we identify 0(t) and 0(t)+2r,, and similarly for &(t).)

Remark: This is a problem of the class mentioned in

section 1.2, with a bilinear system and linear observations. It is, in fact, the same as the problem of obtaining

estimates 6t and t for the system

a 1 0 -1 aI Ial

dl| irQl/2 dvt -

|1|dt

I N0 - 1 0

with linear observations

dx = a dt + Rl/2dw1

and dy =

5

dt + R1/2dW2

(Note: the system equations are verified upon setting

a = cos 0,

5

= sin 0 , and using the Ito differential rule to obtain the stochastic differential equations for a

(11)

11

2. The Phase-tracking Problem: Introduction

An optimal solution to the phase-tracking problem We consider the Kushner-Stratonovich equation--the

stochastic integro-differential equation for the conditional probability density of 0 (conditioned on xs and y ,

for O<s<t)--for the phase-tracking problem of the example of section 1.2. It is (KS) dpt = Q 2 dt + p [sinS - s cosb -

g>s

] 2a02 _1/2 dv2 R where

dv = tI-(sin e)dt and dv2 = -x-(cos 6)dt

R /2 2R1/2

and, for any random variable u, 6 = E( ul

1

).

In (KS), p(0,t) is the conditional probability density function of 0(t). We will study solutions of (KS) by using Fourier series. We expand

00 _inO

p(0,t) = E c (t) e n=-c n

The c are it-measurable random variables. Upon

n t

substituting the series expansion into (KS), we obtain the infinite coupled system of stochastic ordinary differential equations

(12)

2 dc n=-Qn c dt + 1 2 1/2 R d7/. M2 n - 2-where m =Cn-1 -cn+l+ CnIm(ci):cn-1 +cn+₁ n 2i 2 cnRe (c

These equations ..suggest certain suboptimal estimators; we obtain, for example, an approximate filter upon

ignoring all but a finite number of these equations (that is, set cn=O, for {nI sufficiently large).

(13)

3. _{Bounds on Estimator Error:} _{General Results Applied} to the Phase-tracking Problem

Survey of available results

In [14], Zakai and Ziv consider the problem of bounding the mean square error of the optimal estimator for a process of the form

dx = xj+

j=

l,2,...,n-l

dxn = _{m(x(t)) dt}₊

B(x(t)) dwt with the real-valued observation

dyt ₌ _{9(xk(t) ) dt}

+ Rl/ 2_dvt

For the case when x(t) is Gaussian, they obtain the lower bound t 22 12 E([x. - x 2 ₎ _E(cx.(t)2) expf - f E(o (s)2) ds] where a (t)2 = E X {(0),[ k(t),t) - Ex(0)x9 (k(t),t)]2}

and _{o.(t) is the conditional (upon x(O)} ₎ _{variance of x.} Example: if n=l, M(X)=, B(x)=Q1/2, and g(x)= sin x,

we have the phase-tracking problem, with only one observation. For large t, a (t) 2 is approximately equal to 1/2,

since the probability density function of 0 tends to a uniform distribution, as t+=.

(14)

that the lower bound is (asymptotically) Qt exp[ -t/2Q ],

evidently a useless result.

Moreover, the upper bound derived in the same paper is of no help, in this case, as it is valid only for the case

g(x)=x, j=k=n.

A different type of result is due to Gilman and Rhodes [4 ,5 ]. They consider the system

dx = f(x,t) dt + D(t) dv dz = g(x,t) dt + G(t) dw

where v and w are independent standard Brownian motions and the functions f and g satisfy the so-called "cone-boundedness condition"

Sf (x+S,t) -f (x) -A(t) 6 8_ a (t) 6

Ig(x+6,t)-g(x)-B(t)6i _< b(t) 6|

The Functions A(t) and B(t) parameterize an associated nominal linear system

dx = A(t): dt + D(t) dv

dz = B(t)x dt + G(t) dw

The first result obtained by Gilman and Rhodes is an upper bound on the error covariance of the suboptimal filter

(15)

Their upper bound is

E( [x(t)-X'(t)] [x(t)-'x(t)] T) < P(t)?,

where P(t) solves the differential equation

dP = V + KWKT +(A-KC)P _{+ P (A-KC)T +(a+c)P +(aI+CKKT)trP} dt

Here W=GGT and V=DDT are assumed to be positive definite. Moreover, K may be chosen to minimize P(t). The second result, enunciated in [5

],

is the lower

bound

T

where P(t) is the error covariance of the Kalman filter for the associated nominal linear system. That is, it solves

dP = V + AP+ PA- PCTWCp dt and r(t) > 0 solves (1-r)er =e-d, where 2 _-l ₂ -l d(t) = f [a (s) IV (s) + c (s) jW (s) j ]trP(s) ds. 0

Example: For the phase-tracking problem;

1/2 cos xj / f10 ₀

f(x, t)=0, D=Q1/2, g(xt)= G=R/21 1

Isin o1 [x20

Hence,

(16)

16

Note: the fact that c=1 follows from the fact that

= 2[l -cos 6j .<62

Isin(x+6) - sin

x I 2l-cs6

<CSX6 o

For the upper bound, we have, letting K=[k₁ k₂],

T

dP = Q + RKKT +P dt

This has the solution

(t) = P(O)et + (Q+RKKT.) (e 1 ₎

But, K may be chosen to minimize P(t). This is done by K=O, so that

P(t) = P(O) et + Q(et-1)

In particular, the upper bound tends to infinity as t-cm.

In addition, the lower bound which Gilman and Rhodes derive is not useful in this case:

dP = Q, so $(t)=Qt + P(Q) dt

t

and d(t) = f P(s)ds Qt , as t-o.

0 R 2R

(17)

I

-r(t)'l-exp( _-1 _- Qt 2 2R

Hence, the lower bound is asymptotically

Qt exp[ -l - Qt 2 2R

which is not useful for large values of t.

Another general result, due to Zakai and Bobrovsky [ 1], shows that the estimation error for a given nonlinear

problem is bounded below by the estimation error for an associated linear filtering problem. For the special case of estimating a Gaussian process from nonlinear measurements, their results coincide with those of Snyder and Rhodes 1

91.

Snyder and Rhodes consider the system dx = F(t)x dt + V1/2dv

y = H(t)x,

where yt -i the -GOasaian Process to'beestimated by the observation process

1/2 dz = h(t,y) dt + W dw

The result is that the optimal filter error satisfies E([y(t)^y(t)]ly(t)-^(t)]T) > H(t)E(t)H (t)T where E(t) solves the Ricatti equation

dE = FE + EFT + V - EHTPH .

(18)

18 a f( Examp1 applgi We hav Hence, nd P(t) = E( JTWlj ),

or J = the Jacobian of h(t,y), (with respect to y). e: We illustrate the Snyder and Rhodes result by ng it to the phase-tracking problem:

e F=O, V=Q, H=l, h(t,y)=sin (t + y), and W=R

J=cos(wt + y), so that

2

P(t) =(E( cos (wt+y) ),

hence , as too ,

P(t) + 1 2R

Letting E. denote the limit of E as t ZC C, solves the algebraic Ricatti equation

0=Q - )2

2R

whence the error for the optimal nonlinear filter for the phase-tracking problem is bounded below by / .2QR

For completeness, solely in connection with the phase-tracking problem, we discuss the classical phase-lock loop. As discussed in Van Trees [101, this system may be used to

attack the problem of estimating a signal 0, from a noisy observation

s(t) = sin(wt + 6) + dw dt

(19)

19

processes with covariances of Qt and Rt, respectively. Note that et _{is restricted to the unit circle, so 0 and O+27} are identified.

The structure of the phase-lock loop is shown in Figure 1. The voltage controlled oscillator (VCO) is a device whose output, given input d , is cos(wt+f(t)).

dt

In [Q 11, 12, it is shown that the system of Figure 2 may be used to model the phase-lock loop. Heuristically, this is because the signal fed into the low-pass filter is

[sin(wt + 0) + d]cos(wt + 0)

dt

- [sin(2wt)]cos(6-O) + -[l+cos(2wt)](sinG) (sinG)

2 2

-2-1l-cos(2wt)](cosO) (cose)

2

+ !W [cos(wt+O)] dt

The low-pass filter removes the double-frequency (that is, the 2wt) terms and the last term is modelled as a white noise process d , where W has covariance Rt/2.

dt

It follows that the system of Figure 2 models the phase-lock loop, provided that the noise process df has strength 2R.

dt

We now discuss the mean square error for the phase-lock loop. We observe that if the operator "sine" is replaced by the identity, and if the gain function F(s) equals

(20)

20

of estimating 0 given the observation process zt (where dzt= Odt + dw

the modified system is merely the Kalman filter for this problem. _{This value of F(s) is used in the phase-lock loop.}

Remark: A constant is the simplest choice of transfer function; with a gain of the form F(s)=a+(b/s), the filter is called

a second order loop.

It is possible to evaluate the asymptotic error covariance, that is

lim E( '{0(t)-0(t)} ), t+oo

for the phase-lock loop (where, for an angle

P,

{c} = min 1+2kl T

ke Z

This is discussed in [10] and illustrated in Figure 3. In the graph of Figure 3, the straight line is the asymptotic error for the Kalman filter for the linear problem obtained by replacing the sine operator with the identity, as above. The curve is the asymptotic error for the phase-lock loop.

(21)

Cos

(wt+

dt 21 de dt FIGUHE 2 d -W0", dt + + sin F I G YJ i Ll' 1 L ov! 1-18 s s ( S ) ...I Filter vc C

(22)

22

<D • ('\J co • rl

'°

• -rl '<:I" . • ,-t (\J • rl

-

0 C\J • • rl 'd ,d aJ J.t ,....; • 0

'°

₀• ~ • 0 C\t • 0

(23)

23

4. Bounds on an Error Criterion for the Optimal Filter for the Phase-tracking Problem.

We discuss, in this section, our efforts to find bounds for the error criterion

(1) EC (I - cos(e-6)

for the phase-tracking problem introduced in section 1. Section 4 is organized as follows:

(a) We show that (1) depends only on the Fourier coefficient c1 (t) of the series expansion of the

conditional probability density p(O,t), which is the solution of the Kushner-Stratonovich equation

for the phase-tracking problem (see section 2). (b) We derive stochastic differential equations for

fcn(t) 12, for all n.

(c) We study the asymptotic behavior of E(jcn (t) 2) and of related random variables.

(d) We analyze the (algebraic) equations for the limits

2 A1 T 2

I(Ic|nI ) lim - f E( Icn(t)/ )dt

n 7,, T 0

(e) We state a conjecture and generalization, which might strengthen these bounds. Unfortunately, the

conjecture and generalization together are generally false; we give a counterexample.

(24)

24

(f) We give correct arguments motivated by the erroneous conjecture to strengthen the bounds obtained by the analysis described in (d).

(g) We compare these results with those obtained by other workers using different methods for

a related problem.

4.1. Study of the error criterion E(l - cos(O-O) )

In this subsection, we suppose that the random variable 0,- with values on the unit circle, has the probability

density function p(0). The angle a is to be chosen to minimize

(2) E( 1 - cos(0-a) );

we show that the minimal value of (2) is

1 - Ic1I,

where p(O) has the Fourier expansion

M inS

p() = Ec e n n =- 0

Thus, if we seek, for the phase-tracking problem, a random variable 0, which is

(a) 3 -measurable

and (b) minimizes the conditional evpectation

(3) E (l - cos (e-0)

j

7t

)

(25)

the conditional (with respect to

5t)

probability density function of

e

(the Fourier coefficients are 7it-measurable random variables). Therefore, the optimal choice of 0 will give

E(1 - cos(e-O) ₎ = 1 - E(Icj) To verify that (2) is minimized by 1 - Ic),

1 - cos(E-a) = 1 - [ei( 0-a) +e-i (0-a)] 2

Taking expectations,

E( 1 - cos(e-a) ) = 1 - l[c_le-a+-c1e ]

2 But p(e) is real-valued, _so _{c_= c1.}

Hence, (2)= 1 - Re(c₁e ), which is minimized when

icx

Re(c ₁e ) = clI, as was to be shown.

4.2 A stochastic differential equation for tcn(t)12

In this subsection, we will show that the following differential equation for Icn(t)| 2 holds:

dicn 2 = [-Qn2 n 2 + 1( cn-1 - cn+1 + _cnIm(c)12

Rj I _2i

+lcn-l + cn+1 - cnRe(c 1) 2

1 2 )]dt

(26)

26

Justification:

from section 2, we have, dc =-Qn2c dt+

n 2yn

1/2

dv

4

g₂

]

J[,v

where v and v were defined in section 2, and

1 2 9 = cn-1 -cn+l + cnIm(ci) 2i = cn-i +cn+l 2 - cnRe _Cn (c)₁₎

Hence, taking complex conjugates, d = Qn 2 ICdt + dc 2n 1 R1/2 ~g~V -11 dv 2

By the Ito calculus, we have

dcn 2 = cndcn + cndcn +1 tr GQGT 2R where G =

[g

₁

g

₂

g1 g2 91

92

and Q= LO

The desired equation follows from the observation that

T 01 2 2

tr GQG

K )

₌ _2(gj

2 +g ₂_j

We now simplify the equation:

The term ( gli2 +ig₂1 2) is of the form

01 dt

1 0J

F

(27)

27

(4) IA-C +BIj2 + IA+C - BH12

i21 1

2-But, for complex z and C,

Iz + j2 =1z12 +1(12+ 2 Re(z); hence, (4) equals

Al2 + ICl2 +

IBI

2(12+ H2 ) - H Re(AB) + I Im(AB) 2 - _{H Re(CB) + I Im(CB)} On setting A=cn-l B=cn C=cn4l H=Re(c I=Im(c1 )

and substituting into the stochastic differential equation for lcnI 2 we obtain

(5) dl CnI2 22 cQRn

iI

2 + ICn-l12 _+cn+12 R 2 +1cj1 21 cnI 2 - Re(c₁cn-lcn + ccncn+91 dt + White noise

(28)

28

4.3 Asymptotic behavior of the optimal filter error

In this section, we obtain the result

(6) QRn2 I(Ic 12) 1[ I(Icnl2) + I(IC 12

n 2 n-1 n+1

+ I(IclI 2 _tc_n12

- Re( I(c c c + c c -c)

1 n n+1 1 h-1 n

where, for a polynomial q _{in the Fourier coefficients, cn} of the conditional probability density function p(O,t), the limit

T

I( q ) A lim f E( q(t) ) dt

T+oo

exists by a theorem of Kunita.

It is clear that (6) follows from (5) upon taking

expectations, integrating with respect to t from 0 to T, dividing by T, and taking limits, provided that the limits exist. In the remainder of this section we justify taking the limits.

To explain Kunita's result, we require an abstract formulation of the filtering problem. We motivate the

formulation with the example of the phase-tracking problem. The state space of the signal process is S1, the circle. The conditional probability density function, p(at) may be

(29)

29

viewed as a stochastic process taking values (for each time t and each point W in the underlying probability space Q) in C(S1 ), the space of continuous, real-valued functions on Sl. Moreover, for each W and each t, p(-,t) is a probability density function, so it may be viewed as a

1

probability measure on S . This is the desired abstract formulation: the filtering process is a stochastic process taking values in M(S 1), the space of probability measures on Sl. We use 7t to denote the filtering process, so, for our example, nt = p(-,t), where we suppress reference to Q.

We now discuss the abstract problem. The signal process xt has state space S, a compact separable,

Hausdorff space. Let M(S) be the set of all probability measures on S, endowed with the weak* topology (as the dual

space of C(S) ), under which it is a compact, separable, Hausdorff space. The filtering process ft is an M(S)-valued

stochastic process with transition probabilities H (v,).

t

Kunita's results [ 6] (Theorems 3.1 and 3.3) give conditions under which there exists a unique invariant measure of the

set of transition probabilitiesiHt (v, ). Recall that an invariant measure for the Ht (vi) is a probability distribu-tion. .',, on M(S), satisfying

(30)

30

(Intuitively: the probability that, at time t, the value of the filter process is in r, equals the probability that it began in F at time zero.)

The phase-tracking problem satisfies the hypotheses

of Kunita's theorems. Thus, there exists a unique, invariant measure, (, for the filter process, given by

1 T

4(') = lim T f t(v,1)dt

T+co 0

For the phase-tracking problem, S is the circle, S

Suppose that f: M(S1 ) -+ R is continuous. (As an example of such an f take any polynomial in the Fourier coefficients of the measure p, for p eM(S ); in particular, if the

COO

measure has the density function p(O)= E c e , n=-o n

let f be any polynomial in the c .)

n

For such a function f, we have, by definition, E( f(1t 10f110=v)= MS f(a)lt (vda)

0 M(S)

So that

T

I(f) = lim f E(f(HtI 11₀=v)dt

T co 0 T = lim - 1f[f f(a)llt(vda)] dt T+oO 0 M(S ) = f f(a)j(da), M(S )

(31)

31

*

weak limit.

Remark: This shows that the limit I(f) exists for all functions f of interest for the phase-tracking- problem. Moreover, it shows that I is a bona fide integral, that is,

(32)

32

4.4 Bounds for the phase-tracking problem

We begin this section by deriving, from equation (6), for n=l, several inequalities among the numbers I(1c1

1

2)F I(j2 j2), and I(IcI 4). We conclude the section by deducing, from these inequalities, a lower bound for I(1c₁ ).

This will give an upper bound for T

lim 1 f [1 - E( cos(0(t) -0(t)) )]dt

T+oo T 0

for the phase-tracking problem.

Equation (6) for the case n=l is

(7) (l+QR) I('c 12) =[1 + I(1c₂ )] + I(Ic |4 2

-I( Re(c1 c2) We now obtain lower bounds on I( cl 2) and I( ) and deduce lower bounds for I (j c

I).

We remark that, using the Holder inequality and the fact that IcnI l, for all n,

I( Re(cjc2 ) < I(jc 2 c21) Ic 1 14) I(1c1 2 ₎

so that, from (7), we obtain

(8) (l+QR) I(Ic12) l[l + I(c2 )] +Ic )

2c

- I(IclI) I(1021 2)'

Again, by Holder,

(33)

33

so that, from (8), we have

(9) (1+QR) I(Ic114) l>[l+I(1c21 2)] + I(Ic4

- Ic1

j1Tv

21

Note that (9) is an inequality of the form ax2+bx+c < 0, with a >0, where x= I(1c.14)

Hence, we conclude that x e [r₁ r9, where r and r2 are

the roots of the quadratic equation. In this way, we deduce that

(10) I(cO >w(X) -) 4 l[ (X+l+QR) - JQR+2QR( XY)--(-X) 1 2

4 where, henceforth, X - I(jc₂f2).

The following fact will be useful later: (11) X _> 1 + QR - AQR(2+QR)

We derive it from (9), which, after setting x =

Ic11j),

is

(l+QR)x . 1+2 + x - Ax

2

whence

0 < (A-x) 2 < -[ 1 + x2- 2(l+QR)x ] = P(x)

Figure 4 illustrates the situation. Equation (11) is merely the observation that

A

is greater than the smaller root of P(x). Also, (8) may be rewritten, in terms of A, as

(34)

34

In the remainder of this section, we will use some of the inequalities obtained above to deduce a lower bound for I(1c11). The analysis takes the following course:

For each value of A, we show that the set of possible values of (I(qc1t,I(1c₁

12))

is a restricted region of R

From this, we deduce, assuming a particular value of A, a lower bound for I(1cl1). Then, by minimizing this

lower bound over all A e[0,1], we obtain a lower bound for

I(ic1i).

More precisely, for each fixed value of A, (12) gives a lower bound for (1c₁12), as a function of I(jcj|4). In addition, we know

(13) I(c 1|4) < I( c ,2) < I(Ic 11),

because

1c

1

1<

1

and 2 4

(14) I(icli) Icj , by the Holder inequality.

These facts are illustrated in figure 5, in which values of I(c ₁14_{) are on the} _{x-axis and values of (jc}

1j2 ) are

on thb y-axis. The shaded region represents the set of values of the pair (I(Ic1|4), I(1c1 12)) allowed by (10),

(12), (13), and (14). The numbers z(X) and w(A) are defined in the figure. The value of w(A) was given in (10);

(35)

35

S2

z (x) =W(l+2QR) + 2QR _-__

2QR 1+2QR

The minimum value of z(A) is attained for X = 1 1+2QR

4 Because we do not know the true value of I(jcl ) ,we analyze, for fixed x, the possibilities

(a) w(x) < I(IcI ) <4) z(x) 4

(b) z(x) < I(cI1 )

If (b) holds, we apply (13) and (b) to deduce that (15) I(fjc) ;> 1

1+2QR

If (a) is true, we use a convexity argument to lower bound I(1c₁ 1). (Then, the minimum of these lower bounds is

a lower bound,for the given

xj

Then, by minimizing over all A, we obtain the lower bound (18).

The convexity argument f6r case (b) is the well-known fact that, given a function f, the function F(r), defined by

F(r) = log[ f If(x)Ir dx], is convex. Hence,

-[F(4) - F(l)] > F(2) - F(1)

It follows from this, by substituting c₁ for f, and using the integral I, that &

(36)

[I I , 2 3/2 (16) I(jc

)

[I(lclI)] [I(Icll 4)]1/2 4 Setting x = I(ic ) y = I(jc1 2 and v = 1+X2 2

we see that our goal is to lower bound the function

ever that part of the shaded region of figure 5, if x < z(X). This will give, for each value of A, a bound for I(Ic), by (14). Letting

or which lower G(x,X) = v+x-x , 1/3 x (11) gives

,y3

< G( -3/2 x L(1+QR)

We give a lower bound for the function G(x,x), by first minimizing G with respect to x, for fixed A, and then minimizing with respect to A.

In fact, G(x,A) is minimized with respect to x, for fixed A, at the point

x(k) =

L_[X+

16+17??] 2 64

It follows that G(x(k),X) is minimized, for all A, when X = / x'"= / -1/-2'1.

(37)

i

w

We conclude , therefore, that if w(\) <x < z(X), then

(17) I(Ic

I)

[t2 3/2 (l+QR)

Now, (15) holds if x > z(X) and (17) holds if w(\) c x< z(x). We conclude that, whatever the true true value of x and X,

(18) I(ic1i)> 1 3 3'

(l+2QR)₁ _4/F (l+QR

Remark: In section 4.6, we will improve this lower bound by deriving additional restrictions on the

(38)

&I&U.L.

4 I2

I+QR[2 +xir

w(A) z(A)

(39)

39

4.5 A conjecture and generalizations; a partial result; a counterexample

In this section, we do three things. We begin with a

2-conjecture that Re(E( C 02

)

>

0 and suggest the

generalized conjecture tbhat

Re( I( c'c

c

)

>0,

for all n=l,2,3,...

Itnrn+1

-Then, we support the conjecture by proving that

2- 4 2 i

Re( c1 2

)

21c1

1

-

1c

₁1

--8

We conclude this section with a counterexample to the generalized conjecture.

2-We conjecture that Re( E(c1 c2) > 0. This has the

following intuitive interpretation; The random variable c is used to obtain an angle estimate 0, which, perhaps,

maximizes E( Re(e2i(O -')'t) (This is analagous to

e,

which maximizes E ( Re(e'O0

0 fl

_?%).) _We _{observe that}

2- -ie 2 2i.

cl c2 = [E( e ] [E(e)H

= e2i(O-) :E(e( 0 0 _)I _t _{2 [E(e2i(e}

-)j _t

= e i( I-c112 P, for P> ,

by the assumption that S maximizes the conditional

expectation. These equations suggest that Re( E(clc2)

measures the difference of the two estimates, 8 and

&

(40)

are good, so that ei (C- 2)- 1,

2-whence Re ( E (c 1 c 2 )->Q0

The generalization of this conjecture is Re( E(c c nc ) ) > 0, for all n.

I n n+l

A similar intuitive argument suggests that this would reflect the accuracy of estimates based on the higher order Fourier coefficients. We will show, however, that the generalized conjecture cannot be true if QR=2.

Although the generalized conjecture is generally false, and we cannot prove the conjecture, we now prove that

Re(c₁ _c

2 >-)

In fact,

-cl) = [ E(l - cos(e-O) _It7₂

SE1( [1 - cos(o-e)]1 t) , (Holdef)

But

-2iq.

(l-cos43 2+ Re( -2e )

2 2

so that

2i0 A

E( [1 - cos(O-6)] t) 3 + Re[ C2e - 2c eJ ]

A 2 7

)

2c

2 = + Re[ c12E2 - 2jcJ ] 2 ,22 21c11 because 1c₁1= cle .

(41)

41

(19) Re(c₁ c₂) - 21c - c

But, for all real x, 2x2 x > -l; the conclusion follows.

Counterexample: We now give a gounterexample which shows that the generalized conjecture fails if QR=2.

If

E( Re(c cnc n+) ) >0, for all n, then by (6),

QR I(IclI 2) [ + I(1c21)]

and QR I(cnI2 ) f1[ I(Icn-₁ 2) + I(jc n+ 12) ] + I(Ic 2

for n>1.

Letting xn =I

cj

2), to simplify the formulas,

x <c [l+x2] 1-2QR

xn 1 [xn- +x n+1], for n>l. 2(QRn -1)

We analyze this system of inequalities by considering the finite system: x1 <a _{1 + b1x2} x sanx + bnx , n=2,...,N-1 n - n n-l n n+l x c ax + b N - N N-l N

subject to the condition that all a and b are non-negative.

(42)

42 By induction, x c< a x N-k N-k N-(k+l) + g EN--k f-where f =1 N f = 1 - aN-(kl) b-1 k N-k N- (k-i) = bN ....bN-k N '00 6fN-k

Thus, for example,

l-j-+g ₁ = a₁ ₊

N

a₂_b 1 1 1 - a3b2 1- _a-b3 1 - a₅b 4

For the phase-tracking problem, a =b =1

1 ₁ _2QR

a =b = 1 ,nl.

(43)

43

If QR=2, then a₁ =b 1 1 4 a = b = 1 , nl n n 2 (2n2-l)

It follows that g = 0, for all n, and that -

<

0 .135

2-But, by (11), _*>0.17, a contradiction. This concludes the counterexample.

(44)

44

4.6 Strengthening the bounds

We have just shown that the conjecture and the

generalization must be generally false. Let us summarize the information which we do have. For x,y,X, and v as in section 4.4: (11) QR + 1 - /QR (2+QR)' X X (20) v > x + QRy, because (1+QR)y + v + x = I( Re(c1 c2)) (7) 2 -and I ( Re (c c 2) > 2x -y (19 ) (12) y > F (x) 1 V + X - XvW ) 1+QR (10) x > w (A)

These inequalities are illustrated in Figure 6. The shaded region represents the values of the pair (xy) which are permitted, by these inequalities.

If the pair (x₀ y0Q where VX ₀ = F(x₀) =y

QR

lies above the line y = x, we may invoke the convexity

argument,introduced in section 4.4, to lower bound I

(

Ic

).

I(1c 1 ) _ min

[permissible x pairs (x,y)]

(45)

45

It is easy to see that the minimum is attained at (x,y₀)

3

whenever (setting f(x,y) =

1,)

the vector Vf(x0,y0)

x

satisfies if (xny)9.v > 0, for some vector

V= (x1 - x0) + (y -

y

0

)3,

where (x1,y1 ) is a permissible pair. Note that x ________ 2+4R+ 2R) X0= 1 [QRX +(l+2QR) + (2+4QR+Q2R 2 4 (l+2QR) 2

YO

QR

This improves the bound obtained in section 4.4, because the region over which the function f(x,y) need be minimized has been made smaller.

In section 4.7, we give examoles of the use of the use of these bounds.

(46)

46

FIGURE 6 Q R /

t

I I I ____ I w(X) V QR f z(X)

IZYT

v

(47)

47

4.7 Comparison of results

In this section we compare our results with estimates of error bounds obtained by simulations for the following related phase-tracking problem:

The signal process satisfies dO = wdt + ~/2dw

and there is a single observation process dx= (sin O)dt + R4/2dv

A

The goal is to find an estimate 0 of 0, which minimizes the conditional expectation

E( 1 - cos(0-0)

_IY

_t

There are simulation results available for two estimators: (a) the classical phase-lock loop, and (b) a suboptimal filter proposed in [12].

We may compare our results with simulation results

because we shall show how the above problem may be modified to give the one which we have -studied. We argue

non-rigorously; for details, the reader is referred to Van Trees [10 or Viterbi [11].

The observation process solves the differential equation

d = s in(wt + Q/ 2w(t)) + fl/2 dv dtdt

(48)

48

(sin wt)dx 1[1 - cos 2at] cos (Ql/2w (t))

dt 2 1 -1/2 + -[sin 2wt]sin(Q w(t)) 2 + R (sin wt)d dt

This signal is now passed through a low-pass filter to remove the double-frequency (2wt) terms. Moreover, we claim that the process

R1 / 2(sin wt)d-dt

may be replaced by

[4]

_{1/2 dvl}

2_ _dt

for the standard Brownian motion process v₁(t). Thus, we have an observation

s₁ (t) = cos (Q1/2W(t))+-R1/2 dv .

2aL2 dt

Setting sl =

Itl

gives

2dt

dz₁ = cos (Q1/2w(t)) + (2') 1/ 2dv1.

If we multiply the original observation process by (cos ot) and argue in a similar way, we obtain

dz₂ = sin(Q1"2w(t)) _dt ₊ _(2R)1 _/2_dv 2'

(49)

49

single observationmay be compared with the one which we have studied, provided that we set Q= and R=2f', where Q and R are the noise strengths for the single-observation problem and Q and R are the strengths for the problem considered in this thesis.

We tabulate results in Figure 7. The first two columns are taken from Willsky [ ]. They are estimates of the filter error based on simulation of two techniques: the classical phase-lock loop and a filter proposed in [ ].

The third column contains the lower bound of inequality (18). The fourth column contains, for the cases

Qr=1.0 and QR=l.7, a bound obtained by applying the results of section 4.6.

In applying the result of section 4.6, one issue is to obtain tight bounds for

X=

1(

c₂1 2) . To obtain a lower bound for

x,

we use (11). For the upper bound, we use

the infinite system of equations (6) to derive the infinite system of inequalities

(QRn2 - 1)I(1Cn 2) l[ I(c1 2) + I(IcI2+

+ (lCni 2)[ ( IcliI)+ (Icn+l1)I

These follow frnm the Holder inequality and the fact that

(50)

50n

(18) I(Ic

122

1+ I(1c2 2)

1 _2QR

By starting with the knowledge that I(lcnI2) < 1, for all n, these inequalities may be used to give smaller bounds.

These may, in turn, be used to give still smaller ones. By continting this iteration, it is possible to show, for

example, that if

QR=l.7, then IOCI 1|2) < 0.311 and I(|c₂ 2) < 0.056, so that A < 0.237. Hence, we have A E[0.19,O.24]. By using the argument of section 4.6 to limit the region of permissible values of x and y, and by invoking the convexity -argument, we may show that I(1c₁1) > 0.26. This gives the bound for the phase-tracking problem of

T

lim T fPE[l - cos(O(t)-O(t))-] dt < 0.74,

T--C 0

which compares very unfavorably with the estimates obtained by simulation.

It shopld be noted that the iteration procedure for upper bounding is useful only for large values of QR. In the case of QR=l, it seems difficult to improve significantly the bound A E[O.26,0.59].

(51)

51 FIGURE 7 0.0581 0.1120 0.2253 0,.3687 0.5002 0.5692 0.0608 0.1171 0.2447 0.4098 0.5312 0.6158 SIMULATION SIMULATION FOR FOR PHASE-LOCK SUBOPTIMAL LOOP FILTER ESTIMATES OF 1-I(1c2 0.10 0.15 0.26 0.49 0.68 0.79 0.01691 0.0506 0.16 0.49 1.0 1.69 SECTION SECTION 4.4 4.6

UPPER BOUND FOR l-(c1

0.67

0.74

QR I

(52)

52

5. Conclusions

It is our belief, upon observing the weakness of the error bounds obtained by our technique, that there must

exist better ways to obtain bounds for the nonlinear

filtering problem. The weakness of the method used here

must lie in the convexity argument. It cannot generally

give a bound sufficiently tight as to be useful. The only

way suggested by the technique for improving the bound is

to obtain more information which further constrains the set of permissible values of the random variables and numbers

which we have used in our arguments (for example, I(jclj 2

One way to do this might be with the conjecture of section 4.5. Of course, this should not be attempted before learning how the bounds may be improved, by applying the conjecture.

(53)

53

REFERENCES

[1] B.Z. Bobrovsky and M. Zakai, "A Lower Bound for the Estimation Error for Certain Diffusion Processes," Memorandum No. ERL-M490, Elec, Res. Lab.,

University of California, Berkeley, October 1974. [2] R.S. Bucy and A.J. Mallinckrodt, "An Optimal Phase

Demodulator," Stochastics, Vol. 1, 1973, pp. 3-23. [3] M. Fujisaki, G. Kallianpur, and H. Kunita,

"Stochastic Differential Equations for the Nonlinear Filtering Problem," Osaka J. Math., Vol. 9, 1972, pp. 19-40.

[4] A.S. Gilman and I.B. Rhodes, "Cone-Bounded Non-linearities and Mean Square Bounds-r-Estimation Upper Bound," IEEE Trans. on Automatic Control, Vol. AC-18, No. 3, June 1973, pp. 260-265.

[5] I.B. Rhodes and A.S. Gilman, "Cone-Bounded Nonlinearities and Mean Square

Bounds---Estimation Lower Bound," IEEE Transactions on Automatic Control,. Vol. AC-20, No. 5, October 1975, pp. 632-42.

[6] H. Kunita, "Asymptotic Behavior of the Nonlinear Filtering Errors of Markov Processes,"

J. Multivariate Analysis, Vol. 1, 1971, pp. 365-93. [7] F. Levieux, Filtrage Non-lineaire et Analyse

Fonctionnelle, Rapport de Recherche no. 57, IRIA, February 1974.

[8] S.I. Marcus, "Estimation and Analysis of Nonlinear Stochastic Systems," Ph.D. dissertation, Dep.

Electrical Engineering, M.I.T., Cambridge, Mass., May 1975.

[9] D.L. Snyder and I.B. Rhodes, "Filtering and Control Performance Bounds with Implications for Asymptotic Separation," Automatica, Vol. 8, No. 6, November 1972, pp. 747-53.

(54)

54

[10] H.L. Van Trees, Detection, Estimation, and

Modulation Theory--Part II: Nonlinear Modulation Theory. New York: Wiley, 1971.

[11] A.J. Viterbi, "Phase-Locked Loop Dynamics in the Presence of Noise by Fokker-Planck Techniques," Proc. IEEE, Vol. 51, 1963, pp. 1737-53.

[12] A.S. Willsky, "Fourier Series and Estimation on the Circle with Applications to Synchronous Communication--Part II: Implementation," IEEE Trans. on Information Theory, Vol. IT-20, No. 5, September 1974,

pp. 584-590.

[]3] A.S. Willsky and S.I. Marcus, "Analysis of Bilinear Noise Models in Circuits and Devices," Monograph of the Colloquium on the Application of Lie Group Theory to Nonlinear Network Problems, April 1974,

IEEE Pub. No. 74 CHO 917-5 CAS.

[14] M. Zakai and J. Ziv, "Lower and Upper Bounds on the Optimal Filtering Error of Certain Diffusion

Processes," IEEE Trans. on Information Theory, Vol. IT-18, No. 3, May 1972, pp. 325-31.