BOUNDS FOR THE NONLINEAR FILTERING PROBLEM
BY
VICTOR FRANKLIN KLEBANOFF A.B., UNIVERSITY OF CALIFORNIA,
BERKELEY, 1972
Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science
at the
MASSACHUSETTS INSTITUTE OF TECHNOLOGY May, 1976 Signature of Author.ctJ.-...t...-r...v... Department of pldthematics, May 7, 1976 Certified by...g.-. ... ... Th'esis s ervisor Accepted by ...
Chairman, Departmental Committee
JUN 16 1976
ABSTRACT
Title of Thesis: Bounds for the Nonlinear Filtering Problem Author: Victor Franklin Klebanoff
Submitted to the Department of Mathematics on May 7,1976, in partial fulfillment of the requirements for the degree of Master of Science.
Known bounds for the optimal filtering error of the general nonlinear filtering problem are discussed. The example of the phase-tracking problem is used as an
illustration. For the special case of the phas-tracking problem, new arguments are used to derive an upper bound for the optimal filter error.
Thesis Supervisor: Alan S. Willsky
ACKNOWLEDGEMENTS
It is a pleasure to acknowledge the help of Professor Alan Willsky, who supervised this thesis, as well as to thank him for his support and friendship.
In addition, I wish to thank my friends in the
Experimental Studies Group for the delightful opportunity to teach in the program for the past two years.
Thanks are also due to Jiri Dadok, Emden Gansner, and Alfred Magnus for many interesting and helpful
CONTENTS
Page
1. Background and Orientation 5
1.1 The nonlinear filtering problem 5
1.2 A -specific class of nonlinear 8
filtering problems
2. The Phase-tracking Problem: Introduction 11 3. Bounds on Estimator Error: General Results 13
Applied to the Phase-tracking Problem
4. Bounds on an Error Criterion for the Optimal 23 Filter for the Phase-tracking Problem^
4.1 Study of the error criterion E(l-cos(O-O)) 24 4.2 A stochastic differential equation for Icn (t)1 2 25 4.3 Asymptotic behavior of the optimal 28
filter error
4.4 Bounds for the phase-tracking problem 32 4.5 A conjecture and generalizations; a partial
result; a counterexample 39
4.6 Strengthening the bounds 44
4.7 Comparison of results 47 5. Conclusions 52 References 53 Figure 1 21 Figure 2 21 Figure 3 22 Figure 4 38 Figure 5 38 Figure 6 46 Figure 7 51
1. Background and Orientation
1.1 The nonlinear filtering problem
The general nonlinear filtering problem has the following form: there is a signal process xt which satisfies the Ito stochastic differential equation
dxt = A(t,x) dt + B(t,x) dwt and a noisy observation process
dyt = C(t,x) dt + D(t,x) dvt
where xERn, yeRm, and wt and vt are independent standard Brownian motion processes. The goal is to obtain an
estimate xt based on the observations y , for 0;<t. The least squares estimate of xt is the ?-measurable
t t
random variable Rt which minimizes the expectation E( [xt - 2t]T[xt -kt] ),'
where is the a-field generated by the observations
Y, O <sSt. It is well known that
t = E( xt t)
Moreover, xt has the conditional probability density
function p(x,t), which satisfies the Kushner-Stratonovich equation (a stochastic integro-differential equation). For
results concerning the solutions of this equation, the reader may consult, for example, the works of Fujasaki,
C
Kallianpur, and Kunita 13 ] and Levieux [7 ].
Problems associated with the Kushner-Stratonovich
equation are (1) the impossibility of obtaining closed form solutions in the general case and (2) the excess of computing time required for application of numerical methods (despite proof (Levieux [V ]) that under certain conditions, the solutions are sufficiently regular that numerical methods may be used).
By recalling the structure of the Kalman filter, we note another difficulty of the nonlinear filtering problem. For the linear problem, the optimal estimate is the solution of a linear stochastic differential equatiion
dx^ t = A-t dt + B dv t
where the matrices A, B, and the Brownian motion process vt depend only on the signal and observation processes.
Furthermore, the error covariance, that is, the matrix-valued function
E(t) = E( [x t t t tT
solves a deterministic Ricatti equation. This contrasts with the general case, for which a similar stochastic differential
equation for the conditional expectation
Rt=t f x.p(x,t) dx Rn
cannot be given. Rather, on writing the equation
7
it happens that the right had side involves second order conditional moments which are nondeterministic. Similarly, the stochastic differential equation for an n th order
conditional moment will have, on the right hand side, terms including n+lst order conditional moments.
The difficulties inherent in the nonlinear filtering problem suggest using filters which are suboptimal but finite dimensional. This makes it important to know the accuracy of such filters. For the nonlinear situation discussed
above, where a least squares estimate is desired, the error E( [xt - xt]T[x tit] '
for the suboptimal estimate Xt, trivially sAtisfies
E( Ext ~ xtjxt ~ xt] )>: E( [xtrtT xxt t
t-We would necessarily like to know more.
To attack this problem, it is necessary, in principle, to compute
E( [xt - ]T[ -t~ t]
This is, at best, very difficult. An alternative approach, considered by several authors, notably Bobrovsky, Zakai, and Ziv .(working as collaborators) [l , 14] and Gilman and Rhodes
[4
, 5 1, has been to analytically bound this number. We shall show later, for a specific elementary, but highlynonlinear problem, that these bounds are not necessarily
8
Simulations have been used in attempts to measure the estimator error (see Bucy [2 1 for results concerning the phase-lock loop, for example). Of course, in the absence of
any analysis, it cannot be known how precise these measurements are.
1.2 A specific class of nonlinear filtering problems As has been mentioned above, we are interested in nonlinear systems. A subclass of these which has received attention, by workers including Willsky and Marcus [ 8.133 is that of bilinear systems with linear observations.
These are systems of the form. dxt = Axt dt + Bxt dwt dyt =Cxt dt + dvt
For several specific problems, the structure of optimal estimators has been analyzed and various approximations, which may serve as suboptimal filters, have been proposed.
The reader may consult, for example, Willsky [121 for a discussion of these issues.
In what follows, we will discuss our efforts to understand the performance of the optimal filter for the following phase-tracking problem, which is perhaps the simplest bilinear problem.
9
Example: A particle on the unit circle in R2 undergoes
Brownian motion, having begun at time t=0 at the point (1,0). The x and y coordinates of the particle are observed during
the passage of time, but the observations are not precise. Rather, they consist of the true coordinates, corrupted by additive noise, which is taken to be white noise. The Ito stochastic differential equations describing this situation are
de = Q1/2dvt
dx = (cos 0)dt + Rl/ 2 dw1
dy = (sin 8)dt + R1/2dw 2
where v, wI , and w2 are independent standard Brownian
motion processes and we impose the initial conditions 0(0)=0, x(0)=1, and y(0)=0.
The filtering problem consists of finding a random variable S(t) , which is an estimate of 0(t), based on the
observations x(s) and y(s), for Osct. Technically speaking, we require 0(t) to be it-measurable, where
t is the a-field generated by the random variables x(s) and y(s), for O<sct. This estimate will typically be chosen to minimize some error criterion, for example the conditional expectation
E(j6(t) -
a(t)1
210
(Note that since 0 (t) and 0 (t) are points on the unit circle, we identify 0(t) and 0(t)+2r,, and similarly for &(t).)
Remark: This is a problem of the class mentioned in
section 1.2, with a bilinear system and linear observations. It is, in fact, the same as the problem of obtaining
estimates 6t and t for the system
a 1 0 -1 aI Ial
dl| irQl/2 dvt -
|1|dt
I N0 - 1 0
with linear observations
dx = a dt + Rl/2dw1
and dy =
5
dt + R1/2dW2(Note: the system equations are verified upon setting
a = cos 0,
5
= sin 0 , and using the Ito differential rule to obtain the stochastic differential equations for a11
2. The Phase-tracking Problem: Introduction
An optimal solution to the phase-tracking problem We consider the Kushner-Stratonovich equation--the
stochastic integro-differential equation for the conditional probability density of 0 (conditioned on xs and y ,
for O<s<t)--for the phase-tracking problem of the example of section 1.2. It is (KS) dpt = Q 2 dt + p [sinS - s cosb -
g>s
] 2a02 1/2 dv2 R wheredv = tI-(sin e)dt and dv2 = -x-(cos 6)dt
R /2 2R1/2
and, for any random variable u, 6 = E( ul
1
).In (KS), p(0,t) is the conditional probability density function of 0(t). We will study solutions of (KS) by using Fourier series. We expand
00 inO
p(0,t) = E c (t) e n=-c n
The c are it-measurable random variables. Upon
n t
substituting the series expansion into (KS), we obtain the infinite coupled system of stochastic ordinary differential equations
2 dc n=-Qn c dt + 1 2 1/2 R d7/. M2 n - 2-where m =Cn-1 -cn+l+ CnIm(ci):cn-1 +cn+1 n 2i 2 cnRe (c
These equations ..suggest certain suboptimal estimators; we obtain, for example, an approximate filter upon
ignoring all but a finite number of these equations (that is, set cn=O, for {nI sufficiently large).
3. Bounds on Estimator Error: General Results Applied to the Phase-tracking Problem
Survey of available results
In [14], Zakai and Ziv consider the problem of bounding the mean square error of the optimal estimator for a process of the form
dx = xj+
j=
l,2,...,n-ldxn = m(x(t)) dt +
B(x(t)) dwt with the real-valued observation
dyt = 9(xk(t) ) dt
+ Rl/ 2dvt
For the case when x(t) is Gaussian, they obtain the lower bound t 22 12 E([x. - x 2 ) E(cx.(t)2) expf - f E(o (s)2) ds] where a (t)2 = E X {(0),[ k(t),t) - Ex(0)x9 (k(t),t)]2}
and o.(t) is the conditional (upon x(O) ) variance of x. Example: if n=l, M(X)=, B(x)=Q1/2, and g(x)= sin x,
we have the phase-tracking problem, with only one observation. For large t, a (t) 2 is approximately equal to 1/2,
since the probability density function of 0 tends to a uniform distribution, as t+=.
that the lower bound is (asymptotically) Qt exp[ -t/2Q ],
evidently a useless result.
Moreover, the upper bound derived in the same paper is of no help, in this case, as it is valid only for the case
g(x)=x, j=k=n.
A different type of result is due to Gilman and Rhodes [4 ,5 ]. They consider the system
dx = f(x,t) dt + D(t) dv dz = g(x,t) dt + G(t) dw
where v and w are independent standard Brownian motions and the functions f and g satisfy the so-called "cone-boundedness condition"
Sf (x+S,t) -f (x) -A(t) 6 8_ a (t) 6
Ig(x+6,t)-g(x)-B(t)6i _< b(t) 6|
The Functions A(t) and B(t) parameterize an associated nominal linear system
dx = A(t): dt + D(t) dv
dz = B(t)x dt + G(t) dw
The first result obtained by Gilman and Rhodes is an upper bound on the error covariance of the suboptimal filter
Their upper bound is
E( [x(t)-X'(t)] [x(t)-'x(t)] T) < P(t)?,
where P(t) solves the differential equation
dP = V + KWKT +(A-KC)P + P (A-KC)T +(a+c)P +(aI+CKKT)trP dt
Here W=GGT and V=DDT are assumed to be positive definite. Moreover, K may be chosen to minimize P(t). The second result, enunciated in [5
],
is the lowerbound
T
where P(t) is the error covariance of the Kalman filter for the associated nominal linear system. That is, it solves
dP = V + AP+ PA- PCTWCp dt and r(t) > 0 solves (1-r)er =e-d, where 2 -l 2 -l d(t) = f [a (s) IV (s) + c (s) jW (s) j ]trP(s) ds. 0
Example: For the phase-tracking problem;
1/2 cos xj / f10 0
f(x, t)=0, D=Q1/2, g(xt)= G=R/21 1
Isin o1 [x20
Hence,
16
Note: the fact that c=1 follows from the fact that
= 2[l -cos 6j .<62
Isin(x+6) - sin
x I 2l-cs6
<CSX6 o
For the upper bound, we have, letting K=[k1 k2],
T
dP = Q + RKKT +P dt
This has the solution
(t) = P(O)et + (Q+RKKT.) (e 1 )
But, K may be chosen to minimize P(t). This is done by K=O, so that
P(t) = P(O) et + Q(et-1)
In particular, the upper bound tends to infinity as t-cm.
In addition, the lower bound which Gilman and Rhodes derive is not useful in this case:
dP = Q, so $(t)=Qt + P(Q) dt
t
and d(t) = f P(s)ds Qt , as t-o.
0 R 2R
I
-r(t)'l-exp( -1 - Qt 2 2R
Hence, the lower bound is asymptotically
Qt exp[ -l - Qt 2 2R
which is not useful for large values of t.
Another general result, due to Zakai and Bobrovsky [ 1], shows that the estimation error for a given nonlinear
problem is bounded below by the estimation error for an associated linear filtering problem. For the special case of estimating a Gaussian process from nonlinear measurements, their results coincide with those of Snyder and Rhodes 1
91.
Snyder and Rhodes consider the system dx = F(t)x dt + V1/2dv
y = H(t)x,
where yt -i the -GOasaian Process to'beestimated by the observation process
1/2 dz = h(t,y) dt + W dw
The result is that the optimal filter error satisfies E([y(t)^y(t)]ly(t)-^(t)]T) > H(t)E(t)H (t)T where E(t) solves the Ricatti equation
dE = FE + EFT + V - EHTPH .
18 a f( Examp1 applgi We hav Hence, nd P(t) = E( JTWlj ),
or J = the Jacobian of h(t,y), (with respect to y). e: We illustrate the Snyder and Rhodes result by ng it to the phase-tracking problem:
e F=O, V=Q, H=l, h(t,y)=sin (t + y), and W=R
J=cos(wt + y), so that
2
P(t) =(E( cos (wt+y) ),
hence , as too ,
P(t) + 1 2R
Letting E. denote the limit of E as t ZC C, solves the algebraic Ricatti equation
0=Q - )2
2R
whence the error for the optimal nonlinear filter for the phase-tracking problem is bounded below by / .2QR
For completeness, solely in connection with the phase-tracking problem, we discuss the classical phase-lock loop. As discussed in Van Trees [101, this system may be used to
attack the problem of estimating a signal 0, from a noisy observation
s(t) = sin(wt + 6) + dw dt
19
processes with covariances of Qt and Rt, respectively. Note that et is restricted to the unit circle, so 0 and O+27 are identified.
The structure of the phase-lock loop is shown in Figure 1. The voltage controlled oscillator (VCO) is a device whose output, given input d , is cos(wt+f(t)).
dt
In [Q 11, 12, it is shown that the system of Figure 2 may be used to model the phase-lock loop. Heuristically, this is because the signal fed into the low-pass filter is
[sin(wt + 0) + d]cos(wt + 0)
dt
- [sin(2wt)]cos(6-O) + -[l+cos(2wt)](sinG) (sinG)
2 2
-2-1l-cos(2wt)](cosO) (cose)
2
+ !W [cos(wt+O)] dt
The low-pass filter removes the double-frequency (that is, the 2wt) terms and the last term is modelled as a white noise process d , where W has covariance Rt/2.
dt
It follows that the system of Figure 2 models the phase-lock loop, provided that the noise process df has strength 2R.
dt
We now discuss the mean square error for the phase-lock loop. We observe that if the operator "sine" is replaced by the identity, and if the gain function F(s) equals
20
of estimating 0 given the observation process zt (where dzt= Odt + dw
the modified system is merely the Kalman filter for this problem. This value of F(s) is used in the phase-lock loop.
Remark: A constant is the simplest choice of transfer function; with a gain of the form F(s)=a+(b/s), the filter is called
a second order loop.
It is possible to evaluate the asymptotic error covariance, that is
lim E( '{0(t)-0(t)} ), t+oo
for the phase-lock loop (where, for an angle
P,
{c} = min 1+2kl T
ke Z
This is discussed in [10] and illustrated in Figure 3. In the graph of Figure 3, the straight line is the asymptotic error for the Kalman filter for the linear problem obtained by replacing the sine operator with the identity, as above. The curve is the asymptotic error for the phase-lock loop.
Cos
(wt+
dt 21 de dt FIGUHE 2 d -W0", dt + + sin F I G YJ i Ll' 1 L ov! 1-18 s s ( S ) ...I Filter vc C22
<D • ('\J co • rl'°
• -rl '<:I" . • ,-t (\J • rl-
0 C\J • • rl 'd ,d aJ J.t ,....; • 0'°
0 • ~ • 0 C\t • 023
4. Bounds on an Error Criterion for the Optimal Filter for the Phase-tracking Problem.
We discuss, in this section, our efforts to find bounds for the error criterion
(1) EC (I - cos(e-6)
for the phase-tracking problem introduced in section 1. Section 4 is organized as follows:
(a) We show that (1) depends only on the Fourier coefficient c1 (t) of the series expansion of the
conditional probability density p(O,t), which is the solution of the Kushner-Stratonovich equation
for the phase-tracking problem (see section 2). (b) We derive stochastic differential equations for
fcn(t) 12, for all n.
(c) We study the asymptotic behavior of E(jcn (t) 2) and of related random variables.
(d) We analyze the (algebraic) equations for the limits
2 A1 T 2
I(Ic|nI ) lim - f E( Icn(t)/ )dt
n 7,, T 0
(e) We state a conjecture and generalization, which might strengthen these bounds. Unfortunately, the
conjecture and generalization together are generally false; we give a counterexample.
24
(f) We give correct arguments motivated by the erroneous conjecture to strengthen the bounds obtained by the analysis described in (d).
(g) We compare these results with those obtained by other workers using different methods for
a related problem.
4.1. Study of the error criterion E(l - cos(O-O) )
In this subsection, we suppose that the random variable 0,- with values on the unit circle, has the probability
density function p(0). The angle a is to be chosen to minimize
(2) E( 1 - cos(0-a) );
we show that the minimal value of (2) is
1 - Ic1I,
where p(O) has the Fourier expansion
M inS
p() = Ec e n n =- 0
Thus, if we seek, for the phase-tracking problem, a random variable 0, which is
(a) 3 -measurable
and (b) minimizes the conditional evpectation
(3) E (l - cos (e-0)
j
7t)
the conditional (with respect to
5t)
probability density function ofe
(the Fourier coefficients are 7it-measurable random variables). Therefore, the optimal choice of 0 will giveE(1 - cos(e-O) ) = 1 - E(Icj) To verify that (2) is minimized by 1 - Ic),
1 - cos(E-a) = 1 - [ei( 0-a) +e-i (0-a)] 2
Taking expectations,
E( 1 - cos(e-a) ) = 1 - l[c_le-a+-c1e ]
2 But p(e) is real-valued, so c_= c1.
Hence, (2)= 1 - Re(c1e ), which is minimized when
icx
Re(c 1e ) = clI, as was to be shown.
4.2 A stochastic differential equation for tcn(t)12
In this subsection, we will show that the following differential equation for Icn(t)| 2 holds:
dicn 2 = [-Qn2 n 2 + 1( cn-1 - cn+1 + cnIm(c) 12
Rj I 2i
+lcn-l + cn+1 - cnRe(c 1) 2
1 2 )]dt
26
Justification:
from section 2, we have, dc =-Qn2c dt+
n 2yn
1/2
dv
4
g2
]
J[,v
where v and v were defined in section 2, and
1 2 9 = cn-1 -cn+l + cnIm(ci) 2i = cn-i +cn+l 2 - cnRe Cn (c)1)
Hence, taking complex conjugates, d = Qn 2 ICdt + dc 2n 1 R1/2 ~g~V -11 dv 2
By the Ito calculus, we have
dcn 2 = cndcn + cndcn +1 tr GQGT 2R where G =
[g
1g
2
g1 g2 9192
and Q= LOThe desired equation follows from the observation that
T 01 2 2
tr GQG
K )
= 2(gj2 +g 2j
We now simplify the equation:
The term ( gli2 +ig21 2) is of the form
01 dt
1 0J
F
27
(4) IA-C +BIj2 + IA+C - BH12
i21 1
2-But, for complex z and C,
Iz + j2 =1z12 +1(12+ 2 Re(z); hence, (4) equals
Al2 + ICl2 +
IBI
2(12+ H2 ) - H Re(AB) + I Im(AB) 2 - H Re(CB) + I Im(CB) On setting A=cn-l B=cn C=cn4l H=Re(c I=Im(c1 )and substituting into the stochastic differential equation for lcnI 2 we obtain
(5) dl CnI2 22 cQRn
iI
2 + ICn-l12 +cn+12 R 2 +1cj1 21 cnI 2 - Re(c1cn-lcn + ccncn+91 dt + White noise28
4.3 Asymptotic behavior of the optimal filter error
In this section, we obtain the result
(6) QRn2 I(Ic 12) 1[ I(Icnl2) + I(IC 12
n 2 n-1 n+1
+ I(IclI 2 tc n12
- Re( I(c c c + c c -c)
1 n n+1 1 h-1 n
where, for a polynomial q in the Fourier coefficients, cn of the conditional probability density function p(O,t), the limit
T
I( q ) A lim f E( q(t) ) dt
T+oo
exists by a theorem of Kunita.
It is clear that (6) follows from (5) upon taking
expectations, integrating with respect to t from 0 to T, dividing by T, and taking limits, provided that the limits exist. In the remainder of this section we justify taking the limits.
To explain Kunita's result, we require an abstract formulation of the filtering problem. We motivate the
formulation with the example of the phase-tracking problem. The state space of the signal process is S1, the circle. The conditional probability density function, p(at) may be
29
viewed as a stochastic process taking values (for each time t and each point W in the underlying probability space Q) in C(S1 ), the space of continuous, real-valued functions on Sl. Moreover, for each W and each t, p(-,t) is a probability density function, so it may be viewed as a
1
probability measure on S . This is the desired abstract formulation: the filtering process is a stochastic process taking values in M(S 1), the space of probability measures on Sl. We use 7t to denote the filtering process, so, for our example, nt = p(-,t), where we suppress reference to Q.
We now discuss the abstract problem. The signal process xt has state space S, a compact separable,
Hausdorff space. Let M(S) be the set of all probability measures on S, endowed with the weak* topology (as the dual
space of C(S) ), under which it is a compact, separable, Hausdorff space. The filtering process ft is an M(S)-valued
stochastic process with transition probabilities H (v,).
t
Kunita's results [ 6] (Theorems 3.1 and 3.3) give conditions under which there exists a unique invariant measure of the
set of transition probabilitiesiHt (v, ). Recall that an invariant measure for the Ht (vi) is a probability distribu-tion. .',, on M(S), satisfying
30
(Intuitively: the probability that, at time t, the value of the filter process is in r, equals the probability that it began in F at time zero.)
The phase-tracking problem satisfies the hypotheses
of Kunita's theorems. Thus, there exists a unique, invariant measure, (, for the filter process, given by
1 T
4(') = lim T f t(v,1)dt
T+co 0
For the phase-tracking problem, S is the circle, S
Suppose that f: M(S1 ) -+ R is continuous. (As an example of such an f take any polynomial in the Fourier coefficients of the measure p, for p eM(S ); in particular, if the
COO
measure has the density function p(O)= E c e , n=-o n
let f be any polynomial in the c .)
n
For such a function f, we have, by definition, E( f(1t 10f110=v)= MS f(a)lt (vda)
0 M(S)
So that
T
I(f) = lim f E(f(HtI 110=v)dt
T co 0 T = lim - 1f[f f(a)llt(vda)] dt T+oO 0 M(S ) = f f(a)j(da), M(S )
31
*
weak limit.
Remark: This shows that the limit I(f) exists for all functions f of interest for the phase-tracking- problem. Moreover, it shows that I is a bona fide integral, that is,
32
4.4 Bounds for the phase-tracking problem
We begin this section by deriving, from equation (6), for n=l, several inequalities among the numbers I(1c1
1
2)F I(j2 j2), and I(IcI 4). We conclude the section by deducing, from these inequalities, a lower bound for I(1c1 ).This will give an upper bound for T
lim 1 f [1 - E( cos(0(t) -0(t)) )]dt
T+oo T 0
for the phase-tracking problem.
Equation (6) for the case n=l is
(7) (l+QR) I('c 12) =[1 + I(1c2 )] + I(Ic |4 2
-I( Re(c1 c2) We now obtain lower bounds on I( cl 2) and I( ) and deduce lower bounds for I (j c
I).
We remark that, using the Holder inequality and the fact that IcnI l, for all n,
I( Re(cjc2 ) < I(jc 2 c21) Ic 1 14) I(1c1 2 )
so that, from (7), we obtain
(8) (l+QR) I(Ic12) l[l + I(c2 )] +Ic )
2c
- I(IclI) I(1021 2)'
Again, by Holder,
33
so that, from (8), we have
(9) (1+QR) I(Ic114) l>[l+I(1c21 2)] + I(Ic4
- Ic1
j1Tv
21Note that (9) is an inequality of the form ax2+bx+c < 0, with a >0, where x= I(1c.14)
Hence, we conclude that x e [r1 r9, where r and r2 are
the roots of the quadratic equation. In this way, we deduce that
(10) I(cO >w(X) -) 4 l[ (X+l+QR) - JQR+2QR( XY)--(-X) 1 2
4 where, henceforth, X - I(jc2f2).
The following fact will be useful later: (11) X _> 1 + QR - AQR(2+QR)
We derive it from (9), which, after setting x =
Ic11j),
is
(l+QR)x . 1+2 + x - Ax
2
whence
0 < (A-x) 2 < -[ 1 + x2- 2(l+QR)x ] = P(x)
Figure 4 illustrates the situation. Equation (11) is merely the observation that
A
is greater than the smaller root of P(x). Also, (8) may be rewritten, in terms of A, as34
In the remainder of this section, we will use some of the inequalities obtained above to deduce a lower bound for I(1c11). The analysis takes the following course:
For each value of A, we show that the set of possible values of (I(qc1t,I(1c1
12))
is a restricted region of RFrom this, we deduce, assuming a particular value of A, a lower bound for I(1cl1). Then, by minimizing this
lower bound over all A e[0,1], we obtain a lower bound for
I(ic1i).
More precisely, for each fixed value of A, (12) gives a lower bound for (1c112), as a function of I(jcj|4). In addition, we know
(13) I(c 1|4) < I( c ,2) < I(Ic 11),
because
1c
11<
1and 2 4
(14) I(icli) Icj , by the Holder inequality.
These facts are illustrated in figure 5, in which values of I(c 114) are on the x-axis and values of (jc
1j2 ) are
on thb y-axis. The shaded region represents the set of values of the pair (I(Ic1|4), I(1c1 12)) allowed by (10),
(12), (13), and (14). The numbers z(X) and w(A) are defined in the figure. The value of w(A) was given in (10);
35
S2
z (x) =W(l+2QR) + 2QR _-__
2QR 1+2QR
The minimum value of z(A) is attained for X = 1 1+2QR
4 Because we do not know the true value of I(jcl ) ,we analyze, for fixed x, the possibilities
(a) w(x) < I(IcI ) <4) z(x) 4
(b) z(x) < I(cI1 )
If (b) holds, we apply (13) and (b) to deduce that (15) I(fjc) ;> 1
1+2QR
If (a) is true, we use a convexity argument to lower bound I(1c1 1). (Then, the minimum of these lower bounds is
a lower bound,for the given
xj
Then, by minimizing over all A, we obtain the lower bound (18).The convexity argument f6r case (b) is the well-known fact that, given a function f, the function F(r), defined by
F(r) = log[ f If(x)Ir dx], is convex. Hence,
-[F(4) - F(l)] > F(2) - F(1)
It follows from this, by substituting c1 for f, and using the integral I, that &
[I I , 2 3/2 (16) I(jc
)
[I(lclI)] [I(Icll 4)]1/2 4 Setting x = I(ic ) y = I(jc1 2 and v = 1+X2 2we see that our goal is to lower bound the function
ever that part of the shaded region of figure 5, if x < z(X). This will give, for each value of A, a bound for I(Ic), by (14). Letting
or which lower G(x,X) = v+x-x , 1/3 x (11) gives
,y3
< G( -3/2 x L(1+QR)We give a lower bound for the function G(x,x), by first minimizing G with respect to x, for fixed A, and then minimizing with respect to A.
In fact, G(x,A) is minimized with respect to x, for fixed A, at the point
x(k) =
L_[X+
16+17??] 2 64It follows that G(x(k),X) is minimized, for all A, when X = / x'"= / -1/-2'1.
i
w
We conclude , therefore, that if w(\) <x < z(X), then
(17) I(Ic
I)
[t2 3/2 (l+QR)Now, (15) holds if x > z(X) and (17) holds if w(\) c x< z(x). We conclude that, whatever the true true value of x and X,
(18) I(ic1i)> 1 3 3'
(l+2QR)1 4/F (l+QR
Remark: In section 4.6, we will improve this lower bound by deriving additional restrictions on the
&I&U.L.
4
I2
I+QR[2 +xir
w(A) z(A)
39
4.5 A conjecture and generalizations; a partial result; a counterexample
In this section, we do three things. We begin with a
2-conjecture that Re(E( C 02
)
)>
0 and suggest thegeneralized conjecture tbhat
Re( I( c'c
c
)
>0,
for all n=l,2,3,...
Itnrn+1
-Then, we support the conjecture by proving that
2- 4 2 i
Re( c1 2
)
21c11
-1c
11--8
We conclude this section with a counterexample to the generalized conjecture.
2-We conjecture that Re( E(c1 c2) > 0. This has the
following intuitive interpretation; The random variable c is used to obtain an angle estimate 0, which, perhaps,
maximizes E( Re(e2i(O -')'t) (This is analagous to
e,
which maximizes E ( Re(e'O00
fl
?%).) We observe that2- -ie 2 2i.
cl c2 = [E( e ] [E(e)H
= e2i(O-) :E(e( 0 0 )I t 2 [E(e2i(e
-)j t
= e i( I-c112 P, for P> ,
by the assumption that S maximizes the conditional
expectation. These equations suggest that Re( E(clc2)
measures the difference of the two estimates, 8 and
&
are good, so that ei (C- 2)- 1,
2-whence Re ( E (c 1 c 2 )->Q0
The generalization of this conjecture is Re( E(c c nc ) ) > 0, for all n.
I n n+l
A similar intuitive argument suggests that this would reflect the accuracy of estimates based on the higher order Fourier coefficients. We will show, however, that the generalized conjecture cannot be true if QR=2.
Although the generalized conjecture is generally false, and we cannot prove the conjecture, we now prove that
Re(c1 c
2 >-)
In fact,
-cl) = [ E(l - cos(e-O) It7 2
SE1( [1 - cos(o-e)]1 t) , (Holdef)
But
-2iq.
(l-cos43 2+ Re( -2e )
2 2
so that
2i0 A
E( [1 - cos(O-6)] t) 3 + Re[ C2e - 2c eJ ]
A 2 7
)
2c
2 = + Re[ c12E2 - 2jcJ ] 2 ,22 21c11 because 1c11= cle .41
(19) Re(c1 c2) - 21c - c
But, for all real x, 2x2 x > -l; the conclusion follows.
Counterexample: We now give a gounterexample which shows that the generalized conjecture fails if QR=2.
If
E( Re(c cnc n+) ) >0, for all n, then by (6),
QR I(IclI 2) [ + I(1c21)]
and QR I(cnI2 ) f1[ I(Icn-1 2) + I(jc n+ 12) ] + I(Ic 2
for n>1.
Letting xn =I
cj
2), to simplify the formulas,x <c [l+x2] 1-2QR
xn 1 [xn- +x n+1], for n>l. 2(QRn -1)
We analyze this system of inequalities by considering the finite system: x1 <a 1 + b1x2 x sanx + bnx , n=2,...,N-1 n - n n-l n n+l x c ax + b N - N N-l N
subject to the condition that all a and b are non-negative.
42 By induction, x c< a x N-k N-k N-(k+l) + g EN--k f-where f =1 N f = 1 - aN-(kl) b-1 k N-k N- (k-i) = bN ....bN-k N '00 6fN-k
Thus, for example,
l-j-+g 1 = a1 +
N
a2b 1 1 1 - a3b2 1- a-b3 1 - a5b 4For the phase-tracking problem, a =b =1
1 1 2QR
a =b = 1 ,nl.
43
If QR=2, then a1 =b 1 1 4 a = b = 1 , nl n n 2 (2n2-l)It follows that g = 0, for all n, and that -
<
0 .135
2-But, by (11), _*>0.17, a contradiction. This concludes the counterexample.
44
4.6 Strengthening the bounds
We have just shown that the conjecture and the
generalization must be generally false. Let us summarize the information which we do have. For x,y,X, and v as in section 4.4: (11) QR + 1 - /QR (2+QR)' X X (20) v > x + QRy, because (1+QR)y + v + x = I( Re(c1 c2)) (7) 2 -and I ( Re (c c 2) > 2x -y (19 ) (12) y > F (x) 1 V + X - XvW ) 1+QR (10) x > w (A)
These inequalities are illustrated in Figure 6. The shaded region represents the values of the pair (xy) which are permitted, by these inequalities.
If the pair (x0 y0Q where VX 0 = F(x0) =y
QR
lies above the line y = x, we may invoke the convexity
argument,introduced in section 4.4, to lower bound I
(
Ic).
I(1c 1 ) _ min
[permissible x pairs (x,y)]
45
It is easy to see that the minimum is attained at (x,y0)
3
whenever (setting f(x,y) =
1,)
the vector Vf(x0,y0)x
satisfies if (xny)9.v > 0, for some vector
V= (x1 - x0) + (y -
y
0)3,
where (x1,y1 ) is a permissible pair. Note that x ________ 2+4R+ 2R) X0= 1 [QRX +(l+2QR) + (2+4QR+Q2R 2 4 (l+2QR) 2YO
QRThis improves the bound obtained in section 4.4, because the region over which the function f(x,y) need be minimized has been made smaller.
In section 4.7, we give examoles of the use of the use of these bounds.
46
FIGURE 6 Q R /t
I I I ____ I w(X) V QR f z(X)IZYT
v47
4.7 Comparison of results
In this section we compare our results with estimates of error bounds obtained by simulations for the following related phase-tracking problem:
The signal process satisfies dO = wdt + ~/2dw
and there is a single observation process dx= (sin O)dt + R4/2dv
A
The goal is to find an estimate 0 of 0, which minimizes the conditional expectation
E( 1 - cos(0-0)
IY
tThere are simulation results available for two estimators: (a) the classical phase-lock loop, and (b) a suboptimal filter proposed in [12].
We may compare our results with simulation results
because we shall show how the above problem may be modified to give the one which we have -studied. We argue
non-rigorously; for details, the reader is referred to Van Trees [10 or Viterbi [11].
The observation process solves the differential equation
d = s in(wt + Q/ 2w(t)) + fl/2 dv dtdt
48
(sin wt)dx 1[1 - cos 2at] cos (Ql/2w (t))
dt 2 1 -1/2 + -[sin 2wt]sin(Q w(t)) 2 + R (sin wt)d dt
This signal is now passed through a low-pass filter to remove the double-frequency (2wt) terms. Moreover, we claim that the process
R1 / 2(sin wt)d-dt
may be replaced by
[4]
1/2 dvl2_ dt
for the standard Brownian motion process v1 (t). Thus, we have an observation
s1 (t) = cos (Q1/2W(t))+-R1/2 dv .
2aL2 dt
Setting sl =
Itl
gives2dt
dz1 = cos (Q1/2w(t)) + (2') 1/ 2dv1.
If we multiply the original observation process by (cos ot) and argue in a similar way, we obtain
dz2 = sin(Q1"2w(t)) dt + (2R) 1 /2dv 2'
49
single observationmay be compared with the one which we have studied, provided that we set Q= and R=2f', where Q and R are the noise strengths for the single-observation problem and Q and R are the strengths for the problem considered in this thesis.
We tabulate results in Figure 7. The first two columns are taken from Willsky [ ]. They are estimates of the filter error based on simulation of two techniques: the classical phase-lock loop and a filter proposed in [ ].
The third column contains the lower bound of inequality (18). The fourth column contains, for the cases
Qr=1.0 and QR=l.7, a bound obtained by applying the results of section 4.6.
In applying the result of section 4.6, one issue is to obtain tight bounds for
X=
1(
c21 2) . To obtain a lower bound forx,
we use (11). For the upper bound, we usethe infinite system of equations (6) to derive the infinite system of inequalities
(QRn2 - 1)I(1Cn 2) l[ I(c1 2) + I(IcI2+
+ (lCni 2)[ ( IcliI)+ (Icn+l1)I
These follow frnm the Holder inequality and the fact that
50n
(18) I(Ic
122
1+ I(1c2 2)1 2QR
By starting with the knowledge that I(lcnI2) < 1, for all n, these inequalities may be used to give smaller bounds.
These may, in turn, be used to give still smaller ones. By continting this iteration, it is possible to show, for
example, that if
QR=l.7, then IOCI 1|2) < 0.311 and I(|c2 2) < 0.056, so that A < 0.237. Hence, we have A E[0.19,O.24]. By using the argument of section 4.6 to limit the region of permissible values of x and y, and by invoking the convexity -argument, we may show that I(1c11) > 0.26. This gives the bound for the phase-tracking problem of
T
lim T fPE[l - cos(O(t)-O(t))-] dt < 0.74,
T--C 0
which compares very unfavorably with the estimates obtained by simulation.
It shopld be noted that the iteration procedure for upper bounding is useful only for large values of QR. In the case of QR=l, it seems difficult to improve significantly the bound A E[O.26,0.59].
51 FIGURE 7 0.0581 0.1120 0.2253 0,.3687 0.5002 0.5692 0.0608 0.1171 0.2447 0.4098 0.5312 0.6158 SIMULATION SIMULATION FOR FOR PHASE-LOCK SUBOPTIMAL LOOP FILTER ESTIMATES OF 1-I(1c2 0.10 0.15 0.26 0.49 0.68 0.79 0.01691 0.0506 0.16 0.49 1.0 1.69 SECTION SECTION 4.4 4.6
UPPER BOUND FOR l-(c1
0.67
0.74
QR I
52
5. Conclusions
It is our belief, upon observing the weakness of the error bounds obtained by our technique, that there must
exist better ways to obtain bounds for the nonlinear
filtering problem. The weakness of the method used here
must lie in the convexity argument. It cannot generally
give a bound sufficiently tight as to be useful. The only
way suggested by the technique for improving the bound is
to obtain more information which further constrains the set of permissible values of the random variables and numbers
which we have used in our arguments (for example, I(jclj 2
One way to do this might be with the conjecture of section 4.5. Of course, this should not be attempted before learning how the bounds may be improved, by applying the conjecture.
53
REFERENCES
[1] B.Z. Bobrovsky and M. Zakai, "A Lower Bound for the Estimation Error for Certain Diffusion Processes," Memorandum No. ERL-M490, Elec, Res. Lab.,
University of California, Berkeley, October 1974. [2] R.S. Bucy and A.J. Mallinckrodt, "An Optimal Phase
Demodulator," Stochastics, Vol. 1, 1973, pp. 3-23. [3] M. Fujisaki, G. Kallianpur, and H. Kunita,
"Stochastic Differential Equations for the Nonlinear Filtering Problem," Osaka J. Math., Vol. 9, 1972, pp. 19-40.
[4] A.S. Gilman and I.B. Rhodes, "Cone-Bounded Non-linearities and Mean Square Bounds-r-Estimation Upper Bound," IEEE Trans. on Automatic Control, Vol. AC-18, No. 3, June 1973, pp. 260-265.
[5] I.B. Rhodes and A.S. Gilman, "Cone-Bounded Nonlinearities and Mean Square
Bounds---Estimation Lower Bound," IEEE Transactions on Automatic Control,. Vol. AC-20, No. 5, October 1975, pp. 632-42.
[6] H. Kunita, "Asymptotic Behavior of the Nonlinear Filtering Errors of Markov Processes,"
J. Multivariate Analysis, Vol. 1, 1971, pp. 365-93. [7] F. Levieux, Filtrage Non-lineaire et Analyse
Fonctionnelle, Rapport de Recherche no. 57, IRIA, February 1974.
[8] S.I. Marcus, "Estimation and Analysis of Nonlinear Stochastic Systems," Ph.D. dissertation, Dep.
Electrical Engineering, M.I.T., Cambridge, Mass., May 1975.
[9] D.L. Snyder and I.B. Rhodes, "Filtering and Control Performance Bounds with Implications for Asymptotic Separation," Automatica, Vol. 8, No. 6, November 1972, pp. 747-53.
54
[10] H.L. Van Trees, Detection, Estimation, and
Modulation Theory--Part II: Nonlinear Modulation Theory. New York: Wiley, 1971.
[11] A.J. Viterbi, "Phase-Locked Loop Dynamics in the Presence of Noise by Fokker-Planck Techniques," Proc. IEEE, Vol. 51, 1963, pp. 1737-53.
[12] A.S. Willsky, "Fourier Series and Estimation on the Circle with Applications to Synchronous Communication--Part II: Implementation," IEEE Trans. on Information Theory, Vol. IT-20, No. 5, September 1974,
pp. 584-590.
[]3] A.S. Willsky and S.I. Marcus, "Analysis of Bilinear Noise Models in Circuits and Devices," Monograph of the Colloquium on the Application of Lie Group Theory to Nonlinear Network Problems, April 1974,
IEEE Pub. No. 74 CHO 917-5 CAS.
[14] M. Zakai and J. Ziv, "Lower and Upper Bounds on the Optimal Filtering Error of Certain Diffusion
Processes," IEEE Trans. on Information Theory, Vol. IT-18, No. 3, May 1972, pp. 325-31.