Série Mathématiques
B. S AGALOVSKY
Maximum likelihood estimation for discrete-time processes with finite state space ; a linear case
Annales scientifiques de l’Université de Clermont-Ferrand 2, tome 69, série Mathéma- tiques, n
o19 (1981), p. 229-236
<http://www.numdam.org/item?id=ASCFM_1981__69_19_229_0>
© Université de Clermont-Ferrand 2, 1981, tous droits réservés.
L’accès aux archives de la revue « Annales scientifiques de l’Université de Clermont- Ferrand 2 » implique l’accord avec les conditions générales d’utilisation (http://www.
numdam.org/conditions). Toute utilisation commerciale ou impression systématique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit contenir la présente mention de copyright.
Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques
http://www.numdam.org/
MAXIMUM LIKELIHOOD ESTIMATION FOR DISCRETE-TIME PROCESSES
WITH FINITE STATE SPACE
A LINEAR CASE
B. SAGALOVSKY
Universidad Simon
Bolivar, CARACAS,
VenezuelaPRELIMINARY REMARK : This report,
presented
at the Ecole d’Ete de Calcul desProbabilites, Saint-Flour, July, 1980,
is abstracted fromreference
[1].
The reader should go to this reference for formalproofs
of the resultspresented
and for some additional results.THE PROBLEM :
Consider a process
Xo, xIt X2$
..., which takes values on a finitestate space S
= ~ 1, 2,
.. , ,s}.
Denoteby ¿( n the past
of the process
up to and including
the n-th transition, i.e., Fn =o{Xo xit
..., X }.
Given
3l ,
we can consider theprobability
of the processgoing
tos tate j
in time(n
+1),
which we can write asIn what follows we assume that
i n
is known as a function of an unknownparameter a,Write
it as and look at thebehavior
of the MLE(maximum
likelihoodestimate)
of a.We make the
following restrictive assumptions :
Al : a is
real,
and the true valuea°
is known tobelong
to abounded interval I =
[
a , a]
A2 :pi (a)
is linear in a, that isn
A3 : The effect of a is restricted
by
thefollowing
condition :3K
> 0 such that for each n, for eachpossible
evolution of theprocess up to time n, we have that for
every j E S,
eitherWe can
interpret
A3 asstating that,
at time n, we know which statesmight
beoccupied
at time(n+1),
and we know that theprobability
ofoccupancy of such states is at least K. Even
though
current work seemsto indicate that
assumptions
At and A2 can berelaxed,
A3 seems to beessential in .the
developments
that follow.AN EXAMPLE :
The
example
that motivated thisstudy
hasalready
beenanalyzed
inreferences
E2]
andE3] by
differentmethods,
and wassuggested
to thisauthor
by
one of the authors in[31,
Professor P.Varaiya.
Consider a Markov Chain with state space S =
I I p 2,
.,.,s}
whosetransition
probabilities {p.., i, j E S} depend
both on an unknownparameter a and on a control action u that we can
apply
on each transition.That
is,
Pij
=u) .
After each transition we can estimate a from our observations on the chain. Call
ocn
this estimate. We say that u is anadaptive
control ifu =
u(X , out
thatis,
if the control we exert at time ndepends
both onn n n
the state we occupy at that time and on the estimate we have of what the true value of a may be.
As the control
depends
on the wholepast
of the chain(through a )
n
the
resulting
process is nolonger
Markov. We canimagine
morecomplex
ways of
selecting un based,
forexample,
both onan
and the estimatedvariance of an, etc.
n
Assumptions
Al and A2keep
almost the same form in thisspecial
case,’
and A3 takes the
simpler
form :NOTATION AND MAIN RESULTS :
We define the likelihood of a
given a E
I at time n asand the
log-likelihood
where we have denoted
Observe that p
Assumptions A2,
A3 allow us to consider the derivatives ofLn(a),
whichtake a
simple
form :We can now define the MLE of
a°
at time n,a. ,
to be the smallestelement of I = such that
To
study
theasymptotic
behavior ofa;n
it now suffices to look atthe
asymp to ti c
behavior ofL’ ,
as thef o l lowing p ic ture
suggests :Define
To
simplify
notation we assume, without loss ofgenerality,
thatao = 0,
andalso -1 E I, 1 E
I.Now ~
Using
the factthat,
sinceit can be
easily
shown thatE’ .(0)
m = 0 i~mand, furthermore,
for all m;Take now a 0
f ixed,
theargument being symmetric
for a > 0. ThenEm (a) >
0 andL’(a)
turns out to be asubmartingale.
We candecompose
thissubmartingale by considering,
forthe
martingale
and the
increasing
process(where
we havedropped
the a’s inYm(a), etc.)
After some
manipulations,
it is easy to show thatTherefore,
theasymptotic
behavior ofA n
iseasily
determinedby
theasymptotic
behavior ofaa
as n goes to 00.
Suppose
had the same
asymptotic
behavior as n. Then we would haveand our a 0 could not be the MLE for n
large (see
the first or thirddrawings
in ourprevious picture).
To have L’ of the same order of A we would need
n n
To see under what conditions this would
happen,
considerAfter some
algebra,
one can show thatn-l
Let’s call B n
= Z V ,
the process of"variations"
of M . Thism=o
m n
process
plays
animportant
role indefining
the behavior ofM , since
n
it
is, somehow)
the natural time scale for M . Inparticular,
we canfind in Neveu
E4] that,
almostsurely,
for eachrealization,
converges to a finite limit M
By
the bound we got onVm,
we have that I , so that weconclude
that,
for almost allrealizations,
If
An ~ 00
andBn
remainsbounded,
thenIn either case, , whence and we can now
wr i te ,
wi thno
question mark,
(an analogous
result states thatL’n(a) -
-oo a. s, if a >0) ,
n
We
already suggested
an argument forshowing that,
ifL’(a)
+ 00, thenn
a cannot be the MLE for
aO.
This suggests the
following theorem,
whoseproof
can be found in1 IJ.
THEOREM :
Except
for anegligible
set ofrealizations,
if the sequence of MLE’s has an accumulationpoint a* 0 aO
=0,
thenCorollar :
Under the conditions of the
theorem,
Some other results can be proven
using
ourknowledge
of the limitbehaviour of Ll. n The most
important
seems to be thefollowing.
Proposition :
a
a*,
wherea*
maydepend
on the realization.n n
APPLICATION :
We can now
apply
the results we obtained to theexample
thatmotivated this
work,
a Markov Chain with transitionprobabilities.
Let’s assume
A4 :
un, the
control in force after X has beenobserved,
is ofn n ’
the
adaptive
form u n=(f(,a .
T nX );
n andaij (ç(a,
3. j Ti) )
is continuous in a, for everyi, j E
S.Proposition :
Under Al to
A4,
and except for anegligible
set ofrealizations,
ifthe sequence of MLE’s has an accumulation
point a* ~ ao
=0,
then0 for every state i that is reached
infinitely often, ij
for
every j E
S.A5 : The chain is
irreducible
for allpairs (a,u)
in the sensethat :
Vi.jes
such thatpi Q.- ~ . i Q. (a,u)
>0,
For a chain
satisfying
A5 Feller shows that all states are reachedinfinitely
often. This result carries over to our situation(using A3),
and we obtain our final
Proposition :
Under Al to
A5,
andexcept
for anegligible
set ofrealizations,
a n
This last result can be
phrased
assaying
thata*
is suchthat,
ifwe were to
take
the controlU(i) - ~(a*,i), depending only
on thepresent state of the
chain,
thena°
anda*
would beindistinguishable,
since
they
wouldproduce equal
values of all the p.. ’s.This characterizes the set of all
possible
limitpoints a*
of the sequence of maximumlikelihood estimators.
REFERENCES :
[1] B. Sagalovsky. "Adaptive
ControZand
ParameterEstimation in Markov Chains : 03B1 Linear Case ".
InternalReport USB/79/12,
Univ.Simón Bolívar, Caracas,
1979. To appear, IEEE Trans.onAut.Control,
1981.[2]
P.Mand1, "Estimation and
ControZ inMarkov Chains ", Adv.Appl.Prob.,
6 , 40-60,
1979.[3]
V.Borkar,
P.Varaiya, "Adaptive ControZ of Markov Chains .
I :Finite
Parameter-Set",
IEEE Trans. on Aut.Control, AC-24,
953-958,
1979.[4]
J.Neveu,
"Discrete-ParameterMartingales",
North Ho11. Pub.Co.,
1975.
[5]
W.Feller, "Introduction
toProbability and its Applications",
3rd
edition, Vol. 1, Wiley,
1968.ACKNOWLEDGEMENT :
Besides Professor P.
Varaiya,
whosuggested
theproblem,
and theStatistics
Group
at UniversidadSimon Bolivar
whichprovided suggestions
for alternative