R ENDICONTI
del
S EMINARIO M ATEMATICO
della
U NIVERSITÀ DI P ADOVA
P AOLO D AI P RA
A minimum entropy problem for stationary reversible stochastic spin systems on the infinite lattice
Rendiconti del Seminario Matematico della Università di Padova, tome 92 (1994), p. 179-194
<http://www.numdam.org/item?id=RSMUP_1994__92__179_0>
© Rendiconti del Seminario Matematico della Università di Padova, 1994, tous droits réservés.
L’accès aux archives de la revue « Rendiconti del Seminario Matematico della Università di Padova » (http://rendiconti.math.unipd.it/) implique l’accord avec les conditions générales d’utilisation (http://www.numdam.org/conditions).
Toute utilisation commerciale ou impression systématique est constitutive d’une infraction pénale. Toute copie ou impression de ce fichier doit conte- nir la présente mention de copyright.
Article numérisé dans le cadre du programme Numérisation de documents anciens mathématiques
http://www.numdam.org/
for Stationary Reversible Stochastic Spin Systems
on
the Infinite Lattice.
PAOLO DAI
PRA
1. Introduction.
In this paper we formulate and solve a variational
principle
that be-longs
to a wide and very much studied class ofproblems that, through-
out this
work,
will be called minimumentropy problems for
stochasticprocesses. We start
by giving
a basicexample
of aproblem
in suchclass.
Suppose
we aregiven
a stochastic process X =[0, 1]}
tak-ing
values in a measurable space E. Thepath space 6)
= is pro- vided with the natural a-fieldgenerated by cylinder
sets. The process X induces on 6D aprobability
measure that we denoteby
P. In the restof the paper we often
identify
E-valued stochastic processes andproba- bility
measures on W. The collection ofprobability
measures on 6) is de-noted
by
Now let
Q
be another element ofM(6D).
The relativeentropy of Q
with
respect
to P is defined to bewhere
h(Q P)
is assumed to be + 00 if P orL 1 ( Q ).
Noticethat, by
Jensen’sinequality,
0 andh(Q I P)
= 0 if andonly
ifQ
= P. Now let A be a subset ofX(6D),
that will be called the con-straint set. A minimum
entropy problem
consists inminimizing
under the constraint
set Q
E Aand,
ifpossible,
infinding
aQ
E A thatrealizes the minimum.
(*) Indirizzo dell’A: Universita di Padova,
Dipartimento
di Matematica, ViaBelzoni 7, 35100 Padova,
Italy
andRutgers University, Departement
of Mathe- matics, New Brunswick, NJ 08903, U.S.A.A minimum
entropy problem
is therefore identified once a stochas- tic process and a constraint set aregiven.
Thetype
of constraint set which is most often studied in the literature is the one thatgives
rise tothe so called
Schroedinger problem ([1, 5, 7, 8,10,11],
and it consists infixing
the initial and finalmarginal
ofQ.
To be moreprecise,
let us de-note
by t E [ 0, 1 ],
theprojection
ofQ
at timet,
i.e. for G E E measurableMoreover let g, v be two
probability
measures on E. In theSchroedinger problem
the constraint set is of the formRoughly speaking,
to solve theSchroedinger problem
means to con-struct a stochastic
«bridge» from [A
to v which is «as close aspossible»
tothe
given
process P. The «closeness» we aretalking
about in this state- ment is wellexplained by
thelarge
deviationinterpretation
of an en-tropy problem [5],
that we nowbriefly
sketch.Consider the
product space 0 and
suppose weprovide
it withthe
product
measure P In other words we aregiven
a collec-tion
X(i) =
E[0, 1]},
iE N,
ofindependent copies
of X. Forn E N and oi E SZ we define the
empirical
measureL~ ~
E3K(m) by
where is the Dirac measure concentrated at
w(i)
EBy
the Er-godic
Theoremfor P-almost every oi, where the limit is taken with
respect
to somesuitable
topology
on(for
instance the one inducedby duality by
the bounded measurable functions on
(D).
Now let C cX(6D)
be such thatP ~
C. A refinement of(3)
isgiven by
thefollowing large
deviationprinciple,
which holds under someregularity assumptions
on C:In
particular,
forQ
EX(6D)
andNQ
a «small» setcontaining Q (small
with
respect
to the variations ofh(’IP»
onehas, roughly,
The
asymptotic
estimate(4) yields
thefollowing interpretation
of thesolution of an
entropy problem
with constraint A: thesolution Q
of theproblem is,
among the elements ofA,
the evolution which is the mostlikely
to be«observed»,
in the sensethat,
in the limit maxi-mizes the
probability
thatLn, (JJ
is close toQ.
The
entropy problems
we have described so far make sense for anarbitrary
process X. Thepicture
is somewhat different when we knowa
prior
that the process X =IX(t): t e.R},
now defined on the wholereal
line,
isstationary,
i.e. theprobability
measure P it induces on thepath space 6D
=E R
is invariant under themaps ( 0t : t where,
forx E D,
The
stationarity
of the processsuggests
a different way ofcomputing
its statistics. If we are
given
anarbitrary
process, its statistics can becomputed by producing
severalindependent copies
of the process and thenby evaluating
theempirical
averages defined in(2).
Forergodic stationary
processes this is not necessary:by letting
the process run fora
long enough time,
all the statistics can be determined. That is to saythat, by defining
we have
for P-a.e. X E (JJ. Under some
assumptions
onP,
theexponential
rate ofconvergence in
(6)
is controlledby
asuitably
definedentropy function, i.e.,
forA c JTL( (JJ),
We do not
specify
here the conditions on P that areusually assumed,
both because
they
are not relevant for thespecific
model we aregoing
to
consider,
and because(7)
is believed to be true in muchgreater
gen-erality.
The function is defined as follows. Let ~- be the a-field in Wgenerated by (rt :
t ~0},
and denoteby Q~ (resp. P (ù)
theregular
conditional
probability
distribution(r.c.p.d.) of Q (resp. P)
with re-spect
to ~- . Moreover we letff1
to be the a-fieldgenerated by (rt :
0 ~~ t ~
1}.
Then we definestationary,
and andAs we did for
(4),
thelarge
deviation estimate(7) gives
rise to an-other class of
entropy problems,
where theentropy
function is now H instead of h.Thus, given
a constraint A we want to minimizefor Q varying
in A. Aminimizer Q
* willbe,
among the ele- ments ofA,
the one which is morelikely
to be close to in the limitn ~ ~ . The
type
of constraint we will be interested in is a very naturalone for
stationary
processes, and consists infixing
the one-time distri- bution. In otherwords, for g
aprobability
measure onE,
we letThe minimum function = inf has a further
interesting
QeA
interpretation.
Let us define theempirical
measurewhere
3K(E)
is the set ofprobability
measures on E. Then it can be de- rivedby (7) that,
for D c3K(E) sufficiently regular,
In other words is the rate function that controls the
large
devia-tions of the
empirical
measureLn,
x.The purpose of this paper is to solve a minimum
entropy problem
for a class of
stationary
Markov processes which is of relevance in Sta- tistical Mechanics. As for theexamples
we have discussedbefore,
theentropy
function we will introduce comes from alarge
deviationprob- lem,
that will be stated in Section 3. The main feature of our model is the infinitedimensionality
of the state space; for this reason when wedefine the
empirical
averages, we do notonly
average over the time but also over the differentcomponents
of the process. This will lead to the definition of anentropy
function that controls thelarge
devia-tions of
space-time empirical
averages. We then ask for the minimum ofH(Q I P)
under the constraint Vt e R. Under someassumptions
on the
probability
measure li we will be able to find theminimizer, by
showing
that it is Markovian andgiving
its Markovgenerator
ex-plicitely.
In Section 2 we introduce our
model,
and in Section 3 we summarizethe known
large
deviations results. The related minimumentropy problem
is stated and solved in Sections 4 and 5.2. Stochastic
spin systems.
In this section we introduce the class of stochastic processes we will be
dealing
with in this paper. We letX = { -1, + 1 ~
to be the set ofspin
values. The Markov processes we aregoing
to define take value onxz d,
i.e. for any site i of the d-dimensional latticeZd
there is an associ- atedspin
value. Theupdating
mechanism isspecified by assigning
anonnegative
functionc(i, a),
defined for iE XZd.
Theprobability
of
changing
thesign
to thespin
at the site iduring
a time interval oflength dt,
conditioned to theknowledge
of the wholeconfiguration
attime t,
isgiven by
Moreover
spins
at different sites areupdated independently.
Inpartic-
ular the
probability
ofchanging
thespin
at two different sites in thesame interval
[t, t
+/M]
is The): i
are usu-ally
calledflip
rates.The informal definition we have
just given
can be maderigorous
asfollows. First of all we
provide X2
with theproduct
of the discretetopology
on X. Thecorresponding
space of real continuous functions is denotedby
it becomes a Banach space with the usual sup-norm.We say that a function
f:
R is local if itsdependence
on a EXZd
isonly
where A is some finite subset ofZd.
We de-note
by D
the set of local functions. we can define thefollowing operator:
where
It is
proved
in[6] that,
under theassumption
the closure in
generates
a Markovsemigroup.
Moreoverthe
corresponding
Markov process is a Feller process(i.e.
the elementsUt, t
> 0 of thesemigroup
map bounded measurable functions into con- tinuousfunctions).
Notice that condition(8) essentially
says thatc( i, o-)
does not
depend
too much on thespin
of sites that are far from i. We no-tice that
(8)
is satisfied when theflip
rates are translation invariant(i.e. c(i, a)
=c( o,
where =di- + j ))
and local(i.e. c( o, a)
de-pends only
i E where A is a finite subsetof Zd). Only
thesetype
of models will be considered in the rest of the paper.3.
Large
deviations.In this section some of the
large
deviations results obtained in[2,3]
are summarized. We assume
c : ~ -1, 1 ~~Zd -~ I~ +
is a local andstrictly positive
function. As we have seen in Section 2 theoperator
is the
generator
of a Fellersemigroup.
We denote~e{2013l,l}~}
thecorresponding family
of conditionalprobability
measures. In
particular,
for c= 1,
we writePo, ~
inplace
of Noticethat
Po ç
’ issimply
theproduct
measure11 Po,
’ ’~~i>,Po,
~(i)being
theMarkov
family
of aPoisson-spin
process withintensity
one. For obvi-ous reasons the process
generated by
L ~ with c = 1 is called non-inter-acting spin system.
It is well known that thetrajectories
of such proc-esses are, with
probability
one, elements of ~2 =D(R, ~ -1, 1 ~Zd ),
theset of
right
continuous with left limit functions from Rto { -1,
1~Zd,
where the
topology on { -1,
is theproduct
of the discretetopolo-
gy. We also
provide O
with the Skorohodtopology (see [4])
and the cor-responding
Borelo-field,
where the measures we will beconsidering
are defined. On S~ we define the
family
ofspace-time
shift mapsfOt, n:
definedby
DEFINITION 3.1. A
probability
measureQ
said to be sta-tionary if it is
invariantfor
all the maps n-We denote
by
the set ofstationary
measures,provided
withthe weak
topology.
The Borel sets for thistopology provide 0
with astructure of measurable space.
Given oi E 12 we define its
n-periodic
version aswhere
For T C R x
Zd
we letFT
to be the o-field of subsets of S2generated by
the
i)
where Sometimes wewill use, for notation
i )
In thefollowing
defini-tion we denote
by
the set of bounded measurable functions Q - R.DEFINITION 3.2. Let oi E Q
and ~
E Then th empirical
pro-cess
Rn, w is
the elementof
whoseexpectations
aredefined
asfoLLows:
Notice
that,
in order tomake Rn, Ct) stationary,
it is essential to usethe
n th-periodic
version of m in(9).
We also remark that the mapis
lfio,
n] xv -measurable.
In the rest of the paper the a-field n] x vnwill be
simply
denotedby ~n.
Some more notations are now needed. We introduce on
Zd
the lexi-cographic
totalorder,
and denoteby
thecorresponding
order rela- tion. Consider the setFor
Q
E(S2 )
we letQ~
to denote theregular
conditionalprobability
distribution
(r.c.p.d.) of Q
withrespect
to T- . In thefollowing
defini-tion denotes the
Radon-Nikodym
derivative of the in- dicated measures restricted to the 7-field~1.
DEFINITION.
Let Q
E(Q).
The relativeentropy of Q
withrespect
to the Markov
farrcily
isdefined by
where
H(Q) =
+ 00if
theRadon-Nikodym
derivative in( 10 )
is not de-fined
or itslogarithm
is not inL 1 ( Q ).
In what
follows,
for m EQ,
i EZd ,
t ER,
we letNotice that
w t (i) = (
The main resultof [2]
is thefollowing Large
DeviationPrinciple.
THEOREM 1. Let A be a Borel measurable subset
of
~s(0),
and de-note with A dand
A
its interior and its closurerespectively. Then, for every ~
EXz’
where
In a
large
deviationprinciple
it isparticularly
relevant to determinethe zeroes of the rate function
T~(’).
Thefollowing
is the main result contained in[3].
We need to use the a-field ffP = t ~0} and,
for a
given Q
e we letQjj
denote itsr.c.p.d.
withrespect
to ffP.THEOREM 2. For have
~(Q)~0
and= 0
if and only
=Q-a,s..
In other words = 0if
andonly if Q is
astationary
Markovian measuregenerated by
L ~The
following proposition,
alsoproved
in[3],
establishes some prop- erties of the rate functionH ~ ( ~ )
that will be used later in this paper.THEOREM 3. The rate
function 31(8 (Q)
~ R+ is lower semi- continuous and hascompact
levelsets,
i. e.for
every l > 0 the set~ ~
E 31(8(Q): 11
iscompact
in the weaktopology.
Moreover4. The I function and its basic
properties.
Now we are
ready
to state the minimumentropy problem
we wantto solve.
Let g
be an element of3K~ (XZd ),
i.e. aprobability
measure onXZd
which is invariant under the shift maps0j , I E Zd.
Ourgoal
is tominimize the function
HC(Q)
under the constraint for every t E R. In this section wegive
the basicproperties
and thelarge
devia-tion
interpretation
of the minimum functioninf IH’(Q):
==
[Al.
In the nextsection,
under someassumption
on the model and on ~, we will be able toactually compute
and aminhnizer Q* (i.e.
ny O* = u anf Hc(Q*)
The
large
deviationtheory
we have summarized in theprevious
sec-tion allows to determine the
asymptotic
behavior ofquantities
of theform -
for any bounded
measurable,
Rm-valued function F on thepath
spaceS~,
and any Borel set A c
Rm,
m > 0. A function F of thetype
describedabove is often called an observable.
A class of observable that
play a significant
role are those for which there exists a functionf:
such thatIn other words F
depends only
on the value of thetrajectory N
at agiven
time. Noticethat,
for any bounded measurablef:
S~ we candefine
where F is defined
by (12).
It is easy to checkthat,
for n EN, oi
E,~,
can be identified with a measure on
XZd
which is invariant under the shift maps(0j :
i.e.L e Therefore,
afterhaving
provided
with the weaktopology
one can define afamily
ofmeasures f r,,,:
n EN, ç E XZd I
on3K~ (XZd ) by
and ask about the
large
deviationproperties
of The first fact westate is an immediate consequence of the contraction
principle (see [9]).
THEOREM 4. For
any ~
EX , r,, , obeys
thefoLlowing large
devia-tion
principle: for
every A C ~s(XZd)
where the rate
function I~(~)
isdefined by
being
themarginal of Q
at any time t.In
(15)
the function is defined as aninfimum,
but it isactually
a
minimum,
as shown in thefollowing proposition.
PROPOSITION 4.1. Let p. be such that 00. Then there such that
~c( ~ ) _
p. andH c (Q)
=PROOF.
By definition,
there exists a sequenceQn
E3K~(Q)
suchthat
~(~n ) _
g and HC - Ic(,u).
Since HC hascompact
level sets thesequence
Qn
has a limitpoint Q,
andclearly 7r(Q) =
IA.By
the lowersemicontinuity
of HC theequality HC(Q)
=easily
follows. mWe conclude this section
by giving
the mainproperty
ofTHEOREM 5. For any p. E
(Xd),
~ 0 = 0if
andonly if
it is an invariant measurefor
thesemigroup generated by
L c.PROOF. It is an obvious consequence of
Proposition
4.1 and Theo-rem 3.
5. The case of reversible
systems.
In this section we attack the
problem
ofminimizing
H~ under theconstraint
~(~) _ ~u, where g
E In order for ourargument
towork we
impose
aquite
restrictiveassumption
on thegenerator
L c.Moreover we will be able to find
only for g belonging
to a densesubset of 3K~
(XZ ).
The formula weget suggests
a naturalconjecture
on what should be for a
general g
E but we have notbeen able to prove it.
For t4
E we denoteby
the conditionalprobability
with
respect
to the a-fieldgenerated by
thespins
i.These conditional
probabilities
are often called the localspecifications
of the measure g. It is well known that the local
specifications
do notnecessarily determine g uniquely;
whenuniqueness
fails we say that aphase
transition occurs.DEFINITION 5.1. The
operator
Lc is said to be reversible with re-spect
to theprobability
measure IA EmLs (XZd) if, for
every i EZ,
a E XZd
By
translationinvariance, equality (16)
for i = 0implies
all the oth-ers. Since
c( ~ )
isstrictly positive
and local it followseasily
from(16)
that is also
strictly positive
and local. We therefore introducethe defined
by
It is
proved by using
standardargument ([6]) that, if li E S
thenwhere
are such that
Jv +
i =Jv
for everyv c Z,
i EZ, Jv
= 0 if thediameter of V is
large enough,
and is a normalization factor inde-pendent
of~( o ).
The is called aninteraction., and g
is saidto be a Gibbs measure for that interaction. Notice that the fact that
Jv = 0
fordiam(kJ large enough
isequivalent
to thelocality
of,u(~( o) ( ~).
This isusually expressed by saying
that the interaction hasfinite
racnge.It follows from the
argument
above that if Leis reversible with re-spect
to then g is Gibbsian for a finite range interaction. Sometimeswe say that g is a reversible measure for the
system,
or that thesystem
is reversible with
respect
to The fundamentalproperty
of reversiblemeasures is
given
in thefollowing proposition,
whoseproof
can befound in
[6].
PROPOSITION 5.2.
If
Leis reversible withrespect
to then ~u is in- variantfor
thesemigroup e - tL c.
Moreover v E:1[8 (xzd)
is also invariantif
andonly if, for
every a EXzd,
In the
remaining part
of this section we want to solve thefollowing problem: given
ve §,
find the minimum ofH (Q)
under the constraint7r(Q)
= v. In other words we want tocompute for g
ES.
In what follows we assume that L ~ is reversible
for g e §.
For agiven
vE S
we defineClearly
c Y isstrictly positive
and local.PROPOSITION 5.3. The
operactor LeY
is reversible withrespect
to V.PROOF. It is
just
anelementary computation:
It follows from
Propositions
5.2 and 5.3 that there is a Markovianmeasure Q
which is an element of its Markovfamily
of condi-tional distributions and
7r(Q")
= v.Now let us fix c~’ E 12 and n E N. In the
following
formula anyoccurrence of
~(z), replaced by w’(i).
DefineNow we state two lemmas that will be used later. Their
proof
is not
given
since it is astraightforward adaptation
of theproofs
of Lemma 4.3 in
[3]
and Lemma 7.1 in[2].
LEMMA 5.4. For every such that
LEMMA 5.5. be the set
of
all mapsfrom 0
to Q.Then, for
every
Q
E such thatHC(Q)
00Now we are
ready
to prove our main result.THEOREM 6. The measure
Qv
Y has thefollowing property:
Moreover
PROOF.
By
Theorem3,
Lemmas 5.4 and 5.5 we havefor every w’ E 0. In
particular
we can choose m ’ to betime-indepen-
dent. Now define
where
~(a)
does notdepend
ona(0).
MoreoverWe also notice that
Now,
if we definethen,
for i =0, ... ,
n - 1and,
moreover,for some constant C > 0. In an
analogous
way we defineh’(,7)
andh’(a-).
For (o’time-independent
we haveTherefore it follows from
(19)
thatWe notice that the
argument
we have beenfollowing
proves a littlemore than
that, namely
that forQ
E withHC(Q)
o0Now we
let Q
be such thatx(Q)
= v and 00. Inparticular
. Moreover . Therefore
which
completes
theproof.
It is
immediately
noticed that formula(18)
makes sense for everyv and it is natural to
conjecture
that itactually
holds forevery v E 3K~
(XZd ).
However we do not know how to prove itsince,
al-though ~
is dense in a lower semicontinuous function is notuniquely
determinedby
its restriction to a dense set.Nevertheless, density
andsemicontinuity, together
with Theorem 6immediately give
the
following.
COROLLARY 5.6. For every
REFERENCES
[1] P. DAI PRA, A stochastic control
approach
toreciprocal diffusion
processes,Appl.
Math.Optimiz.
23 (1991), pp. 313-329.[2] P. DAI PRA,
Space-time large
deviationsfor interacting particle
systems, Comm. PureAppl.
Math., 46 (1993), pp. 387-422.[3] P. DAI PRA,
Large
deviations andequilibrium
measuresfor
stochasticspin
systems, Stochastic Processes and TheirApplications,
48 (1993), pp.9-30.
[4] S. N. ETHIER and T. G. KURTZ, Markov processes, characterization and convergence, John
Wiley
& Sons (1986).[5] H. FOLLMER, Random
fields
anddiffusion
processes, in Ecole d’Eté de Saint Flour, XV-XVII (1985-87), p. 101-203, Lecture Notes in Mathemat-ics, 1362,
Springer-Verlag
(1988).[6] T. M. LIGGETT,
Interacting
ParticleSystems, Springer-Verlag
(1988).[7] M. NAGASAWA,
Transformation of diffusion
andSchrödinger
processes, Prob. Th. Rel. Fields, 82 (1989), pp. 109-136.[8] M. PAVON - A. WAKOLBINGER, On
free
energy, stochastic control andSchrödinger
processes, in Proc.Workshop
onModeling
and Controlof
Uncertain
Systems,
editedby
G. B. DI MASI, A. GOMBANI and A.KURZHANSKI, Birkhauser (1991).
[9] S. R. VARADHAN,
Large
Deviations andApplications,
CBMS-NSFRegion-
al
Conference
series inApplied
Mathematics, Vol. 46,Society
for Industrial andApplied
Mathematics,Philadelphia
(1984).[10] A. WAKOLBINGER, A
simplified
variational characterizationof Schrödinger
processes, J. Math.Phys,
30 (12) (1989), pp. 2943-2946.[11] J. C. ZAMBRINI, Stochastic mechanics
according
toSchrödinger, Phys.
Rev. A, 33 (3) (1986), pp. 1532-1548.
Manoscritto pervenuto in redazione il 26 novembre 1992 e, in forma definitiva, il 18 febbraio 1993.