• Aucun résultat trouvé

On conditional independence and log-convexity

N/A
N/A
Protected

Academic year: 2022

Partager "On conditional independence and log-convexity"

Copied!
11
0
0

Texte intégral

(1)

www.imstat.org/aihp 2012, Vol. 48, No. 4, 1137–1147

DOI:10.1214/11-AIHP431

© Association des Publications de l’Institut Henri Poincaré, 2012

On conditional independence and log-convexity 1

František Matúš

Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic, Pod vodárenskou vˇeží 4, 182 08 Prague, Czech Republic. E-mail:[email protected]

Received 20 November 2010; revised 21 April 2011; accepted 26 April 2011

Abstract. If conditional independence constraints define a family of positive distributions that is log-convex then this family turns out to be a Markov model over an undirected graph. This is proved for the distributions on products of finite sets and for the regular Gaussian ones. As a consequence, the assertion known as Brook factorization theorem, Hammersley–Clifford theorem or Gibbs–Markov equivalence is obtained.

Résumé. Si des contraintes d’indépendance conditionnelle définissent une famille de distributions positives qui est log-convexe, alors cette famille doit être un modèle de Markov sur un graphe non-dirigé. Ceci est démontré pour les distributions sur le pro- duits d’ensembles finis et pour les distributions gaussiennes régulières. Par conséquent, l’assertion connue comme le théorème de factorisation de Brook, le théorème de Hammersley–Clifford ou l’équivalence de Gibbs–Markov est obtenue.

MSC:Primary 62H05; 62M40; secondary 62H17; 62J10; 05C50; 11C20; 15A15

Keywords:Conditional independence; Markov properties; Factorizable distributions; Graphical Markov models; Log-convexity; Gibbs–Markov equivalence; Markov fields; Hammersley–Clifford theorem; Contingency tables; Gibbs potentials; Multivariate Gaussian distributions; Positive definite matrices; Covariance selection model

1. Main result

LetN be a finite set,Xi,iN, finite nonempty state spaces,XI the product ofXi overiI, andπI the coordinate projection ofXN toXI,IN. The marginal of a probability measure (p.m.)P onXN toI is the p.m.πIP onXI given by

πIP (πIx)=

yX

P (y)δx,yI , xXN,

whereδx,yI equals one ifπIx=πIy and zero otherwise.

A p.m.P onXNsatisfies aconditional independence(CI-) constraint if

πij KP (πij Kx)·πKP (πKx)=πiKP (πiKx)·πj KP (πj Kx), xXN, (1) wherei, jN are different andKN\ij. By convention, an elementiN is not distinguished from the singleton {i}and the sign∪for unions of subsets ofN is omitted.

1Supported in part by Grant Agency of Academy of Sciences of the Czech Republic, Grant IAA 100750603, and by Grant Agency of the Czech Republic, Grant 201/08/0539.

(2)

The family of all ordered couples(ij|K)withi,j andKas above is denoted byR. ForLRletPLdenote the family of p.m.’s onXNthat satisfy theCI-constraint given by each(ij|K)LandPL+=PLP+. Here,P+is the family of p.m.’s onXN that are positive in the senseP (x) >0 for allxXN. Given a p.m.P, the projectionsπi, iN, can be interpreted as the random variables jointly distributed according toP. Their distribution belongs toPL if and only ifπi andπjare stochastically independent givenk)kK for all(ij|K)L.

A nonempty subfamily ofP+islog-convexif it contains together with p.m.’sP andQalso the p.m. proportional toxPt(x)Q1t(x),xXN, for all 0< t <1. The general definition of log-convexity is recalled and discussed in Section5.3.

A hypergraph(N,A)consists of the vertex setN and a nonempty familyAof subsets ofN, called hyperedges.

A p.m.P onXN isfactorizablew.r.t. the hypergraph if for eachIAthere exists a real-valued functionψI onXI such that

P (x)=

I∈A

ψIIx), xXN. (2)

The family of all p.m.’s on XN that factorize w.r.t. the hypergraph is denoted by FA and FA+=FAP+. An undirected graphG=(N, E)has a vertex setN and an edge setE, contained in the familyN

2

of all two-element subsets ofN. A setLof vertices is a clique ofGifL

2

E. IfAis the family of cliques ofG, the notationFG+is preferred toFA+and one speaks about the factorization w.r.t.G.

Theorem 1. IfLRandPL+is log-convex then this family coincides withFG+whereG=(N, E)is the graph with ijEif and only if(ij|K) /Lfor allKN\ij.

In other words, given a familyLthe graphGis constructed onN by removing fromN

2

allijsuch that(ij|K)L for someKN\ij; Theorem1asserts that the log-convexity ofPL+impliesPL+=FG+.

IfLconsists exclusively of the couples(ij|K)havingK=N\ij then it is log-convex. This follows easily from a well-known equivalent definition ofCI-constraints, see (6) below. In this special case Theorem 1implies what is called by statisticians Brook factorization theorem [5,16] or Hammersley–Clifford theorem [3,7] and by physicists Gibbs–Markov equivalence [19,26].

Corollary 1. IfG=(N, E)is an undirected graph andLconsists of all couples(ij|N\ij )such thatij is not an edge ofGthenPL+=FG+.

A proof of Theorem 1 is deferred to Section4. It is based on properties of interaction spaces, summarized in Section2, and new observations on the behaviour of the familyPLaround the uniform p.m., presented in Section3.

Discussion and remarks are collected in Section 5 that contains also a short independent proof of Corollary 1 of geometric flavor. Section6is devoted to a version of Theorem1involving regular multidimensional Gaussian p.m.’s, formulated as Theorem2.

2. Interaction spaces and factorizability

The factorization (2) of positive p.m.’s w.r.t. a hypergraph can be equivalently described on the exponential scale by a linear space. The space decomposes orthogonally to interaction spaces. The material of this section is standard, see [10] and [20], Appendix B.2. The aim is to prepare arguments used in Sections3and4, and supply self-contained proofs.

ForINletVI denote the linear space of those functionsvofxXNthat depend onxonly throughπIx, thus v(x)=v(y)onceπIx=πIy,x, yXN. ForvVI

πIv(πIx)= |XN\I|v(x), xXN, (3)

where the marginalization of functions onXN is defined analogously to that of p.m.’s, thusπIv(πIx)is the sum of v(y)overyXN satisfyingπIx=πIy.

(3)

Lemma 1. A functionuonXNis orthogonal toVJ,JN,if and only ifπJu=0.

Proof. A functionvbelongs toVJ if and only if it can be written as the compositionv=vJπJ wherevJ:XJ→R. Since

u, v =

xXN

u(x)·v(x)=

xJXJ

πJu(xJ)·vJ(xJ)= πJu, vJ

andvJ is arbitrary the assertion follows.

ForIN letUI denote the orthogonal complement inVI to the sum ofVJ overJI, and be referred to as an interaction space.

Lemma 2. IfJN does not containINanduUI,thenπJu=0.

Proof. Since UIVI the marginalπJu of uUI to J depends onxJXJ only throughπIJJxJ whereπIJJ denotes the coordinate projection ofXJ toXIJ. It follows from (3), applied toπJuandIJ in the roles ofv andI, thatπIJJπJu(πIJJxJ)equals|XJ\I|πJu(xJ)for xJXJ. Therefore, it suffices to prove that the marginal πIJJπJu=πIJuofuUIvanishes. By Lemma1, this is equivalent to the orthogonality ofuandVIJ which holds

by the definition ofUIandIJ=I.

Lemma 3. The spacesUI,IN,are pairwise orthogonal.

Proof. IfI, JN are different then, up to symmetry,J does not containI. By Lemma2, ifuUI thenπJu=0.

Lemma1implies thatuis orthogonal toVJ. Thus,UIandUJ are orthogonal.

For a hypergraph(N,A), letWAdenote the sum ofVI overIA.

Lemma 4. For a hypergraph(N,A)the spaceWAequals the orthogonal sum ofUIover allIN that are covered by some hyperedge fromA.

Proof. By Lemma3, the assertion follows from its special instance disregarding orthogonality. On account of the definition ofWA, it suffices to restrict to the hypergraphs withA= {I},IN, in which caseWA=VI. Induction on the cardinality ofI is employed to prove thatVIis the sum ofUJ overJI. IfI=∅thenVandUcoincide with the space of constant functions onXNand the assertion is trivial. Otherwise,VIdecomposes orthogonally intoUI

and the sum ofVJ overJI. By the induction assumption, this sum equals the sum ofUJ overJI whence the

induction step is completed.

Corollary 2. The Euclidean spaceVN=RXN decomposes orthogonally toUI,IN.

Lemma 5. For every hypergraph(N,A)the familyFA+ consists of the p.m.’s that are proportional toew wherew belongs to the orthogonal sum ofUIover allINthat are covered by some hyperedge fromA.

Proof. Ifwbelongs to the above sum ofUI then it is in the sum ofVIoverIA. Writingwas the sum of functions vIπIwherevI:XI→R, a p.m. proportional to ewis given byxtexp[I∈AvIIx)],xXN, with some constant t >0. Such a p.m. belongs toFA+, choosingψIin (2) proportional to evI.

If PFA+ then, by (2), positive functionsψI exist such thatP (x)=ew(x),xXN, wherew is the sum over IAof the functionsx→lnψIIx). Thus,P is proportional to ew, with the proportionality constant equal to 1.

By definition,wbelongs toWA, equal to the above sum ofUIby Lemma4.

(4)

Corresponding to the interaction spacesUI, a special baseαy,yXN, of the spaceRXN is constructed. To this end, it is assumed that eachXihas a distinguished element denoted by0i and0=(0i)iN. The functionαyis defined atxXNby

αy(x)=

(−1)|s(y)s(x)|, xy,

0, otherwise,

wheres(y)denotes the support{iN: yi =0i}ofy andxy abbreviates the equality of projections ofx andy ontos(x)s(y).

Lemma 6. IfIN then{αy: s(y)I}is a base ofVI and{αy: s(y)=I}a base ofUI.

Proof. Fory, zXNthe scalar product ofαyandαz equals the sum of(−1)|[s(y)s(z)]∩s(x)|overxXNsuch that xy andxz. Foris(y)\s(z)the range of summation is partitioned into the pairs that differ only in theith coordinate, belonging to{0i, yi}. By the summations over the pairs, the scalar product vanishes. Therefore,αyandαz are orthogonal whens(y)=s(z). Ifyandzhave the same support thenαy(z)=(−1)|s(y)|δNy,z, and thus the functions αywiths(y)=I are independent. It follows that{αy: s(y)I}is an independent set. This set is a base ofVIbecause αyVs(y)VIand dimVI= |XI|. Then,{αy: s(y)=I}is a base ofUIby Lemma4.

In particular, the spaceUI has a positive dimension if and only if|Xi|>1 for alliI. IfXi= {0i,1i},iN, then eachUI is spanned by the single functionαy wherey=(yi)iNhasyi equal to1i foriI and0i otherwise. These functions form an orthogonal base ofRXN by Corollary2.

The following technical lemma is prepared for the discussion in Section5.2.

Lemma 7. For any hypergraph(N,A)the familyWAis the direct sum of the spaces V0,I =

vVI: v(x)=0oncexi=0ifor someiI overINthat can be covered by someJA.

The spaceV0,I consists of the functionsvIπI wherevI:XI→Rvanishes on eachxIXI having a coordinate xi=0i. Such a functionvI is said to beadaptedto0.

Proof of Lemma7. ForINletρImapxXNtoyXNgiven byπIy=πIxandπN\Iy=πN\I0. Letw∈RXN be decomposed as the sum ofvIπI overINwhere everyvIis adapted to0. IfxXNandI=s(x)then

LI

(−1)|I\L|w(ρLx)=

LI

(−1)|I\L|

KL

vKKρLx)

=

KI

vKKx)

KLI

(−1)|I\L|=vIIx). (4)

Therefore, the functionsvI are unique and, in turn, the sum of allV0,I,IN, is direct.

Ifvis a function onXNthen

IN

(−1)|N\I|IV0,N.

In fact, ifxXN has a coordinatexi=0ithenρIx=ρI\ixforiIN, and grouping the summands into the pairs I,I\ithe sum vanishes. Sincev=NandIVI it follows thatVN=RXNequalsV0,N plus the sum ofVIover IN. By induction on the cardinality ofN,VNis the sum ofV0,I overIN, which implies the assertion.

(5)

3. Conditional independence and interactions

In this section, theCI-constraints are related to interaction spaces. This is done via the relation(ij|K)Lbetween (ij|K)RandLNdefined by the statement ‘i /∈Lorj /LorLij K.’ In other words,(ij|K) /Lif and only ifijLij K, see Section5.3. From now on,mI denotes|XI|1,IN.

Lemma 8. If (ij|K)Land uUL then the measure given byP (x)=mN[1+u(x)],xXN,satisfies the CI- constraint given by(ij|K).

Proof. By Lemma2, ifij Kdoes not containLthen the marginal ofP toij Kis uniform. Thus, theCI-constraint (1) reduces to mij KmK=miKmj K. Otherwise,Lij K, and it follows from(ij|K)Lthat i /Lor j /L. By the symmetry between iandj, it suffices to restrict to the casei /L. Then,Lj K which implies thatubelongs to Vj KVij K. By the double use of (3) withij Kandj Kin the role ofI, the constraint (1) rewrites to

mij K 1+u(x)

·πKP (πKx)=πiKP (πiKx)·mj K 1+u(x)

, xXN. (5)

IfjLthenKandiK do not containL. By Lemma2, the marginals ofP toK andiK are uniform whence (5) holds. If j /L then LK, and thus uVKViK. By the double use of (3) withK andiK in the role of I, πKP (πKx)=mK[1+u(x)]andπiKP (πiKx)=miK[1+u(x)]. Hence, (5) is always satisfied.

ForLRletLdenote the family ofLN such that(ij|K)Lfor all(ij|K)L.

Corollary 3. IfLRandLLthen any p.m.onXNthat differs from the uniform one by a vector fromUL,as in Lemma8,belongs toPL.

An example of the constructionLL arises from a graphG=(N, E)andKG= {(ij|N\ij ): ijN

2

\E}, arriving at the familyKGof cliques ofG.

For a later reference the following simple assertion is needed.

Lemma 9. For anyLR,ifMN and every differenti, jMcan be covered by someLijL contained inM thenML.

Proof. If a couple(ij|K)Lhasi, jMthen, by the assumption,ijLijMfor someLijL. The definition ofLimplies that(ij|K)Lij, and thusLijij KbecauseijLij. Hence,Mij K. It follows that(ij|K)M.

In turn,ML.

In the remaining part of this section, theCI-constraints (1) are analyzed via the mappingsΨij|Kgiven by Ψij|K(w)=

πij Kw(πij Kx)·πKw(πKx)πiKw(πiKx)·πj Kw(πj Kx)

xXN.

Here,(ij|K)Randwis a function onXN. A p.m. can play the role ofwas well.

Lemma 10. The Jacobian ofΨij|Kat the uniform p.m.onXNis equal to mKδx,yij K+mij KδKx,ymj Kδx,yiKmiKδj Kx,y

x,yXN.

Proof. The coordinate function ofΨij|K indexed byxXN equals

zXN

zXN

w(z)w(z) δx,zij KδKx,zδiKx,zδx,zj K .

(6)

Differentiating w.r.t.w(y),yXN,

zXN

zXN

δNy,zw(z)+w(z)δz,yN δij Kx,zδx,zKδx,ziKδj Kx,z .

Whenw(z)=mN,zXN, this equals mN

zXN

δx,yij Kδx,zKδx,yiKδj Kx,z +mN

zXN

δx,zij Kδx,yKδx,ziKδj Kx,y

and the assertion follows.

Let Ker(ij|K)denote the kernel of the Jacobian ofΨij|Kat the uniform p.m.

Lemma 11. ForLRthe intersection ofKer(ij|K)over(ij|K)Lis equal to the sum ofULoverLL. Proof. IfL /LthenijLij Kfor some(ij|K)L. ForuULandv∈Ker(ij|K)

xXN

u(x)

yXN

mKδx,yij K+mij KδKx,ymj Kδx,yiKmiKδj Kx,y

v(y)=0

because the inner sums equal zero due to Lemma10. Since none of the setsK,iK andj KcontainsLthe marginals ofuto these sets vanish by Lemma2and the above equation rewrites to

xXN

yXN

u(x)v(y)δx,yij K=0.

SinceLij Kthe functionubelongs toVij K, and thusu(x)δij Kx,y =u(y)δx,yij K. Therefore,uandvare orthogonal. In turn, Ker(ij|K)is orthogonal toUL.

By Corollary2, the intersection of Ker(ij|K) over(ij|K)L is contained in the sum ofUL over LL. The

opposite inclusion is a consequence of Corollary3.

4. Log-convexity and conditional independence

A nonempty family of positive p.m.’s onXN islog-linearif it contains together with p.m.’sP andQalso the p.m.

proportional toxPt(x)Qs(x),xXN, for all realt ands. The family islog-affineif this is required only with s=1−t. The log-convexity assumes the additional restriction 0< t <1. The log-affine families correspond to the full exponential families [20].

In this section,LR.

Lemma 12. IfPL+is log-convex then it is log-linear.

Proof. LetP , QPL+, 0< t <1 andRt(x)=Pt(x)Q1t(x),xX. By the log-convexity, for(ij|K)LEqs (1) hold withP replaced byRt. Thus, the coordinate functions oftΨij|K(Rt)vanish whent ranges between 0 and 1.

Since they are holomorphic they vanish identically. It follows thatPL+is closed to the log-affine combinations. The assertion follows because it is not difficult to see that a log-affine family that contains the uniform p.m. onXN is

log-linear.

Lemma 13. IfPL+ is log-convex and a p.m.P onXN is proportional toew for somew∈RXN thenPPL+if and only ifwbelongs to the sum ofULoverLL.

(7)

Proof. The log-convex combinations ofP with the uniform p.m.

Pt(x)=et w(x)

yXN

et w(y), xXN,

are viewed as a curve parameterized byt. Its tangent vector at the uniform p.m.P0equals mNwm2N

yXN

w(y).

If PPL+ then the log-convexity of PL+ implies that the curve ranges in PL+. Hence, the tangent belongs to Ker(ij|K)for(ij|K)L. By Lemma11, the tangent is in the sum ofULoverLL. Since∅∈LandUconsists of the constant functions the sum containsw.

Letwbelong to the sum ofULoverLL. By Lemma12,PL+is log-linear. Therefore, to prove thatPPL+ it suffices to confine to the case wULfor someLL. The assertion is trivial ifL=∅, havingwconstant and P uniform. Otherwise,L=∅andw is orthogonal toU by Lemma3. Then,xmN[1+εw(x)]defines a p.m.

forεsufficiently close to 0. By Corollary3, this p.m. belongs toPL. It is proportional to eln[1+εw]. The log-affinity ofPL+implies that the p.m.Pεproportional to ewε withwε=1εln[1+εw]belongs toPL+. Limiting withεto 0 the functionswεconverge towandPεconverges toPP+. Hence,PPL+.

TheCI-constraint (1) given by(ij|K)can be equivalently expressed as

Q(xixjxK)·Q(yiyjxK)=Q(xiyjxK)·Q(yixjxK), xi, yiXi, xj, yjXj, xKXK, (6) whereQdenotes the marginal ofP toij K,xixjxKis the element ofXij Kthat projects toxi,xj andxK, andyiyjxK, xiyjxKandyixjxK have analogous meaning. IfK=N\ij thenQ=P and it is easy to see that if two p.m.’s satisfy the equations in (6) then the equations also hold for the log-linear combinations of the p.m.’s.

Under the additional natural assumption|Xi|>1 for alliN, the following lemma and the proof of Theorem1 can be slightly simplified but when not excluding|Xi| =1 only additional minor technicalities are needed.

Lemma 14. IfPL+is log-convex,LLand|X|>1for eachLthen all subsets ofLbelong toL.

Proof. It suffices to prove thatL\Lfor allL. This is accomplished when the violation of(ij|K)(L\)for some(ij|K)implies that the couple is not inL. Thus, assumeijL\ij K. By the assumption on cardinalities, there exists yXN with s(y)=L. Let zXN haves(z)=and the same th coordinate asy,y=0. Since L, LLemma6impliesαyUL andαzU. By the log-convexity and Lemma13, the p.m.P proportional to eαy+αzis inPL+.

There existst >0 such thatt·πN\P (xN\)equals eαy(0xN\)+αz(0xN\)+eαy(yxN\)+αz(yxN\)+

xX\{0,y}

eαy(xxN\)+αz(xxN\)

=eαy(0xN\)+1+eαy(yxN\)1+ |X| −2, xN\XN\,

whereαyandαzvanish atxxN\. CombiningijL\ij KandLL, the setij Kcannot contain. Therefore, πij KP =Qis a marginal ofπN\P. SinceπN\P depends onxN\ only throughπij KN\xN\it follows from (3) that Q(πij KN\xN\)is equal to|XN\ij K|πN\P (xN\). Therefore,

Q(0i0j0K)Q(yiyj0K)Q(0iyj0K)Q(yi0j0K)

is a positive multiple of[e2+e2+ |X| −2]2− |X|2. Using (6), this implies(ij|K) /L.

Proof of Theorem1. LetRN,Xconsist of the couples(ij|N\ij )with|Xi| =1 or|Xj| =1. By (1),PL=PL∪RN,X

for allLR. LetGL=(N, E)be the graph withij /Eif and only if (ij|K)Lfor someKN\ij. By (2),

(8)

a p.m. factorizes w.r.t.GLif and only if it does w.r.t.GL∪RN,X. It follows that it suffices to prove the assertion of Theorem1under the additional assumptionRN,XL. Hence, anyLL with at least two elements has|X|>1 for allL. By Lemma14, all subsets of anyLL belong toL; thusL is hereditary. Using Lemma9, a set LN with at least two elements belongs toL if and only ifijL for everyi, jL; thusL is conformal. It follows thatLis the family of cliques ofGL. Hence, the assertion of Theorem1obtains from Lemmas5and13.

5. Discussion and remarks

5.1. The main result and its corollary

For an undirected graphG=(N, E)letKGconsist of the couples(ij|N\ij )havingij /E. The p.m.’s fromPKG are calledpairwise Markov w.r.t. G. The families PK+G parameterized by undirected graphs Ghave been known in the statistical literature as theMarkov models over undirected graphs[9,20], over the contingency tables. They are log-convex because each familyP{+(ij|N\ij )}is log-linear by the argument following (6) and intersections of log- linear families are log-linear. Hence, Theorem1directly implies the assertion of Corollary1,PK+G=FG+. Rephrased verbally, given an undirected graph, a positive p.m. is pairwise Markov if and only if it factorizes. The assumption of positivity matters, see [26], p. 22, and [20], Example 3.10.

Alternatively,PK+G=FG+is a simple consequence of the lemmas on the interaction spaces from Section2. In fact, it is rather straightforward thatPP+satisfies theCI-constraint(ij|N\ij )if and only if lnP:x→lnP (x)belongs to the sum ofVN\i andVN\j, see, e.g., [20], (3.6). This is the sum ofUI whereI does not containij, on account of Lemma4. Therefore, by Corollary2,PP+is pairwise Markov w.r.t.Gif and only if lnP belongs to the sum ofUI whereI contains noij /E, thusI is a clique ofG. To concludePK+G=FG+it suffices to evoke Lemma5.

It seems that this short geometric proof of Corollary1via the intersections of sums of the interaction spaces is new.

The presented argumentation is coordinate-free; no projectors, Moebius transform, algebras, and a special choice of 0XN are employed. The only comparable proof of Corollary1 is the inductive one by Brook [5], see also [16], Theorem 7.1, which however has no geometric content.

ForLR, Theorem1and log-convexity of the Markov models imply thatPL+is log-convex if and only if it is Markov over an undirected graph. Thus, the class of these Markov models can be equivalently defined by means of theCI-constraints and log-convexity, without any reference to graphs. Here, both the conditional independence [11]

and log-convexity [6], including the geometrical viewpoint of [1], have been recognized for decades as basic building stones in statistics. Theorem1seems to be a new bridge between them.

5.2. Gibbs probability measures

Given a distinguished element0 =(0i)iN ofXN and a hereditary hypergraph(N,A), a p.m.PP+isGibbsianif there exist real-valued functionsvIonXI such that

lnP (x)=

I∈A,Is(x)

vIIx), xX. (7)

The family of such probability measures is denoted byG0+,A. IfxIXI and for someiI theith coordinate ofxI

equals0ithen the valuevI(xI)does not occur in (7). Therefore, it can be equivalently assumed in the above definition that all such values equal zero, thusvIare adapted to0. In this situation, the summation in (7) can run equivalently over IA. Thus, the Gibbs p.m.’s factorize in a special way,G0+,AFA+. By Lemmas5and7, ifPFA+thenP =ew forwWAin the form

I∈AvIπI with allvI adapted to0. Therefore,FA+G0,+A. Thus, over a hypergraph, the notion of Gibbs p.m.’s does not depend on the choice of0 and coincides with the factorizability of positive p.m.’s. In literature, e.g., in [7], Theorem 2, [26], Theorem 2, typically (7) is stated to be equivalent to

vKKx)=

IK

(−1)|K\I|lnP (ρIx), KA, xXN,s(x)=K,

which is a reinterpretation of the computation in (4), based on Moebius transform.

(9)

The identityPK+G =FG+, disguised in Gibbs p.m.’s and potentials, has been revealed and proved independently several times. Over point lattices it goes back to [2,30]. Two early unpublished manuscripts [14,17]2have been fre- quently cited. Other proofs are in [3], p. 198, [7], p. 22, [15,27,28], and three proofs in [26], Theorem 2. For personal remarks see also Hammersley’s discussion in [3], p. 230.

Under topological assumptions on the state spaces, Hammersley–Clifford theorem from [12,20] asserts that over a graph the pairwise Markovness is equivalent to the factorization, for the p.m.’s with continuous and positive densities w.r.t. a product measure.

5.3. Miscellany

WhenP andQare p.m.’s on a measurable space that are not mutually singular,p andq are their densities with respect to a dominating measureR and 0< t <1, the log-convex combination ofP andQis defined as the p.m.

with theR-density proportional toptq1t. The definition is not dependent on the choice ofR. A family of p.m.’s is called log-convex if it is closed to the log-convex combinations of pairs of not mutually singular p.m.’s. Log-convex families of mutually absolutely continuous p.m.’s are called ‘geodesically convex’ in [6]. Examples of log-convex sets comprise the exponential families with convex sets of canonical parameters and their extensions [8].

A crucial role in the proof of Theorem1is played by the binary relationbetweenRand the power set ofN. It appeared previously prior to [23], Theorem 4. In addition to the mappingLL, it gives rise to

AA=

(ij|K)R: (ij|K)Lfor allLA ,

whereAis a family of subsets ofN. The pair of mappings forms a Galois connection [4], Chapter V, Sections 7 and 8.

By the remark preceding [23], Theorem 4, the connection gives rise to an antiisomorphism betweenDCI-relations and Cw-families, similarly to [23], Theorem 1.

The implication of Lemma9is valid for the family of connected sets in any topological space in the role ofL; it goes back at least to [18].

General conditional independence statements can be reduced to families of theCI-constraints (1) by [22], Lemma 3.

The familiesPLandPL+have been, in spite of considerable effort, far from being understood in general [12,24,31].

6. Gaussian probability measures

In this sectionN= {1,2, . . . , n}. A p.m. onRnis regular Gaussian (rG) if its density with respect to the Lebesgue measure has the form

x(2π)n/2(detA)1/2exp −12(xμ)TA1(xμ)

, x∈Rn,

whereμis a (column) vector fromRnandA=(ai,j)i,jNa real positive definite matrix. The standard notation detA is used for the determinant ofAandT for the transposition.

AnrGp.m. satisfies theCI-constraint given by(ij|K)Rif and only if detAiK,j K vanishes whereAiK,j K is the submatrix ofAwhose rows are indexed byiK and columns byj K, see [21]. Hence, the p.m. is pairwise Markov w.r.t. an undirected graphG=(N, E)if and only if detAN\j,N\i=0 wheneverij /E; this equality is equivalent to vanishing of the element ofA1indexed byi, j. It follows that the p.m. factorizes w.r.t. the graph [20], Section 5.1.4, which implies Markovness [20], Proposition 3.8. Thus, as well-known, the pairwise Markovness of anyrGp.m. is equivalent to a factorization, over an undirected graph. For the generalCI-constraints on Gaussian variables see [12, 13,21,25,29].

Log-convex combinations of tworGp.m.’s parameterized byμ, Aandν, B are therGp.m.’s parameterized by someλμ,ν,t∈Rnand[t A1+(1t )B1]1, 0< t <1.

Theorem 2. If the set of rG p.m.’s on Rn that satisfy all CI-constraints from LR is log-convex then this set coincides with the set of all rG p.m.’s that factorize w.r.t.the graph(N, E)that hasijEif and only if(ij|K) /L for allKN\ij.

2The former work is not available to the author.

Références

Documents relatifs

[r]

In fact, the main contribution of our work is that we motivated an analysis of XML (Bleachery), which we used to demonstrate that the seminal pseudorandom algorithm for the

This paper develops a new contrast process for parametric inference of general hidden Markov models, when the hidden chain has a non-compact state space.. This contrast is based on

Using the general form of the (H.W.I) inequality it is quite easy to adapt the previous proof in order to show the following result :.. Let µ satisfying (H.C.K) for some K

We present the results in the elementary framework of symmetric Markov chains on a finite space, and then indicate how they can be extended to more general Markov processes such as

We investigate the relations between the Poissonnian loop ensem- bles, their occupation fields, non ramified Galois coverings of a graph, the associated gauge fields, and

L’algorithme de Viterbi permet de trouver, étant donné une séquence d’observation, le chemin le plus probable dans un HMM (la succession la plus probable d’états)

Every based loop in G defines an element of Γ x 0 and an element of the monodromy group M x 0 at the base point whose conjugacy class is in- dependent of the geodesic linking X 0 to