• Aucun résultat trouvé

Exact marginals and normalizing constant for Gibbs distributions

N/A
N/A
Protected

Academic year: 2021

Partager "Exact marginals and normalizing constant for Gibbs distributions"

Copied!
5
0
0

Texte intégral

(1)

HAL Id: hal-00795132

https://hal.archives-ouvertes.fr/hal-00795132

Submitted on 27 Feb 2013

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Exact marginals and normalizing constant for Gibbs

distributions

Cécile Hardouin, Xavier Guyon

To cite this version:

Cécile Hardouin, Xavier Guyon. Exact marginals and normalizing constant for Gibbs distributions.

C. R. Acad. Sci. Paris, Ser. I., 2010, 348, pp.199-201. �hal-00795132�

(2)

Statistics

Exact marginals and normalizing constant for Gibbs distributions

C. Hardouin

a,1

, X. Guyon

a

aCES/SAMOS-MATISSE/Universit´e de Paris 1

Abstract

We present a recursive algorithm for the calculation of the marginal of a Gibbs distribution π. A direct conse-quence is the calculation of the normalizing constant of π.

R´esum´e

R´ecurrences et constante de normalisation pour des mod`eles de Gibbs. Nous proposons dans ce travail une r´ecurrence sur les lois marginales d’une distribution de Gibbs π. Une cons´equence directe est le calcul exact de la constante de normalisation de π.

1. Introduction

Usually, obtaining the marginals and/or the normalizing constant C of a discrete probability distri-bution π involves high dimensional summation : for example, for the binary Ising model on a simple grid 10 × 10, the calculation of C involves 2100 terms. One way to prevent this problem is to change

distribution of interest for an alternative as, for example in spatial statistics, replacing the likelihood for the conditional pseudo likelihood ([1]). Another solution consists of estimating the normalizing constant; see for example Pettitt & al ([8]) and Moeller & al ([7]) for efficient Monte Carlo methods, Bartolucci and Besag ([2]) for a recursive algorithm computing the exact likelihood of a Markov random field, Reeves and Pettitt ([9]) for an efficient computation of the normalizing constant for a factorisable model.

We present specific results for a Gibbs distribution π. We derive results of Khaled ([5,6]) who gives an original linear recursion on the marginals of π, the law of Z = (Z1, Z2, · · · , ZT) ∈ ET; this result eases

the calculation of π’s normalizing constant. We generalize Khaled results noticing that if π is a Gibbs

Email addresses: Cecile.Hardouin@univ-paris1.fr (C. Hardouin), guyon@univ-paris1.fr (X. Guyon).

1 Adresse de correspondance : C. Hardouin, SAMOS-MATISSE/Universit´e de Paris 1, 90 rue de Tolbiac, 75634 Paris

(3)

distribution on T = {1, 2, · · · , T }, then π is a Markov field on T , so it is easy to manipulate its conditional distributions that are the basic tools of our forward recursions.

2. Markov representations of a Gibbs field

Let T > 0 be a fix positive integer, E = {e1, e2, · · · , eN} a finite state space, Z = (Z1, Z2, · · · , ZT) ∈ ET

a temporal sequence with distribution π. Let us denote z(t) = (z1, z2, · · · , zt). We assume that π is a

Gibbs distribution with energy and potentials:

π(z(T )) = C exp UT(z(T )) with C−1= X z(T )∈ET exp UT(z(T )) where (1) Ut(z(t)) = X s=1,t θs(zs) + X s=2,t Ψs(zs−1, zs) for 2 ≤ t ≤ T , and U1(z1) = θ1(z1).

So, π is a bilateral 2 nearest neighbours Markov field ([4,3])

π(zt| zs, 1 ≤ s ≤ T and s 6= t) = π(zt| zt−1, zt+1) (2)

but Z is also a Markov chain :

π(zt| zs, s ≤ t − 1) = π(zt| zt−1) if 1 < t ≤ T. (3)

An important difference appears between formulas (3) and (2): indeed, (2) is computationnally feasible, when (3) is not.

3. Recursion over marginal distributions 3.1. Future-conditional contribution Γt(z(t))

For t ≤ T − 1, the distribution π(z1, z2, · · · , zt| zt+1, zt+2, · · · , zT) conditionally to the future, depends

only on zt+1: π(z1, z2, · · · , zt| zt+1, zt+2, · · · , zT) = P π(z1, z2, · · · , zT) ut 1∈Etπ(u t 1, zt+1, ..zT)= π(z1, z2, · · · , zt| zt+1).

We can also write π(z1, z2, · · · , zt | zt+1) = Ct(zt+1) exp Ut∗(z1, z2, · · · , zt; zt+1) where Ut∗ is the

future-conditional energ : U∗ t(z1, z2, · · · , zt; zt+1) = Ut(z1, z2, · · · , zt) + Ψt+1(zt, zt+1), (4) and Ct+1(zt+1)−1= P ut 1∈Etexp {U t(u1, ..., ut; zt+1)}. Then, for i = 1, N : π(z1, z2, · · · , zt| zt+1= ei) = Ct(ei)γt(z1, z2, · · · , zt; ei) where γt(z(t); ei) = exp Ut∗(z(t); ei).

With the convention ΨT +1 ≡ 0, we define for t ≤ T , the vector Γt(z(t)) ∈ RN of the future-conditional

contributions as

t(z(t)))i= γt(z(t); ei), 1 ≤ i ≤ N .

and the recursion matrix Atby

At(i, j) = exp{θt(ej) + Ψt+1(ej, ei)}, i, j = 1, N. (5)

Then we get the following fundamental recurrence. 2

(4)

Proposition 3.1 For all 2 ≤ t ≤ T , z(t) = (z1, z2, · · · , zt) ∈ Etand ei∈ E, we have:

γt(z(t − 1), ej; ei) = At(i, j) × γt−1(z(t − 1); ej) , (6)

and X

zt∈E

Γt(z(t − 1), zt) = AtΓt−1(z(t − 1)). (7)

3.2. Forward recursions on marginals and normalization constant

Let us define the following 1 × N row vectors : E1= BT = (1, 0, · · · , 0), and the (Bt)t=T,2 defined by

the forward recursion Bt−1 = BtAt if t ≤ T ; we also denote K1 =

P

z1∈EΓ1(z1) ∈ R

N. We give below

the main result of this work.

Proposition 3.2 Marginal distributions πtand calculation of the normalization constant C.

(1) For 1 ≤ t ≤ T :

πt(z(t)) = C × BtΓt(z(t)). (8)

(2) The normalization constant C of the joint distribution π verifies: C−1= E

1ATAT −1· · · A2K1. (9)

The formula (9) reduces to C−1= E

1ATAT −2K1for time invariant potentials.

As a basic example, let us consider E = {0, 1}, θt(zt) = αzt, and Ψt+1(zt, zt+1) = βztzt+1; the analytic

expressions of A, K1 are trivially derived. We computed C−1 = E1ATAT −2K1 for increasing values of

T ; the computing time is always negligible for T ≤ 700, whereas computing C−1 by direct summation

needs 750 seconds for T = 20, 6 hours for T = 25, and the method becoming ineffectual for T > 25.

4. Extensions to general Gibbs fields

There are various generalisations of the preceeding results. 4.1. Temporal Gibbs model

Let us give the following example as an illustration to possible extensions. Coming back to the previous model (1), we add the interaction potentials Ψ2,s(zs−2, zs). Then π is a 4 nearest neighbours Markov field

but also a Markov chain of order 2. Conditionally to the future, we get

π(z1, z2, · · · , zt | zt+1, zt+2, · · · , zT) = π(z(t) | zt+1, zt+2) = Ct(zt+1, zt+2) exp Ut∗(z(t); zt+1, zt+2), with

U∗

t(z(t); zt+1, zt+2) = Ut(z(t)) + Ψ1,t+1(zt, zt+1) + Ψ2,t+1(zt−1, zt+1) + Ψ2,t+2(zt, zt+2),

Then, for a, b and c ∈ E, U∗

t(z(t − 1), a; (b, c)) = Ut−1∗ (z(t − 1); (a, b)) + θt(a) + Ψ1,t+1(a, b) + Ψ2,t+2(a, c);

analogously to the previous example, we define the future-conditional contributions and the N2× N2

matrices Atby

γt(z(t); (zt+1, zt+2)) = exp Ut∗(z(t); (zt+1, zt+2)

(5)

Similarly as 3.1, we get the following recursion:

γt(z(t − 1), ek; (ei, ej)) = At((i, j), (k, i)) × γt−1(z(t − 1); (ek, ei))

We thus obtain a recurrence (7) on the contributions Γt(z(t)) and analogous results as (8) and (9) for

the bivariate Markov chain (Zt−1, Zt), t = 1, T .

4.2. Spatial Gibbs fields

For t ∈ T ={1, 2, ...T }, let us consider Zt= (Z(t,i), i ∈ I), where I = {1, 2, · · · , m}, Z(t,i)∈ F . Then

Z = (Zs, s = (t, i) ∈ S) is a spatial field on S = T ×I. We note again zt= (z(t,i), i ∈ I), z(t) = (z1, .., zt),

z = z(T ) and we suppose that the distribution π of Z is a Gibbs distribution with translation invariant

potentials ΦAk(•), k = 1, K associated to a family of subsets {Ak, k = 1, K} of S. For A ⊆ S, let us define H(A) = sup{|u − v| , ∃(u, i) and (v, j) ∈ A}, and H = sup{H(Ak), k = 1, K}. With this notation,

we write the Gibbs-energy

U (z) = H X h=0 T X t=h+1 Ψ(zt−h, · · · , zt) with Ψ(zt−h, · · · , zt) = X k:H(Ak)=h X s∈St(k) ΦAk+s(z)

where St(k) = {s = (u, i) : Ak+ s ⊆ S and t − H(Ak) ≤ u ≤ t}. Then (Zt) is a Markov process of order

H and Yt = (Zt−H, Zt−H+1, · · · Zt), t > H a Markov chain on EH for which we get the results (8) and

(9).

We applied the result to the calculation of the normalization constant for an Ising model. For m = 10 and T = 100, the computing time is less than 20 seconds.

References

[1] J. Besag, 1974. Spatial interactions and the statistical analysis of lattice systems. J. Roy. Statist. Soc. B 148, 1-36 [2] F. Bartolucci, J. Besag, 2002. A recursive algorithm for Markov random fields. Biometrika 89 3, 724-730

[3] X. Guyon, 1995. Random Fields on a Network: Modeling, Statistics, and Applications. Springer-Verlag, New York [4] R. Kindermann, J.L. Snell, 1980. ), Markov random fields and their applications, Contemp. Maths.

[5] M. Khaled, 2008. A multivariate generalization of a Markov switching model. working paper C.E.S. Universit´e Paris 1,

http://mkhaled.chez-alice.fr/khaled.html

[6] M. Khaled, 2008. Estimation bay´esienne de mod`eles espace-´etat non lin´eaires. Ph. D. Thesis, Universit´e Paris 1,

http://mkhaled.chez-alice.fr/khaled.html

[7] J. Moeller, A.N. Pettitt, R. Reeves, K.K. Berthelsen, 2006. An efficient Markov chain Monte Carlo method for distributions with intractable normalizing constants. Biometrika 93 2 , 451-458.

[8] A.N. Pettitt, N. Friel, R. Reeves, 2003. fficient calculation of the normalizing constant of the autologistic and related models on the cylinder and lattice. J. Roy. Statist. Soc. B 65, part 1, 235-246

[9] R. Reeves,A.N. Pettitt, 2004. Efficient recursions for general factorisable models. Biometrika 91 3 , 751-757.

Références

Documents relatifs

Dans les essais numériques, nous avons remarqué que les méthodes proposées sont plus précises que la méthode MC classique puisque leurs variances sont inférieures (pour un même

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des

Keywords: Darcy law, divergence-form operator, stochastic diffusion process, stochastic differential equations with local time, kinetic approximation..

Monte Carlo MC simulation is widely used to estimate the expectation E[X ] of a random variable X and compute a con dence interval on E[X ]... What this talk

Maximum likelihood inference Since the latent class structure is a mixture model, the EM algorithm (Dempster et al. 1977, McLachlan and Krishnan 1997) is a privileged tool to derive

Since the efficiency of the Monte Carlo simulation considerably depends on the small- ness of the variance in the estimation, we propose a stochastic approximation method to find

There is also a vaste literature regarding Schur- constant models for continuous survival data valued in R + ; let us mention, among others, Caramellino and Spizzichino (1994), Chi

Dans son char, enveloppé d'une épaisse nuée noire, Phaéton n'avait qu'un seul désir : que ce cauchemar se termine au plus vite!. Plutôt la mort que cette terreur