• Aucun résultat trouvé

Parallel and interacting Markov chains Monte Carlo method

N/A
N/A
Protected

Academic year: 2021

Partager "Parallel and interacting Markov chains Monte Carlo method"

Copied!
31
0
0

Texte intégral

(1)

HAL Id: inria-00103871

https://hal.inria.fr/inria-00103871v2

Submitted on 2 Nov 2006

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents

method

Fabien Campillo, Vivien Rossi

To cite this version:

Fabien Campillo, Vivien Rossi. Parallel and interacting Markov chains Monte Carlo method. [Research Report] RR-6008, INRIA. 2006. �inria-00103871v2�

(2)

inria-00103871, version 2 - 2 Nov 2006

a p p o r t

d e r e c h e r c h e

Thème NUM

Parallel and interacting

Markov chains Monte Carlo method

Fabien Campillo and Vivien Rossi

N° 6008

October 2006

(3)
(4)

Fabien Campillo

and VivienRossi

† ‡

ThèmeNUMSystèmesnumériques

ProjetsAspi

Rapportdereherhe 6008Otober200627pages

Abstrat: Inmany situations it is importantto beable to propose N independent real- izations of a givendistribution law. We propose a strategy for making N parallel Monte

CarloMarkovChains(MCMC)interatinordertogetanapproximationofanindependent

N-sampleofagiventarget law. Inthismethod eahindividual hainproposesandidates

forallotherhains. WeprovethatthesetofinteratinghainsisitselfaMCMCmethodfor

theprodutofN targetmeasures. Comparedtoindependentparallelhainsthismethodis moretimeonsuming,butweshowthroughonreteexamplesthatitpossessesmanyadvan-

tages: itanspeeduponvergenetowardthetargetlawaswellashandlethemulti-modal

ase.

Key-words: MarkovhainMonteCarlomethod,Metropolis-Hastings,interatinghains,

partileapproximation

INRIA/IRISA,Rennes,Fabien.Campilloinria.fr

IURC,UniversityofMontpellierIViven.Rossiiur.montp. inse rm. fr

TheresearhoftheseondauthorwasdoneduringapostdotoralstayattheINRIA/IRISA,Rennes.

(5)

Résumé : Dans de nombreuses situations il est important de pouvoir disposer de N

réalisations indépendantes d'une loi donnée. Notre but est de développer une stratégie

d'interation deN méthodesde MonteCarloparChaînede Markov(MCCM) dans lebut

de proposer une approximation d'un éhantillon indépendant de taille N d'une loi ible

donnée. L'idée estquehaquehaînepropose unandidatpourelle-mêmemaiségalement

pourtoutesles autreshaînes. Onmontre que l'ensemblede es N haînesen interation

estlui-mêmeuneméthodeMCCMpourleproduitdeN mesuresibles. Cetteapproheest

naturellement plusoûteuse queN haînesindépendantes, onmontre toutefois autravers d'exemplesonretsqu'ellepossèdeplusieursavantages: ellepeutsensiblementaélérerla

onvergeneverslaloiible,ellepermetégalementd'appréhenderleasmultimodal.

Mots-lés: méthodedeMonteCarloparhaînedeMarkov,Metropolis-Hastings,haînes

eninteration,approximationpartiulaire

(6)

Contents

1 Introdution 5

2 Parallel/interating MHalgorithm 5

2.1 Thealgorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 DesriptionoftheMHkernel . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 Invarianeproperty. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3 Parallel/interating MwG algorithm 12

3.1 Thealgorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.2 DesriptionoftheMHkernel . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.3 Invarianeproperty. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4 Numerial tests 19

4.1 Amulti-modalexample . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.2 AnhiddenMarkovmodel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5 Conlusion 23

(7)
(8)

1 Introdution

MarkovhainMonteCarlo(MCMC)algorithms[19,12,18℄allowsustodrawsamplesfrom

aprobabilitydistribution π(x)dxknown upto a multipliative onstant. They onsist in sequentiallysimulatingasingleMarkovhainwhoselimitdistributionisπ(x)dx. Thereexist

manytehniquesto speeduptheonvergene towardthetargetdistribution byimproving

themixing propertiesofthe hain[13℄. Moreover,speialattentionshould begiventothe

onvergenediagnosis ofthis method[1,6,15℄.

An alternativeis to run manyMarkov hains in parallel. The simplest multiple hain

algorithmistomakeuseofparallelindependenthains[9℄. Thereommendationsonerning

thisideaseemontraditoryin theliterature(f. themany shortruns vs onelongrun

debate desribed in [10℄). We an note with [11℄ and [18, Ÿ6.5℄ that independent parallel

hainsouldbeapooridea: amongthesehainssomemaynotonverge,soonelonghain

ould be preferableto many short ones. Moreover,many parallel independenthains an

artiiallyexhibitamorerobustbehaviorwhihdoesnotorrespondtoarealonvergene

ofthealgorithm.

In pratie onehowevermakeuse of several hains in parallel. It is then tempting to

exhange information between these hains to improve mixing properties of the MCMC

samplers [4, 5, 16, 3, 7, 8℄. A general framework of Population Monte Carlo has been

proposedinthisontext[14,17,2℄. Inthispaperweproposeaninteratingmethodbetween

parallelhainswhihprovidesanindependentsamplefromthetargetdistribution. Contrary

topaperspreviouslyited,theproposallawnourworkisgivenanddoesnotadaptitselfto

theprevioussimulations. Hene,theproblemofthehoieofthislawstillremains.

TheMetropolis-Hastings(MH)algorithmanditstheoretialpropertiesarepresentedin

setion2. TheorrespondingMetropoliswithinGibbs(MwG)algorithmanditstheoretial

propertiesarepresentedinsetion3. InSetion4,twosimplenumerialexamplesillustrate

howtheintrodutionofinterationsanspeeduptheonvergeneandhandlemulti-modal

ases.

2 Parallel/interating Metropolis Hastings (MH) algo-

rithm

Considera target density lawπ(x) dened on (Rn,B(Rn)) and a proposal kernel density πprop(y|x). We proposeamethod for samplingN independentvaluesX1, . . . , XN Rn of

thelawπ(x)dx.

Notations: Let

X =X1:N =X1:n Rn×N,

sothatXRN andXiRn (thesameforY andZ);xRn sothatxR(thesamefor y and z); ξ, ξ R. HereX1:N = (X1, . . . , XN)and X1:n = (X1, . . . , Xn). Wealso dene

(9)

¬ℓ={1, . . . , n} \ {ℓ}. Notethat the strutureofthe matrixX is:

X =

Xi

X11 · · · X1i · · · X1N

.

.

.

.

.

.

.

.

.

X1 · · · Xi · · · XN

.

.

.

.

.

.

.

.

.

Xn1 · · · Xni · · · XnN

X .

2.1 The algorithm

WedesribetheMarkov hain {X(k)}k≥0 overRn×N orresponding theMH algorithm. It onsistsinN mutuallydependent realizationsXi,(k) (i= 1, . . . , N)ofthestatevariableand

itslimitdistributionwillbe

Π(dX)def=π(X1)dX1· · ·π(XN)dXN.

WedetailaniterationX(k)=X X(k+1)=Z oftheMHalgorithm. TheN vetorsare

updatedsequentially:

[X1:N][Z1X2:N][Z1:2X3:N]· · ·[Z1:N−1XN][Z1:N].

Atsub-iterationi,thatis[Z1:i−1Xi:N][Z1:iXi+1:N],wesimulateZi in twosteps:

Proposal step: independently onefrom theother, eah hain j = 1· · ·N proposesa an-

didate Yj Rn aording to the proposal kernel starting from its urrent position,

i.e.

Yj πpropi,j (y|Z1:i−1, Xi, Xi+1:N)dy .

Notethat theandidatesYj dependalso oni. Wewillusealighternotation:

πi,jprop(y|Xi) =πi,jprop(y|Z1:i−1, Xi, Xi+1:N). (1)

(10)

Seletion step: We anhoseamong theseN andidates Y1:N orstayat Xi aordingto

themultinomiallaw:

Zi

Y1 withprobability

1

N αi,1(Xi, Y1),

.

.

.

YN withprobability

1

N αi,N(Xi, YN), Xi withprobabilityρ˜i(Xi, Y)

wheretheaeptaneprobabilitiesare

αi,j(x, y)def= π(y) π(x)

πpropi,j (x|y) πpropi,j (y|x)1,

˜

ρi(Xi, Y)def= 1 1 N

XN

j=1

αi,j(Xi, Yj).

Thenal algorithmisdepitedinAlgorithm1.

hooseXRn×N

fork= 1,2, . . . do

fori= 1 :N do

forj= 1 :N do Yj πpropi,j (y|Xi)dy

αj[π(Yj)πi,jprop(Xi|Yj)]/[π(Xi)πpropi,j (Yj|Xi)]1

endfor

˜

ρ1N1 PN j=1αj

Xi

Y1 withprobabilityα1/N

.

.

.

YN withprobabilityαN/N Xi withprobabilityρ˜

endfor

endfor

Algorithm1: Parallel/interatingMHalgorithm.

2.2 Desription of the MH kernel

Lemma2.1 The Markov kernelassoiatedwiththe MH proeduredesribedinSetion2.1

is

P(X;dZ)def=P1(X1:N;dZ1)P2(Z1, X2:N;dZ2)· · ·PN(Z1:N−1, XN;dZN) (2)

(11)

where

Pi(Z1:i−1, Xi:N;dz)def= 1 N

XN

j=1

αi,j(Xi, z)πi,jprop(z|Xi)dz+ρi(Xi)δXi(dz). (3)

Aeptationprobability is

αi,j(x, z)def=

ri,j(x, z)1 if(x, z)Ri,j,

0 otherwise, (4)

ri,j(x, z)def= π(z) π(x)

πi,jprop(x|z)

πi,jprop(z|x), (5)

ρi(x)def= 1 1 N

XN

j=1

Z

R

αi,j(x, z)πi,jprop(z|x)dz . (6)

Theset Ri,j is denedby:

Ri,jdef=

(x, z)Rn×Rn;π(z)πi,jprop(x|z)>0 and π(x)πpropi,j (z|x)>0 .

Note that the funtions αi,j(x, z), ρi(x), ri,j(x, z) and the set Ri,j depend on Z1:i−1 and Xi:N.

The measures

ν(dx×dz) =π(z)πi,jprop(x|z)dzdx , νT(dx×dz) =π(x)πpropi,j (z|x)dzdx

are mutually absolutely ontinuous over Ri,j and mutually singular on the omplementary set [Ri,j]c. The set Ri,j isunique, upto the ν andνT negligible sets, andsymmetri, i.e.

(x, z)Ri,j(z, x)Ri,j.

Proof ThisonstrutionfollowsthegeneralsetupproposedbyLukeTierney in [20℄. We

now derive the probability kernel assoiated with the iteration desribed in the previous

subsetion 2.1. ThekernelPi(Z1:i−1, Xi:N;dz) isthe omposition of a proposition kernel andofaseletionkernel:

Pi(Z1:i−1, Xi:N;dz) = Z

Y1:N

Si(Z1:i−1, Xi:N, Y1:N;dz)Qi(Z1:i−1, Xi:N;dY1:N)

whih onsists in proposing independently N andidates Y1:N sampled from the density

proposition,i.e.

Qi(Z1:i−1, Xi:N;dY1:N)def= YN

k=1

πi,kprop(Yk|Xi)dYk

(12)

thentoseletamongtheseandidatesortostayatXi withtheMHaeptaneprobability, i.e.

Si(Z1:i−1, Xi:N, Y1:N;dz)def= 1 N

XN

j=1

αi,j(Xi, Yj)δYj(dz) + ˜ρi(Xi, Y)δXi(dz).

Hene:

Pi(Z1:i−1, Xi:N;dz) =

= 1 N

XN

j=1

Z

Y1:N

αi,j(Xi, Yj)δYj(dz)nYN

k=1

πpropi,k (Yk|Xi)dYko

+ Z

Y1:N

˜

ρi(Xi, Y)δXi(dz)nYN

k=1

πpropi,k (Yk|Xi)dYko

=A1+A2

and

A1= 1 N

XN

j=1

Z

Yj

αi,j(Xi, Yj)δYj(dz)πpropi,j (Yj|Xi) Z

Y¬j

nYN

k6=j

πi,kprop(Yk|Xi)dYko

| {z }

=1

dYj

= 1 N

XN

j=1

αi,j(Xi, z)πpropi,j (z|Xi)dz

beause

R

YjδYj(dz)dYj=dz. Theseond termA2reads:

A2= Z

Y1:N

˜

ρi(Xi, Y)δXi(dz)nYN

k=1

πi,kprop(Yk|Xi)dYko

=δXi(dz) Z

Y1:N

n1 1 N

XN

j=1

αi,j(Xi, Yj)o nYN

k=1

πi,kprop(Yk|Xi)dYko

=δXi(dz)n 1 1

N XN

j=1

Z

Y1:N

αi,j(Xi, Yj) YN

k=1

πpropi,k (Yk|Xi)dYko

=δXi(dz)n 1 1

N XN

j=1

Z

Yj

αi,j(Xi, Yj)πi,jprop(Yj|Xi)dYjo .

SummingupA1 andA2 provestheLemma. 2

(13)

2.3 Invariane property

Lemma2.2 Forall(x, z)Rn×Rn a.e. wehave:

αi,j(x, z)π(x)πi,jprop(z|x) =αi,j(z, x)π(z)πpropi,j (x|z).

Proof For(x, z)6∈Ri,j theresultisobvious. For(x, z)Ri,j wehave:

(ri,j(x, z)1)π(x)πpropi,j (z|x)

= minn

π(z)πi,jprop(x|z), π(x)πpropi,j (z|x)o

= (ri,j(z, x)1)π(z)πi,jprop(x|z).

2

Lemma2.3(onditional detailedbalane) The following equality of measuresdened

onRn×Rn

Pi(Z1:i−1, Xi:N;dZi)π(Xi)dXi=Pi(Z1:i, Xi+1:N;dXi)π(Zi)dZi (7)

holdstrue forany i= 1, . . . , N,Z1:i−1R(i−1)×N,andXi+1:N R(N−i)×N.

Proof Left hand sideof (7) is ameasure,say ν(dZi×dXi)on (Rn×Rn,B(Rn×Rn)).

ForallA1, A2∈ B(Rn), wewanttoprovethatν(A1×A2) =ν(A2×A1). Wehave:

ν(A1×A2) = Z

Pi(Z1:i−1, Xi:N;A1)1A

2(Xi)π(Xi)dXi

and

Pi(Z1:i−1, Xi:N;A1) = 1 N

XN

j=1

Z 1A

1(Zi)αi,j(Xi, Zi)πpropi,j (Zi|Xi)dZi +ρi(Xi)1A

1(Xi)

sothat

ν(A1×A2)

= 1 N

XN

j=1

Z Z 1A

1(Zi)1A

2(Xi)αi,j(Xi, Zi)π(Xi)πi,jprop(Zi|Xi)dXidZi +

Z

ρi(Xi)1A

1(Xi)1A

2(Xi)π(Xi)dXi. (8)

(14)

AndfromLemma2.2,weget:

ν(A1×A2)

= 1 N

XN

j=1

Z Z 1A

1(Zi)1A

2(Xi)αi,j(Zi, Xi)π(Zi)πpropi,j (Xi|Zi)dZidXi +

Z

ρi(Xi)1A

1(Xi)1A

2(Xi)π(Xi)dXi

Exhangingthe nameof variablesXi Zi in therst termof therighthand side ofthe

previousequality, leadsto thesameexpressionas(8)where A1 andA2 wereinterhanged,

inotherwordsν(A1×A2) =ν(A2×A1). 2

Proposition 2.4(invariane) Theprobability measure

Π(dX) =π(X1)dX1· · ·π(XN)dXN

isaninvariant distributionofthe Markov kernel P,i.e. ΠP = Πthat is:

Z

X

P(X,dZ)nYN

i=1

π(Xi)dXio

= YN

i=1

π(Zi)dZi. (9)

Proof

Z

X

P(X,dZ)nYN

i=1

π(Xi)dXio

= Z

X

P1(X1:N;dZ1)P2(Z1, X2:N;dZ2)· · ·

· · ·PN(Z1:N−1, XN;dZN)nYN

i=1

π(Xi)dXio

= Z

X

P1(X1:N;dZ1)π(X1)dX1P2(Z1, X2:N;dZ2)· · ·

· · ·Pn(Z1:N−1, XN;dZN)nYN

i=2

π(Xi)dXio .

Using(7)withi= 1gives:

Z

X

P(X,dZ)nYN

i=1

π(Xi)dXio

=

= Z

X

P1(Z1, X2:N;dX1)π(Z1)dZ1P2(Z1, X2:N;dZ2)· · ·

· · ·Pn(Z1:N−1, XN;dZN)nYN

i=2

π(Xi)dXio .

Références

Documents relatifs

The waste-recycling Monte Carlo (WR) algorithm introduced by physicists is a modification of the (multi-proposal) Metropolis-Hastings algorithm, which makes use of all the proposals

(Right) Swap probability acceptance for the here explained local algorithm (squares) and the global Parisi Grigera swap (circles), for the σ A /σ B = 1.4 model in the fluid phase, as

Unité de recherche INRIA Rennes, Irisa, Campus universitaire de Beaulieu, 35042 RENNES Cedex Unité de recherche INRIA Rhône-Alpes, 655, avenue de l’Europe, 38330 MONTBONNOT ST

le concerne son caractère très empirique.. Mais les auteurs indiquent que leur ' démarcbe ne permet pas encore de con- clure, puisqu'el le nécessite encore des

In summary, for array-RQMC, we have the follow- ing types of d-dimensional RQMC point sets for P n : a d + 1-dimensional Korobov lattice rule with its first coordinate skipped,

We have proposed and analyzed a QMC method for the simulation of discrete- time Markov chains on a multi-dimensional state space.. The method simulates several copies of the chain

The posterior distribution of the parameters will be estimated by the classical Gibbs sampling algorithm M 0 as described in Section 1 and by the Gibbs sampling algo- rithm with

Probabilities of errors for the single decoder: (solid) Probability of false negative, (dashed) Probability of false positive, [Down] Average number of caught colluders for the