Monte Carlo methods in molecular dynamics
Part 1: Sampling the canonical distribution and computing free energy differences.
T. Lelièvre
CERMICS - Ecole des Ponts ParisTech & MicMac project-team - INRIA
IPAM tutorial, MTDUT September 2012
Introduction
The aim of molecular dynamics simulations is to understand the relationships between themacroscopic propertiesof a molecular system and itsatomisticfeatures. In particular, one would like to to evaluate numerically macroscopic quantities from models at the microscopic scale.
Some examples of macroscopic quantities:
(i) Thermodynamics quantities (average of some observable wrt an equilibrium measure): stress, heat capacity,free energy,...
Eµ(ϕ(X)) = Z
Rd
ϕ(x)µ(dx).
(ii) Dynamical quantities(average over trajectories at equilibrium):
diffusion coefficients, viscosity, transition rates,...
E(F((Xt)t≥0)) = Z
C0(R+,Rd)F((xt)t≥0))W(d((xt)t≥0)).
Introduction
Many applications in various fields: biology, physics, chemistry, materials science. Molecular dynamics computations consume today a lot of CPU time.
A molecular dynamics model amounts essentially in choosing a potentialV which associates to a configuration
(x1, ...,xN) =x ∈R3N an energy V(x1, ...,xN).
In the canonical (NVT) ensemble, configurations are distributed according to the Boltzmann-Gibbs probability measure:
dµ(x) =Z−1exp(−βV(x))dx, whereZ =R
exp(−βV(x))dx is the partition function and β= (kBT)−1 is proportional to the inverse of the temperature.
Introduction
Typically,V is a sum of potentials modelling interaction between two particles, three particles and four particles:
V =X
i<j
V1(xi,xj) + X
i<j<k
V2(xi,xj,xk) + X
i<j<k<l
V3(xi,xj,xk,xl).
For example,V1(xi,xj) =VLJ(|xi −xj|) where VLJ(r) =4ǫ
σ r
12
− σr6
is the Lennard-Jones potential.
Difficulties: (i) high-dimensional problem (N ≫1) ; (ii)µis a multimodal measure.
Introduction
To sampleµ, ergodic dynamics wrt to µare used. A typical example is theover-damped Langevin(or gradient) dynamics:
(GD) dXt =−∇V(Xt)dt+p
2β−1dWt.
It is the limit (when the mass goes to zero or the damping parameter to infinity) of theLangevin dynamics:
(
dXt=M−1Ptdt,
dPt=−∇V(Xt)dt−γM−1Ptdt+p
2γβ−1dWt, whereMis the mass tensor andγis the friction coefficient.
To compute dynamical quantities, these are also typically the dynamics of interest. Thus,
Eµ(ϕ(X))≃ 1 T
Z T 0
ϕ(Xt)dt andE(F((Xt)t≥0))≃ 1 N
N
X
m=1
F((Xmt )t≥0).
In the following, we mainly consider theover-damped Langevin dynamics.
Introduction
Probabilistic insert (1): discretization of SDEs.
The discretization of (GD) by the Euler scheme is (for a fixed timestep∆t):
Xn+1 =Xn− ∇V(Xn) ∆t+p
2β−1∆tGn
where(Gni)1≤i≤3N,n≥0 are i.i.d. random variables with lawN(0,1).
Indeed,
(W(n+1)∆t−Wn∆t)n≥0 =L √
∆t(Gn)n≥0.
In practice, a sequence of i.i.d. random variables with lawN(0,1) may be obtained from a sequence of i.i.d. random variables with lawU((0,1)).
Introduction
µin invariant for (GD)Proof: One needs to show that if the law of X0 is µ, then the law of Xt is also µ. Let us denote Xxt the solution to (GD) such thatX0=x. Let us consider the function u(t,x) solution to:
∂tu(t,x) =−∇V(x)· ∇u(t,x) +β−1∆u(t,x), u(0,x) =φ(x)(+ assumptions on decay at infinity),
then,u(t,x) =E(φ(Xxt)). Thus, the measure µis invariant:
d dt
Z
E(φ(Xxt))dµ(x) =Z−1 Z
∂tu(t,x)exp(−βV(x))dx
=Z−1 Z
−∇V · ∇u+β−1∆u
exp(−βV)=0.
Therefore, Z
E(φ(Xxt))dµ(x) = Z
φ(x)dµ(x).
Introduction
Probabilistic insert (2): Feynman-Kac formula.
Whyu(t,x) =E(φ(Xxt))? For 0<s <t, we have (characteristic method):
d u(t−s,Xxs) =−∂tu(t−s,Xxs)ds+∇u(t−s,Xxs)·dXxs +β−1∆u(t−s,Xxs)ds,
=
−∂tu(t−s,Xxs)− ∇V(Xxs)· ∇u(t−s,Xxs) +β−1∆u(t−s,Xxs)
ds+p
2β−1∇u(t−s,Xxs)·dWs. Thus, integrating overs ∈(0,t) and taking the expectation:
E(u(0,Xxt))−E(u(t,Xx0)) =p 2β−1E
Z t
0 ∇u(t−s,Xxs)·dWs
=0.
Introduction
Probabilistic insert (3): Itô’s calculus. (in 1d.)
Where does the term∆u come from ? Starting from the discretization:
Xn+1=Xn−V′(Xn) ∆t+p
2β−1∆tGn, we have (for a time-independent functionu):
u(Xn+1) =u
Xn−V′(Xn) ∆t+p
2β−1∆tGn
,
=u(Xn)−u′(Xn)V′(Xn) ∆t+p
2β−1∆tu′(Xn)Gn
+β−1(Gn)2u′′(Xn)∆t+o(∆t).
Thus, summing overn ∈[0...t/∆t] and taking the limit∆t →0, u(Xt) =u(X0)−
Z t 0
V′(Xs)u′(Xs)ds+p 2β−1
Z t 0
u′(Xs)dWs
+β−1 Z t
0
u′′(Xs)ds.
Introduction: metastability
Difficulty: In practice,Xt is a metastable process, so that the convergence to equilibrium is very slow.
A 2d schematic picture: Xt1 is a slow variable(a metastable dof) of the system.
x1
x2
V(x1,x2)
Xt1
t
Introduction: metastability
A more realistic example (Dellago, Geissler): Influence of the solvation on a dimer conformation.
00000000 00000000 00000000 0000
11111111 11111111 11111111 1111
000000 000000 000000 111111 111111 111111
000000 000000 000000 111111 111111 111111 00000000 00000000 00000000 11111111 11111111 11111111
00000000 00000000 00000000 0000
11111111 11111111 11111111 1111
000000 000000 000000 000
111111 111111 111111 111
00000000 00000000 00000000 11111111 11111111 11111111
000000 000000 000000 111111 111111 111111 000000000000000000000000
11111111 11111111 11111111 000000 000000 000000 111111 111111 111111 000000 000000 000000 111111 111111 111111
000000 000000 000000 000
111111 111111 111111 111
000000 000000 000000 111111 111111 111111
000000 000000 000000 000
111111 111111 111111 111
00000000 00000000 00000000 0000
11111111 11111111 11111111 1111
000000 000000 000000 000
111111 111111 111111 111
000000 000000 000000 000
111111 111111 111111 111
00000000 00000000 00000000 0000
11111111 11111111 11111111 1111
000000 000000 000000 111111 111111 111111 000000 000000 000000 111111 111111 111111
000000 000000 000000 000
111111 111111 111111 111
00000000 00000000 00000000 11111111 11111111 11111111
000000 000000 000000 111111 111111 111111
00000000 00000000 00000000 11111111 11111111 11111111 000000 000000 000000 000
111111 111111 111111 111 00000000 00000000 00000000 11111111 11111111 11111111
Compact state. Stretched state.
The particles interact through a pair potential: truncated LJ for all particles except the two monomers (black particles) which interact through a double-well potential. A slow variable is the distance between the two monomers.
Introduction: metastability
A “real” example: ions canal in a cell membrane. (C. Chipot).
A second “real” example: Diffusion of adatoms on a surface.
(A. Voter).
Los Alamos
Introduction: metastability
One central numerical difficulty is thusmetastability. How to quantify this bad behaviour ?
1. Escape time from a potential well.
2. Asymptotice variance of the estimator.
3. “Decorrelation time”.
4. Rate of convergence of the law of Xt to µ.
In the following we use the fourth criterion.
Introduction: metastability
The PDE point of view: convergence of the pdfψ(t,x) of Xt to ψ∞(x) =Z−1e−βV(x). ψ satisfies the Fokker-Planck equation
∂tψ=div(∇Vψ+β−1∇ψ), which can be rewritten as∂tψ=β−1div
ψ∞∇
ψ ψ∞
. Let us introducethe entropy
E(t) =H(ψ(t,·)|ψ∞) = Z
ln ψ
ψ∞
ψ.
We have (Csiszár-Kullback inequality):
kψ(t,·)−ψ∞kL1 ≤p 2E(t).
Introduction: metastability
dE dt =
Z ln
ψ ψ∞
∂tψ
=β−1 Z
ln ψ
ψ∞
div
ψ∞∇ ψ
ψ∞
=−β−1 Z
∇ln
ψ ψ∞
2
ψ=:−β−1I(ψ(t,·)|ψ∞).
IfV is such that the following Logarithmic Sobolev inequality (LSI(R)) holds: ∀ψ pdf,
H(ψ|ψ∞)≤ 1
2RI(ψ|ψ∞)
thenE(t)≤E(0)exp(−2β−1Rt) and thusψ converges toψ∞ exponentially fast with rateβ−1R (and the converse is true).
Metastability ⇐⇒ smallR
Introduction: reaction coordinate and free energy
Metastability: How to attack this problem ?
We suppose in the following that the slow variable isof dimension 1 andknown: ξ(Xt), where ξ :Rn→T.
For example, in the 2d simple system,ξ(x1,x2) =x1.
x1
x2
V(x1,x2)
Xt1
t
We will use this knowledge to build efficient sampling techniques.
Two ideas:
• Conditioning (constrained dynamics),
• Importance sampling (biased dynamics).
Free energywill play a central role.
Introduction: reaction coordinate and free energy
Let us introduce two probability measures associated toµandξ:
• The image of the measure µby ξ:
ξ∗µ(dz) =exp(−βA(z))dz where the free energy Ais defined by:
A(z) =−β−1ln Z
Σ(z)
e−βVδξ(x)−z(dx)
! ,
with Σ(z) ={x, ξ(x) =z} is a (smooth) submanifold of Rd, andδξ(x)−z(dx)dz =dx.
• The probability measure µconditioned to ξ(x) =z:
µΣ(z)(dx) = exp(−βV(x))δξ(x)−z(dx) exp(−βA(z)) .
Introduction: reaction coordinate and free energy
In the simple caseξ(x1,x2) =x1, we have:
• The image of the measure µby ξ:
ξ∗µ(dx1) =exp(−βA(x1))dx1
where the free energy Ais defined by:
A(x1) =−β−1ln Z
e−βV(x1,x2)dx2
,
andΣ(x1) ={(x1,x2),x2∈R}.
• The probability measure µconditioned to ξ(x1,x2) =x1: µΣ(x1)(dx2) = exp(−βV(x1,x2))dx2
exp(−βA(x1)) .
Introduction: reaction coordinate and free energy
The measureδξ(x)−z is related to the Lebesgue measure onΣ(z) through:
δξ(x)−z =|∇ξ|−1dσΣ(z). This is theco-area formula. We thus have:
A(z) =−β−1ln Z
Σ(z)
e−βV|∇ξ|−1dσΣ(z)
! .
General statement: LetX be a random variable with lawψ(x)dx inRn, then ξ(X) has lawR
Σ(z)ψ|∇ξ|−1dσΣ(z)dz,and the law of X conditioned to a fixed valuez ofξ(X)is
dµΣ(z)= R ψ|∇ξ|−1dσΣ(z)
Σ(z)ψ|∇ξ|−1dσΣ(z).
Indeed, for any two bounded functionsf andg, E(f(ξ(X))g(X)) =
Z
Rnf(ξ(x))g(x)ψ(x)dx,
= Z
Rp Z
Σ(z)
f◦ξgψ|∇ξ|−1dσΣ(z)dz,
= Z
Rpf(z) R
Σ(z)gψ|∇ξ|−1dσΣ(z) R
Σ(z)ψ|∇ξ|−1dσΣ(z) Z
Σ(z)
ψ|∇ξ|−1dσΣ(z)dz.
Introduction: reaction coordinate and free energy
Remarks:
-A is thefree energy associated with thereaction coordinateor collective variableξ (angle, length, ...). Ais defined up to an additive constant, so that it is enough to compute free energy differences, or the derivative ofA(the mean force).
-A(z) =−β−1lnZΣ(z) andZΣ(z) is the partition function associated with the conditioned probability measures:
µΣ(z)=ZΣ(z)−1 e−βV|∇ξ|−1dσΣ(z). - IfU =
Z
Σ(z)
V ZΣ(z)−1 e−βVδξ(x)−z(dx) and S =−kB
Z
Σ(z)
ln
ZΣ(z)−1 e−βV
ZΣ(z)−1 e−βVδξ(x)−z(dx)(and β−1=kBT), thenA=U−TS.
Introduction: reaction coordinate and free energy
What is free energy ? The simple example of the solvation of a dimer. (Profiles computed using thermodynamic integration.)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−0.6
−0.4
−0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
Parameter z
Free energy difference ∆ F(z)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
−0.6
−0.4
−0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
Parameter z
Free energy difference ∆ F(z)
The density of the solvent molecules is lower on the left than on the right. At high (resp. low) density, the compact state is more (resp. less) likely. The “free energy barrier” is higher at high density than at low density. Related question: interpretation of the free energy barrier in terms of dynamics ?
Introduction: reaction coordinate and free energy
There are manyfree energy calculation techniques:
(a) Thermodynamic integration. (b) Histogram method.
(c)Non equilibrium dynamics. (d)Adaptive dynamics.
Thermodynamic integration
Thermodynamic integration
Thermodynamic integrationis based on two remarks:
(1) The derivativeA′(z)can be obtained by sampling the conditioned probability measureµΣ(z) (Sprik, Ciccotti, Kapral, Vanden-Eijnden, E, den Otter, ...)
A′(z) =ZΣ(z)−1 Z
∇V · ∇ξ
|∇ξ|2 −β−1div
∇ξ
|∇ξ|2
e−βV|∇ξ|−1dσΣ(z),
= Z
f dµΣ(z),
where f = ∇V|∇ξ|·∇ξ2 −β−1div
∇ξ
|∇ξ|2
. Another expression:
A′(z) =ZΣ(z)−1
Z ∇ξ
|∇ξ|2 ·
∇V˜ +β−1H
exp(−βV˜)dσΣ(z), whereV˜ =V +β−1ln|∇ξ|andH=−∇ ·
∇ξ
|∇ξ|
∇ξ
|∇ξ| is the mean curvature vector.
Thermodynamic integration
In the simple caseξ(x1,x2) =x1, remember that A(x1) =−β−1ln
Z
e−βV(x1,x2)dx2
, so that
A′(x1) = Z
∂x1V e−βV(x1,x2)dx2
Z
e−βV(x1,x2)dx2
= Z
∂x1V dµΣ(x1).
Thermodynamic integration
Proof in the general case : (based on the co-area formula)
Z Z
exp(−βV˜)dσΣ(z) ′
φ(z)dz =− Z Z
exp(−βV˜)dσΣ(z)φ′dz,
=− Z Z
exp(−βV˜)φ′◦ξdσΣ(z)dz,
=− Z
exp(−βV˜)φ′◦ξ|∇ξ|dx,
=− Z
exp(−βV˜)∇(φ◦ξ)· ∇ξ
|∇ξ|2|∇ξ|dx,
= Z
∇ ·
exp(−βV˜) ∇ξ
|∇ξ|
φ◦ξdx,
= Z Z
−β∇V˜ · ∇ξ
|∇ξ|2 +|∇ξ|−1∇ · ∇ξ
|∇ξ| !
exp(−βV˜)dσΣ(z)φ(z)dz.
Thermodynamic integration
(2) It is possible to sample the conditioned probability measure µΣ(z)=ZΣ(z)−1 exp(−βV˜)dσΣ(z) by considering the following constrained dynamics:
(RCD)
dXt =−∇V˜(Xt)dt+p
2β−1dWt+∇ξ(Xt)dΛt, dΛt such thatξ(Xt) =z.
Thus,A′(z) =limT→∞ T1 RT
0 f(Xt)dt.
The free energy profile is then obtained bythermodynamic integration:
A(z)−A(0) = Z z
0
A′(z)dz ≃
K
X
i=0
ωiA′(zi).
Thermodynamic integration
Notice that there is actually no need to computef in practice since the mean force may be obtained by averaging the Lagrange
multipliers.
Indeed, we havedΛt =dΛmt +dΛft,with dΛmt =−p
2β−1|∇ξ|∇ξ2(Xt)·dWt and dΛft=|∇ξ|∇ξ2 ·
∇V˜ +β−1H
(Xt)dt =f(Xt)dt so that A′(z) = lim
T→∞
1 T
Z T
0
dΛt = lim
T→∞
1 T
Z T
0
dΛft.
Of course, this comes at a price: essentially, we are using the fact that M→∞lim lim
∆t→0 1 M∆t
M X
m=1 h
ξ q+√
∆t Gm
−2ξ(q) +ξ q−√
∆t Gmi
= ∆ξ(q),
and this estimator has a non zero variance.
Thermodynamic integration
More explicitly, the rigidly constrained dynamics writes:
(RCD) dXt=P(Xt)
−∇V˜(Xt)dt+p
2β−1dWt
+β−1H(Xt)dt,
whereP(x) is the orthogonal projection operator:
P(x) =Id−n(x)⊗n(x), withn the unit normal vector: n(x) = ∇ξ
|∇ξ|(x).
(RCD) can also be written using theStratonovitch product:
dXt=−P(Xt)∇V˜(Xt)dt+p
2β−1P(Xt)◦dWt.
It is easy to check thatξ(Xt) =ξ(X0) =z for Xt solution to (RCD).
Thermodynamic integration
[G. Ciccotti, TL, E. Vanden-Einjden, 2008]Assume wlg that z =0. The probabilityµΣ(0) is theunique invariant measure with support in Σ(0) for (RCD).
Proposition: LetXt be the solution to (RCD) such that the law of X0 isµΣ(0). Then, for all smooth functionφand for all timet >0,
E(φ(Xt)) = Z
φ(x)dµΣ(0)(x).
Proof:Introduce the infinitesimal generator and applythe divergence theorem on submanifolds:∀φ∈ C1(R3N,R3N),
Z
divΣ(0)(φ)dσΣ(0)=− Z
H·φdσΣ(0), wheredivΣ(0)(φ) =tr(P∇φ).
Thermodynamic integration
Discretization: These two schemes are consistent with (RCD):
(S1)
Xn+1 =Xn− ∇V˜(Xn)∆t+p
2β−1∆Wn+λn∇ξ(Xn+1), with λn ∈Rsuch thatξ(Xn+1) =0,
(S2)
Xn+1=Xn− ∇V˜(Xn)∆t+p
2β−1∆Wn+λn∇ξ(Xn), withλn∈R such thatξ(Xn+1) =0,
where∆Wn=W(n+1)∆t−Wn∆t. The constraint is exactly satisfied (important for longtime computations). The discretization ofA′(0) =limT→∞ T1 RT
0 dΛt is:
T→∞lim lim
∆t→0
1 T
T/∆t
X
n=1
λn=A′(0).
Thermodynamic integration
In practice, the followingvariance reduction schememay be used:
Xn+1=Xn− ∇V˜(Xn)∆t+p
2β−1∆Wn+λ∇ξ(Xn+1), with λ∈Rsuch thatξ(Xn+1) =0,
X∗ =Xn− ∇V˜(Xn)∆t−p
2β−1∆Wn+λ∗∇ξ(X∗), with λ∗ ∈R such thatξ(X∗) =0,
andλn= (λ+λ∗)/2.
The martingale partdΛmt (i.e. the most fluctuating part) of the Lagrange multiplier is removed.
Thermodynamic integration
Error analysis[Faou,TL, Mathematics of Computation, 2010]: Using classical technics (Talay-Tubaro like proof), one can check that the ergodic measureµ∆tΣ(0) sampled by the Markov chain(Xn) is an
approximation of order one ofµΣ(0): for all smooth functions g : Σ(0)→R,
Z
Σ(0)
gdµ∆tΣ(0)− Z
Σ(0)
gdµΣ(0)
≤C∆t.
Metastability issue: Using TI, we sample the conditional measures µΣ(z) rather than the original Gibbs measure µ. The long-time behaviour of the constrained dynamics (RCD) will be essentially limited bythe LSI contantρ(z) of the conditional measures µΣ(z) (to be compared with the LSI constantR of the original measure µ). For well-chosen ξ,ρ(z)≫R, which explains the efficiency of the whole procedure.
Thermodynamic integration
Remarks:
- There are many ways to constrain the dynamics (GD). We chose one which is simple to discretize. We may also have used, for example (forz =0)
dXηt =−∇V(Xηt)dt − 1
2η∇(ξ2)(Xηt)dt+p
2β−1dWt, where the constraint is penalized. One can show that
limη→0Xηt =Xt (inL∞t∈[0,T](L2ω)-norm) whereXt satisfies (RCD). Notice that we usedV and notV˜ in the penalized dynamics.
The statistics associated with the dynamics where the constraints are rigidly imposed and the dynamics where the constraints are softly imposed through penalizationare different: “a stiff spring6=a rigid rod” (van Kampen, Hinch,...).
Thermodynamic integration
- TI yields a way to computeR
φ(x)dµ(x):
Z
φ(x)dµ(x) =Z−1 Z
φ(x)e−βV(x)dx,
=Z−1 Z
z
Z
Σ(z)
φe−βV|∇ξ|−1dσΣ(z)dz, (co-area formula)
=Z−1 Z
z
R
Σ(z)φe−βV|∇ξ|−1dσΣ(z) R
Σ(z)e−βV|∇ξ|−1dσΣ(z) Z
Σ(z)
e−βV|∇ξ|−1dσΣ(z)dz,
= Z
z
e−βA(z)dz −1Z
z
Z
Σ(z)
φdµΣ(z)
!
e−βA(z)dz.
withΣ(z) ={x, ξ(x) =z}, A(z) =−β−1ln
R
Σ(z)e−βV|∇ξ|−1dσΣ(z) and µΣ(z)=e−βV|∇ξ|−1dσΣ(z)/R
Σ(z)e−βV|∇ξ|−1dσΣ(z).
Thermodynamic integration
-[C. Le Bris, TL, E. Vanden-Einjden, CRAS 2008] For a general SDE (with a non isotropic diffusion), the following diagramdoes not commute:
Pcont
Pdisc Projected discretized process
? Discretized process
Projected continuous process
Continuous process
Discretized projected continuous process
∆t
∆t
Thermodynamic integration
Generalization to Langevin dynamics. Interests: (i) Newton’s equations of motion are more “natural”; (ii) leads to numerical schemes which sample the constrained measure without time discretization error; (iii) seems to be more robust wrt the timestep choice.
dqt =M−1ptdt,
dpt =−∇V(qt)dt−γM−1ptdt +p
2γβ−1dWt+∇ξ(qt)dλt, ξ(qt) =z.
The probability measure sampled by this dynamics is
µT∗Σ(z)(dqdp) =Z−1exp(−βH(q,p))σT∗Σ(z)(dqdp), whereH(q,p) =V(q) +12pTM−1p.
Thermodynamic integration
The marginal ofµT∗Σ(z)(dqdp)in q writes:
νΣ(z)M = 1
Z exp(−βV(q))σMΣ(z)(dq)6= 1
Z exp(−βV(q))δξ(q)−z(dq).
Thus, the “free energy” which is naturally computed by this dynamics is
AM(z) =−β−1ln Z
Σ(z)
exp(−βV(q))σΣ(z)M (dq)
! .
The original free energy may be recovered from the relation: for GM =∇ξTM−1∇ξ,
A(z)−AM(z) =−β−1ln Z
Σ(z)
det(GM)−1/2dνΣ(z)M
! .
Thermodynamic integration
Moreover, one can check that:
T→∞lim 1 T
Z T 0
dλt = (AM)′(z).
Discretization: A natural numerical scheme is to use a splitting:
• 1/2 midpoint Euler on the fluctuation-dissipation part,
• 1 Verlet step on the Hamiltonian part (RATTLE scheme) and
• 1/2 midpoint Euler on the fluctuation-dissipation part.
Thermodynamic integration
pn+1/4=pn−∆t
4 γM−1(pn+pn+1/4) + r∆t
2 σGn+∇ξ(qn)λn+1/4,
∇ξ(qn)TM−1pn+1/4=0,
pn+1/2=pn+1/4−∆t
2 ∇V(qn) +∇ξ(qn)λn+1/2, qn+1=qn+ ∆t M−1pn+1/2,
ξ(qn+1) =z,
pn+3/4=pn+1/2−∆t
2 ∇V(qn+1) +∇ξ(qn+1)λn+3/4,
∇ξ(qn+1)TM−1pn+3/4=0,
pn+1=pn+3/4−∆t
4 γM−1(pn+3/4+pn+1) + r∆t
2 σGn+1/2 +∇ξ(qn+1)λn+1,
∇ξ(qn+1)TM−1pn+1=0, andlimT→∞lim∆t→0T1 PT/∆t
n=1
λn+1/2+λn+3/4
= (AM)′(z).
Thermodynamic integration
Using the symmetry of the Verlet step, it is easy toadd a Metropolization stepto the previous numerical scheme, thus removing the time discretization error. For this modified scheme, it is easy to prove that
∆t→0lim lim
T→∞
1 T
T/∆t
X
n=1
λn+1/2+λn+3/4
= (AM)′(z).
By choosingM = ∆tγ/4=Id, this leads to an original sampling scheme in the configuration space (generalized Hybrid Monte Carlo scheme).
Notice that it is not clear how to use such a Metropolization step for the constrained dynamics (RCD) since the proposal kernel is not symmetric, and does not admit any simple analytical expression.
Thermodynamic integration
Algorithm: Let us introduce R∆t which is such that, if
(qn,pn)∈T∗Σ(z), and |pn|2 ≤R∆t, one step of the RATTLE scheme is well defined(i.e.there exists a unique solution to the constrained problem). Then the GHMC scheme writes (M = ∆tγ/4=Id):
Thermodynamic integration
Consider an initial configurationq0∈Σ(z). Iterate onn≥0,
1. Sample a random vector in the tangent spaceTqnΣ(z)(∇ξ(qn)Tpn=0):
pn=β−1/2P(qn)Gn,
where(Gn)n≥0are i.i.d. standard random Gaussian variables, and compute the energyEn= 1
2|pn|2+V(qn)of the configuration(qn,pn);
2. If|pn|2>R∆t, setEn+1= +∞and go to(3); otherwise perform one integration step of the RATTLE scheme:
pn+1/2=pn−∆t
2 ∇V(qn) +∇ξ(qn)λn+1/2, e
qn+1=qn+ ∆t pn+1/2, ξ(eqn+1) =z,
e
pn+1=pn+1/2−∆t
2 ∇V(eqn+1) +∇ξ(eqn+1)λn+1,
∇ξ(eqn+1)Tepn+1=0;
Thermodynamic integration
3. If|epn+1|2>R∆t, setEn+1= +∞; otherwise compute the energy En+1=1
2|epn+1|2+V(eqn+1)of the new phase-space configuration.
Accept the proposal and setqn+1=qen+1 with probability min
exp(−β(En+1−En)),1
; otherwise, reject and setqn+1=qn.
Proposition: The probability measure νΣ(z)M = 1
Z exp(−βV(q))σMΣ(z)(dq) is invariant for the Markov Chain(qn)n≥1.
Non-equilibrium dynamics
Non-equilibrium dynamics
Let us consider a stochastic process such thatX0 ∼µΣz(0) and
dXt =−P(Xt)∇V˜(Xt)dt+p
2β−1P(Xt)◦dWt +∇ξ(Xt)dΛextt ,
dΛextt = z′(t)
|∇ξ(Xt)|2dt,
wherez : [0,T]→[0,1]is a fixed deterministic evolution of the reaction coordinateξ, such that z(0) =0 andz(T) =1.
One can check that
ξ(Xt) =z(t).
Non-equilibrium dynamics
The dynamics can also be written using a Lagrange multiplier:
dXt=−∇V˜(Xt)dt+p
2β−1dWt+∇ξ(Xt)dΛt, ξ(Xt) =z(t).
And we have
dΛt=dΛmt +dΛft+dΛextt , wheredΛmt =−p
2β−1|∇ξ|∇ξ2(Xt)·dWt ,dΛft =f(Xt)dt and dΛextt = |∇ξ(Xz′(t)
t)|2dt.
Non-equilibrium dynamics
How to get equilibrium quantities (like the free energy) through non-equilibrium simulations ?
The idea is to associate to each trajectoryXt a weight W(t) =
Z t 0
f(Xs)z′(s)ds= Z t
0
z′(s)dΛfs.
and to compute free energy differences by a Feynman-Kac formula (Jarzynski identity):
A(z(t))−A(z(0)) =−β−1ln(E(exp(−βW(t)))).
Non-equilibrium dynamics
[TL, M. Rousset, G. Stoltz, 2007]. The proof consists in introducing the semi-group associated with the dynamics
u(s,x) =E
ϕ(Xs,xt )exp
−β Z t
s
f(Xs,xr )z′(r)dr
and to show that dsd R
u(s, .)exp(−βV˜)dσΣz(s) =0 using the divergence theorem on submanifolds. Then
Z
u(t, .)exp(−βV˜)dσΣz(t) = Z
u(0, .)exp(−βV˜)dσΣz(0) is equivalent to
Z
ϕexp(−βV˜)dσΣz(t)
=exp(−βA(z(0)))E Z
ϕ(Xt)exp
−β Z t
0
f(Xr)z′(r)dr
.
Non-equilibrium dynamics
A more general relation is the so-calledCrooks identitywhich is a more general formula relating the free energy to the work of forward and backward switchedprocesses. Letqtf andqbt satisfy:
q0f ∼µΣ(z(0)),q0b∼µΣ(z(T)), ( dqtf =−∇V˜(qft)dt+p
2β−1dWtf +∇ξ(qft)dΛft, ξ(qtf) =z(t),
( dqtb′ =−∇V˜(qtb′)dt′+p
2β−1dWtb′ +∇ξ(qbt′)dΛbt′, ξ(qtb′) =z(T −t′).
Non-equilibrium dynamics
Then, for anyθ∈[0,1], for any path functional φ, exp
−β(A(z(T))−A(z(0)) E
φ({qbT−s}0≤s≤T)exp(−βθWb(T))
=E
φ({qsf}0≤s≤T)exp(−β(1−θ)Wf(T)) ,
whereWf(T) =Rt
0 f(qsf)z′(s)ds and Wb(T) =−Rt
0 f(qsb)z′(T −s)ds.
This identity can be used tocombine forward and backward processes to get better estimates of the free energy difference, see for examplebridge sampling methods[Bennett, Meng and Wong, Shirts].
Non-equilibrium dynamics
The discretization of the constrained process is (as before):
(S1)
Xn+1 =Xn− ∇V˜(Xn)∆t+p
2β−1∆Wn+λn∇ξ(Xn+1), with λn such thatξ(Xn+1) =z(tn+1),
(S2)
Xn+1=Xn− ∇V˜(Xn)∆t+p
2β−1∆Wn+λn∇ξ(Xn), withλn such thatξ(Xn+1) =z(tn+1).
To extractλfn fromλn, one can e.g. compute:
λfn=λn−z(tn+1)−z(tn)
|∇ξ(Xn)|2 +p
2β−1 ∇ξ
|∇ξ|2(Xn)·∆Wn.
Non-equilibrium dynamics
Another method to computeλfn consists in:
XRn+1 =Xn− ∇V˜(Xn)∆t−p
2β−1∆Wn+λRn∇ξ(XRn+1), with λRn such that 12 ξ(XRn+1) +ξ(Xn+1)
=ξ(Xn).
We then haveλfn= 12 λn+λRn . The weight is then approximated by
( W0=0,
Wn+1 =Wn+z(tn+1t )−z(tn)
n+1−tn λfn, and a (biased) estimator of the free energy difference A(z(T))−A(z(0))is−β−1ln
1 M
PM
m=1exp
−βWT/∆tm .
Non-equilibrium dynamics
In practice, the efficiency of this numerical method is not clearly demonstrated. If the transition is too fast, the variance of the estimator is very large. If the transition is slow, we are back to thermodynamic integration...
Ideas: (i) combine forward and backward trajectories, (ii) add selection mechanisms[M. Rousset, G. Stoltz, 2006] or (iii) use importance sampling to help the transition (escorting)[Vaikuntanathan, Jarzynski, 2008]. All this can be generalized to Langevin (phase-space) dynamics, with the additional difficulty that generalized free energies with constraints on both positions and momenta are obtained.
Adaptive biasing techniques
Adaptive biasing techniques
Recall that the original gradient dynamics (GD) dXt =−∇V(Xt)dt+p
2β−1dWt
is metastable. We want to use the information thatξ(Xt) is a slow variable, whereξ:Rn→Tto have better sampling.
We will use the free energy A(z) =−β−1ln
Z
Σ(z)
e−βVδξ(x)−z(dx)
! ,
to bias the dynamics.
Adaptive biasing techniques
The bottom line of adaptive methods is the following: for “well chosen” ξ the potentialV −A◦ξ is less rugged thanV. Indeed, by constructionξ∗exp(−β(V −A◦ξ)) =1T.
Problem: Ais unknown ! Idea: use a time dependent potential of the form
Vt(x) =V(x)−At(ξ(x))
whereAt is an approximation at timet of A, given the configurations visited so far.
Hopes:
• build a dynamics which goes quickly to equilibrium,
• compute free energy profiles.
Wang-Landau, ABF, metadynamics:Darve, Pohorille, Hénin, Chipot, Laio, Parrinello, Wang, Landau,...
Adaptive biasing techniques
x coordinate
y coordinate
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5 0 0.5 1 1.5
0.0 2000 4000 6000 8000 10000
−1.5
−1.0
−0.5 0.0 0.5 1.0 1.5
Time
x coordinate
x coordinate
y coordinate
−1.5 −1 −0.5 0 0.5 1 1.5
−1.5
−1
−0.5 0 0.5 1 1.5
0.0 2000 4000 6000 8000 10000
−1.5
−1.0
−0.5 0.0 0.5 1.0 1.5
Time
x coordinate
A 2d example of a free energy biased trajectory: energetic barrier.
Adaptive biasing techniques
x coordinate
y coordinate
0.0 5000 10000 15000 20000
−8
−4 0 4 8
Time
x coordinate
−8 −6 −2 2 6 8
0.
3.
x coordinate
y coordinate
0 5000 10000 15000 20000
−8
−4 0 4 8
Time
x coordinate
A 2d example of a free energy biased trajectory: entropic barrier.