Part 1: Sampling the canonical distribution and computing free energy differences.

(1)

Monte Carlo methods in molecular dynamics

Part 1: Sampling the canonical distribution and computing free energy differences.

T. Lelièvre

CERMICS - Ecole des Ponts ParisTech & MicMac project-team - INRIA

IPAM tutorial, MTDUT September 2012

(2)

Introduction

The aim of molecular dynamics simulations is to understand the relationships between themacroscopic propertiesof a molecular system and itsatomisticfeatures. In particular, one would like to to evaluate numerically macroscopic quantities from models at the microscopic scale.

Some examples of macroscopic quantities:

(i) Thermodynamics quantities (average of some observable wrt an equilibrium measure): stress, heat capacity,free energy,...

Eµ(ϕ(X)) = Z

R^d

ϕ(x)µ(dx).

(ii) Dynamical quantities(average over trajectories at equilibrium):

diffusion coefficients, viscosity, transition rates,...

E(F((Xt)_t≥0)) = Z

C⁰(R+,R^d)F((xt)_t≥0))W(d((xt)_t≥0)).

(3)

Introduction

Many applications in various fields: biology, physics, chemistry, materials science. Molecular dynamics computations consume today a lot of CPU time.

A molecular dynamics model amounts essentially in choosing a potentialV which associates to a configuration

(x1, ...,x_N) =x ∈R^3N an energy V(x1, ...,x_N).

In the canonical (NVT) ensemble, configurations are distributed according to the Boltzmann-Gibbs probability measure:

dµ(x) =Z⁻¹exp(−βV(x))dx, whereZ =R

exp(−βV(x))dx is the partition function and β= (k_BT)⁻¹ is proportional to the inverse of the temperature.

(4)

Introduction

Typically,V is a sum of potentials modelling interaction between two particles, three particles and four particles:

V =X

i<j

V1(xi,xj) + X

i<j<k

V2(xi,xj,x_k) + X

i<j<k<l

V3(xi,xj,x_k,x_l).

For example,V₁(x_i,x_j) =V_LJ(|x_i −x_j|) where VLJ(r) =4ǫ

σ r

12

− ^σ_r6

is the Lennard-Jones potential.

Difficulties: (i) high-dimensional problem (N ≫1) ; (ii)µis a multimodal measure.

(5)

Introduction

To sampleµ, ergodic dynamics wrt to µare used. A typical example is theover-damped Langevin(or gradient) dynamics:

(GD) dXt =−∇V(Xt)dt+p

2β⁻¹dWt.

It is the limit (when the mass goes to zero or the damping parameter to infinity) of theLangevin dynamics:

(

dXt=M⁻¹Ptdt,

dPt=−∇V(Xt)dt−γM⁻¹Ptdt+p

2γβ⁻¹dWt, whereMis the mass tensor andγis the friction coefficient.

To compute dynamical quantities, these are also typically the dynamics of interest. Thus,

Eµ(ϕ(X))≃ 1 T

Z T 0

ϕ(Xt)dt andE(F((Xt)t≥0))≃ 1 N

N

X

m=1

F((X^m_t )t≥0).

In the following, we mainly consider theover-damped Langevin dynamics.

(6)

Introduction

Probabilistic insert (1): discretization of SDEs.

The discretization of (GD) by the Euler scheme is (for a fixed timestep∆t):

X_n+1 =X_n− ∇V(Xn) ∆t+p

2β⁻¹∆tGn

where(G_nⁱ)_{1≤i≤3N,n≥0} are i.i.d. random variables with lawN(0,1).

Indeed,

(W_(n+1)∆t−W_n∆t)_n≥0 =^L √

∆t(G_n)_n≥0.

In practice, a sequence of i.i.d. random variables with lawN(0,1) may be obtained from a sequence of i.i.d. random variables with lawU((0,1)).

(7)

Introduction

µin invariant for (GD)Proof: One needs to show that if the law of X0 is µ, then the law of Xt is also µ. Let us denote X^x_t the solution to (GD) such thatX0=x. Let us consider the function u(t,x) solution to:

∂tu(t,x) =−∇V(x)· ∇u(t,x) +β⁻¹∆u(t,x), u(0,x) =φ(x)(+ assumptions on decay at infinity),

then,u(t,x) =E(φ(X^x_t)). Thus, the measure µis invariant:

d dt

Z

E(φ(X^x_t))dµ(x) =Z⁻¹ Z

∂tu(t,x)exp(−βV(x))dx

=Z⁻¹ Z

−∇V · ∇u+β⁻¹∆u

exp(−βV)=0.

Therefore, Z

E(φ(X^x_t))dµ(x) = Z

φ(x)dµ(x).

(8)

Introduction

Probabilistic insert (2): Feynman-Kac formula.

Whyu(t,x) =E(φ(X^x_t))? For 0<s <t, we have (characteristic method):

d u(t−s,X^x_s) =−∂tu(t−s,X^x_s)ds+∇u(t−s,X^x_s)·dX^x_s +β⁻¹∆u(t−s,X^x_s)ds,

=

−∂_tu(t−s,X^x_s)− ∇V(X^x_s)· ∇u(t−s,X^x_s) +β⁻¹∆u(t−s,X^x_s)

ds+p

2β⁻¹∇u(t−s,X^x_s)·dW_s. Thus, integrating overs ∈(0,t) and taking the expectation:

E(u(0,X^x_t))−E(u(t,X^x₀)) =p 2β⁻¹E

Z t

0 ∇u(t−s,X^x_s)·dWs

=0.

(9)

Introduction

Probabilistic insert (3): Itô’s calculus. (in 1d.)

Where does the term∆u come from ? Starting from the discretization:

X_n+1=Xn−V^′(Xn) ∆t+p

2β⁻¹∆tGn, we have (for a time-independent functionu):

u(X_n+1) =u

Xn−V^′(Xn) ∆t+p

2β⁻¹∆tGn

,

=u(Xn)−u^′(Xn)V^′(Xn) ∆t+p

2β⁻¹∆tu^′(Xn)Gn

+β⁻¹(Gn)²u^′′(Xn)∆t+o(∆t).

Thus, summing overn ∈[0...t/∆t] and taking the limit∆t →0, u(Xt) =u(X0)−

Z t 0

V^′(Xs)u^′(Xs)ds+p 2β⁻¹

Z t 0

u^′(Xs)dWs

+β⁻¹ Z t

0

u^′′(Xs)ds.

(10)

Introduction: metastability

Difficulty: In practice,X_t is a metastable process, so that the convergence to equilibrium is very slow.

A 2d schematic picture: X_t¹ is a slow variable(a metastable dof) of the system.

x1

x2

V(x1,x2)

X_t¹

t

(11)

Introduction: metastability

A more realistic example (Dellago, Geissler): Influence of the solvation on a dimer conformation.

00000000 00000000 00000000 0000

11111111 11111111 11111111 1111

000000 000000 000000 111111 111111 111111

000000 000000 000000 111111 111111 111111 00000000 00000000 00000000 11111111 11111111 11111111

00000000 00000000 00000000 0000

11111111 11111111 11111111 1111

000000 000000 000000 000

111111 111111 111111 111

00000000 00000000 00000000 11111111 11111111 11111111

000000 000000 000000 111111 111111 111111 000000000000000000000000

11111111 11111111 11111111 000000 000000 000000 111111 111111 111111 000000 000000 000000 111111 111111 111111

000000 000000 000000 000

111111 111111 111111 111

000000 000000 000000 111111 111111 111111

000000 000000 000000 000

111111 111111 111111 111

00000000 00000000 00000000 0000

11111111 11111111 11111111 1111

000000 000000 000000 000

111111 111111 111111 111

000000 000000 000000 000

111111 111111 111111 111

00000000 00000000 00000000 0000

11111111 11111111 11111111 1111

000000 000000 000000 111111 111111 111111 000000 000000 000000 111111 111111 111111

000000 000000 000000 000

111111 111111 111111 111

00000000 00000000 00000000 11111111 11111111 11111111

000000 000000 000000 111111 111111 111111

00000000 00000000 00000000 11111111 11111111 11111111 000000 000000 000000 000

111111 111111 111111 111 00000000 00000000 00000000 11111111 11111111 11111111

Compact state. Stretched state.

The particles interact through a pair potential: truncated LJ for all particles except the two monomers (black particles) which interact through a double-well potential. A slow variable is the distance between the two monomers.

(12)

Introduction: metastability

A “real” example: ions canal in a cell membrane. (C. Chipot).

(13)

A second “real” example: Diffusion of adatoms on a surface.

(A. Voter).

Los Alamos

(14)

Introduction: metastability

One central numerical difficulty is thusmetastability. How to quantify this bad behaviour ?

1. Escape time from a potential well.

2. Asymptotice variance of the estimator.

3. “Decorrelation time”.

4. Rate of convergence of the law of X_t to µ.

In the following we use the fourth criterion.

(15)

Introduction: metastability

The PDE point of view: convergence of the pdfψ(t,x) of Xt to ψ∞(x) =Z⁻¹e^−βV^(x). ψ satisfies the Fokker-Planck equation

∂tψ=div(∇Vψ+β⁻¹∇ψ), which can be rewritten as∂tψ=β⁻¹div

ψ_∞∇

ψ ψ_∞

. Let us introducethe entropy

E(t) =H(ψ(t,·)|ψ_∞) = Z

ln ψ

ψ_∞

ψ.

We have (Csiszár-Kullback inequality):

kψ(t,·)−ψ∞kL¹ ≤p 2E(t).

(16)

Introduction: metastability

dE dt =

Z ln

ψ ψ_∞

∂tψ

=β⁻¹ Z

ln ψ

ψ_∞

div

ψ_∞∇ ψ

ψ_∞

=−β⁻¹ Z

∇ln

ψ ψ_∞

2

ψ=:−β⁻¹I(ψ(t,·)|ψ_∞).

IfV is such that the following Logarithmic Sobolev inequality (LSI(R)) holds: ∀ψ pdf,

H(ψ|ψ_∞)≤ 1

2RI(ψ|ψ_∞)

thenE(t)≤E(0)exp(−2β⁻¹Rt) and thusψ converges toψ_∞ exponentially fast with rateβ⁻¹R (and the converse is true).

Metastability ⇐⇒ smallR

(17)

Introduction: reaction coordinate and free energy

Metastability: How to attack this problem ?

We suppose in the following that the slow variable isof dimension 1 andknown: ξ(Xt), where ξ :Rⁿ→T.

For example, in the 2d simple system,ξ(x₁,x₂) =x₁.

x1

x2

V(x1,x2)

X_t¹

t

We will use this knowledge to build efficient sampling techniques.

Two ideas:

• Conditioning (constrained dynamics),

• Importance sampling (biased dynamics).

Free energywill play a central role.

(18)

Introduction: reaction coordinate and free energy

Let us introduce two probability measures associated toµandξ:

• The image of the measure µby ξ:

ξ∗µ(dz) =exp(−βA(z))dz where the free energy Ais defined by:

A(z) =−β⁻¹ln Z

Σ(z)

e^−βVδ_ξ(x)−z(dx)

! ,

with Σ(z) ={x, ξ(x) =z} is a (smooth) submanifold of R^d, andδ_ξ(x_)−z(dx)dz =dx.

• The probability measure µconditioned to ξ(x) =z:

µ_Σ(z)(dx) = exp(−βV(x))δ_ξ(x_)−z(dx) exp(−βA(z)) .

(19)

Introduction: reaction coordinate and free energy

In the simple caseξ(x1,x2) =x1, we have:

• The image of the measure µby ξ:

ξ∗µ(dx1) =exp(−βA(x1))dx1

where the free energy Ais defined by:

A(x1) =−β⁻¹ln Z

e^−βV^(x¹^,x²⁾dx2

,

andΣ(x1) ={(x1,x2),x2∈R}.

• The probability measure µconditioned to ξ(x1,x2) =x1: µ_Σ(x₁₎(dx₂) = exp(−βV(x1,x2))dx2

exp(−βA(x1)) .

(20)

Introduction: reaction coordinate and free energy

The measureδ_ξ(x)−z is related to the Lebesgue measure onΣ(z) through:

δ_ξ(x)−z =|∇ξ|⁻¹dσ_Σ(z). This is theco-area formula. We thus have:

A(z) =−β⁻¹ln Z

Σ(z)

e^−βV|∇ξ|⁻¹dσ_Σ(z)

! .

General statement: LetX be a random variable with lawψ(x)dx inRⁿ, then ξ(X) has lawR

Σ(z)ψ|∇ξ|⁻¹dσ_Σ(z)dz,and the law of X conditioned to a fixed valuez ofξ(X)is

dµ_Σ(z)= ^R ^ψ^|∇ξ|⁻¹^dσ^Σ(z)

Σ(z)ψ|∇ξ|⁻¹dσ_Σ(z).

Indeed, for any two bounded functionsf andg, E(f(ξ(X))g(X)) =

Z

Rnf(ξ(x))g(x)ψ(x)dx,

= Z

Rp Z

Σ(z)

f◦ξgψ|∇ξ|⁻¹dσ_Σ(z)dz,

= Z

Rpf(z) R

Σ(z)gψ|∇ξ|⁻¹dσ_Σ(z) R

Σ(z)ψ|∇ξ|⁻¹dσ_Σ(z) Z

Σ(z)

ψ|∇ξ|⁻¹dσ_Σ(z)dz.

(21)

Introduction: reaction coordinate and free energy

Remarks:

-A is thefree energy associated with thereaction coordinateor collective variableξ (angle, length, ...). Ais defined up to an additive constant, so that it is enough to compute free energy differences, or the derivative ofA(the mean force).

-A(z) =−β⁻¹lnZ_Σ(z) andZ_Σ(z) is the partition function associated with the conditioned probability measures:

µ_Σ(z)=Z_Σ(z)⁻¹ e^−βV|∇ξ|⁻¹dσ_Σ(z). - IfU =

Z

Σ(z)

V Z_Σ(z)⁻¹ e^−βVδ_ξ(x)−z(dx) and S =−kB

Z

Σ(z)

ln

Z_Σ(z)⁻¹ e^−βV

Z_Σ(z)⁻¹ e^−βVδ_ξ(x)−z(dx)(and β⁻¹=kBT), thenA=U−TS.

(22)

Introduction: reaction coordinate and free energy

What is free energy ? The simple example of the solvation of a dimer. (Profiles computed using thermodynamic integration.)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

Parameter z

Free energy difference ∆ F(z)

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

Parameter z

Free energy difference ∆ F(z)

The density of the solvent molecules is lower on the left than on the right. At high (resp. low) density, the compact state is more (resp. less) likely. The “free energy barrier” is higher at high density than at low density. Related question: interpretation of the free energy barrier in terms of dynamics ?

(23)

Introduction: reaction coordinate and free energy

There are manyfree energy calculation techniques:

(a) Thermodynamic integration. (b) Histogram method.

(c)Non equilibrium dynamics. (d)Adaptive dynamics.

(24)

Thermodynamic integration

(25)

Thermodynamic integration

Thermodynamic integrationis based on two remarks:

(1) The derivativeA^′(z)can be obtained by sampling the conditioned probability measureµ_Σ(z) (Sprik, Ciccotti, Kapral, Vanden-Eijnden, E, den Otter, ...)

A^′(z) =Z_Σ(z)⁻¹ Z

∇V · ∇ξ

|∇ξ|² −β⁻¹div

∇ξ

|∇ξ|²

e^−βV|∇ξ|⁻¹dσ_Σ(z),

= Z

f dµ_Σ(z),

where f = ^∇V_|∇ξ|^·∇ξ₂ −β⁻¹div

∇ξ

|∇ξ|²

. Another expression:

A^′(z) =Z_Σ(z)⁻¹

Z ∇ξ

|∇ξ|² ·

∇V˜ +β⁻¹H

exp(−βV˜)dσ_Σ(z), whereV˜ =V +β⁻¹ln|∇ξ|andH=−∇ ·

∇ξ

|∇ξ|

∇ξ

|∇ξ| is the mean curvature vector.

(26)

Thermodynamic integration

In the simple caseξ(x₁,x₂) =x₁, remember that A(x1) =−β⁻¹ln

Z

, so that

A^′(x1) = Z

∂x₁V e^−βV^(x¹^,x²⁾dx2

Z

= Z

∂x₁V dµ_Σ(x₁₎.

(27)

Thermodynamic integration

Proof in the general case : (based on the co-area formula)

Z Z

exp(−βV˜)dσ_Σ(z) ′

φ(z)dz =− Z Z

exp(−βV˜)dσ_Σ(z)φ^′dz,

=− Z Z

exp(−βV˜)φ^′◦ξdσ_Σ(z)dz,

=− Z

exp(−βV˜)φ^′◦ξ|∇ξ|dx,

=− Z

exp(−βV˜)∇(φ◦ξ)· ∇ξ

|∇ξ|²|∇ξ|dx,

= Z

∇ ·

exp(−βV˜) ∇ξ

|∇ξ|

φ◦ξdx,

= Z Z

−β∇V˜ · ∇ξ

|∇ξ|² +|∇ξ|⁻¹∇ · ∇ξ

|∇ξ| !

exp(−βV˜)dσ_Σ(z)φ(z)dz.

(28)

Thermodynamic integration

(2) It is possible to sample the conditioned probability measure µ_Σ(z)=Z_Σ(z)⁻¹ exp(−βV˜)dσ_Σ(z) by considering the following constrained dynamics:

(RCD)

dX_t =−∇V˜(Xt)dt+p

2β⁻¹dW_t+∇ξ(Xt)dΛt, dΛt such thatξ(Xt) =z.

Thus,A^′(z) =lim_T→∞ _T¹ RT

0 f(Xt)dt.

The free energy profile is then obtained bythermodynamic integration:

A(z)−A(0) = Z z

0

A^′(z)dz ≃

K

X

i=0

ωiA^′(zi).

(29)

Thermodynamic integration

Notice that there is actually no need to computef in practice since the mean force may be obtained by averaging the Lagrange

multipliers.

Indeed, we havedΛ_t =dΛ^m_t +dΛ^f_t,with dΛ^m_t =−p

2β⁻¹_|∇ξ|^∇ξ2(Xt)·dW_t and dΛ^f_t=_|∇ξ|^∇ξ2 ·

∇V˜ +β⁻¹H

(Xt)dt =f(Xt)dt so that A^′(z) = lim

T→∞

1 T

Z T

0

dΛt = lim

T→∞

1 T

Z T

0

dΛ^f_t.

Of course, this comes at a price: essentially, we are using the fact that M→∞lim lim

∆t→0 1 M∆t

M X

m=1 h

ξ q+√

∆t G^m

−2ξ(q) +ξ q−√

∆t G^mi

= ∆ξ(q),

and this estimator has a non zero variance.

(30)

Thermodynamic integration

More explicitly, the rigidly constrained dynamics writes:

(RCD) dX_t=P(Xt)

−∇V˜(Xt)dt+p

2β⁻¹dW_t

+β⁻¹H(Xt)dt,

whereP(x) is the orthogonal projection operator:

P(x) =Id−n(x)⊗n(x), withn the unit normal vector: n(x) = ∇ξ

|∇ξ|(x).

(RCD) can also be written using theStratonovitch product:

dXt=−P(Xt)∇V˜(Xt)dt+p

2β⁻¹P(Xt)◦dWt.

It is easy to check thatξ(X_t) =ξ(X₀) =z for X_t solution to (RCD).

(31)

Thermodynamic integration

[G. Ciccotti, TL, E. Vanden-Einjden, 2008]Assume wlg that z =0. The probabilityµ_Σ(0) is theunique invariant measure with support in Σ(0) for (RCD).

Proposition: LetXt be the solution to (RCD) such that the law of X₀ isµ_Σ(0). Then, for all smooth functionφand for all timet >0,

E(φ(Xt)) = Z

φ(x)dµ_Σ(0)(x).

Proof:Introduce the infinitesimal generator and applythe divergence theorem on submanifolds:∀φ∈ C¹(R^3N,R^3N),

Z

div_Σ(0)(φ)dσ_Σ(0)=− Z

H·φdσ_Σ(0), wherediv_Σ(0)(φ) =tr(P∇φ).

(32)

Thermodynamic integration

Discretization: These two schemes are consistent with (RCD):

(S1)

X_n+1 =Xn− ∇V˜(Xn)∆t+p

2β⁻¹∆Wn+λn∇ξ(X_n+1), with λ_n ∈Rsuch thatξ(X_n+1) =0,

(S2)

X_n+1=Xn− ∇V˜(Xn)∆t+p

2β⁻¹∆Wn+λn∇ξ(Xn), withλn∈R such thatξ(Xn+1) =0,

where∆Wn=W_(n+1)∆t−W_n∆t. The constraint is exactly satisfied (important for longtime computations). The discretization ofA^′(0) =lim_T→∞ _T¹ RT

0 dΛt is:

T→∞lim lim

∆t→0

1 T

T/∆t

X

n=1

λn=A^′(0).

(33)

Thermodynamic integration

In practice, the followingvariance reduction schememay be used:

X_n+1=X_n− ∇V˜(Xn)∆t+p

2β⁻¹∆Wn+λ∇ξ(Xn+1), with λ∈Rsuch thatξ(X_n+1) =0,

X_∗ =X_n− ∇V˜(Xn)∆t−p

2β⁻¹∆Wn+λ_∗∇ξ(X_∗), with λ_∗ ∈R such thatξ(X_∗) =0,

andλn= (λ+λ∗)/2.

The martingale partdΛ^m_t (i.e. the most fluctuating part) of the Lagrange multiplier is removed.

(34)

Thermodynamic integration

Error analysis[Faou,TL, Mathematics of Computation, 2010]: Using classical technics (Talay-Tubaro like proof), one can check that the ergodic measureµ^∆t_Σ(0) sampled by the Markov chain(Xn) is an

approximation of order one ofµ_Σ(0): for all smooth functions g : Σ(0)→R,

Z

Σ(0)

gdµ^∆t_Σ(0)− Z

Σ(0)

gdµ_Σ(0)

≤C∆t.

Metastability issue: Using TI, we sample the conditional measures µ_Σ(z) rather than the original Gibbs measure µ. The long-time behaviour of the constrained dynamics (RCD) will be essentially limited bythe LSI contantρ(z) of the conditional measures µ_Σ(z) (to be compared with the LSI constantR of the original measure µ). For well-chosen ξ,ρ(z)≫R, which explains the efficiency of the whole procedure.

(35)

Thermodynamic integration

Remarks:

- There are many ways to constrain the dynamics (GD). We chose one which is simple to discretize. We may also have used, for example (forz =0)

dX^η_t =−∇V(X^η_t)dt − 1

2η∇(ξ²)(X^η_t)dt+p

2β⁻¹dWt, where the constraint is penalized. One can show that

limη→0X^η_t =X_t (inL^∞_t∈[0,T](L²_ω)-norm) whereX_t satisfies (RCD). Notice that we usedV and notV˜ in the penalized dynamics.

The statistics associated with the dynamics where the constraints are rigidly imposed and the dynamics where the constraints are softly imposed through penalizationare different: “a stiff spring6=a rigid rod” (van Kampen, Hinch,...).

(36)

Thermodynamic integration

- TI yields a way to computeR

φ(x)dµ(x):

Z

φ(x)dµ(x) =Z⁻¹ Z

φ(x)e^−βV^(x)dx,

=Z⁻¹ Z

z

Z

Σ(z)

φe^−βV|∇ξ|⁻¹dσ_Σ(z)dz, (co-area formula)

=Z⁻¹ Z

z

R

Σ(z)φe^−βV|∇ξ|⁻¹dσ_Σ(z) R

Σ(z)e^−βV|∇ξ|⁻¹dσ_Σ(z) Z

Σ(z)

e^−βV|∇ξ|⁻¹dσ_Σ(z)dz,

= Z

z

e^−βA(z)dz −1Z

z

Z

Σ(z)

φdµ_Σ(z)

!

e^−βA(z)dz.

withΣ(z) ={x, ξ(x) =z}, A(z) =−β⁻¹ln

R

Σ(z)e^−βV|∇ξ|⁻¹dσ_Σ(z) and µ_Σ(z)=e^−βV|∇ξ|⁻¹dσ_Σ(z)/R

Σ(z)e^−βV|∇ξ|⁻¹dσ_Σ(z).

(37)

Thermodynamic integration

-[C. Le Bris, TL, E. Vanden-Einjden, CRAS 2008] For a general SDE (with a non isotropic diffusion), the following diagramdoes not commute:

P_cont

P_disc Projected discretized process

? Discretized process

Projected continuous process

Continuous process

Discretized projected continuous process

∆t

(38)

Thermodynamic integration

Generalization to Langevin dynamics. Interests: (i) Newton’s equations of motion are more “natural”; (ii) leads to numerical schemes which sample the constrained measure without time discretization error; (iii) seems to be more robust wrt the timestep choice.







dqt =M⁻¹ptdt,

dpt =−∇V(qt)dt−γM⁻¹ptdt +p

2γβ⁻¹dWt+∇ξ(qt)dλt, ξ(qt) =z.

The probability measure sampled by this dynamics is

µ_T∗Σ(z)(dqdp) =Z⁻¹exp(−βH(q,p))σ_T∗Σ(z)(dqdp), whereH(q,p) =V(q) +¹₂p^TM⁻¹p.

(39)

Thermodynamic integration

The marginal ofµ_T∗Σ(z)(dqdp)in q writes:

ν_Σ(z)^M = 1

Z exp(−βV(q))σ^M_Σ(z)(dq)6= 1

Z exp(−βV(q))δ_ξ(q)−z(dq).

Thus, the “free energy” which is naturally computed by this dynamics is

A^M(z) =−β⁻¹ln Z

Σ(z)

exp(−βV(q))σ_Σ(z)^M (dq)

! .

The original free energy may be recovered from the relation: for GM =∇ξ^TM⁻¹∇ξ,

A(z)−A^M(z) =−β⁻¹ln Z

Σ(z)

det(G_M)^−1/2dν_Σ(z)^M

! .

(40)

Thermodynamic integration

Moreover, one can check that:

T→∞lim 1 T

Z T 0

dλt = (A^M)^′(z).

Discretization: A natural numerical scheme is to use a splitting:

• 1/2 midpoint Euler on the fluctuation-dissipation part,

• 1 Verlet step on the Hamiltonian part (RATTLE scheme) and

• 1/2 midpoint Euler on the fluctuation-dissipation part.

(41)

Thermodynamic integration







p^n+1/4=pⁿ−∆t

4 γM⁻¹(pⁿ+p^n+1/4) + r∆t

2 σGⁿ+∇ξ(qⁿ)λ^n+1/4,

∇ξ(qⁿ)^TM⁻¹p^n+1/4=0,











p^n+1/2=p^n+1/4−∆t

2 ∇V(qⁿ) +∇ξ(qⁿ)λ^n+1/2, qⁿ⁺¹=qⁿ+ ∆t M⁻¹p^n+1/2,

ξ(qⁿ⁺¹) =z,

p^n+3/4=p^n+1/2−∆t

2 ∇V(qⁿ⁺¹) +∇ξ(qⁿ⁺¹)λ^n+3/4,

∇ξ(qⁿ⁺¹)^TM⁻¹p^n+3/4=0,











pⁿ⁺¹=p^n+3/4−∆t

4 γM⁻¹(p^n+3/4+pⁿ⁺¹) + r∆t

2 σG^n+1/2 +∇ξ(qⁿ⁺¹)λⁿ⁺¹,

∇ξ(qⁿ⁺¹)^TM⁻¹pⁿ⁺¹=0, andlim_T→∞lim_∆t→0_T¹ PT/∆t

n=1

λ^n+1/2+λ^n+3/4

= (A^M)^′(z).

(42)

Thermodynamic integration

Using the symmetry of the Verlet step, it is easy toadd a Metropolization stepto the previous numerical scheme, thus removing the time discretization error. For this modified scheme, it is easy to prove that

∆t→0lim lim

T→∞

1 T

T/∆t

X

n=1

λ^n+1/2+λ^n+3/4

= (A^M)^′(z).

By choosingM = ∆tγ/4=Id, this leads to an original sampling scheme in the configuration space (generalized Hybrid Monte Carlo scheme).

Notice that it is not clear how to use such a Metropolization step for the constrained dynamics (RCD) since the proposal kernel is not symmetric, and does not admit any simple analytical expression.

(43)

Thermodynamic integration

Algorithm: Let us introduce R_∆t which is such that, if

(qⁿ,pⁿ)∈T^∗Σ(z), and |pⁿ|² ≤R_∆t, one step of the RATTLE scheme is well defined(i.e.there exists a unique solution to the constrained problem). Then the GHMC scheme writes (M = ∆tγ/4=Id):

(44)

Thermodynamic integration

Consider an initial configurationq⁰∈Σ(z). Iterate onn≥0,

1. Sample a random vector in the tangent spaceTqⁿΣ(z)(∇ξ(qⁿ)^Tpⁿ=0):

pⁿ=β^−1/2P(qⁿ)Gⁿ,

where(Gⁿ)n≥0are i.i.d. standard random Gaussian variables, and compute the energyEⁿ= 1

2|pⁿ|²+V(qⁿ)of the configuration(qⁿ,pⁿ);

2. If|pⁿ|²>R∆t, setEⁿ⁺¹= +∞and go to(3); otherwise perform one integration step of the RATTLE scheme:











p^n+1/2=pⁿ−∆t

2 ∇V(qⁿ) +∇ξ(qⁿ)λ^n+1/2, e

qⁿ⁺¹=qⁿ+ ∆t p^n+1/2, ξ(eqⁿ⁺¹) =z,

e

pⁿ⁺¹=p^n+1/2−∆t

2 ∇V(eqⁿ⁺¹) +∇ξ(eqⁿ⁺¹)λⁿ⁺¹,

∇ξ(eqⁿ⁺¹)^Tepⁿ⁺¹=0;

(45)

Thermodynamic integration

3. If|epⁿ⁺¹|²>R∆t, setEⁿ⁺¹= +∞; otherwise compute the energy Eⁿ⁺¹=1

2|epⁿ⁺¹|²+V(eqⁿ⁺¹)of the new phase-space configuration.

Accept the proposal and setqⁿ⁺¹=qeⁿ⁺¹ with probability min

exp(−β(Eⁿ⁺¹−Eⁿ)),1

; otherwise, reject and setqⁿ⁺¹=qⁿ.

Proposition: The probability measure ν_Σ(z)^M = 1

Z exp(−βV(q))σ^M_Σ(z)(dq) is invariant for the Markov Chain(qⁿ)_n≥1.

(46)

Non-equilibrium dynamics

(47)

Non-equilibrium dynamics

Let us consider a stochastic process such thatX0 ∼µ_Σ_z(0) and











dX_t =−P(X_t)∇V˜(X_t)dt+p

2β⁻¹P(X_t)◦dW_t +∇ξ(Xt)dΛ^ext_t ,

dΛ^ext_t = z^′(t)

|∇ξ(Xt)|²dt,

wherez : [0,T]→[0,1]is a fixed deterministic evolution of the reaction coordinateξ, such that z(0) =0 andz(T) =1.

One can check that

ξ(X_t) =z(t).

(48)

Non-equilibrium dynamics

The dynamics can also be written using a Lagrange multiplier:

dXt=−∇V˜(Xt)dt+p

2β⁻¹dWt+∇ξ(Xt)dΛt, ξ(Xt) =z(t).

And we have

dΛt=dΛ^m_t +dΛ^f_t+dΛ^ext_t , wheredΛ^m_t =−p

2β⁻¹_|∇ξ|^∇ξ2(Xt)·dW_t ,dΛ^f_t =f(Xt)dt and dΛ^ext_t = _|∇ξ(X^z^′^(t)

t)|²dt.

(49)

Non-equilibrium dynamics

How to get equilibrium quantities (like the free energy) through non-equilibrium simulations ?

The idea is to associate to each trajectoryX_t a weight W(t) =

Z t 0

f(Xs)z^′(s)ds= Z t

0

z^′(s)dΛ^f_s.

and to compute free energy differences by a Feynman-Kac formula (Jarzynski identity):

A(z(t))−A(z(0)) =−β⁻¹ln(E(exp(−βW(t)))).

(50)

Non-equilibrium dynamics

[TL, M. Rousset, G. Stoltz, 2007]. The proof consists in introducing the semi-group associated with the dynamics

u(s,x) =E

ϕ(X^s,x_t )exp

−β Z t

s

f(X^s,x_r )z^′(r)dr

and to show that _ds^d R

u(s, .)exp(−βV˜)dσ_Σ_z(s) =0 using the divergence theorem on submanifolds. Then

Z

u(t, .)exp(−βV˜)dσ_Σ_z(t) = Z

u(0, .)exp(−βV˜)dσ_Σ_z(0) is equivalent to

Z

ϕexp(−βV˜)dσ_Σ_z(t)

=exp(−βA(z(0)))E Z

ϕ(Xt)exp

−β Z t

0

f(Xr)z^′(r)dr

.

(51)

Non-equilibrium dynamics

A more general relation is the so-calledCrooks identitywhich is a more general formula relating the free energy to the work of forward and backward switchedprocesses. Letq_t^f andq^b_t satisfy:

q₀^f ∼µ_Σ(z(0)),q₀^b∼µ_Σ(z(T)), ( dq_t^f =−∇V˜(q^f_t)dt+p

2β⁻¹dW_t^f +∇ξ(q^f_t)dΛ^f_t, ξ(q_t^f) =z(t),

( dq_t^b′ =−∇V˜(q_t^b′)dt^′+p

2β⁻¹dW_t^b′ +∇ξ(q^b_t′)dΛ^b_t′, ξ(q_t^b′) =z(T −t^′).

(52)

Non-equilibrium dynamics

Then, for anyθ∈[0,1], for any path functional φ, exp

−β(A(z(T))−A(z(0)) E

φ({q^b_T−s}0≤s≤T)exp(−βθW^b(T))

=E

φ({q_s^f}0≤s≤T)exp(−β(1−θ)W^f(T)) ,

whereW^f(T) =Rt

0 f(q_s^f)z^′(s)ds and W^b(T) =−Rt

0 f(q_s^b)z^′(T −s)ds.

This identity can be used tocombine forward and backward processes to get better estimates of the free energy difference, see for examplebridge sampling methods[Bennett, Meng and Wong, Shirts].

(53)

Non-equilibrium dynamics

The discretization of the constrained process is (as before):

(S1)

X_n+1 =X_n− ∇V˜(Xn)∆t+p

2β⁻¹∆Wn+λn∇ξ(Xn+1), with λn such thatξ(X_n+1) =z(t_n+1),

(S2)

X_n+1=X_n− ∇V˜(Xn)∆t+p

2β⁻¹∆Wn+λn∇ξ(Xn), withλn such thatξ(X_n+1) =z(t_n+1).

To extractλ^f_n fromλn, one can e.g. compute:

λ^f_n=λn−z(t_n+1)−z(tn)

|∇ξ(Xn)|² +p

2β⁻¹ ∇ξ

|∇ξ|²(Xn)·∆Wn.

(54)

Non-equilibrium dynamics

Another method to computeλ^f_n consists in:

X^R_n+1 =Xn− ∇V˜(Xn)∆t−p

2β⁻¹∆Wn+λ^R_n∇ξ(X^R_n+1), with λ^R_n such that ¹₂ ξ(X^R_n+1) +ξ(X_n+1)

=ξ(Xn).

We then haveλ^f_n= ¹₂ λn+λ^R_n . The weight is then approximated by

( W0=0,

Wn+1 =Wⁿ+^z(tⁿ⁺¹_t ^)−z(tⁿ⁾

n+1−tn λ^f_n, and a (biased) estimator of the free energy difference A(z(T))−A(z(0))is−β⁻¹ln

1 M

PM

m=1exp

−βW_T/∆t^m .

(55)

Non-equilibrium dynamics

In practice, the efficiency of this numerical method is not clearly demonstrated. If the transition is too fast, the variance of the estimator is very large. If the transition is slow, we are back to thermodynamic integration...

Ideas: (i) combine forward and backward trajectories, (ii) add selection mechanisms[M. Rousset, G. Stoltz, 2006] or (iii) use importance sampling to help the transition (escorting)[Vaikuntanathan, Jarzynski, 2008]. All this can be generalized to Langevin (phase-space) dynamics, with the additional difficulty that generalized free energies with constraints on both positions and momenta are obtained.

(56)

Adaptive biasing techniques

(57)

Adaptive biasing techniques

Recall that the original gradient dynamics (GD) dX_t =−∇V(Xt)dt+p

2β⁻¹dW_t

is metastable. We want to use the information thatξ(Xt) is a slow variable, whereξ:Rⁿ→Tto have better sampling.

We will use the free energy A(z) =−β⁻¹ln

Z

Σ(z)

e^−βVδ_ξ(x)−z(dx)

! ,

to bias the dynamics.

(58)

Adaptive biasing techniques

The bottom line of adaptive methods is the following: for “well chosen” ξ the potentialV −A◦ξ is less rugged thanV. Indeed, by constructionξ∗exp(−β(V −A◦ξ)) =1_T.

Problem: Ais unknown ! Idea: use a time dependent potential of the form

V^t(x) =V(x)−At(ξ(x))

whereAt is an approximation at timet of A, given the configurations visited so far.

Hopes:

• build a dynamics which goes quickly to equilibrium,

• compute free energy profiles.

Wang-Landau, ABF, metadynamics:Darve, Pohorille, Hénin, Chipot, Laio, Parrinello, Wang, Landau,...

(59)

Adaptive biasing techniques

x coordinate

y coordinate

−1.5 −1 −0.5 0 0.5 1 1.5

−1.5

−1

−0.5 0 0.5 1 1.5

0.0 2000 4000 6000 8000 10000

−1.5

−1.0

−0.5 0.0 0.5 1.0 1.5

Time

x coordinate

y coordinate

−1.5 −1 −0.5 0 0.5 1 1.5

−1.5

−1

−0.5 0 0.5 1 1.5

0.0 2000 4000 6000 8000 10000

−1.5

−1.0

−0.5 0.0 0.5 1.0 1.5

Time

x coordinate

A 2d example of a free energy biased trajectory: energetic barrier.

(60)

Adaptive biasing techniques

x coordinate

y coordinate

0.0 5000 10000 15000 20000

−8

−4 0 4 8

Time

x coordinate

−8 −6 −2 2 6 8

0.

3.

x coordinate

y coordinate

0 5000 10000 15000 20000

−8

−4 0 4 8

Time

x coordinate

A 2d example of a free energy biased trajectory: entropic barrier.