• Aucun résultat trouvé

Measuring the Irreversibility of Numerical Schemes for Reversible Stochastic Differential Equations

N/A
N/A
Protected

Academic year: 2022

Partager "Measuring the Irreversibility of Numerical Schemes for Reversible Stochastic Differential Equations"

Copied!
29
0
0

Texte intégral

(1)

DOI:10.1051/m2an/2013142 www.esaim-m2an.org

MEASURING THE IRREVERSIBILITY OF NUMERICAL SCHEMES FOR REVERSIBLE STOCHASTIC DIFFERENTIAL EQUATIONS

Markos Katsoulakis

1

, Yannis Pantazis

1

and Luc Rey-Bellet

1

Abstract. For a stationary Markov process the detailed balance condition is equivalent to the time- reversibility of the process. For stochastic differential equations (SDE’s), the time discretization of numerical schemes usually destroys the time-reversibility property. Despite an extensive literature on the numerical analysis for SDE’s, their stability properties, strong and/or weak error estimates, large deviations and infinite-time estimates, no quantitative results are known on the lack of reversibility of discrete-time approximation processes. In this paper we provide such quantitative estimates by using the concept of entropy production rate, inspired by ideas from non-equilibrium statistical mechanics.

The entropy production rate for a stochastic process is defined as the relative entropy (per unit time) of the path measure of the process with respect to the path measure of the time-reversed process. By construction the entropy production rate is nonnegative and it vanishes if and only if the process is reversible. Crucially, from a numerical point of view, the entropy production rate is an a posteriori quantity, hence it can be computed in the course of a simulation as the ergodic average of a cer- tain functional of the process (the so-called Gallavotti−Cohen (GC) action functional). We compute the entropy production for various numerical schemes such as explicit EulerMaruyama and explicit Milstein’s for reversible SDEs with additive or multiplicative noise. In addition we analyze the entropy production for the BBK integrator for the Langevin equation. The order (in the time-discretization stepΔt) of the entropy production rate provides a tool to classify numerical schemes in terms of their (discretization-induced) irreversibility. Our results show that the type of the noise critically affects the behavior of the entropy production rate. As a striking example of our results we show that the Euler scheme for multiplicative noiseis not an adequate scheme from a reversibility point of view since its entropy production ratedoes not decrease withΔt.

Mathematics Subject Classification. 65C30, 82C3, 60H10.

Received July 23, 2012. Revised September 1, 2013.

Published online August 13, 2014.

1. Introduction

In molecular dynamics algorithms arising in the simulation of systems in materials science, chemical engi- neering, evolutionary games, computational statistical mechanics,etc.the steady-state statistics obtained from numerical simulations is of great importance [8,27,33]. For instance, the free energy of the system or free en- ergy differences as well dynamical transitions between metastable states are quantities which are sampled in

Keywords and phrases. Stochastic differential equations, detailed balance, reversibility, relative entropy, entropy production, numerical integration, (overdamped) Langevin process.

We thanks Natesh Pillai for useful comments and suggestions. M.A.K. and Y.P. are partially supported by NSF-CMMI 0835673 and L.R.-B. is partially supported by NSF -DMS-1109316.

1 Department of Mathematics and Statistics, University of Massachusetts, Amherst, MA, USA.luc@math.umass.edu

Article published by EDP Sciences c EDP Sciences, SMAI 2014

(2)

M. KATSOULAKISET AL.

the stationary regime. At the microscopic level, physical processes are often modeled as interactions between particles described by a system of stochastic differential equations (SDE’s) [8,15]. To perform steady-state sim- ulations for the sampling of desirable observables, the solution of the system of SDE’s must possess a (unique) ergodic invariant measure. The uniqueness of the invariant measure follows from the ellipticity or hypoellipticity of the generator of the process together with irreducibility, which means that the process can reach at some positive time any open set of the state space with positive probability [21,25]. Under such conditions, the point distribution of the process converges to the invariant measure (ergodicity) while the process is stationary,i.e.

the distribution of the path of the process is invariant under time-shift, when the initial data are drawn from the invariant measure. Moreover, many processes of physical origin, such as diffusion and adsorption/desorption of interacting particles, satisfy the condition of detailed balance (DB), or equivalently, time-reversibility,i.e., the distribution of the path of the process is invariant under time-reversal. It is easy to see that time-reversibility implies stationarity but this a strictly stronger condition in general. The condition of detailed balance often arises from a gradient-like behavior of the dynamics or from Hamiltonian dynamics if the time-reversal includes reversal of the velocities.

However, the numerical simulation of SDE’s necessitates the use of numerical discretization schemes. Dis- cretization procedures, except in very special cases, results in the destruction of the DB condition. This affects the approximation process in at least two ways. First, the invariant measure of the approximation process, if it exists at all, is not known explicitly and, second, the time reversibility of the process is lost. Several recent results prove the existence of the invariant measure for the discrete-time approximation and provide error es- timates [2,3,19,20] but, to the best of our knowledge, there is no quantitative assessment of the irreversibility of the approximation process. Of course there exist Metropolized numerical schemes such as MALA [26] and variations thereof [4,15] which do satisfy the DB condition but they are numerically more expensive, especially in high-dimensional systems, as they require an accept/reject step. Thus, a quantitative understanding of the lack of reversibility for simpler discretization schemes can provide new insights for selecting which schemes are closer to satisfying the DB condition.

The implications of irreversibility are only partially understood, both from the physical and mathematical point of view. These issues have emerged as a main theme in non-equilibrium statistical mechanics and it is well-known that irreversibility introduces a stationary current (net flow) to the system [11,18,23,28] but it is still under investigation in statistical mechanics community how this current affects the long-time properties (i.e., the dynamics and large deviations) of the process such as exit times, correlation times and phase transitions of metastable states. Nevertheless, a straightforward implication of irreversibility is the breaking of the symmetry of fundamental quantities and observables: for example, if a process is reversible theautocorrelation function R(s, t) of a given observable is symmetric, i.e. R(s, t) = R(t, s). Reversibility is a natural and fundamental property of physical systems and thus, if numerical approximation results in the destruction of reversibility, one should carefully quantify the irreversibility of the approximation process. More generally, structure preserving numerical schemes have proved to be vastly superior (in stability) in deterministic ODE’s, e.g Stormer–Verlet for Hamiltonian systems, [10] as well as in PDE, e.g. hyperbolic conservation laws and infinite dimensional Hamiltonian systems, and there is no reason to believe it should be different in the numerical analysis of stochastic processes. From a physical point of view the reason for detailed balance is the time-reversibility of an underlying microscopic dynamics over which an (effective) stochastic model is presumably built. Hence, violating such a condition reduces the physical interpretation of our stochastic model. The detailed balance condition has many implications (Kubo-formula for the linear response and Onsager relations for the response coefficients for example which are related to the symmetry of the autocorrelations functions) and as such it should be preserved as much as possible in computational algorithms. In this paper we quantify irreversibility in numerical schemes using the entropy production rate, which allows for a quantitative analysis of the path space statistics, including the long-time, stationary time-series regime. The entropy production rate, defined as the relative entropy (per unit time) of the path measure of the process with respect to the path measure of the reversed process, is widely used in statistical mechanics for the study of non-equilibrium steady states of irreversible systems [7,11,14,18].

(3)

MEASURING THE IRREVERSIBILITY OF NUMERICAL SCHEMES

A fundamental result on the structure of non-equilibrium steady states is the Gallavotti−Cohen fluctuation theorem that describes the fluctuations (of large deviations type) of the entropy production [7,11,14,18] and this result can be viewed as a generalization of the Kubo-formula and Onsager relations far from equilibrium.

Specifically for diffusion processes, entropy production governs the long time statistics of the empirical measure (also known as occupation statistics) as well as the current statistics through the Donsker–Varadhan rate functional, [16,17]. For our purposes, it is important to note that the entropy production rate is zero when the process is reversible and positive otherwise making entropy production rate a sensible quantitative measure of irreversibility. Furthermore, if we assume ergodicity of the approximation process, the entropy production rate equals the time-average of the Gallavotti−Cohen (GC) action functional which is defined as the logarithm of the Radon-Nikodym derivative between the path measure of the process and the path measure of the reversed process. A key observation of this paper is that GC action functional is ana posterioriquantity, hence, it is easily computable during the simulation makingthe numerical computation of entropy production rate tractable. We show that entropy production is a computable observable that distinguishes between different numerical schemes in terms of their discretization-induced irreversibility and as such could allow us to adjust the discretization in the course of the simulation.

We use entropy production to assess the irreversibility of various numerical schemes for reversible continuous- time processes. A simple class of reversible processes, yet of great interest, is the overdamped Langevin process with gradient-type drift [8,9,15]. The discretization of the process is performed using the explicit Euler−Maruyama (EM) scheme and we distinguish between two different cases depending on the kind of the noise. In the case of additive noise, under the assumption of ergodicity of the approximation process [2,3,19,20]

we prove that the entropy production rate is of orderO(Δt2) whereΔtis the time step of the numerical scheme.

In the case of multiplicative noise, the results are strikingly different. Indeed, under ergodicity assumption, the entropy production rate for the explicit EM scheme is proved to have a lower positive bound which is indepen- dent ofΔt. Thus irreversibility is notreduced by adjustingΔt, as the approximation process converges to the continuous-time process. The different behavior of entropy production depending on the kind of noise is one of the prominent findings of this paper. As a further step in our study, we analyze the explicit Milstein’s scheme with multiplicative noise (it is the next higher-order numerical scheme). We prove that the entropy production rate of Milstein’s scheme decreases as time step decreases with orderO(Δt).

Finally, we compute both analytically and numerically the entropy production rate for a discretization scheme for Langevin systems which is another important and widely-used class of reversible models [8,15]. The Langevin equation is time-reversible if in addition to reversing time, one reverses the sign of the velocity of all particles.

The noise is degenerate but the process is hypo-elliptic and under mild conditions the Langevin equation is ergodic [20,24,31]. Our discretization scheme is a quasi-symplectic splitting scheme also known as BBK integrator [5,15]. We rigorously prove, under ergodicity assumption of the approximation process, that the entropy rate produced by the numerical scheme for the Langevin process with additive noise is of orderO(Δt), hence, in terms of irreversibility it is an acceptable integration scheme.

The paper is organized in four sections. In Section 2 we recall some basic facts about reversible processes and define the entropy production. Moreover we give the basic assumptions necessary for our proofs, namely, the ergodicity of both continuous-time and discrete-time approximation process. In Section3 we compute the entropy production rate for reversible overdamped Langevin processes. The section is split into three subsections for the additive and multiplicative noise for the Euler and Milstein schemes. In Section4we compute the entropy production rate for the reversible (up to momenta flip) Langevin process using the BBK integrator. Conclusions and future extensions of the current work are summarized in the fourth and final section.

2. Reversibility, Gallavotti Cohen action functional, and entropy production

Let us consider ad-dimensional system of SDE’s written as

dXt=a(Xt)dt+b(Xt)dBt (2.1)

(4)

M. KATSOULAKISET AL.

where XtRd is a diffusion Markov process,a:Rd Rd is the drift vector,b :Rd Rd×m is the diffusion matrix, and BtRm is a standardm-dimensional Brownian motion. We will always assume that aandb are sufficiently smooth and satisfy suitable growth conditions and/or dissipativity conditions at infinity to ensure the existence of global solutions. The generator of the diffusion process is defined by

Lf = d i=1

ai∂f

∂xi +1 2

d i,j=1

(bbT)i,j 2f

∂xi∂xj· (2.2)

for smooth test functionsf. We assume that the processXt has a (unique) invariant measureμ(dx), and that it satisfies the Detailed Balance (DB) condition,i.e., its generator is symmetric in the Hilbert space L2(μ):

Lf, gL2(μ)=f,LgL2(μ) (2.3) for suitable smooth test functionsf, g.

A Markov processXtis said to be time-reversible if for anynand sequence of times t1< . . . < tn the finite dimensional distributions of (Xt1, . . . , Xtn) and of (Xtn, . . . , Xt1) are identical. More formally, letPρ[0,t] denote the path measure of the processXton the time-interval [0, t] withX0∼ρ. LetΘ denote the time reversal,i.e.

Θ acts on a path{Xs}0≤s≤thas

(ΘX)s = Xt−s (2.4)

Then reversibility is equivalent to Pμ[0,t] = Pμ[0,t] ◦Θ and it is well-known that a stationary2 process which satisfies the DB condition is time-reversible.

The condition of reversibility can be also expressed in terms of relative entropy as follows. Recall that for two probability measureπ1, π2 on some measurable space, the relative entropy ofπ1 with respect toπ2is given byR(π12)

1log1

2 ifπ1 is absolutely continuous with respect toπ2 and +∞otherwise. The relative entropy is nonnegative,R(π12)0 andR(π12) = 0 if and only ifπ1=π2. The entropy production rate of a Markov processXt is defined by

EPcont:= lim

t→∞

1 tR

Pρ[0,t]|Pρ[0,t]◦Θ

= lim

t→∞

1 t

dPρ[0,t]log dPρ[0,t]

dPρ[0,t]◦Θ (2.5) If Xt satisfies DB and X0 μ then R(Pμ[0,t]|Pμ[0,t] ◦Θ) is identically 0 for all t and the entropy production rate is 0. Note that if X0 ρ then R(Pρ[0,t]|Pρ[0,t]◦Θ) is a boundary term, in the sense that it is O(1) and so the entropy rate vanishes in this case in the large time limit (under suitable ergodicity assumptions).

Conversely whenEPcont= 0 the process is truly irreversible. The entropy production rate for Markov processes and stochastic differential equations is discussed in more detail in [14,18].

Let us consider a numerical integration scheme for the SDE (2.1) which has the general form

xi+1=F(xi, Δt, ΔWi) i= 1,2, . . . (2.6) Herexi Rd is a discrete-time continuous state-space Markov process,Δtis the time-step andΔWiRm, i= 1,2, . . .are i.i.d. Gaussian random variables with mean 0 and varianceΔtIm. We will assume that the Markov processxi has transition probabilities which are absolutely continuous with respect to Lebesgue measure with everywhere positive densitiesΠ(xi, xi+1) :=ΠF(x,Δt,ΔW)(xi+1|xi) and we also assume that xi has a invariant measure which we denote ¯μ(dx) and which is then unique and has a density with respect to Lebesgue measure.

In general the invariant measure for Xtandxi differ,μ= ¯μandxi does not satisfy a DB condition. Note also that the very existence of ¯μ is not guaranteed in general. Results on the existence of ¯μ do exist however and typically require that the SDE is elliptic or hypoellitptic and that the state space ofXtis compact or that a global Lipschitz condition on the drift holds [2,3,19,20].

2Stationarity is equivalent to starting the processXt from its invariant measure,i.e.,X0µ.

(5)

MEASURING THE IRREVERSIBILITY OF NUMERICAL SCHEMES

Proceeding as in the continuous case we introduce an entropy production rate for the Markov processxi. Let us assume that the process starts from some distribution ρ(x)dx, then the finite dimensional distribution on the time window [0, t] wheret=nΔtis given by

P¯[0,t](dx0, . . . ,dxn) =ρ(x0)Π(x0, x1). . . Π(xn−1, xn)dx0. . .dxn. (2.7) For the time reversed pathΘ(x0, . . . xn) = (xn, . . . , x0) we have then

P¯[0,t]◦Θ(dx0, . . . ,dxn) =ρ(xn)Π(xn, xn−1). . . Π(x1, x0)dx0. . .dxn (2.8) and the Radon-Nikodym derivative takes the form

d ¯P[0,t]

d ¯P[0,t]◦Θ = exp(W(t))ρ(x0)

ρ(xn) (2.9)

whereW(t) is the Gallavotti−Cohen (GC) action functional given by W(t) =W(n;Δt) :=

n−1

i=0

logΠ(xi, xi+1)

Π(xi+1, xi)· (2.10)

Note that W(t) is an additive functional of the paths and thus if xi is ergodic, by the ergodic theorem the following limit exists

EP(Δt) = lim

t→∞

1

tW(t) = lim

n→∞

1

nΔtW(n;Δt) P¯a.s. (2.11)

We call the quantityEP(Δt) the entropy production rate associated to the numerical scheme. Note that we have, almost surely,

EP(Δt) = 1 Δt lim

n→∞

1 n

n−1

i=0

logΠ(xi, xi+1) Π(xi+1, xi) = 1

Δt μ(x)Π¯ (x, y) logΠ(x, y)

Π(y, x)dxdy (2.12) and for concrete numerical schemes we will compute fairly explicitly the entropy production in the next sec- tions. Since we are interested in the ergodic average we will systematically omit boundary terms which do not contribute to ergodic averages and we will use the notation

W1(t) ˙=W2(t) if lim

t→∞

1

t(W1(t)−W2(t)) = 0. (2.13) For example we have

W(t) ˙= log d ¯P[0,t]

d ¯P[0,t]◦Θ· (2.14)

Note also that using (2.11) and (2.10), entropy production rate is tractable numerically and it can be easily calculated “on-the-fly” once the transition probability density functionΠ(·,·) is provided.

In the following sections we investigate the behavior of the entropy production rate for different discretization schemes of various reversible processes in the stationary regime. However, before proceeding with our analysis, let us state formally the basic assumptions necessary for our results to apply.

Assumption 2.1. We have

The driftaand the diffusionbin (2.1) as well as the vectorF in (2.6) areCand all their derivatives have at most polynomial growth at infinity.

(6)

M. KATSOULAKISET AL.

The generatorLis elliptic or hypo-elliptic, in particular the transition probabilities and the invariant measure (if it exists) are absolutely continuous with respect to Lebesgue with smooth densities. We assume thatxt is ergodic, i.e. every open set can be reached with positive probability starting from any point. For the discretized scheme we assume thatxi has smooth everywhere positive transition probabilities.

Both the continuous-time processXtand discrete-time processxiare ergodic with unique invariant measures μand ¯μ, respectively. Furthermore for sufficiently smallΔtwe have

|Eμ[f]Eμ¯[f]|=O(Δt) (2.15) for functionsf which areC with at most polynomial growth at infinity.

Notice that inequality (2.15) is an error estimate for the invariant measures of the processesXt andxi. The rate of convergence in terms of Δtdepends on the particular numerical scheme [19,30]. Ergodicity results for (numerical) SDEs can be found in [2,3,12,19,20,26,30–32]. For instance, if both drift terma(x) and diffusion termb(x) have bounded derivatives of any order, the covariance matrix (bbT)(x) is elliptic for allx∈Rd and there is a compact set outside of which holdsxTa(x)<−C|x|2for allx∈Rd(Lyapunov exponent) then it was shown in [30] that the continuous-time process as well both Euler and Milstein numerical schemes are ergodic and error estimate (2.15) holds. Another less restrictive example where ergodicity properties were proved is for SDE systems with degenerate noise and particularly for Langevin processes [20,31]. Again, a Lyapunov functional is the key assumption in order to handle the stochastic process at the infinity. More recently, Mattinglyet al.[19]

showed ergodicity for SDE-driven processes restricted on a torus as well their discretizations utilizing only the assumptions of ellipticity or hypoellipticity and the assumption of local Lipschitz continuity for both drift and diffusion terms.

3. Entropy production for overdamped langevin processes

The overdamped Langevin process,XtRd, is the solution of the following system of SDE’s dXt=1

2Σ(Xt)∇V(Xt)dt+1

2∇Σ(Xt)dt+σ(Xt)dBt (3.1)

whereV :RdRis a smooth potential function,σ:RdRd×mis the diffusion matrix,Σ:=σσT :RdRd×d is the covariance matrix and Bt is a standardm-dimensional Brownian motion. We assume from now on that Σ(x) is invertible for any xso that the process is elliptic. It is straightforward to show that the generator of the processXt satisfies the DB condition (2.3) with invariant measure

μ(dx) = 1

Zexp(−V(x))dx (3.2)

whereZ =

Rdexp(−V(x))dxis the normalization constant and thus ifX0∼μthen the Markov processXtis reversible.

The explicit Euler−Maruyama (EM) scheme for numerical integration of (3.1) is given by xi+1=xi1

2Σ(xi)∇V(xi)Δt+1

2∇Σ(xi)Δt+σ(xi)ΔWi (3.3)

with ΔWi ∼N(0, ΔtIm), i = 1,2, . . . are m-dimensional iid Gaussian random variables. The process xi is a discrete-time Markov process with transition probability density given by

Π(xi, xi+1) = 1 Z(xi)exp

1 2Δt

Δxi+1

2Σ(xi)∇V(xi)Δt1

2∇Σ(xi)Δt

T

×Σ−1(xi)

Δxi+1

2Σ(xi)∇V(xi)Δt1

2∇Σ(xi)Δt

(3.4)

(7)

MEASURING THE IRREVERSIBILITY OF NUMERICAL SCHEMES

where Δxi = xi+1 −xi and Z(xi) = (2π)m/2|detΣ(xi)|1/2 is the normalization constant for the multidi- mensional Gaussian distribution. The following lemma provides the GC action functional for the explicit EM time-discretization scheme of the overdamped Langevin process.

Lemma 3.1. Assume that detΣ(x) = 0 ∀x Rd. Then the GC action functional of the process xi solv- ing(3.3)is

W(n;Δt) ˙= 1 2

n−1

i=0

ΔxTi [∇V(xi+1) +∇V(xi)] +1 2

n−1

i=0

ΔxTi−1(xi+1)∇Σ(xi+1) +Σ−1(xi)∇Σ(xi)]

+ 1 2Δt

n−1

i=0

ΔxTi

Σ−1(xi+1)−Σ−1(xi) Δxi

(3.5)

wheremeans equality up to boundary terms, as defined in (2.13).

Proof. The assumption for non-zero determinant is imposed so that the transition probabilities and hence the GC action functional are non-singular. The proof is then a straightforward computation using (3.4) and (2.10).

W(n;Δt) :=

n−1

i=0

[logΠ(xi, xi+1)logΠ(xi+1, xi)] =

n−1

i=0

[logZ(xi+1)logZ(xi)]

1 2Δt

n−1

i=0

Δxi+1

2Σ(xi)∇V(xi)Δt1

2∇Σ(xi)Δt

T

×Σ−1(xi)

Δxi+1

2Σ(xi)∇V(xi)Δt1

2∇Σ(xi)Δt

−Δxi+1

2Σ(xi+1)∇V(xi+1)Δt1

2∇Σ(xi+1)Δt

T

×Σ−1(xi+1)

−Δxi+1

2Σ(xi+1)∇V(xi+1)Δt1

2∇Σ(xi+1)Δt

˙

= 1 2Δt

n−1

i=0

ΔxTiΣ−1(xi)Δxi+1

4∇V(xi)TΣ(xi)∇V(xi)Δt2+1

4∇Σ(xi)TΣ−1(xi)∇Σ(xi)Δt2 +ΔxTi ∇V(xi)Δt−ΔxTi Σ−1(xi)∇Σ(xi)Δt1

2∇V(xi)T∇Σ(xi)Δt2

−ΔxTiΣ−1(xi+1)Δxi1

4∇V(xi+1)TΣ(xi+1)∇V(xi+1)Δt21

4∇Σ(xi+1)TΣ−1(xi+1)∇Σ(xi+1)Δt2 +ΔxTi ∇V(xi+1)Δt−ΔxTi Σ−1(xi+1)∇Σ(xi+1)Δt+1

2∇V(xi+1)T∇Σ(xi+1)Δt2

˙

= 1 2Δt

n−1

i=0

ΔxTi

Σ−1(xi)−Σ−1(xi+1)

Δxi1 2

n−1

i=0

ΔxTi[∇V(xi+1) +∇V(xi)]

+1 2

n−1

i=0

ΔxTi−1(xi+1)∇Σ(xi+1) +Σ−1(xi)∇Σ(xi)]

where all the terms of the general formG(xi)−G(xi+1) in the sums were cancelled out since they form telescopic

sums which become boundary terms.

Three important remarks can readily be made from the above computation.

(8)

M. KATSOULAKISET AL.

Remark 3.2. The numerical computation of entropy production rate as the time-average of the GC action functional on the path space (i.e., based on (2.9)) at first sight seems computationally intractable due to the large dimension of the path space. However, due to ergodicity, the numerical computation of the entropy production can be performed as a time-average based on (2.11) and (3.5) for largen. Additionally, this computation can be done for free and “on-the-fly” since the quantities involved are already computed in the simulation of the process. The numerical entropy production rate shown in the following figures is computed using this approach.

Remark 3.3. It was shown in [18] that the GC action functional of thecontinuous-timeprocess driven by (3.1) equals the Stratonovich integral

Wcont(t) = t

0 ∇V(Xs)dXs=V(x0)−V(xt) (3.6) which reduces to a boundary term as expected. This functional has the discretization

Wcont(t)1 2

n−1

i=0

ΔxTi [∇V(xi+1) +∇V(xi)] (3.7)

and this is exactly the first term in the GC action functionalW(n;Δt) for the explicit EM approximation process (see (3.5)). However, the discretization scheme introduces two additional terms to the GC action functional which may greatly affect the asymptotic behavior of entropy production asΔtgoes to zero, as we demonstrate in Section3.2. Notice that when the noise is additive,i.e., when the diffusion matrix is constant, then these two additional terms vanish and taking the limitΔt→0, the GC action functionalW(n;Δt), if exists, becomes the Stratonovich integralWcont(t) which is a boundary term.

Remark 3.4. The GC action functionalW(n;Δt) consists of three terms (see (3.5)), each of which stems from a particular term in the SDE. Thus, each term in the SDE contributes to the entropy production functional a component which is totally decoupled from the other terms. The reason for this decomposition lies in the particular form of the transition probabilities for the explicit EM scheme which are exponentials with quadratic argument. This feature can be exploited for the study of entropy production of numerical schemes for processes with irreversible dynamics. Indeed, if a non-gradient term of the forma(Xt)dtis added to the drift of (3.1), the process is irreversible and its GC action functional is not anymore a boundary term and is given by [18]

Wcont(t) ˙= t

0

Σ−1(Xt)a(Xt)dXt1 2

n−1

i=0

ΔxTi

Σ−1(xi)a(xi) +Σ−1(xi+1)a(xi+1)

(3.8) On the other hand, due to the separation property of the explicit EM scheme, the GC action functional of the discrete-time approximation processW(n;Δt) has the additional term

1 2

n−1

i=0

ΔxTi−1(xi)a(xi) +Σ−1(xi+1)a(xi+1)]. (3.9) Evidently, the discretization of Wcont(t) equals the additional term of the GC functionalW(n;Δt). Thus, the GC action functional W(n;Δt) is decomposed into two components, one stemming from the irreversibility of the continuous-time process and another one stemming from the irreversibility of the discretization procedure.

3.1. Entropy production for the additive noise case

An important special case of (3.1) is the case of additive noise, i.e., when the covariance matrix does not depend in the process,Σ(x)≡Σ. In this case, the SDE system becomes

dXt=1

2Σ∇V(Xt)dt+σdBt X0∼μ

(3.10)

(9)

MEASURING THE IRREVERSIBILITY OF NUMERICAL SCHEMES

and the GC action functional is simply given by

W(n;Δt) ˙=1 2

n−1

i=0

ΔxTi[∇V(xi+1) +∇V(xi)] (3.11)

In this section we prove an upper bound for the entropy production of the explicit EM scheme. The proof uses several lemmas stated and proved in Appendix A.

Theorem 3.5. Let Assumption 2.1 hold. Assume also that the potential function V has bounded fifth-order derivative and that the covariance matrix Σ is invertible. Then, for sufficiently small Δt, there exists C = C(V, Σ)>0 such that

EP(Δt)≤CΔt2 (3.12)

Proof. Utilizing the generalized trapezoidal rule (A.1) fork= 3, the GC action function is rewritten as

W(n;Δt) ˙= 1 2

n−1

i=0

ΔxTi [∇V(xi+1) +∇V(xi)]

=

n−1

i=0

⎧⎨

−(V(xi+1)−V(xi)) +

|α|=3

Cα[DαV(xi+1) +DαV(xi)]Δxαi

+

|α|=1,3,5

|β|=5−|α|

Bβ[Rβα(xi, xi+1) +Rβα(xi+1, xi)]Δxα+βi

⎫⎬

˙

=

n−1

i=0

|α|=3

Cα[DαV(xi+1) +DαV(xi)]Δxαi

+

n−1

i=0

|α|=1,3,5

|β|=5−|α|

Bβ[Rβα(xi, xi+1) +Rβα(xi+1, xi)]Δxα+βi .

(3.13)

Applying, once again, Taylor series expansion toDαV(xi+1), the GC action functional becomes

W(n;Δt) ˙=

n−1

i=0

⎧⎨

|α|=3

2CαDαV(xi)Δxαi +

|α|=3

Cα

|β|=1

Dα+βV(xi)Δxα+βi

⎫⎬

⎭ +

n−1

i=0

|α|=1,3,5

|β|=5−|α|

R¯βα(xi, xi+1)Δxα+βi

(3.14)

where ¯Rβα(xi, xi+1) =Bβ[Rβα(xi, xi+1) +Rβα(xi+1, xi)] +½|α|=3Rαβ(xi, xi+1). Moreover, expandingΔxαi using the multi-binomial formula

Δxαi =

1

2Σ∇V(xi)Δt+σΔWi

α

=

ν≤α

α ν

1

2Σ∇V(xi)Δt

ν

(σΔWi)α−ν. (3.15)

(10)

M. KATSOULAKISET AL.

Then, the GC action functional becomes

W(n;Δt) ˙= 2

n−1

i=0

|α|=3

ν≤α

Cα α

ν DαV(xi)

1

2Σ∇V(xi)Δt

ν

(σΔWi)α−ν

+

n−1

i=0

|α|=3

|β|=1

ν≤α+β

Cα α+β

ν Dα+βV(xi)

1

2Σ∇V(xi)Δt

ν

(σΔWi)α+β−ν

+

n−1

i=0

|α|=1,3,5

|β|=5−|α|

ν≤α+β

α+β

ν R¯βα(xi, xi+1)

1

2Σ∇V(xi)Δt

ν

(σΔWi)α+β−ν.

(3.16)

From (2.11), the entropy production rate is the time-averaged GC action functional asn→ ∞. Thus, EP(Δt) = lim

n→∞

W(n;Δt) nΔt

= 2 Δt

|α|=3

ν≤α

Cα α

ν lim

n→∞

1 n

n−1

i=0

DαV(xi)

1

2Σ∇V(xi)Δt

ν

(σΔWi)α−ν

+ 1 Δt

|α|=3

|β|=1

ν≤α+β

Cα α+β

ν lim

n→∞

1 n

n−1

i=0

Dα+βV(xi)

1

2Σ∇V(xi)Δt

ν

(σΔWi)α+β−ν

+ 1 Δt

|α|=1,3,5

|β|=5−|α|

ν≤α+β

α+β ν lim

n→∞

1 n

n−1

i=0

R¯βα(xi, xi+1)

1

2Σ∇V(xi)Δt

ν

(σΔWi)α+β−ν. (3.17) The ergodicity ofxias well the Gaussianity ofΔWiguarantees that the first two limits in the entropy production formula exist. Additionally, the residual terms, ¯Rβα(xi, xi+1), are bounded due to the assumption on bounded fifth-order derivative ofV, hence, the third limit also exists. Note here that this assumption could be changed by assuming boundedness of a higher order derivative and performing a higher-order Taylor expansion. Appendix A gives rigorous proofs of these ergodicity statements. Hence,

EP(Δt) = 2 Δt

|α|=3

ν≤α

Cα α

ν E¯μ

DαV(x)

1

2Σ∇V(x)Δt

ν Eρ

(σy)α−ν

+ 1 Δt

|α|=3

|β|=1

ν≤α+β

Cα α+β

ν Eμ¯

Dα+βV(x)

1

2Σ∇V(x)Δt

ν Eρ

(σy)α+β−ν

+ 1 Δt

|α|=1,3,5

|β|=5−|α|

ν≤α+β

α+β ν Eμ×ρ¯

R¯βα(x, y)

1

2Σ∇V(x)Δt

ν

Eρ[(σy)α+β−ν]

(3.18)

where ¯μ is the equilibrium measure for xi while ρ is the Gaussian measure of ΔWi. Using the Isserlis−Wick formula we can compute the higher moments of multivariate Gaussian random variable from the second-order moments. Indeed, we have

E[yν] =E[yν11. . . ydνd] =E

z1z2. . . z|ν|

=

0 if |ν| odd

E[zizj] if |ν| even (3.19) where means summing over all distinct ways of partitioningz1, . . . , z|ν| into pairs. Moreover, E[zizj] = ΣijΔt, hence, applying (3.19) into (3.18) and changing the multi-index notation to the usual notation, the

(11)

MEASURING THE IRREVERSIBILITY OF NUMERICAL SCHEMES

entropy production rate becomes EP(Δt) = 2

Δt d k1=1

d k2=1

d k3=1

Ck1k2k3

Eμ¯

3V

∂xk1∂xk2∂xk3

1 2Σ∇V

k1

Σk2k3Δt2

+E¯μ

3V

∂xk1∂xk2∂xk3

1 2Σ∇V

k2

Σk1k3Δt2

+E¯μ

3V

∂xk1∂xk2∂xk3

1 2Σ∇V

k3

Σk1k2Δt2+O Δt3

+ 1 Δt

d k1=1

d k2=1

d k3=1

d k4=1

Ck1k2k3

Eμ¯

4V

∂xk1. . . ∂xk4

×k1k2Σk3k4+Σk1k3Σk2k4+Σk1k4Σk2k3]Δt2+O Δt3

+ 1 ΔtO

Δt3 .

(3.20)

Using that (−12Σ∇V)ki=21d

k4=1Σkik4∂x∂V

k4, entropy production is rewritten as EP(Δt) =

d k1=1

d k2=1

d k3=1

d k4=1

Ck1k2k3

Σk1k2Σk3k4

−Eμ¯

3V

∂xk1∂xk3∂xk4

∂V

∂xk2

+Eμ¯

4V

∂xk1. . . ∂xk4 +Σk1k3Σk2k4

−Eμ¯

3V

∂xk1∂xk2∂xk4

∂V

∂xk3

+Eμ¯

4V

∂xk1. . . ∂xk4k1k4Σk2k3

−Eμ¯

3V

∂xk1∂xk2∂xk3

∂V

∂xk4

+Eμ¯

4V

∂xk1. . . ∂xk4 Δt+O Δt2

.

(3.21) By a simple integration by parts, we observe that for any combinationk1, . . . , k4= 1, . . . , d

Eμ

3V

∂xk1∂xk2∂xk3

∂V

∂xk4

=Eμ

4V

∂xk1. . . ∂xk4

(3.22) where the expectation is taken with respect ofμwhich is the invariant measure of the continuous-time process.

However, in (3.21) the expectation is w.r.t. the invariant measure of the discrete-time process (i.e., ¯μinstead of μ). Nevertheless, Assumption2.1guarantees that the alternation of the measure fromμto ¯μcosts an error of orderO(Δt). Hence, for any coefficient in (3.21), we obtain that

Eμ¯

3V

∂xk1∂xk2∂xk3

∂V

∂xk4

Eμ¯

4V

∂xk1. . . ∂xk4

2KΔt (3.23)

since the potentialV as well its derivatives are sufficiently smooth. Hence, we overall showed that EP(Δt) =O

Δt2

(3.24)

which completes the proof.

Remark 3.6. Depending on the potential function the entropy production could be even smaller. For instance, when the potentialV is a quadratic function (i.e. the continuous-time process is an OrnsteinUhlenbeck pro- cess), then, it is easily checked by a trivial calculation of (3.11) that the GC action function is a boundary term, thus, the entropy production of the explicit EM scheme is zero. However, for a generic potential V we expect that the entropy production rate decays quadratically as a function ofΔtbut not faster.

(12)

M. KATSOULAKISET AL.

0 1 2 3 4 5 6 7 8 9 10

x 104

−15

−10

−5 0 5 10 15

Time

GC Action Functional

0 1 2 3 4 5 6 7 8 9 10

x 104

−10

−5 0 5x 10−3

Time

Entropy Production

Figure 1.Upper panel: the GC action functional as a function of time for fixedΔt= 0.05. Its variance is large necessitating the use of many samples in order to obtain statistically confident quantities.Lower Panel: the entropy production rate as a function of time for the sameΔt. It converges to a positive value as expected.

3.1.1. Fourth-order potential on a torus

Lets now proceed with an important example where the potential is a forth-order polynomial while the process takes values on a torus. Assumed= 2 while potentialV =Vβ is given by

Vβ(x) =β |x|4

4 −|x|2

2 (3.25)

where β is a positive real number which in statistical mechanics has the meaning of the inverse temperature.

The diffusion matrix is set toσ=

−1Id. Based on [20], Assumption 2.1 is satisfied because the domain is restricted to a torus, the potential is locally Lipschitz continuous and the covariance matrix is elliptic. Figure1 presents both the GC action functional (upper panel) and the entropy production rate (lower panel) as a function of time for fixedΔt= 0.05. Both quantities are numerically computed while the inverse temperature is set to β = 10. Even though the variance of the GC action functional is large, entropy production which is the cumulative sum of the GC functional converges due to the law of large numbers to a (positive) value after relatively long time. Additionally, due to the ergodicity assumption, it converges to the correct value. Figure2 shows the loglog plot of the numerical entropy production rate as a function ofΔt forβ = 20, 40, 60. Final time was set to t = 2×106 while initial point was set to one of the attraction points of the deterministic counterpart. For reader’s convenience, the thick black line denotes theO(Δt2) rate of convergence. This plot is in agreement with the theorem’s estimate (3.12). Notice also that, for smallΔt, entropy production rate is very close to 0 and even larger final time is needed in order to obtain a statistically confident numerical estimate for the entropy production. Moreover, as it is evident from the figure and the GC action functional in (3.11), the dependence of the entropy production w.r.t. the inverse temperature is inverse proportional. Thus, from a statistical mechanics point of view, the larger is the temperature the larger -in a linear manner- is the entropy production rate of the numerical scheme.

Références

Documents relatifs

Mais elle a également un rôle dans l’évaluation pré thérapeutique, puisque les troubles du sommeil, et surtout les troubles du comportement qui leur sont associés, sont une

We will fur- thermore show that under certain circumstances approximate approximation phenomena occur, in the sense that, in case of non-convergence of the scheme to the true

It is proved in [5] that all of them are gradient discretisation methods for which general properties of discrete spaces and operators are allowing common convergence and

Here we are concerned with reachability: the exis- tence of zigzag trajectories between any points in the state space such that, for a given potential function U , the trajectories

In Section 3, we show that all methods in the following list are gradient schemes and satisfy four of the five core properties (coercivity, consistency, limit-conformity,

An original feature of the proof of Theorem 2.1 below is to consider the solutions of two Poisson equations (one related to the average behavior, one related to the

(Left) A cathepsin B-green fluorescent protein chimera contain- ing the first 65 amino acids of truncated procathepsin B starting with position 52 (aa numbering of preprocathepsin

The first one is a re- interpretation of the classical proof of an implicit functions theorem in an ILB setting, for which our setting enables us to state an implicit functions