Post-processing of the planewave approximation of Schrödinger equations. Part I: linear operators

(1)

HAL Id: hal-01908039

https://hal.archives-ouvertes.fr/hal-01908039v4

Submitted on 17 Mar 2020

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Post-processing of the planewave approximation of Schrödinger equations. Part I: linear operators

Eric Cancès, Geneviève Dusson, Yvon Maday, Benjamin Stamm, Martin Vohralík

To cite this version:

Eric Cancès, Geneviève Dusson, Yvon Maday, Benjamin Stamm, Martin Vohralík. Post-processing of the planewave approximation of Schrödinger equations. Part I: linear operators. IMA Journal of Numerical Analysis, Oxford University Press (OUP), 2021, 41 (4), pp.2423-2455.

�10.1093/imanum/draa044�. �hal-01908039v4�

(2)

Post-processing of the planewave approximation of Schr¨ odinger equations. Part I: linear operators

Eric Canc` es

CERMICS, Ecole des Ponts, Universit´ e Paris-Est, 6 & 8 Av. Pascal, 77455 Marne-la-Vall´ ee, France and

Inria, 2 Rue Simone Iff, 75589 Paris, France Genevi` eve Dusson

^∗

Universit´ e Bourgogne Franche-Comt´ e, Laboratoire de Math´ ematiques de Besan¸con, UMR CNRS 6623, Besan¸con, France

Yvon Maday

Sorbonne Universit´ e, Universit´ e Paris-Diderot SPC, CNRS, Laboratoire Jacques-Louis Lions (LJLL), F-75005 Paris, France, and

Institut Universitaire de France, 75005 Paris, France Benjamin Stamm

Center for Computational Engineering Science, RWTH Aachen University, Aachen, Germany

and

Martin Vohral´ık

Inria, 2 Rue Simone Iff, 75589 Paris, France and

CERMICS, Ecole des Ponts, Universit´ e Paris-Est, 6 & 8 Av. Pascal, 77455 Marne-la-Vall´ ee, France

Abstract

In this article, we prove a priori error estimates for the perturbation-based post-processing of the plane-wave approximation of Schr¨odinger equations introduced and tested numerically in previous works [6, 7]. We consider here a Schr¨odinger operatorH =−¹₂∆ +VonL²(Ω), where Ω is a cubic box with periodic boundary conditions, and whereVis a multiplicative operator by a regular-enough functionV. The quantities of interest are, on the one hand, the ground-state energy defined as the sum of the lowestN eigenvalues ofH, and, on the other hand, the ground-state density matrix, that is the spectral projector on the vector space spanned by the associated eigenvectors. Such a problem is central in first-principle molecular simulation, since it corresponds to the so-called linear subproblem in Kohn–Sham density functional theory (DFT). Interpreting the exact eigenpairs ofH as perturbations of the numerical eigenpairs obtained by a variational approximation in a plane-wave (i.e. Fourier) basis, we compute first-order corrections for the eigenfunctions, which are turned into corrections on the ground-state density matrix.

This allows us to increase the accuracy by a factor proportional to the inverse of the kinetic energy cutoff Ec−1

of both the ground-state energy and the ground-state density matrix in Hilbert–Schmidt norm at

∗Corresponding author. Email: [email protected]

(3)

a low computational extra-cost. Indeed, the computation of the corrections only requires the computation of the residual of the solution in a larger plane-wave basis and two Fast Fourier Transforms per eigenvalue. Schr¨odinger operator, perturbation method, planewave approximation, eigenvalue problem, post-processing.

1 Introduction

First-principle molecular simulation is a major tool to predict the properties of matter from the atomic to the macroscopic scales. It is widely used in different fields such as chemistry, condensed matter physics, or materials science. As a main advantage, it requires no empirical parameter except a few fundamental constants of physics (the reduced Planck constant ~, the electron mass me, the elementary charge e, the dielectric permittivity of the vacuum ε0, the Boltzmann constant kB), as well as the masses and atomic numbers of the nuclei contained in the system under consideration.

At this scale, matter is described as a system of M nuclei andN electrons. In the Born–Oppenheimer approximation [1], the nuclei, which are thousands times heavier than electrons, are considered as point-like classical particles, and the electrons are supposed to be, at each timet, in their ground-state. The resulting many-body Schr¨odinger equation describing the electronic ground state, a partial differential equation on a 3N-dimensional space, is way too costly to be solved in practice, hence approximations have to be resorted to, starting with considering a simpler model.

Many different approaches have been proposed to compute the electronic ground-state, among which wavefunction methods (see [12] and [4] for a mathematical introduction), density functional theory consisting of orbital-free and Kohn–Sham models [9, 16] and quantum Monte Carlo methods [17, 18].

The Kohn–Sham models are among the most widely used in physics and chemistry, as they provide a good compromise between accuracy and computational cost, being posed only in a 3-dimensional space. In condensed matter physics and materials science, most Kohn–Sham calculations are performed in a rectangular box Ω, called a supercell, with periodic boundary conditions (Born–von Karman PBC). The most common method to discretize the Kohn–Sham equations then is to use a variational approximation in a plane-wave (Fourier) basis. For very large systems, unfortunately, using a fine discretization basis is too expensive, while using a coarse discretization basis leads to insufficiently accurate results.

In order to limit the computational cost of the method while preserving the quality of the numerical results, several post-processing methods have been proposed. Usually, the approach is to perform a full computation in a coarse basis, for which the computational cost is not excessively high, and then to make some not-too-expensive computation in a finer basis leading to a substantial improvement in accuracy.

In general, our approach falls into the category of two-grid methods, which have been applied to the case of a linear eigenvalue problem originally in [22], and nonlinear eigenvalue problems in [2, 13, 19]. In [22], the proposed method contains two steps: first, a linear eigenvalue problem is solved on a coarse finite element basis, and second, the computed eigenvectors are improved through a post-processing step, consisting of solving a boundary problem on a fine basis. The latter allows to increase the convergence rate of the eigenvectors in energy norm by at most the square of the discretization parameter of the coarse basis. For the nonlinear eigenvalue problems treated in [2, 13, 19], the two steps are similar: first a nonlinear eigenvalue problem is solved on a coarse basis, then a boundary problem is solved on a fine basis. In [2], an alternative approach for the post-processing step consisting of solving a linear eigenvalue problem is also proposed. Also, a preprocessing step is added in [13] in order to better choose the coarse basis. In terms of improvement, the convergence rate of the post-processed eigenvectors in energy norm is doubled compared to the coarse quantities in [19] for the Hartree–Fock problem. It is improved by the square of the coarse discretization parameter in [2], and by its cube in [13], thanks to the proposed preprocessing step.

In [6, 7], we introduced a new post-processing method for periodic Kohn–Sham calculations in a plane- wave basis, leading to a significant gain in accuracy at a very limited computational extra-cost. This approach is based on the Rayleigh–Schr¨odinger perturbation method, considering the exact Kohn–Sham ground-state as a perturbation of the approximate ground-state computed in a finite basis set. Theoretical estimates in the asymptotic regime for the energy were announced and illustrated by numerical simulations. These

(4)

simulations showed that the accuracy on the ground-state energy could be improved by a factor of 10 to 100 at a very limited extra-cost (about 3%).

Compared to the existing post-processing methods [2, 13, 19], we avoid here the resolution of a boundary value problem or an eigenvalue problem in the fine basis, because the considered operator to invert is diagonal.

The gain in accuracy is proportional to the square of the plane-wave discretization parameter for the energy norm of the eigenvectors, which is comparable to what is obtained in [2] and [22]. Note that the proposed method is specific to a plane-wave discretisation. A theoretical and numerical comparison of these different approaches for nonlinear eigenvalue problems will be presented in the following article [11].

In this contribution, we focus on the linear subproblem of the Kohn–Sham model, and we present the proofs of the improved accuracy for the post-processing of the plane-wave approximation of Schr¨odinger equations stated in [7]. Corresponding proofs for the nonlinear Kohn–Sham model will be dealt with in a forthcoming paper [10].

The linear Kohn–Sham subproblem considered in this article consists in computing the rank–N ground- state density matrixγ0of a linear Schr¨odinger operator H =−¹2∆ +V, acting on the space L²_#(Ω) of real- valued square-integrable periodic functions onR³with Ω as a periodic cell. ForV∈L²_#(Ω), the Hamiltonian H is diagonalizable in an orthonormal basis and its eigenvaluesλ⁰₁≤λ⁰₂≤ · · · (counting multiplicities) form a non-decreasing sequence of real numbers that tends to +∞. Denoting by (φ⁰_i)i≥1 an orthonormal basis of associated eigenvectors, and assuming that there is a gapg:=λ⁰_N₊₁−λ⁰_N >0 between theN^th and the (N+1)^steigenvalue ofH, the rank–Nground-state density matrix is defined, using Dirac’s bra-ket notation, as

γ0=

N

X

i=1

|φ⁰_iihφ⁰_i|.

It is therefore the orthogonal projector on the N-dimensional vector subspace of L²_#(Ω) spanned by the eigenvectors associated with the lowestN eigenvalues ofH. The ground-state energy is then defined as the scalar quantity

E0=

N

X

i=1

λ⁰_i.

Note thatE0 is not equal to the Kohn–Sham ground-state energy whenH is the Kohn–Sham Hamiltonian due to nonlinear effects, called double-counting in the chemistry and physics literatures.

Our main result is summarized in Theorem 4.1. We show that, in the asymptotic regime where the discretization space is large enough, the convergence rates of both the post-processed ground-state density matrix (built from the post-processed eigenvectors defined in [6]) measured in Hilbert–Schmidt norm and the post-processed ground-state energy are improved by a factor proportional to N_c⁻² where Nc is the discretization parameter. Since the kinetic energy cutoffEc is itself proportional toN_c², this improvement factor corresponds toEc−1, i.e. the inverse of the kinetic energy cutoff.

Note that we do not make any non-degeneracy assumption on the lowestN eigenvalues ofH; only the presence of a positive gap between the N^th and the (N + 1)^st eigenvalues of H is required. As in [6, 7], our approach strongly relies on the fact that, in plane-wave approximations, the kinetic energy operator

−¹2∆, which is the leading term in the Hamiltonian H, commutes with the orthogonal projector on the discretization space. The constant C appearing in the estimations is generic and may depend onN but is always independent of the discretization parameterNc.

This article is organized as follows. In Section 1, we summarize the main notation used in the article. In Section 2.1, we present in detail the linear subproblem of the Kohn–Sham model, as well as the characteriza- tion ofγ0 as the unique solution to some constrained optimization problem, and some other useful classical results. In Section 2.3, we describe the plane-wave discretization of this optimization problem. In Section 2.4, we translate thea priorierror analysis results of [3] into the density matrix formalism. Our post-processing method based on Rayleigh–Schr¨odinger perturbation theory is described in Section 3. In Section 4.1, we present the main results of this paper,i.e. an improved convergence rate on the post-processed ground-state

(5)

density matrix and energy. The proofs are given in Section 4.2. Some numerical simulations are presented in Section 5.

List of notation

To help the reader navigate through the paper, we summarize below the principal notation, and refer to the definitions when needed.

To start with, N denotes the number of computed eigenvalues. The quantities related to the choice of discretisation (Section 2.3) are

• Ec: kinetic energy cutoff,

• Nc =q

Ec

2 L

π: discretisation parameter,

• XNc: discretisation space.

The quantities related to energies are

• E0: exact ground state energy (Section 2.1),

• E0,Nc: variational approximation of the ground state energy (2.19),

• E]0,Nc: perturbed ground state energy (3.10).

The different eigenfunctions, corresponding eigenvalues and Lagrange multiplier matrices defined in this article are

• Φ⁰ = (φ⁰₁,· · · , φ⁰_N)^T and (λ⁰₁,· · · , λ⁰_N): lowest N eigenfunctions and corresponding eigenvalues of the Hamiltonian. The matrix of Lagrange multipliers Λ⁰ is diagonal Λ⁰= diag(λ⁰₁,· · ·, λ⁰_N).

• ΦNc := (φ1,Nc, . . . , φN,Nc)^T and (λ1,Nc, . . . , λN,Nc): lowest eigenfunctions diagonalizing the Hamilto- nian on the discretisation space and corresponding eigenvalues (Section 2.3). The corresponding matrix of Lagrange multiplier is diagonal as well.

• Φ⁰_N_c = (φ⁰_1,N_c, . . . , φ⁰_N,N_c)^T: eigenfunctions given by a unitary transform of ΦNc such that Φ⁰_N_c is as much aligned with Φ⁰ as possible. The corresponding matrix of Lagrange multipliers Λ⁰_N_c = (λ⁰_ij,N_c)1≤i,j≤N := (hφ⁰_i,N_c|H|φ⁰_j,N_ci)1≤i,j≤N ∈R^N^×^N is not diagonal.

• (φ]1,Nc, . . . ,φ^N,Nc) and (λ]1,Nc, . . . ,λ^N,Nc): perturbed eigenfunctions and perturbed eigenvalues (Defi- nition 3.1).

The different density matrices involved in the following are:

• γ0: exact ground state density matrix (Section 2.1),

• γ0,Nc: approximate density matrix (2.21),

• γ]0,Nc: perturbed density matrix (Definition 3.1). Note that this density matrix is not an orthogonal projector, as mentioned in Remark 3.1.

(6)

2 Post-processing for the Kohn–Sham linear subproblem

In order to simplify the notation, we consider a cubic latticeR=LZ³(L >0) corresponding to the supercell Ω = [0, L)³, but all our arguments straightforwardly apply to the general case of a lattice with lower or no point symmetry. For 1≤p≤ ∞ands∈R+, we denote by

L^p_#(Ω) :=

u∈L^p_loc(R³,R)|uisR-periodic , H_#^s(Ω) :=

u∈H_loc^s (R³,R)|uisR-periodic ,

the spaces of real-valuedR-periodic L^p and H^s functions, and byL(L²_#) the vector space of the bounded linear operators onL²_#(Ω).

2.1 Problem setting

LetN ∈N^∗ andV∈L²_#(Ω). In Kohn–Sham models, N is the number of electrons (or of electron pairs in closed-shell models) in the simulation cell, andVis an approximation of the Kohn–Sham effective potential.

LetH be the operator onL²_#(Ω) with domainH_#²(Ω) defined by

∀u∈H_#²(Ω), Hu=−1

2∆u+Vu. (2.1)

Since the operator H is a perturbation of the laplace operator on a torus, it is self-adjoint, bounded below, with compact resolvent [21]. It can therefore be diagonalized in an orthonormal basis: there exists a non-decreasing sequence (λ⁰_i)i≥1 of real numbers and an orthonormal basis (φ⁰_i)i≥1 ofL²_#(Ω) consisting of functions ofH_#²(Ω) such that

∀i≥1, Hφ⁰_i =λ⁰_iφ⁰_i. (2.2)

We denote by Φ⁰= (φ⁰₁,· · · , φ⁰_N)^T and Λ⁰= diag(λ⁰₁,· · ·, λ⁰_N).

A key assumption for our analysis is the following:

Assumption2.1. There is a gap between theN^th and the(N+ 1)^st eigenvalues ofH, i.e.

g:=λ⁰_N₊₁−λ⁰_N >0.

We denote by F:= ^λ⁰^N^+λ₂⁰^N+1 the Fermi level. Note that, in this setting, any real number in the range (λ⁰_N, λ⁰_N₊₁) is an admissible Fermi level.

As already mentioned in the introduction, the purpose of the linear subproblem is to compute two quantities of interest:

1. the ground-state density matrix

γ0:=1(−∞,F](H) =

N

X

i=1

|φ⁰_iihφ⁰_i|;

2. the ground-state energy

E0:= Tr (Hγ0) =

N

X

i=1

λ⁰_i,

where Tr denotes the trace, and will be properly introduced in Section 2.2.

The linear subproblem can be formulated as a variational problem in several ways. First, introducing the quadratic form

H_#¹(Ω)3ψ7→ hψ|H|ψi:=1 2

Z

Ω|∇ψ|²+ Z

Ω

V|ψ|²∈R

(7)

associated withH, the energy functionalEdefined by

∀Ψ = (ψ1, . . . , ψN)^T ∈

H_#¹(Ω)N

, E(Ψ) :=

N

X

i=1

hψi|H|ψii=

N

X

i=1

1 2

Z

Ω|∇ψi|²+ Z

Ω

V|ψi|²

, (2.3) and the (infinite-dimensional) Stiefel manifold

M=

Ψ = (ψ1, . . . , ψN)^T ∈

H_#¹(Ω)N

∀i, j= 1, . . . , N, Z

Ω

ψiψj =δij

, (2.4)

we have

E0= inf{E(Ψ), Ψ∈M}. (2.5)

Besides, Φ⁰ = (φ⁰₁, . . . , φ⁰_N)^T is a minimizer of (2.5). Note that Φ⁰ is not the unique minimizer of (2.5).

Indeed, denoting by O(N) :=

U ∈R^N^×^N |U^TU = 1N the orthogonal group in dimension N (1N is the identity matrix of rankN), we have

∀Ψ∈M, ∀U ∈O(N), UΨ∈M, and E(UΨ) =E(Ψ). (2.6) Therefore,UΦ⁰is a minimizer of (2.5) for allU ∈O(N). In fact, under Assumption 2.1, the set of minimizers of (2.5) is exactly equal toO(N)Φ⁰. For the sake of completeness, let us recall the proof of this elementary, but key, property. Let Ψ = (ψ1,· · · , ψN)^T be a critical point of (2.5). The first-order optimality conditions satisfied by Ψ read

∀i, j= 1, . . . , N, Hψi=

N

X

j=1

λijψj, Z

Ω

ψiψi=δij.

TheN×Nsymmetric matrix Λ = (λij)i,j=1,...,N is the Lagrange multiplier of the matrix constraintR

Ωψiψj= δij. It is not diagonal in general. On the other hand, since it is symmetric, there existsU∈O(N) such that UΛU^T = diag(λ1,· · ·, λN) with λ1 ≤λ2 ≤ · · · ≤ λN. Then, Φ = (φ1,· · ·, φN)^T :=UΨ also is a critical point of (2.5) with the same energy as Ψ, and we have

∀1≤i, j≤N, Hφi=λiφi, Z

Ω

φiφj=δij and E(Φ) =E(Ψ) =

N

X

i=1

λi.

For Ψ to be a minimizer of (2.5), we must haveλi=λ⁰_i for all 1≤i≤N. Under Assumption 2.1, we have in addition Span(ψ1,· · ·, ψN) = Span(φ1,· · · , φN) = Span(φ⁰₁,· · · , φ⁰_N) = Ran(γ0). Therefore, there exists U ∈O(N) such that Ψ =UΦ0.

To get rid of the gauge invariance (2.6), it is possible to identify the elements of the Stiefel manifold (2.4) spanning the same subspace, introducing the Grassmann manifold as the quotient of the Stiefel manifold M/∼with respect to the equivalence relation Ψ∼Ψ if ˜˜ Ψ =UΨ for someU ∈O(N). Alternatively, as there exists a one-to-one correspondence between the subspace spanned by Ψ∈Mand the orthogonal projector onto this space, it is possible to define the Grassmann manifold in terms of the orthogonal projectors called density matrices, and to reformulate problem (2.5) in terms of these density matrices. To this end, we define the (infinite-dimensional) Grassmann manifold

Υ =

γ∈L(L²#)

γ^∗=γ, γ²=γ, Tr (γ) =N, Tr (−∆γ)<∞ , (2.7) its convex hull

K=

γ∈L(L²_#)

γ^∗=γ, 0γ1, Tr (γ) =N, Tr (−∆γ)<∞ , (2.8) and the energy functionalE defined onKby

∀γ∈K, E(γ) = Tr (Hγ). (2.9)

(8)

There holds

E0= inf{E(γ), γ∈Υ}, (2.10)

and

E0= inf{E(γ), γ∈K}. (2.11)

Besides, under Assumption 2.1, γ0 is the unique minimizer of both (2.10) and (2.11). Hereγ^∗ denotes the adjoint of γ, 0 ≤ γ ≤ 1 means ∀u ∈ L²_#(Ω), 0 ≤ hu|γ ui ≤ kuk²L²_#. The precise meaning of the terms Tr (−∆γ) and Tr (Hγ), as well as the proof of the fact thatγ0 is the unique minimizer of (2.10) and (2.11), will be given in the next section.

2.2 Functional setting

We denote byk · kthe operator norm onL(L²#), the space of bounded linear operators onL²_#(Ω). We also need to introduce the Banach spaceS1(L²_#) of trace-class operators onL²_#(Ω) and the Hilbert spaceS2(L²_#) of Hilbert–Schmidt operators on L²_#(Ω), respectively endowed with the norm defined by kAkS1(L²_#) :=

Tr (|A|) = Tr (√

A^∗A) and the inner product defined by (A, B)_S₂_(L²

#):= Tr (A^∗B). We refer to [20, Chapter VI] for an introduction to trace-class and Hilbert–Schmidt operators. Let us just recall here the properties which will be used in the sequel:

• for any orthonormal basis (en)n∈NofL²_#(Ω), we have

∀A∈S1(L²_#), Tr (A) =X

n∈N

hen|Aeni,

∀A∈S₂(L²_#), kAk^S2(L²_#)= Tr (A^∗A)^1/2= X

n∈N

kAenk²L²_#

!1/2

.

IfA∈L(L²_#) is apositiveoperator, that is if for allu∈L²_#(Ω), there holds thathu|Aui ≥0, then the value of the sum

Tr (A) :=X

n∈N

hen|Aeni ∈R+∪ {+∞}

is independent of the choice of the orthonormal basis (en)n∈N. IfA∈L(L²#) ispositive and self-adjoint, thenA∈S₁(L²_#) if and only if Tr (A)<∞;

• S₁(L²_#)⊂S₂(L²_#)⊂L(L²_#) and for allA∈S₁(L²_#),

kAk ≤ kAkS2(L²_#)≤ kAkS1(L²_#); (2.12)

• for allA∈S₁(L²_#) andB∈L(L²_#), we haveAB∈S₁(L²_#),BA∈S₁(L²_#),

Tr (AB) = Tr (BA), kABk^S1(L²_#)≤ kBk kAk^S1(L²_#), kBAk^S1(L²_#)≤ kBk kAk^S1(L²_#); (2.13)

• for allA∈S₁(L²_#), there exists a unique functionρA∈L¹_#(Ω), called the density associated with the operator A, such that for allV ∈L^∞_#(Ω),

Tr (AV) = Z

Ω

ρAV,

where on the left–hand side of the above equality, V ∈L(L²_#) denotes the multiplication operator by the function V;

(9)

• for allA∈S2(L²_#) andB∈L(L²#), we haveAB∈S2(L²_#),BA∈S2(L²_#),

kABkS2(L²_#)≤ kBk kAkS2(L²_#), kBAkS2(L²_#)≤ kBk kAkS2(L²_#); (2.14)

• for allA∈S₂(L²_#) andB∈S₂(L²_#),AB∈S₁(L²_#),BA∈S₁(L²_#),

Tr (AB) = Tr (BA)≤ kAkS2(L²_#)kBkS2(L²_#). (2.15) We set

∀Ψ = (ψ1,· · ·, ψN)^T ∈

L²_#(Ω)N

, kΨkL²_# :=

N

X

i=1

kψik²L²_#

!^1/2 ,

∀Ψ = (ψ1,· · ·, ψN)^T ∈

H_#¹(Ω)N

, kΨkH_#¹ :=k(1−∆)^1/2ΨkL²_# =

N

X

i=1

kψik²H_#¹

!^1/2 ,

and more generally, for any operatorAonL²_#(Ω) with domainD(A),

∀Ψ∈[D(A)]^N, kAΨkL²_# :=

N

X

i=1

kAψik²L²_#

!^1/2 .

Let R^∗ = ^2π_LZ³ be the dual lattice of the periodic lattice R = LZ³. For k ∈ R^∗, we denote by ek the plane-wave with wavevectork, defined by

ek: R³→C

x7→ |Ω|^−1/2e^ik^·^x,

where|Ω|=L³. The family (ek)k∈R^∗ forms an orthonormal basis of the complex Hilbert space L²_#(Ω,C) :=

u∈L²_loc(R³,C)|uisR-periodic , endowed with the scalar product

∀u, v∈L²_#(Ω,C), hu|vi= Z

Ω

u(r)v(r)dr,

whereu(r) denotes the complex conjugate ofu(r), and for allv∈L²_#(Ω,C), v(r) = X

k∈R^∗

vbkek(r) with vbk=hek|vi=|Ω|⁻^1/2 Z

Ω

v(r)e⁻^ik^·^rdr.

Recall that the periodic Sobolev spaces H_#^s(Ω) can be characterized in a simple way using Fourier series:

fors∈R, we have H_#^s(Ω) :=

(

v= X

k∈R^∗

bvkek

∀k, bv₋k=bvk, kvk²H_#^s := X

k∈R^∗

(1 +|k|²)^s|bvk|²<∞ )

,

where theH_#^s inner product is defined by

∀u, v∈H_#^s(Ω), (u, v)H_#^s := X

k∈R^∗

(1 +|k|²)^sukvk.

(10)

Let us now clarify the meaning of the terms Tr (−∆γ) and Tr (Hγ) appearing in (2.7)–(2.9). Letγ∈L(L²#) be self-adjoint and positive. Since|∇|(i.e. the multiplication operator by|k|in Fourier representation) is a bounded linear operator fromH_#^s(Ω) to H_#^s−1(Ω),|∇|γ|∇|defines a bounded linear operator from H_#¹(Ω) toH_#⁻¹(Ω). If in addition, Ran(|∇|γ|∇|)⊂L²_#(Ω) and

∃C∈R+ such that ∀u∈H_#¹(Ω), k|∇|γ|∇|ukL²_# ≤CkukL²_#,

then|∇|γ|∇|can be uniquely extended to a bounded, self-adjoint, positive operator onL²_#(Ω), also denoted by|∇|γ|∇|for simplicity. In this case, Tr (|∇|γ|∇|) is well-defined inR+∪ {+∞}. In view of the fact that

−∆ =|∇|², the notation

Tr (−∆γ) := Tr (|∇|γ|∇|)

is commonly used in the mathematical physics literature. Let us emphasize that Tr (−∆γ)<∞only means that Tr (|∇|γ|∇|)<∞; in particular, it doesnotimply that the operator −∆γis in S₁(L²_#).

It follows from the Hoffmann–Ostenhof inequality [14] that for allγ∈K, ργ ≥0 and√ργ ∈H_#¹(Ω),→ L⁶_#(Ω). The real number Tr (Hγ) can therefore be defined for all γ∈Kas

Tr (Hγ) := 1

2Tr (−∆γ) + Z

Ω

ργV.

It is known in addition (see e.g. [5]) that, under Assumption 2.1, there exist 0< c≤C <∞such that

c(1−∆)≤ |H −F| ≤C(1−∆), (2.16)

where the operator|H −F| is defined as|H −F|=−γ0(H −F)γ0+ (1−γ0)(H −F)(1−γ0), and can be written using the eigendecomposition ofH as

|H −F|=

+∞

X

i=1

|λ⁰_i −F||φ⁰_iihφ⁰_i|,

where (φ⁰_i, λ⁰_i)i≥1 are eigenpairs ofH defined in (2.2). There also holds

∀γ∈K, Tr (Hγ)−Tr (Hγ0) =k|H−F|^1/2(γ−γ0)k²S2(L²_#). (2.17) We deduce from (2.16) and (2.17) that there exist two constants 0< c≤C <∞such that

∀γ∈K, ck(1−∆)^1/2(γ−γ0)k²S2(L²_#)≤E(γ)−E0≤Ck(1−∆)^1/2(γ−γ0)k²S2(L²_#). (2.18) This implies in particular thatγ0 is the unique minimizer of (2.10) and (2.11).

Note that for allγ∈Υ,asγ²=γand by the cyclicity of the trace, there also holds Tr (Hγ) = Tr (γH) = Tr (γHγ).

2.3 Discretization

In order to solve problem (2.5) numerically, we use a plane-wave discretization. For each k ∈ R^∗, the kinetic energy of the plane-waveek is given by ¹₂|k|², where| · | denotes the Euclidean norm. To construct a discretization space, we introduce some energy cut-offEc >0 and consider all plane-waves whose kinetic energy is smaller thanEc,i.e. |k| ≤√

2Ec. For each cut-off energyEc, we set Nc=q

Ec

2 L π and XNc:=







X

k∈R^∗,|k|≤^2πLNc

bvkek

vb₋k=bvk, ∀k







⊂ \

s∈R

H_#^s(Ω).

(11)

For alls∈R, for allr≤s, and for eachv∈H_#^s(Ω), the best approximation ofv inXNc in anyH_#^r-norm is ΠNcv= X

k∈R^∗,|k|≤^2π_LNc

bvkek.

We denote by Π^⊥_N_c = (1−ΠNc) the orthogonal projector on X_N^⊥_c, the orthogonal ofXNc. The variational approximation to the ground-state energy inXNc is defined as

E0,Nc= inf

E(ΨNc), ΨNc ∈M∩[XNc]^N , (2.19) withEandMdefined in (2.3) and (2.4). Letλ1,Nc ≤λ2,Nc≤ · · · ≤λdim^(XNc),Ncbe the dim(XNc) eigenvalues (counting multiplicites) of the Hermitian linear operatorHNc,proj:XNc→XNc defined as

HNc,proj= ΠNcHΠNc =−1

2ΠNc∆ΠNc+ ΠNcVΠNc. (2.20) Let (φ1,Nc, . . . , φN,Nc) be an orthonormal family of eigenvectors ofHNc,projassociated with the eigenvalues λ1,Nc≤ · · · ≤λN,Nc:

HNc,projφi,Nc=λi,Ncφi,Nc, Z

Ω

φi,Ncφj,Nc =δij, ∀1≤i, j≤N, and let ΦNc:= (φ1,Nc, . . . , φN,Nc)^T. Then ΦNc is a minimizer of (2.19). Denoting by

γ0,Nc=

N

X

i=1

|φi,Ncihφi,Nc| (2.21)

the associated density matrix, we have

E0,Nc= Tr (Hγ0,Nc) =

N

X

j=1

λj,Nc. (2.22)

2.4 A priori results on the density matrices

From now on, we make the following technical assumption:

Assumption2.2. Vis aR−periodic potential satisfying V∈H_#^s(Ω) fors >3/2.

Note that this assumption implies thatV∈L^∞_#(Ω) and∇V∈L³_#(Ω).

Thea priorierror estimates established in [3] for the nonlinear Kohn–Sham model also hold true for the linear subproblem. In order to use these results in the present setting, it is convenient to reformulate them in terms of density matrices. As in [3], we introduce

M^Φ⁰ :=

Ψ∈M

kΨ−Φ⁰kL²_#= min

U∈O(N)kUΨ−Φ⁰kL²_#

,

where Φ⁰ = (φ⁰₁, . . . , φ⁰_N)^T, (φ⁰₁, . . . , φ⁰_N) being a family of orthonormal eigenvectors ofH associated with the eigenvaluesλ⁰₁≤ · · · ≤λ⁰_N fixed once and for all.

Proceeding as in [3], it can be shown that, for Nc large enough, (2.19) has a unique minimizer Φ⁰_N_c = (φ⁰_1,N_c, . . . , φ⁰_N,N_c)^T belonging to M^Φ⁰, that the set of minimizers of (2.19) is O(N)Φ⁰_N_c, and that, conse- quently, all the minimizers of (2.19) share the same density matrix. In particular

γ0,Nc=

N

X

i=1

|φ⁰_i,N_cihφ⁰_i,N_c|.

(12)

We denote by

Λ⁰_N_c= (λ⁰_ij,N_c)1≤i,j≤N := (hφ⁰_i,N_c|H|φ⁰_j,N_ci)1≤i,j≤N ∈R^N^×^N (2.23) the Lagrange multiplier matrix of the orthonormality constraints. Note that the matrix Λ⁰_N_c is not diagonal in general, but that we have

E0,Nc= Tr (Λ⁰_N_c).

The following lemma allows one to translate the a prioriresults of [3, Theorem 4.2] in terms of density matrices.

Lemma 2.1. Under Assumption 2.1, there exist 0< c≤C <∞ andN_c⁰∈N,such that for all Nc≥N_c⁰, kΦ⁰_N_c−Φ⁰kL²_# ≤ kγ0,Nc−γ0kS2(L²_#) ≤ √

2kΦ⁰_N_c−Φ⁰kL²_#, (2.24) ck(1−∆)^1/2(Φ⁰_N_c−Φ⁰)kL²_# ≤ k(1−∆)^1/2(γ0,Nc−γ0)kS2(L²_#) ≤ Ck(1−∆)^1/2(Φ⁰_N_c−Φ⁰)kL²_#. (2.25)

The proof is given in the Appendix.

We then immediately infer from [3, Theorem 4.2] that under Assumptions 2.1 and 2.2, there existC∈R+

andN_c⁰∈Nsuch that for allNc≥N_c⁰,

kγ0−γ0,Nck^S2(L²_#)≤CN_c⁻², (2.26) kγ0−γ0,Nck^S2(L²_#)≤CN_c⁻¹k(1−∆)^1/2(γ0−γ0,Nc)k^S2(L²_#). (2.27) Moreover, there existsC∈R+ such that

kΛ⁰−Λ⁰_N_ck^F≤Ck(1−∆)^1/2(γ0−γ0,Nc)k²S2(L²_#), (2.28) wherek.k^F denotes the Frobenius norm.

3 Post-processing of the plane-wave approximation

3.1 A key remark

Let us introduce the Hamiltonian onL²_#(Ω) with domainH_#²(Ω) defined by

∀u∈H_#²(Ω), HNcu=−1

2∆u+ ΠNcVΠNcu.

Since XNc and X_N^⊥_c are invariant subspaces of HNc, the Hamiltonian HNc can be represented in term of HNc,projas follows:

HNc=







HNc,proj 0

0 −¹2∆











 XNc





 X_N^⊥_c

| {z }

XNc

| {z }

X_Nc^⊥

(3.1)

The eigenvalues of the Laplace operator, which is diagonal in plane-wave bases, are explicitly known and its smallest eigenvalue on the invariant subspaceX_N^⊥_c is larger than ¹₂ ^2πN_L^c²

. Therefore, as soon as λN,Nc <1

2

2πNc

L 2

, (3.2)

(13)

where we recall that λ1,Nc ≤ · · · ≤ λN,Nc are the lowest N eigenvalues (counting multiplicities) of the operatorHNc,projdefined in (2.20), we have

∀j= 1, . . . , N, HNcφj,Nc=λj,Ncφj,Nc, and ∀i, j= 1, . . . , N, Z

Ω

φi,Ncφj,Nc=δij, (3.3) andλ1,Nc≤ · · · ≤λN,Nc are also the lowestN eigenvalues (counting multiplicities) of the operatorHNc. A key observation is that the lowest energy eigenmodes ofH satisfy

∀j= 1, . . . , N, (HNc+V^⊥Nc)φ⁰_j =λ⁰_jφ⁰_j, and ∀i, j= 1, . . . , N, Z

Ω

φ⁰_iφ⁰_j =δij, (3.4) where

V^⊥Nc=V−ΠNcVΠNc. (3.5)

We can therefore apply the Rayleigh–Schr¨odinger perturbation method [15] using (φj,Nc, λj,Nc)j=1,...,N as the reference solution and (φ⁰_j, λ⁰_j)j=1,...,N as the perturbed solution, in order to build improved approximations of the orbitals and eigenvalues respectively denoted by (φ]j,Nc)j=1,...,N and (λ]j,Nc)j=1,...,N, as well as an improved density matrixγ]0,Nc and improved energyE]0,Nc.

3.2 Corrections computation

More precisely, we apply first-order perturbation to the analytic family of operators H(β) =HNc+βV^⊥_N_c, where β ∈ R is a parameter, which amounts to considering H(0) = HNc and (3.3) as the unperturbed eigenvalue problem, and H(1) = H and (3.4) as the perturbed eigenvalue problem. Assuming that the eigenvalues are not degenerate, we obtain at first order for the eigenfunctions, and at second order for the eigenvalues,

∀j= 1, . . . , N, φ⁰_j 'φ⁰_j,N_c+φ⁽¹⁾_j,N_c, λ⁰_j 'λj,Nc+λ⁽²⁾_j,N_c, where

φ⁽¹⁾_j,N_c =−

−1

2∆−λj,Nc

−1

rj∈X_N^⊥_c, (3.6)

withrj being the residual rj=

−1

2∆ +V−λj,Nc

φj,Nc= HNc+V^⊥Nc−λj,Nc

φj,Nc=V^⊥Ncφj,Nc, (3.7) and

λ⁽²⁾_j,N_c=hφ⁽¹⁾_j,N_c|rji=−hrj|

−1

2∆−λj,Nc

−1

|rji. (3.8)

We observe that the corrections on the eigenfunctions given in (3.6) are well-defined even ifλj,Ncis degenerate.

We therefore define the perturbed eigenvectors, density matrix, and energy for the general case as follows.

Definition3.1 (Perturbed eigenvectors, eigenvalues, density matrix, and energy). For allNc≥N_c⁰and all j= 1, . . . , N, the perturbed eigenvectors are defined as

φ]j,Nc =φj,Nc+φ⁽¹⁾_j,N_c, the perturbed eigenvalues as

λ]j,Nc =λj,Nc+λ⁽²⁾_j,N_c, the perturbed density matrix as

γ]0,Nc =γ0,Nc+γ_N⁽¹⁾_c,

(14)

with

γ_N⁽¹⁾_c =

N

X

j=1

|φ⁽¹⁾_j,N_cihφj,Nc|+

N

X

j=1

|φj,Ncihφ⁽¹⁾_j,N_c|, (3.9) and the perturbed energy as

E]0,Nc=

N

X

j=1

λ]j,Nc= Tr (γ0,NcHγ]0,Nc). (3.10)

Remark 3.1. Note that even if we call γ]0,Nc a density matrix, γ]0,Nc ∈/K in general. Indeed, γ]0,Nc=γ]0,Nc

∗

and Tr (γ]0,Nc) =N, but we do not have in general 0 ≤γ]0,Nc ≤ 1. Hence, the perturbed energy, which is defined as the sum of the perturbed eigenvalues, is not equal to the energy of the perturbed density matrix, i.e. E]0,Nc 6= Tr (Hγ]0,Nc). Nevertheless definition (3.10) corresponds to the correction on the eigenvalues (λj,Nc)j=1,...,N given by perturbation theory at second-order as in [6], the contribution at first-order on the eigenvalues being zero.

Remark 3.2. As we shall see in Section 5, the quantities φ⁽¹⁾_j,N_c are easily computable.

4 Convergence improvement on the density matrix and the energy

4.1 Main results

The main results of this article are collected in the following theorem.

Theorem 4.1. Under Assumptions 2.1–2.2, there existC∈R+ andN_c⁰∈N, such that for all Nc≥N_c⁰,

k(1−∆)^1/2(γ]0,Nc−γ0)k^S2(L²_#)≤CN_c⁻²k(1−∆)^1/2(γ0,Nc−γ0)k^S2(L²_#), (4.1) and

E]0,Nc−E0

≤CN_c⁻²

E0,Nc−E0

. (4.2)

4.2 Proofs

In order to prove Theorem 4.1, we first provide in Section 4.2.1 a decomposition of γ0 based on spectral projection in Lemma 4.2, relying on Lemma 4.1 for a rigorous justification of the contour integral. In Section 4.2.2, we decompose the differenceγ0−γ]0,Nc into three parts in Lemma 4.3, and we then estimate each of these terms in three of the following Lemmas 4.4, 4.6, and 4.7, relying on an intermediary estimate presented in Lemma 4.5, in order to prove estimate (4.1). Finally, in Section 4.2.3, we provide a proof for estimate (4.2).

4.2.1 Exact density matrix in terms of approximate density matrix

Lemma 4.1. Let Γbe the circle in the complex plane symmetric with respect to the real axis and containing the real numbers λ⁰₁−1 and F. There exists N_c⁰ ∈ N such that for all Nc ≥ N_c⁰, Γ encloses the lowest N eigenvalues of both the operators H andHNc (respectively defined in (2.1) and (3.1)), and none of the higher ones.

Proof. AsH andHNcare self-adjoint operators, their eigenvalues noted respectively (λ⁰_i)i∈N^∗and (λi,Nc)i∈N^∗

(with increasing values and counting multiplicities) are real. From the gap assumption 2.1, and the definition of the Fermi level, we have

∀i= 1, . . . , N, λ⁰_i < F, and ∀i > N, λ⁰_i > F. (4.3)

(15)

The plane-wave discretization being variational, there holds

∀i= 1, . . . ,dim(XNc), λ⁰_i ≤λi,Nc,

and the sequence (λi,Nc)Nc is decreasing. Moreover, classical convergence results (see e.g. [8, Chapter 5]) guarantee that

i=1,...,Nmax |λi,Nc−λ⁰_i| −→_N

c→+∞0.

Therefore, there existsN_c⁰∈Nsuch that for allNc ≥N_c⁰, λN,Nc≤λN,N_c⁰ < F, and the eigenvalues of the Laplace operator onX_N^⊥_c are larger thanλ⁰_N+1> F, so that

∀Nc≥N_c⁰, ∀i= 1, . . . , N, λi,Nc ≤λN,N_c⁰ < F, and ∀i > N, λi,Nc≥λ⁰_N₊₁> F. (4.4) Combining (4.3) and (4.4) concludes the proof of the lemma.

Using the Cauchy residue theorem and functional calculus for self-adjoint operators, the ground-state density matrix ofH can be written as

γ0= 1 2πi

I

Γ

(z−H)⁻¹dz= 1 2πi

I

Γ

z−HNc−V^⊥Nc

−1

dz. (4.5)

Since V∈L^∞_#(Ω), V^⊥_N_c is HNc-bounded, and we can perform a Dyson expansion of (4.5) at second order.

We obtain

γ0= 1 2πi

I

Γ

(z−HNc)⁻¹dz+ 1 2πi

I

Γ

(z−HNc)⁻¹V^⊥Nc(z−HNc)⁻¹dz + 1

2πi I

Γ

(z−H)⁻¹V^⊥Nc(z−HNc)⁻¹V^⊥Nc(z−HNc)⁻¹dz, (4.6) where each term of the right-hand side is well-defined.

Lemma 4.2 (Second order expansion ofγ0). There holds

γ0=γ0,Nc+γ_N⁽¹⁾_c+QgNc, (4.7) whereγ_N⁽¹⁾_c is the finite-rank operator defined in (3.9)and where

QgNc:= 1 2iπ

I

Γ

(z−H)⁻¹V^⊥Nc(z−HNc)⁻¹V^⊥Nc(z−HNc)⁻¹dz. (4.8) Proof. The operatorHNcbeing self-adjoint with compact resolvent, it can be diagonalized in an orthonormal basis. Hence, there exists a sequence (ψk, εk)k≥1with (ψk)k≥1 an orthonormal basis ofL²_#(Ω) consisting of functions ofH_#²(Ω) and (εk)_k≥1a non-decreasing sequence of real numbers such that

∀k≥1, HNcψk=εkψk.

Without loss of generality, we can choose a basis such that, in addition, for k= 1, . . . , N, ψk =φk,Nc and εk=λk,Nc. The operatorHNc can then be written as

HNc=X

k≥1

εk|ψkihψk|.

Let us show that the expansions (4.6) and (4.7) are identical. First, we have γ0,Nc= 1

2πi I

Γ

(z−HNc)⁻¹dz.