and T. Leli` evre

(1)

L. Chupin, A. M¨unch, Editors

GREEDY ALGORITHMS FOR HIGH-DIMENSIONAL NON-SYMMETRIC LINEAR PROBLEMS^∗,^∗∗

E. Canc` es

¹

, V. Ehrlacher

¹

and T. Leli` evre

¹

Abstract. In this article, we present a family of numerical approaches to solve high-dimensional linear non-symmetric problems. The principle of these methods is to approximate a function which depends on a large number of variates by a sum of tensor product functions, each term of which is iteratively computed via a greedy algorithm [20]. There exists a good theoretical framework for these methods in the case of (linear and nonlinear) symmetric elliptic problems. However, the convergence results are not valid any more as soon as the problems under consideration are not symmetric. We present here a review of the main algorithms proposed in the literature to circumvent this difficulty, together with some new approaches. The theoretical convergence results and the practical implementation of these algorithms are discussed. Their behaviors are illustrated through some numerical examples.

Résumé. Dans cet article, nous présentons une famille de méthodes numériques pour résoudre des problèmes linéaires non symétriques en grande dimension. Le principe de ces approches est de représenter une fonction dépendant d’un grand nombre de variables sous la forme d’une somme de fonc- tions produit tensoriel, dont chaque terme est calculé itérativement via un algorithme glouton [20]. Ces méthodes possèdent de bonnes propriétés théoriques dans le cas de problèmes elliptiques symétriques (linéaires ou non linéaires), mais celles-ci ne sont plus valables dès lors que les problèmes considérés ne sont plus symétriques. Nous présentons une revue des principaux algorithmes proposés dans la littérature pour contourner cette difficulté ainsi que de nouvelles approches que nous proposons. Les résultats de convergence théoriques et la mise en oeuvre pratique de ces algorithmes sont détaillés et leur comportement est illustré au travers d’exemples numériques.

Introduction

High-dimensional problems arise in a wide range of fields such as quantum chemistry, molecular dynamics, uncertainty quantification, polymeric fluids, finance... In all these contexts, one wishes to approximate a function udepending ondvariatesx1, ...,xd whered∈N∗ is typically very large. Classically, the function uis defined as the solution of a Partial Differential Equation (PDE) and cannot be obtained by standard approximation techniques such as Galerkin methods for instance. Indeed, let us consider a discretization basis withN degrees of freedom for each variate (N ∈N∗), so that the discretization space is given by

V_N := Spann

ψ_i⁽¹⁾₁ (x1)· · ·ψ^(d)_i

d (xd), 1≤i₁,· · · , i_d≤No ,

∗Funding from the Michelin company is acknowledged.

∗∗ Virginie Ehrlacher would like to thank Kathrin Smetana for very interesting discussions about symmetric formulations of non-symmetric problems.

1Universit´e Paris Est, CERMICS, projet MICMAC, Ecole des Ponts Paristech - INRIA, 6 & 8 avenue Blaise Pascal, 77455 Marne- la-Vall´ee Cedex 2, France; e-mail: [email protected] & [email protected] & [email protected]

c

EDP Sciences, SMAI 2013

Article published online byEDP Sciencesand available athttp://www.esaim-proc.orgorhttp://dx.doi.org/10.1051/proc/201341005

(2)

where for all 1≤j≤d, ψ_i^(j)

1≤i≤N is a family ofN functions which only depend on the variatexj. A Galerkin method consists in representing the solutionuof the initial PDE as

u(x1,· · ·, xd)≈ X

1≤i1,···,id≤N

λi1,···,idψ⁽¹⁾_i₁ (x1)· · ·ψ_i^(d)_d (xd),

and computing the set of N^d real numbers (λi1,···,id)₁_≤_i

1,···,id≤N. Thus, the size of the finite-dimensional problem to solve grows exponentially with the number of variates involved in the problem. Such methods cannot be implemented whendis too large: this is the so-calledcurse of dimensionality[2].

Several approaches have recently been proposed in order to circumvent this difficulty. Let us mention among others sparse grids [21], tensor formats [11], reduced bases [4] and adaptive polynomial approximations [6].

In this paper, we will focus on a particular set of methods, originally introduced by Ladev`ezeet al. to perform time-space variable separation [12], Chinesta et al. to solve high-dimensional Fokker-Planck equations in the context of kinetic models for polymers [1] and Nouy in the context of uncertainty quantification [15], under the name ofProgressive Generalized Decomposition (PGD) methods.

Let us assume that each variatexjbelongs to a subsetXj ofR^m^j, wheremj∈N^∗ for all 1≤j≤d. For each d-tuple (r⁽¹⁾,· · ·, r^(d)) of functions such thatr^(j)only depends onxj for all 1≤j≤d, we call atensor product function and denote byr⁽¹⁾⊗ · · · ⊗r^(d)the function which depends on all the variatesx1,· · ·, xdand is defined by

r⁽¹⁾⊗ · · · ⊗r^(d):

X1× · · · × Xd → R

(x1,· · · , xd) 7→ r⁽¹⁾(x1)· · ·r^(d)(xd).

The approach of Ladev`eze, Chinesta, Nouy and coauthors consists in approximating the function u by a separate variable decomposition, i.e.

u(x1,· · · , x_d)≈ Xn k=1

r⁽¹⁾_k (x1)· · ·r_k^(d)(xd) = Xn k=1

r⁽¹⁾_k ⊗ · · · ⊗r^(d)_k (x1,· · · , x_d), (1)

for some n ∈ N^∗. In the above sum, each term is a tensor product function. Each d-tuple of functions

r⁽¹⁾_k ,· · ·, r^(d)_k

is iteratively computed in a greedy [20] way: once the firstk terms in the sum (1) have been computed, they are fixed, and the (k + 1)^th term is obtained as the next best tensor product function to approximate the solution. This will be made precise below.

Thus, the algorithm consists in solving several low-dimensional problems whose dimensions scale linearly with the number of variates and can be applied when classical methods are not. In this case, if we use a discretization basis withNdegrees of freedom per variate as above, the size of the discretized problems involved in the computation of a d-tuple

r_k⁽¹⁾,· · · , r^(d)_k

is of the order ofN d and the total size of the discretization problems isnN d.

This numerical strategy has been extensively studied for the resolution of (linear or nonlinear) elliptic problems [5, 10, 13, 18, 20]. More precisely, let ube defined as the unique solution of a minimization problem of the form

u= argmin

v∈V

E(v), (2)

where V is a reflexive Banach space of functions depending on thed variates x1, ..., xd, and E :V →R is a coercive real-valued energy functional. Besides, for all 1≤j≤d, letVxj be a reflexive Banach space of functions which only depend on the variate x_j. The standard greedy algorithm reads:

(1) setu0= 0 andn= 1;

(3)

(2) find

r⁽¹⁾n ,· · ·, r^(d)n

∈Vx1× · · · ×Vx_d such that

r⁽¹⁾_n ,· · · , r^(d)_n

∈ argmin

(^r⁽¹⁾^,···,r^(d))∈Vx1×···×V_xd

E

un−1+r⁽¹⁾⊗ · · · ⊗r^(d) ,

(3) setun=un−1+r⁽¹⁾n ⊗ · · · ⊗rn^(d)andn=n+ 1.

Under some natural assumptions on the spacesV,Vx1, ...,Vxd and the energy functionalE, all the iterations of the greedy algorithm are well-defined and the sequence (un)n∈N^∗ strongly converges inV towards the solution uof the original minimization problem (2).

This result holds in particular whenuis defined as the unique solution of findu∈Vsuch that

∀v∈V, a(u, v) =l(v),

whereV is a Hilbert space,ais asymmetric continuous coercive bilinear form onV ×V andl is a continuous linear form on V. In this case, u is equivalently solution of a minimization problem of the form (2) with E(v) = ¹₂a(v, v)−l(v) for allv∈V.

However, there is no obvious generalization of the iterative algorithm presented above when the functionu is not defined as the solution of a minimization problem of the form (2). This situation typically occurs when uis defined as the solution of anon-symmetric linear problem

findu∈Vsuch that

∀v∈V, a(u, v) =l(v),

whereais a non-symmetric continuous bilinear form onV ×V andl is a continuous linear form onV. The aim of this article is to give an overview of the state of the art of the numerical methods based on the greedy iterative approach used in this non-symmetric linear context and of the remaining open questions concerning this issue. In Section 1, we present the standard greedy algorithm for the resolution of symmetric coercive high-dimensional problems and the theoretical convergence results proved in this setting. Section 2 explains why a naive transposition of this algorithm for non-symmetric problems is doomed to failure and motivates the need for more subtle approaches. Section 3 describes the provably converging algorithms existing in the literature for non-symmetric problems. All of them consist in symmetrizing the original non-symmetric problem by minimizing the residual of the equation in a well-chosen norm. However, depending on the choice of the norm, either the conditioning of the discretized problems may behave badly or several intermediate problems may have to be solved online, which leads to a significant increase of simulation times and memory needs compared to the original algorithm in a symmetric linear coercive case. So far, there are no methods avoiding these two problems and for which there are theoretical convergence results in the general case. In Section 4, we present some existing algorithms designed by Nouy [16] and Lozinski [14] to circumvent these difficulties and the partial theoretical results which are known for these algorithms. Section 5 is concerned with another algorithm we propose, for which some partial convergence results are proved. In Section 6, the behaviors of the different algorithms presented here are illustrated on simple toy numerical examples. Lastly, we present in the Appendix some possible tracks to design other methods, but for which further work is needed.

1. The symmetric coercive case

1.1. Notation

Let us first introduce some notation. Letdbe a positive integer,m1, ...,md positive integers andX1, ...,Xd

open subsets ofR^m¹, ...,R^m^d respectively.

(4)

Letµx1, ...,µxd denote measures onX1, ..., Xd respectively. Let L²(X1;µx1), ...,L²(Xd;µxd) be associated L² spaces, i.e. vectorial spaces which are complete when endowed with the scalar products

∀f, g∈L²(Xj;µxj), hf, gi_X_j :=

Z

Xj

f(xj)g(xj)µxj(dxj), ∀1≤j≤d,

and their associated normsk · k_X1, ...,k · k_X_d. For instance, in the case whenX1= (0,1) andµx1 is the standard Lebesgue measure onX1, the spacesL²(0,1),L²_per(0,1) andL²₀(0,1) :=n

f ∈L²(0,1), R1 0 f = 0o

are examples of suchL² spaces.

In the rest of this article, for the sake of simplicity, we will omit the reference to the measuresµx1, ...,µx_d

and denote byL²(X1) =L²(X1;µx1), ...,L²(Xd) =L²(Xd;µxd).

We introduce the space L²(X1× · · · × Xd) := L²(X1)⊗ · · · ⊗L²(Xd). This space is a Hilbert space when endowed with the natural scalar product

∀f, g∈L²(X1× · · · × Xd), hf, gi:=

Z

X¹×···×Xd

f(x1,· · ·, xd)g(x1,· · ·, xd)µx1(dx1)· · ·µxd(dxd), and the associated norm is denoted byk · k_X₁_×···×X_d.

Let V ⊂L²(X1× · · · × Xd), Vx1 ⊂L²(X1), ...,Vx_d ⊂L²(Xd) be Hilbert spaces endowed respectively with scalar products denoted byh·,·iV,h·,·iVx1, ...,h·,·iV_xd and associated normsk · kV,k · kVx1, ...,k · kV_xd.

We defineV^′, V_x^′₁, ...,V_x^′_d as the dual spaces ofV,Vx1, ...,Vx_d with respect to theL² scalar productsh·,·i, h·,·i_X1, ...,h·,·i_X_d. These dual spaces are endowed with their natural normsk · kV^′ etc.

Lastly, the Riesz operatorRV :V →V^′ is defined by

∀v, w∈V, hv, wiV =hRVv, wiV^′,V.

It holds in particular that kvkV =kRVvkV^′. Similar operators RVx1, ..., RV_xd are introduced for the spaces Vx1, ...,Vxd.

For anyd-tuple r⁽¹⁾,· · · , r^(d)

∈Vx1× · · · ×Vxd, we define the tensor product functionr⁽¹⁾⊗ · · · ⊗r^(d) as follows

r⁽¹⁾⊗ · · · ⊗r^(d):

X1× · · · × Xd → R

(x1,· · · , xd) 7→ r⁽¹⁾(x1)· · ·r^(d)(xd).

In the particular case whend= 2, we shall denote respectivelyx1,X1,m1,Vx1 byx,X,mx,Vxandx2,X2, m₂,V_x₂ byt,T, m_t,V_t.

Besides, for any Banach spacesH1,H2, the space of bounded linear operators fromH1toH2will be denoted byL(H1, H2).

1.2. Theoretical results

We recall here the theoretical framework of the standard greedy algorithm in the coercive symmetric case.

Let us consider the problem

findu∈V such that

∀v∈V, a(u, v) =l(v), (3)

where

• a(·,·) is asymmetric, coercivecontinuous bilinear form onV ×V;

• l is a continuous linear form onV.

(5)

Then, problem (3) is equivalent to the minimization problem u= argmin

v∈V E(v), (4)

where

∀v∈V, E(v) :=1

2a(v, v)−l(v). (5)

The greedy algorithm reads:

(1) letu0= 0 andn= 1;

(2) define

r⁽¹⁾n ,· · ·, r^(d)n

∈Vx1× · · · ×Vxd such that

r⁽¹⁾_n ,· · · , r^(d)_n

∈ argmin

(^r⁽¹⁾^,···,r^(d))∈V_x1×···×V_xd

E

un−1+r⁽¹⁾⊗ · · · ⊗r^(d)

; (6)

(3) defineun=un−1+r⁽¹⁾n ⊗ · · · ⊗r^(d)n and setn=n+ 1.

Let us denote by

Σ :=n

r⁽¹⁾⊗ · · · ⊗r^(d), r⁽¹⁾∈Vx1,· · · , r^(d)∈Vx_d

o

(7) and make the following assumptions:

(A1) Span(Σ)^V =V;

(A2) Σ is weakly closed in V.

These assumptions are usually satisfied in the case of classical Sobolev spaces [5].

Theorem 1.1. Assume that (A1) and (A2) are satisfied. Then, for alln∈N^∗, there exists at least one solution

r⁽¹⁾n ,· · ·, r^(d)n

∈V_x₁× · · · ×V_x_d (not necessarily unique) to (6) and any solution satisfiesrn⁽¹⁾⊗ · · · ⊗r^(d)n 6= 0 if and only if un−16=u. Besides, the sequence (un)n∈N^∗ strongly converges towards uinV.

The following Lemma will be used later. Although the proof is given in [10], we recall it here for the sake of self-containedness.

Lemma 1.1. For all v∈V, let us denote bykvka:=p

a(v, v). Then, for alln∈N^∗,

r_n⁽¹⁾⊗ · · · ⊗r^(d)_n

a= sup

(r⁽¹⁾,···,r^(d))∈Vx1×···×V_xd, r⁽¹⁾⊗···⊗r^(d)6=0

a u−un−1, r⁽¹⁾⊗ · · · ⊗r^(d) r⁽¹⁾⊗ · · · ⊗r^(d)

a

. (8)

Proof. Let us prove (8) for n = 1. The proof is similar for larger n ∈ N^∗. The d-tuple

r⁽¹⁾₁ ,· · ·, r₁^(d)

∈ Vx1× · · · ×Vxd solution of (6) forn= 1 equivalently satisfies:

r⁽¹⁾₁ ,· · ·, r^(d)₁

∈ argmin

(^r⁽¹⁾^,···,r^(d))∈Vx1×···×V_xd

1 2

u−r⁽¹⁾⊗ · · · ⊗r^(d)

2

a. (9)

The Euler equations associated to this minimization problem read: for all δr⁽¹⁾,· · ·, δr^(d)

∈Vx1× · · · ×Vxd, a

r₁⁽¹⁾⊗ · · · ⊗r₁^(d), r₁⁽¹⁾⊗ · · · ⊗r^(d₁⁻¹⁾⊗δr^(d)+r⁽¹⁾₁ ⊗ · · · ⊗r^(d₁⁻²⁾⊗δr^(d⁻¹⁾⊗r^(d)₁ +· · ·+δr⁽¹⁾⊗r⁽²⁾₁ ⊗ · · · ⊗r₁^(d)

=a

u, r⁽¹⁾₁ ⊗ · · · ⊗r₁^(d⁻¹⁾⊗δr^(d)+r₁⁽¹⁾⊗ · · · ⊗r₁^(d⁻²⁾⊗δr^(d⁻¹⁾⊗r^(d)₁ +· · ·+δr⁽¹⁾⊗r⁽²⁾₁ ⊗ · · · ⊗r^(d)₁ ,

(6)

which implies that r⁽¹⁾₁ ⊗ · · · ⊗r₁^(d)

2 a =a

u, r⁽¹⁾₁ ⊗ · · · ⊗r^(d)₁

. (10)

Let now r⁽¹⁾,· · ·, r^(d)

∈Vx1× · · · ×Vxd be such thatr⁽¹⁾⊗ · · · ⊗r^(d)6= 0. Using (9) and (10), it holds that

u− a

u, r₁⁽¹⁾⊗ · · · ⊗r₁^(d)

r⁽¹⁾₁ ⊗ · · · ⊗r₁^(d)

2 a

r⁽¹⁾₁ ⊗ · · · ⊗r^(d)₁

2

a

=

u−r⁽¹⁾₁ ⊗ · · · ⊗r^(d)₁

2 a≤

u−a u, r⁽¹⁾⊗ · · · ⊗r^(d) r⁽¹⁾⊗ · · · ⊗r^(d)²

a

r⁽¹⁾⊗ · · · ⊗r^(d)

2

a

.

Therefore,

a

u, r₁⁽¹⁾⊗ · · · ⊗r₁^(d)2

r₁⁽¹⁾⊗ · · · ⊗r₁^(d)

2 a

≥a u, r⁽¹⁾⊗ · · · ⊗r^(d)2

r⁽¹⁾⊗ · · · ⊗r^(d)²

a

.

Taking the supremum over all r⁽¹⁾,· · · , r^(d)

∈Vx1×· · ·×Vxdsuch thatr⁽¹⁾⊗· · ·⊗r^(d)6= 0 yields the result.

Equation (8) implies in particular that for alln∈N∗,

r_n⁽¹⁾⊗ · · · ⊗ · · ·r_n^(d)

a = sup

(^r⁽¹⁾^,···,r^(d))∈Vx1×···×V_xd

l r⁽¹⁾⊗ · · · ⊗r^(d)

−a un−1, r⁽¹⁾⊗ · · · ⊗r^(d) r⁽¹⁾⊗ · · · ⊗r^(d)

a

. (11)

Let us rewrite the greedy algorithm in the particular case whend= 2.

(1) Letu0= 0 and n= 1;

(2) define (rn, sn)∈Vx×Vtsuch that

(rn, sn)∈ argmin

(r,s)∈Vx×Vt

E(un−1+r⊗s) ; (12)

(3) defineun=un−1+rn⊗sn and set n=n+ 1.

For the sake of simplicity, in the rest of the article, all the algorithms will be presented in the case whend= 2.

The generalization of the approaches to a larger number of variatesdis straightforward unless mentioned.

The Euler equations associated to the minimization problem (12) read

a(un−1+rn⊗sn, δr⊗sn+rn⊗δs) =l(δr⊗sn+rn⊗δs), ∀(δr, δs)∈Vx×Vt. (13) As a consequence of Theorem 1.1, provided that the set

Σ ={r⊗s, r∈Vx, s∈Vt} (14)

satisfies assumptions (A1) and (A2), at the first iteration of the algorithm (n = 1), as soon as the forml is nonzero, there exists at least one solution (r1, s1)∈Vx×Vtof

a(r1⊗s1, δr⊗s1+r1⊗δs) =l(δr⊗s1+r1⊗δs), ∀(δr, δs)∈Vx×Vt, such thatr1⊗s16= 0.

In practice, at each iteration n∈N∗, a pair (rn, sn)∈Vx×Vt is computed via the resolution of the Euler equations (13) using a fixed-point procedure which reads as follows:

(7)

• choose

rn⁽⁰⁾, s⁽⁰⁾n

∈Vx×Vtand setm= 1;

• find

r^(m)n , s^(m)n

∈Vx×Vtsuch that



 a

u_n₋₁+r^(m)n ⊗s^(mn ⁻¹⁾, δr⊗s^(mn ⁻¹⁾

=l

δr⊗s^(mn ⁻¹⁾

, ∀δr∈V_x, a

un−1+r^(m)n ⊗s^(m)n , r^(m)n ⊗δs

=l

r^(m)n ⊗δs

, ∀δs∈Vt; (15)

• setm=m+ 1.

This fixed-point algorithm is numerically observed to converge exponentially fast in most situations, although, at least to our knowledge, there is no rigorous proof in the general case.

2. The non-symmetric case

2.1. General framework

Let us now consider the case of a non-symmetric linear problem of the form findu∈V such that

∀v∈V, a(u, v) =l(v), (16)

where

• a(·,·) is anonsymmmetriccontinuous bilinear form onV ×V;

• l is a continuous linear form onV. In the rest of the article, we will assume that

(A3) problem (16) has a unique solutionu∈V for any continuous linear forml∈L(V,R).

We denote byA ∈L(V, V) the operator defined by

∀v, w∈V, hAv, wiV =a(v, w), and byLthe element ofV such that

∀v∈V, hL, viV =l(v).

We also introduce the operator A:V →V^′ and the linear formL∈V^′ defined byA=RVAandL=RVL so that the unique solutionuto (16) is also the unique solution to the problem

findu∈V such that Au=LinV^′.

It follows from assumption (A3) thatAandA are invertible operators.

2.2. Prototypical examples

Let us present two prototypical examples we will refer to throughout the rest of the paper.

• The first one is

findu∈H₀¹(X)⊗L²(T) such that

−∆xu+bx· ∇xu+u=f in D^′(X × T), (17) withf ∈H⁻¹(X)⊗L²(T) andb_x∈R^m^x. For this problem,V =H₀¹(X)⊗L²(T),V^′ =H⁻¹(X)⊗L²(T) and ∀u, v∈V, a(u, v) =R

X ×T (∇xu· ∇xv+v(bx· ∇xu) +uv),

∀v∈V, l(v) =R

Thf, vi_H⁻¹₍_X_),H₀¹₍_X₎. In this case,A=−∆x+bx· ∇x+ 1.

(8)

• The second example is

findu∈H₀¹(X × T) such that

−∆x,tu+b· ∇x,tu+u=f inD^′(X × T), (18) withf ∈H⁻¹(X ×T) andb= (bx, bt)∈R^m^x×R^m^t. For this problem,V =H₀¹(X ×T),V^′ =H⁻¹(X,T) and ∀u, v∈V, a(u, v) =R

X ×T (∇x,tu· ∇x,tv+v(b· ∇x,tu) +uv),

∀v∈V, l(v) =hf, vi_H⁻¹₍_{X ×T}_),H₀¹₍_{X ×T}₎. In this case,A=−∆x,t+b· ∇x,t+ 1.

2.3. Failure of the standard greedy algorithm

Problem (16) cannot be written as a minimization problem of the form (4) with an energy functional given by (5). The definition of the greedy algorithm via the minimization problems (6) or (12) cannot therefore be transposed to this case. However, a natural way to define the iterations of a greedy algorithm for the non- symmetric problem (16) is to define iteratively for n ∈ N^∗ the pair (rn, sn) ∈ Vx×Vt as a solution of the following equation

a(un−1+rn⊗sn, δr⊗sn+rn⊗δs) =l(δr⊗sn+rn⊗δs), ∀(δr, δs)∈Vx×Vt, (19) by analogy with the Euler equations (13). This is the so-calledPGD-Galerkin algorithm [3].

Actually, there are cases whenl6= 0 and any solution (r1, s1)∈Vx×Vtof the first iteration of the algorithm a(r1⊗s1, δr⊗s1+r1⊗δs) =l(δr⊗s1+r1⊗δs), ∀(δr, δs)∈Vx×Vt, (20) necessarily satisfiesr1⊗s1= 0. Such an algorithm cannot converge since the approximationun =

Xn k=1

rk⊗sk

given by the algorithm is equal to 0 for anyn∈N^∗. Besides, this situation may occur even when the norm of the antisymmetric part of the bilinear forma(·,·) is arbitrarily small.

Let us give an explicit example.

Example 2.1. Let X =T = (−1,1) andµx (respectively µt) be the Lebesgue measure on X (respectively on T). Let b∈R,Vx=H_per¹ (−1,1),Vt=L²(−1,1) andV =Vx⊗Vt. Consider the non-symmetric problem (16) with

∀v, w∈V, a(v, w) = Z

X ×T

(∇xv· ∇xw+ (b· ∇xv)w+vw), and

∀v∈V, l(v) = Z

X ×T

f v, with f ∈L²_per(−1,1)⊗L²(−1,1).

Problem (16) is equivalent to

find u∈H_per¹ (−1,1)⊗L²(−1,1) such that

−∆xu+b∇xu+u=f in D^′(R× T). (21) In this context, equations (20) read









find (r1, s1)∈H_per¹ (−1,1)×L²(−1,1)such that hR1

−1|s1(t)|²dti

(−r^′′₁(x) +br₁^′(x) +r₁(x)) =R1

−1f(x, t)s1(t)dt, hR1

−1 |r₁^′(x)|²+|r1(x)|² dxi

s1(t) =R1

−1f(x, t)r1(x)dx,

(22)

(9)

since the periodic boundary conditions on r1 imply that R1

−1r1(x)r^′₁(x)dx= 0.

Unlike the symmetric case, there exists an infinite set of functionsf ∈L²_per(−1,1)⊗L²(−1,1)such thatf 6= 0 and any solution(r1, s1)∈Vx×Vt of equations (22) necessarily satisfies r1⊗s1 = 0for any arbitrarily small value of|b|. This is the case for example whenf(x, t) =φ(x−t)for all(x, t)∈R×(−1,1)withφ∈L²_per(−1,1) an odd real-valued function.

Let us argue by contradiction. If (r1, s1)∈Vx×Vt is a solution to (22) such thatr1⊗s16= 0, up to some rescaling, we can assume that

Z 1

−1

|s1(t)|²dt= Z 1

−1

|r₁^′(x)|²+|r1(x)|²

dx=λ >0.

Thus, we can rewrite (22) as

−r₁^′′(x) +br^′₁(x) +r1(x) = 1 λ

Z 1

−1

f(x, t)s1(t)dt, s1(t) = 1

λ Z 1

−1

f(x, t)r1(x)dx.

Plugging the second equation into the first one, we obtain

−r^′′₁(x) +br₁^′(x) +r1(x) = 1 λ²

Z 1

−1

Z 1

−1

f(x, t)f(y, t)dt

r1(y)dy. (23)

Let us denote by g(x, y) =R1

−1f(x, t)f(y, t)dt for all (x, y)∈R². As φ is an odd,2-periodic function, it holds that

g(x, y) = Z 1

−1

f(x, t)f(y, t)dt

= Z 1

−1

φ(x−t)φ(y−t)dt

= −

Z 1

−1

φ(x−t)φ(t−y)dt

= −

Z 1+y

−1+y

φ(x−y−u)φ(u)du

= −

Z 1

−1

φ(x−y−u)φ(u)du.

Taking the Fourier transform of equation (23) yields that for allk∈πZ,

(|k|²+ibk+ 1)br1(k) =− 4 λ²

φ(k)b 2

b r1(k), where

b

r1(k) =1 2

Z 1

−1

r1(x)e⁻^ik^·^xdx.

(10)

PRACTICE THEORY

Dual norm residual minimization

PGD−Galerkin

Minimax

Dual Greedy

X−Greedy

Decomposition residual minimization

OK

Additional regularity on the right−hand side is needed though.

The conditioning of the resulting problems scale quadratically with the conditioning

of the original problem.

Need to solve several small− or high− dimensional symmetric coercive problems

converge towards the true solution.

OK in finite dimension provided that the

compared to its implicit part.

explicit part of the bilinear form is small enough

OK but slow

Diverges if the explicit part of the bilinear form is too large.

in practice.

Not clear how to implement the algorithm Same situation as the Dual−Greedy

OK OK OK OK

OK for separated operators

OK in finite dimension when Problems in infinite dimension There are cases where the algorithm does not

L²

V =Vx⊗Vt

Figure 1. Summary of the different greedy algorithms used for non-symmetric high- dimensional linear problems.

Futhermore, λ ∈ R^∗₊ and φ(0) = 0b (φ is an odd function). Thus, since φ(k)b is a purely imaginary number,

− φ(k)b 2

=bφ(k)

2

and a solutionr₁necessarily satisfiesbr₁(k) = 0for allk∈πZ, which yields a contradiction.

This example clearly shows that a naive transposition of the greedy algorithm to the non-symmetric case by analogy with the Euler equations (13) obtained in the symmetric case may be doomed to failure.

This article presents a review of some methods which aim at circumventing this difficulty. A particular highlight is set on the practical implementation of these methods and on the existence of theoretical rigorous convergence results. The properties of the different algorithms which are dealt with in this article are summarized in Figure 1. In particular, the algorithms which are markedOK in thePracticecolumn are those which

• do not suffer from conditioning problems;

• no extra loop of the greedy algorithm are needed to implement the method in practice.

3. Residual minimization algorithms

In this section, we present some numerical methods used for the computation of separate variable represen- tations of the solution of non-symmetric problems, for which there are rigorous convergence proofs. A natural

(11)

idea is to symmetrize (16) using a reformulation as a residual minimization problem in a well-chosen norm.

These algorithms are also calledMinimum Residual PGDin the literature [3].

3.1. Minimization of the residual in the L²(X × T) norm

Let us assume thatL∈L²(X × T) and that there existsD(A)⊂V a dense subdomain ofL²(X × T) such thatA(D(A))⊂L²(X × T). The mappingA:D(A)→L²(X × T) defines a linear operator onL²(X × T). Let us assume moreover that (A, D(A)) is a closed operator. This implies in particular thatD(A), endowed with the scalar product

∀v, w∈D(A), hv, wiD(A)=hv, wi+hAv, Awi, is a Hilbert space.

A first approach, inspired by [9], consists in applying a standard greedy algorithm on the energy functional E(v) =kAv−Lk²_L2(X ×T), ∀v∈D(A).

Let us consider the case when

A= Xp

i=1

A⁽ⁱ⁾_x ⊗A⁽ⁱ⁾_t

where for all 1≤i≤p,A⁽ⁱ⁾x andA⁽ⁱ⁾_t are operators onL²(X) andL²(T) with domainsD A⁽ⁱ⁾x

andD A⁽ⁱ⁾_t respectively. We denote byDx=Tp

i=1D A⁽ⁱ⁾x

andDt=Tp i=1D

A⁽ⁱ⁾_t

, and assume thatDxandDtare dense subspaces of L²(X) andL²(T) respectively and are Hilbert spaces, when endowed with the scalar products

∀v, w∈Dx, hv, wiD_x =hv, wi_X+ Xp i=1

D

A⁽ⁱ⁾_x v, A⁽ⁱ⁾_x wE

X,

and

∀v, w∈Dt, hv, wiDt =hv, wi_T + Xp i=1

D

A⁽ⁱ⁾_t v, A⁽ⁱ⁾_t wE

T . The greedy algorithm reads:

(1) letu0= 0 and setn= 1;

(2) define (rn, sn)∈Dx×Dt such that (rn, s_n)∈ argmin

(r,s)∈Dx×Dt

kA(un−1+r⊗s)−Lk²_L2(X ×T); (24) (3) setun=un−1+rn⊗sn andn=n+ 1.

Let us denote by Σ^D:={r⊗s, r∈D_x, s∈D_t}. From Theorem 1.1, provided that (B1) SpanΣ^D^D(A)=D(A);

(B2) Σ^D is weakly closed inD(A);

the sequence (un)n∈N^∗ strongly converges towardsuin D(A).

In the case of problem (17),A=Ax⊗AtwithAx=−∆x+b·∇x+1 andAt= 1,D(A) = H²(X)∩H₀¹(X)

⊗ L²(T),D_x=D(Ax) =H²(X)∩H₀¹(X) andD_t=D(At) =L²(T).

For problem (18), A= A⁽¹⁾x ⊗A⁽¹⁾_t +A⁽²⁾x ⊗A⁽²⁾_t with A⁽¹⁾x =−∆x+bx· ∇x+ 1, A⁽¹⁾_t = 1, A⁽²⁾x = 1 and A⁽²⁾_t =−∆t+bt· ∇t,D(A) =H²(X × T)∩H₀¹(X × T),Dx=H²(X)∩H₀¹(X) andDt=H²(T)∩H₀¹(T).

In both cases, assumptions (A1) and (A2) are satisfied.

(12)

Actually, when L is regular enough, i.e. if L ∈ D(A^∗), where A^∗ denotes the adjoint of A and D(A^∗) its domain, this method is equivalent to performing a standard greedy algorithm on the symmetric coercive problem

A^∗Au=A^∗L.

The Euler equations associated to the minimization problems (24) read

hA(un−1+rn⊗sn)−L, A(δr⊗sn+rn⊗δs)i= 0, ∀(δr, δs)∈Dx×Dt.

This method suffers from several drawbacks though. Firstly, the right-hand side L needs more regularity than necessary for problem (16) to be well-posed (we needL∈L²(X × T) instead ofL∈V^′).

Secondly, and more importantly, the conditioning of the associated discretized problems behaves badly since it scales quadratically with the conditioning of the original problem Au=L.

3.2. Minimization of the residual in the dual norm

In order to avoid the conditioning problems encountered when minimizing the residual in the L²(X × T) norm, another method consists in performing a greedy algorithm on the energy functional

E(v) =kAv−Lk²_V^′ =kR⁻_V¹(Av−L)k²_V, ∀v∈V.

Here, the residual Av−L is evaluated in the dual norm k · kV^′. In this method, the right-hand side L does not need to be more regular than L∈ V^′ and this approach is equivalent to performing the standard greedy algorithm on the symmetric coercive problem

A^∗(RV)⁻¹Au=A^∗(RV)⁻¹L.

The conditioning of the resulting problem scales linearly with the conditioning of the originalAu=Lproblem.

The algorithm reads:

(1) letu₀= 0 andn= 1;

(2) let (rn, sn)∈Vx×Vt such that (rn, sn)∈ argmin

(r,s)∈Vx×Vt

k(RV)⁻¹[A(un−1+r⊗s)−L]k²_V; (25) (3) setu_n=u_n₋₁+r_n⊗s_n andn=n+ 1.

Provided that Σ defined by (14) satisfies assumptions (A1) and (A2), we infer from Theorem 1.1 that the sequence (un)n∈Nstrongly converges touinV.

The Euler equations associated with the minimization problems (25) read: for all (δr, δs)∈V_x×V_t, R⁻_V¹[A(un−1+rn⊗sn)−L], R⁻_V¹[A(δr⊗sn+rn⊗δs)]

V = 0, or equivalently,

A(un−1+rn⊗sn)−L, R⁻_V¹[A(δr⊗sn+rn⊗δs)]

V^′,V = 0.

However, even if the conditioning problem of the previous method is avoided, this algorithm still requires the inversion of the operatorRV.

In the case whenV =Vx⊗Vt, the dual spaceV^′satisfiesV^′=V_x^′⊗V_t^′, so that the operatorRV =RVx⊗RVt

is a tensorized operator and R⁻_V¹ = R⁻_V_x¹⊗R⁻_V_t¹. A prototypical example of this situation is given by the problem (17), where we have V_x^′ =H⁻¹(X), V_t^′ =L²(T),RVx =−∆x, RVt = 1 and RV =RVx⊗RVt. Thus,

(13)

R⁻_V¹ = (−∆x)⁻¹⊗1 and carrying out the above greedy algorithm requires the computation of several low- dimensional Poisson problems, which remains doable but increases the time and memory needs compared to a standard greedy algorithm in the symmetric coervive case whereb=bx= 0.

The situation is even more intricate when V 6=Vx⊗Vt, since the operatorRV is not a tensorized operator in general. A prototypical example of this situation is problem (18) where V^′ = H⁻¹(X × T), RV =−∆x,t

and R⁻_V¹ cannot be expanded as a finite sum of tensorized operators. These intermediate symmetric coercive high-dimensional can be solved with a standard greedy algorithm presented in Section 1, but this considerably increases the time needed to run a simulation.

In this particular case, sinceRV =−∆x⊗1−1⊗∆tis the sum of two tensorized operators which commute with one another, we can use an approach described in [11]. This method consists in using an approximate expansion of the inverse of the Laplacian operator, constructed as follows. The functionh:x∈[x0,+∞)7→ _x¹ (wherex0is a positive real number) can be approximated by a sum of exponential functions of the form

1 x≈

XN l=1

Cle⁻^c^l^x,

for some N ∈ N^∗ and where (Cl)1≤l≤N and (cl)1≤l≤N a two sets of well-chosen real numbers, depending on x0. Provided that x0 satisfies x0 <min(1, λ^x₁, λ^t₁), where λ^x₁ (respectively λ^t₁) is the lowest eigenvalue of the operator −∆x onH₀¹(X) with respect to the L²(X) scalar product (respectively the lowest eigenvalue of the operator −∆t on H₀¹(T) with respect to the L²(T) scalar product), since both the operators −∆x⊗1 and

−1⊗∆tcommute, R⁻_V¹can be approximated by R⁻_V¹ ≈ PN

l=1Cle⁻^c^l⁽⁻^∆^x^⊗¹⁻¹^⊗^∆^t⁾

= PN

l=1Cle⁻^c^l^∆^x⊗e⁻^c^l^∆^t. (26) The computation of the expansion (26) only involves the computation of the exponential of small-dimensional operators. But of course, to have a reliable approximation of this operator, the numberN of terms in the above approximation may be very large. Besides, an explicit expansion is not always available for a general operator R⁻_V¹.

The algorithms presented in the following sections are attempts to find numerical methods which

• avoid the conditioning problem inherent to the method described in Section 3.1;

• avoid the use of inverse operators such asR⁻_V¹in the approach using the dual norm.

Of course, a natural idea would be to find a suitable norm to minimize the residual to avoid the conditioning and inversion problems. So far, no norms with such properties have been proposed.

In Section 4, we present algorithms already existing in the literature, namely those suggested by Anthony Nouy [16] and Alexe¨ı Lozinski [14]. In Section 5, a new algorithm is proposed. The known partial convergence results for these methods are presented and some details on the numerical implementations of these algorithms are provided.

4. Algorithms based on dual formulations

In this section, we present some classes of algorithms based on dual formulations of the non-symmetric problem (16).

4.1. MiniMax algorithm

A first algorithm based on a dual formulation of problem (16) is theMiniMax algorithmproposed by Nouy [16].

The algorithm reads as follows:

(1) letu0= 0 andn= 1;

(14)

(2) let (rn,ern, sn,sen)∈V_x²×V_t²such that (rn,ern, sn,esn)∈arg max

(er,es)∈Vx×Vt

(r,s)min∈Vx×Vt

Jn(r⊗s,er⊗es), (27) where for allv,ve∈V,

Jn(v,ev) = 1

2kvk²_V −a(un−1+v,ev) +l(ev);

(3) setun=un−1+rn⊗sn andn=n+ 1.

At each iterationn∈N^∗, the computation of a quadruplet (rn,er_n, s_n,es_n)∈V_x²×V_t² satisfying (27) is done by solving the stationarity equations

a(un−1+rn⊗sn,ern⊗δse+δer⊗sen) =l(ern⊗δes+δer⊗esn), ∀(δer, δes)∈Vx×Vt,

a(rn⊗δs+δr⊗sn,ern⊗esn) =hrn⊗δs+δr⊗sn, rn⊗sniV, ∀(δr, δs)∈Vx×Vt. (28) In practice, for each n ∈ N∗, these equations are solved through a fixed-point procedure where the pairs (rn,ren)∈V_x² and (sn,esn)∈V_t² are computed iteratively. More precisely, the fixed-point algorithm reads:

• setm= 0, and choose an initial guess

rn⁽⁰⁾,ren⁽⁰⁾, s⁽⁰⁾n ,es⁽⁰⁾n

∈V_x²×V_t²;

• find

r^(m+1)n ,ern^(m+1)

∈V_x²such that



 a

un−1+rn^(m+1)⊗s^(m)n , δre⊗es^(m)n

=l

δer⊗es^(m)n

, ∀δer∈Vx, a

δr⊗s^(m)n ,ren^(m+1)⊗es^(m)n

=D

δr⊗s^(m)n , r^(m+1)n ⊗s^(m)n

E

V , ∀δr∈Vx;

• find

s^(m+1)n ,se^(m+1)n

∈V_t² such that



 a

un−1+rn^(m+1)⊗s^(m+1)n ,re^(m+1)n ⊗δes

=l e

r^(m+1)n ⊗δes

, ∀δes∈Vt, a

rn^(m+1)⊗δs,ren^(m+1)⊗es^(m+1)n

=D

rn^(m+1)⊗δs, rn^(m+1)⊗s^(m+1)n

E

V , ∀δs∈V_t;

• setm=m+ 1.

In [17], it is proved that in the case when a =ax⊗at where ax is a continuous bilinear form on Vx×Vx

and at a continuous bilinear form onVt×Vtand V =Vx⊗Vt, the algorithm converges. However, there is no convergence result in the full general case.

4.2. Greedy algorithms for Banach spaces

Another family of dual greedy algorithms is inspired from the methods suggested by Temlyakov in [20] for Banach spaces and was proposed by Lozinski [14] in order to deal with the resolution of high-dimensional problems of the form (16).

4.2.1. Greedy algorithms for general Banach spaces

For the sake of simplicity, let us present two particular greedy algorithms proposed by Temlyakov in the context of Banach spaces, namely theX-Greedyand theDual Greedy algorithms.

Let (X,k · kX) be a reflexive Banach space and D a dictionary of X, i.e. a subset of X such that for all g∈ D,kgkX= 1 and Span(D)^X=X. Let us also denote byX^∗the dual space ofX.

(15)

Letf ∈X. The aim of both the Dual Greedy and the X-Greedy algorithms is to give an approximation of f as a linear combination of vectors of the dictionaryD. These numerical methods are generalizations of the Pure Greedyalgorithm, which is defined for Hilbert spaces. WhenX is a Hilbert space endowed with the scalar producth·,·iX, the Pure Greedy algorithm can be interpreted in two equivalent ways, namely:

Pure Greedy algorithm (1):

(1) letf0= 0,r0=f andn= 1;

(2) letgn∈ Dandαn∈Rsuch that (assuming existence) krn−1−αngnkX= min

g∈D, α∈Rkrn−1−αgkX; (3) letfn=fn−1+αngn,rn=rn−1−αngn andn=n+ 1;

and

Pure Greedy algorithm (2):

(1) letf0= 0,r0=f andn= 1;

(2) letgn∈ Dsuch that (assuming existence)

hrn−1, gniX = max

g∈Dhrn−1, giX; (3) letαn ∈Rsuch that

krn−1−αngnkX= min

α∈Rkrn−1−αgnkX; (4) letfn=fn−1+αngn,rn=rn−1−αngn andn=n+ 1.

WhenX is a Hilbert space, the two versions of the Pure Greedy algorithm are equivalent, but this is not the case anymore as soon asX is a general Banach space.

TheX-Greedyalgorithm corresponds to the extension of the first version of the Pure Greedy algorithm:

(1) letf0= 0,r0=f andn= 1;

(2) letgn∈ Dandαn∈Rsuch that (assuming existence) krn−1−αngnkX= min

g∈D, α∈Rkrn−1−αgkX; (29)

(3) letfn=fn−1+αngn,rn=rn−1−αngn andn=n+ 1.

TheDual Greedyalgorithm generalizes the second version of the Pure Greedy algorithm and is slightly more subtle. It is based on the notion ofpeak functional. For any non-zero elementf ∈X, we say thatFf ∈X^′ is a peak functional forf ifkFfkX^∗ = 1 andFf(f) =kfkX. TheDual Greedy algorithm reads:

(1) letf₀= 0,r₀=f andn= 1;

(2) letFr_n−1∈X^∗ be a peak functional forrn−1and letgn ∈ Dsuch that (assuming existence) gn ∈argmax

g∈D

Fr_n−1(g); (30)

(3) letαn ∈Rsuch that

αn∈argmin

α∈R

krn−1−αgnkX; (31)

(4) letfn=fn−1+αngn,rn=rn−1−αngn andn=n+ 1.