Reduced-basis solutions of aﬀinely-parametrized linear partial differential equations

(1)

HAL Id: hal-00725344

https://hal.archives-ouvertes.fr/hal-00725344

Submitted on 24 Aug 2012

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Reduced-basis solutions of aﬀinely-parametrized linear partial differential equations

Alexandre Janon

To cite this version:

Alexandre Janon. Reduced-basis solutions of aﬀinely-parametrized linear partial differential equations.

[Research Report] LJK. 2010. �hal-00725344�

(2)

Reduced-basis solutions of

affinely-parametrized linear partial differential equations

Alexandre Janon¹ June 2010

Introduction and overview

In some contexts (for example, in Monte-Carlo, or quasi-Monte-Carlo sensitiv- ity analysis), it is necessary to compute the solutionu_µ of some partial differential equation (PDE) depending on an input parameter vectorµ(dependence can be obviously through the PDE coefficients, and also through the domain on which the equation is posed, or through initial/boundary data), for a large number (say, thousands) values ofµ. When no analytical solution of the PDE is known, one has to use computationally-intensive numerical methods – such as finite elements or finite differences – to approximate the solution of the PDE.

This leads to complex computer codes, which can take several hours to run – for asinglevalue ofµ. In this case, one easily sees that simply calling the numerical code for each required value ofµ– one-by-one – is not practicable at all ; and that we need to use a computationally cheaper code giving asurrogate solution eu_µ (also called metamodel) from a value of µ. More speciﬁcally, we split our computation in two phases:

• an offline phase, where we make a reasonable number of calls to the expensive code in order to "learn" about the solutions of the PDE ;

• an onlinephase, where, given some value of µ, we use the information collected during the ofﬂine phase to produce a good approximation of the solution of the PDE for this speciﬁc value ofµ, faster than if we had no information at all.

This strategy is efficient if the fixed additional cost of data collection during the offline phase (done once for every value of µ) is dominated by a strong reduction in the marginal cost (i.e., per value of µ) when one replaces the original, expensive numerical code by the cheaper "online" code.

In this report, we describe a procedure, named reduced basis approximation (RB), specifying the ofﬂine and online phases. The key to RB approximation is to look for (during the online phase) a surrogateeu_µ solution of the PDE as a linear combination of functions from a linearly-independent family (called the reduced basis) that has to be wisely chosen during the ofﬂine phase. For the resulting problem to be well-posed, one has to somewhat weaken the PDE, using a Galerkin-like method. An advantage of using such a procedure is that one can quantify the loss of information made when replacing the expensive numerical solutionu_µ byeu_µ, using ana posteriorifastly-computederror bound

(4)

ǫ_µ so that

||u_µ−eu_µ|| ≤ǫ_µ for some functional norm|| · ||.

The report is organised as follows: in a first part, we present the necessary assumptions on the PDE for RB to be applicable to ; for the sake of simplicity, we restrict ourselves to the case of coercive, linear and "affinely parametrized" (we will explain what this means) PDEs, but it should be noted there are extensions to RB which somewhat relax some of these assumptions ; in a second part, we describe the core of the RB approximation, that are the offline and online phases of computation of the surrogateeu_µ; in a third part, we give a procedure to compute the error boundǫ_µ ; in a fourth part, we show two different ways (greedyandPOD) to choose the reduced basis family ; in a fifth and final part, we address the question of using RB to approximate outputs of the solution, that are (scalar) functionals ofu_µ.

1 Notations and assumptions

Our main reference for sections1to3is[5].

As said earlier, we denote byµ∈ P ⊂R^pour parameter vector.

Our unknown is a functionu_µ — depending onµ, besides the usual (e.g. time and space) variables — belonging to some function space X, satisfying a µ- parametrized linear partial differential equation, which can be written under a variational (a.k.a. weak) form:

a_µ(u_µ,v) = f_µ(v) ∀v∈X (1) where a_µ is aµ-dependent (continuous) bilinear form on X, and f is a (continuous) linear form onX.

We further assume that our PDE isaffinely parametrized, that is:

• f_µ ≡ f isµ-independent;

• anda_µ can be written as a sum ofQbilinear forms:

a_µ(w,v) = XQ q=1

Θ_q(µ)a_q(w,v)

where:

(5)

– a_q areµ-independent bilinear forms onX ;

– Θ_q are arbitrary functions on P, capturing all dependence in µ ; since these functions are to be evaluated many times during the online phase, we shall require these functions to be fast-computed.

2 Reduced basis approximation

2.1 Opportunity for a RB approximation

The function space X is typically a Sobolev space; such spaces, due to their infinite-dimensional nature, are not amenable to numerical computations. In order to discretize equation (1) and get an approximation of the actual mathe- matical solution, one can setX to some finite-element space (e.g.,P¹) and use (1) to get a linear system of equations satisfied by the coefficients of u_µ in a basis of X. In the sequel, X will always stand for such a space. This means that the reduced basis solution we are to compute is an approximation to the discrete numerical solution, and thus we rely on (and assume) the fact that the latter is a good approximation to the analytical solution. A good treatment of finite element discretizations can be found in[7].

The computational cost required to compute this approximation is an increas- ing function of the dimension of the chosen discrete function space, and unfortunately, it has to be very large for the approximation to be accurate. One explanation for this fact is that those spaces are "too general", that is, they con- tain too many different functions and so we need to deal with a large number of degrees of freedom for our unknown.

On the other hand, we expect that two solutions of (1), for two different values ofµ, will exhibit common "features". We could "encode" those features into a spaceXe, spanned by some (µ-independent) functionsζ₁,ζ₂, . . . ,ζ_N ∈X, and replace X by Xe in (1). The family {ζ₁, . . . ,ζ_N}is called the reduced basis. For a sufﬁciently regular-in-µ problem, we expect that taking N << dimX will still give – for a well-choosed reduced basis – an accurate enough solution, yet providing substantial computational economy.

2.2 Offline and online procedures

We suppose, for now, that we are given a reduced basis{ζ₁, . . . ,ζ_N}of linearly independent functions inX (discussion about choice of this basis can be found

(6)

in section4). We setXe=Span{ζ₁, . . . ,ζ_N}.

Givenµ∈ P, we look foreu_µas a function belonging toXe, thus :

e u_µ=

XN n=1

c_n(µ)ζ_n (2)

where the unknowns arec_n(µ)∈R,n=1, . . . ,N.

These unknowns are to be found by imposing thateu_µ satisfy a relaxed version of (1); instead of verifying (1) for everyv∈X,ue_µ satisﬁes (1) for everyv∈Xe: a_µ(eu_µ,v) = f(v) ∀v∈Xe (3) Since Xe ⊂ X, (3) is weaker than (1). This standard technique is known as Galerkin projection.

Substituting (2) into (3), and setting v = ζ_p (p = 1, . . . ,N) gives that the c_n(µ)’s satisfy the following system ofN linear equations:

XN n=1



 XQ q=1

Θ_q(µ)a_q(ζ_n,ζ_p)



c_n(µ) = f(ζ_p) ∀p=1, . . . ,N (4)

This justiﬁes the following ofﬂine/online procedure for the computation ofeu_µ:

• in the ofﬂine phase:

– we choose a reduced basis {ζ₁, . . . ,ζ_N} (more details about this choice later)

– we assemble, and store,Qmatrices of sizeN×N:

a_q=

a_q(ζ_n,ζ_p)

p,n=1,...,N

and theN-sized vector:

f= f(ζ_p)

p=1...N

• in the online phase, being given a value ofµ:

– we assemble theN×N matrix:

a(µ) = XQ q=1

Θ_q(µ)a_q

(7)

– our vector of unknownsc(µ) = c_n(µ)

n=1...N is given by : c(µ) = a(µ)−1

f

where the matrix inversion solves the system (4), – and so we get:

e u_µ=

XN n=1

c_n(µ)ζ_n

The complexity of the online phase is as follows: assembling of a(µ) takes O(QN²) operations. Its inversion takes O(N³) operations (one should note that a(µ) has no reason to be a sparse matrix – it is not, indeed). So the coefﬁcients of eu_µ with respect to the reduced basis can be computed in less thanO(QN²+N³)operations.

It is important to note that this complexity isindependentof dimX. Thus online computation time is independent of the "quality" of the underlying reference numerical approximation, which can be as good as wanted without impacting marginal cost. We notice that we neglected the cost of evaluations ofΘ_q(µ), hence the hypothesis of fast-evaluation of Θ_q we have made. We also notice that online complexity (in computational time and in storage requirements) is proportional toQ, the number of terms in the afﬁne decomposition ofa_µ; for our procedure to be efﬁcient,Qshould therefore be kept orders of magnitude smaller than dimX.

3 Error bound

3.1 Derivation of the error bound

In this section, we design an error boundǫ_µso that:

||u_µ−eu_µ|| ≤ǫ_µ for allµ∈ P.

We should note once again that, thanks to our assumption about the quality of the underlying approximation, we measure only the error between the reference, expensive numerical solutionuand our surrogateeu; error betweenu and the analytical solution is neglected.

We endow X with an Hilbert structure: an inner product < ·,· > on X ×X

(8)

yielding a norm|| · || on X. We need a crucial assumption about the bilinear forma_µ describing our PDE: theα(µ)deﬁned by:

α(µ) = inf

w∈Xsup

v∈X

a_µ(w,v)

||w||||v||

has to be strictly positive:α(µ)>0 ∀µ∈ P.

The quantityα(µ)is known as Babuška’s "inf-sup" stability constant. The inf- sup stability hypothesisα(µ)>0 is a sufﬁcient condition for well-posedness of the underlying variational problem (1) (Babuška-Brezzi condition).

In the sequel, we suppose thata_µ issymmetric. The inf-sup stability hypothesis is then equivalent to the coercivity of a_µ (and the well-posedness of the variational problem comes from Lax-Milgram), and we have:

α(µ) = inf

w∈X

a_µ(w,w)

||w||²

Let’s move on to the error bound. First we remember that:

a_µ(u_µ,v) = f_µ(v) ∀v∈X

Subtractinga_µ(eu_µ,v)from both sides of this equation, we get, thanks to bilin- earity ofa_µ:

a_µ(u_µ−eu_µ,v) = f_µ(v)−a_µ(eu_µ,v) ∀v∈X So:

a_µ(e_µ,v) =r_µ(v) ∀v∈X (5) if we set:

e_µ=u_µ−ue_µ r_µ(v) = f(v)−a_µ(eu_µ,v) (e_µ is known as theerror, and r_µ as theresidualform).

Introducingα(µ), we have the following inequality:

a_µ(e_µ,e_µ)≥α(µ)||e_µ||² We also have:

||r_µ(v)|| ≤ ||r_µ||_X^′||v||

(9)

where

||r_µ||_X^′= sup

v∈X||v||=1

r_µ(v) is the dual norm of the residual.

Thanks to these inequalities, and (5) with v=e_µ, we get:

α(µ)||e_µ||²≤ ||r_µ||_X^′||e_µ||

and so, thanks to our stability assumption:

||e_µ||=||u_µ−eu_µ|| ≤ ||r_µ||_X^′

α(µ) (6)

So we have:

ǫ_µ= ||r_µ||_X^′ α(µ)

In order to practically exploit the error bound (6), it remains to explain how to efﬁciently work out||r_µ|| andα(µ). This question is addressed in the two following sections.

3.2 Dual norm of the residual ||r_µ||_X^′

Riesz representation theorem ensures that there exist y_µ∈X so that

< y_µ,v>=r_µ(v) ∀v∈X (7) Moreover, computing y_µ is sufﬁcient because of the additional property:

||r_µ||_X^′=||y_µ||

Equation (7) can be rewritten, thanks to the deﬁnition of r_µ and the afﬁne decomposition ofa_µ:

< y_µ,v>= f(v)− XQ q=1

Θ_q(µ)a_q(eu_µ,v) ∀v∈X

and so, making use of the reduced basis decomposition ofeu_µ (equation (2)):

< y_µ,v>= f(v)− XQ q=1

XN n=1

c_n(µ)a_q(ζ_n,v) ∀v∈X

(10)

This equation can be viewed as a linear variational problem whose unknown is y_µ∈X. By Duhamel superposition principle, y_µcan be written as:

y_µ=γ− XQ q=1

XN n=1

Θ_q(µ)c_n(µ)Γ_qn (8)

whereγ∈X, andΓ_qn∈X (q=1, . . . ,Q,n=1, . . . ,N) satisfy:

< γ,v>= f(v) ∀v∈X (9)

<Γ_qn,v>=a_q(ζ_n,v) ∀v∈X (10) Law of cosines applied to (8) gives:

||y_µ||²=||γ||²−2 XQ q=1

XN n=1

Θ_q(µ)c_n(µ)< γ,Γ_qn>

+ XQ q=1

XQ q^′=1

XN n=1

XN n^′=1

Θ_q(µ)c_n(µ)Θ_q^′(µ)c_n^′(µ)<Γ_qn,Γ_q^′_n^′> (11)

Equation (11) will be our key to an ofﬂine/online procedure for computing

||r_µ||_X^′. Each of those phases should be done after the phases of computation ofeu(µ)described in section2.2. The procedures read:

• in the ofﬂine phase:

– we solve theQN+1 linear variational problems (9) and (10) forγ andΓ_qn(q=1, . . . ,Q,n=1, . . . ,N) ;

– we compute and store 1+QN+ (QN)²inner products:

< γ,γ >, < γ,Γ_qn>, <Γ_qn,Γ_q′n^′>

(q,q^′=1, . . . ,Q,n,n^′=1, . . . ,N)

• in the online phase, we compute||r_µ||_X^′=||y_µ||using equation (11).

We note that the vectorsγandΓ_qncomputed during ofﬂine phase arenotsaved for the online phase, we store only 1+QN+ (QN)² scalars. Thus in the online phase, the actual vector y_µ is never formed. Doing so would make complexity of the online phase depend on dimX; this is something we would like to avoid.

(11)

3.3 Stability constant α(µ)

We now turn to the problem about estimatingα(µ).

Recall that

α(µ) = inf

v∈X||v||=1a_µ(v,v)

α(µ) is thus the optimal value of a minimization problem posed on X. The objective function in this problem is a quadratic form, we seek to minimize on the unit sphere. It is a well-known fact thatα(µ)is the smallest eigenvalue of a (symmetric) matrix associated to the quadratic forma_µ(·,·). Indeed, if for allv andwinX:a_µ(v,w) =v^TM_µwand if<v,w>=v^TΩwfor symmetric matrices M_µandΩ, thenα(µ)is the smallestλ∈Rso that equation :

M_µv=λΩv admits a nonzero solutionv.

Due to the high-dimensionality of X, it would be too expensive to solve this (generalized) eigenproblem online, for each value ofµ. However, we expect that affine decomposition ofa_µ will provide us with an efficient offline/online procedure for estimating α(µ). By estimating here we mean finding a lower bound forα(µ), because the form of the error estimator (6) shows us we only need such a lower bound:

e

α(µ)≤α(µ)

because we have therefore:

||u−eu|| ≤||r||_X^′ e α(µ) 3.3.1 An easy case

We should ﬁrst address the case whereΘ_q(µ)≥0 ∀q=1, . . . ,Q,∀µ∈ P.

(12)

If this condition is met, then for anyµ∈ P,µ∈ P: XQ

q=1

Θ_q(µ)a_q(u,v) ≥ XQ q=1

Θ_q(µ)

Θ_q(µ)Θ_q(µ)a_q(u,v)

≥

¨ inf

q^′=1,...,Q

Θ_q(µ) Θ_q(µ)

«X_Q

q=1

Θ_q(µ)a_q(u,v)

≥

¨ inf

q^′=1,...,Q

Θ_q(µ) Θ_q(µ)

«

α(µ)||u||||v||

This leads us to the following ofﬂine/online procedure forα(µ):e

• Ofﬂine:

– choose a "nominal" valueµ∈ P for whichΘ_q(µ)6=0∀q=1, . . . ,Q; – compute and storeα(µ)by solving an eigenproblem onX.

• Online, given a value ofµ, take e

α(µ) =

¨

q=1,...,Qinf Θ_q(µ) Θ_q(µ)

« α(µ)

This procedure is simple and efﬁcient, with a dimX-independent online stage.

However, if our positivity hypothesis onΘ_q is violated, we have to resort to a more complex method, which we describe below.

3.3.2 The general case: successive constraints method (SCM) The successive constraints method (SCM) is presented in[3].

α(µ)can be rewritten

α(µ) = inf

w∈X,||w||=1a_µ(w,w) We further note that:

α(µ) = inf

w∈X,||w||=1





 XQ q=1

Θ^q(µ)a_q(w,w)





= inf

y∈Y





 XQ q=1

Θ^q(µ)y_q





 (12)

(13)

where:

Y =¦

(y₁, . . . ,y_q)∈R^Q st.∃w∈X,||w||=1 and y_q=a_q(w,w)∀q=1, . . . ,Q© To obtain a lower bound onα(µ), we should replaceY by aY ⊇ Ye ; in other words, we should relax the constraints on the optimization problem (12).

To deﬁneYe, we begin by noticing thatY is bounded: let:

σ⁻_q = inf

w∈X,||w||=1a_q(w,w), σ⁺_q = sup

w∈X,||w|||=1

a_q(w,w), B = YQ q=1

[σ⁻_q;σ⁺_q].

Now let y = (y₁, . . . ,y_Q)∈ Y and let wsatisfying the properties in the deﬁni- tion ofY. For everyµ^′∈ P, we have:

X

q

Θ_q(µ^′)y_q = X

q

Θ_q(µ^′)a_q(w,w)

≥ inf

w^′∈X,||w^′||=1

X

q

Θ^q(µ^′)a_q(w^′,w^′)

= α(µ^′) So far we have:

Y ⊂





y ∈ B | XQ q=1

Θ_q(µ^′)y_q≥α(µ^′) ∀µ^′∈ P







The right-hand side of this inclusion would be a good candidate forYe; however it is defined by an infinite number ofµ-independent constraints (each for one µ^′ ∈ P). We should relax one more time these constraints in order to get a finite number of (now,µ-dependent) constraints. We take:

e Y =





y ∈ B | XQ q=1

Θ_q(µ^′)y_q≥α(µ^′) ∀µ^′∈ S_M(µ,C)







WhereM ∈N,C is a ﬁnite subset ofP andS_M(µ,C)is the set of theM points inC that are closest (with respect to some metric) toµ.

So our SCM ofﬂine/online procedure is:

• ofﬂine:

(14)

– choose M; choose C (we postpone discussion about the choice of C until the next subsection) ;

– for eachµ^′∈ C, compute and storeα(µ^′)by solving an eigenproblem onX; compute and store σ_q⁻ and σ_q⁺ the same way, for each q=1, . . . ,Q.

• online: solve optimization problem:

e

α(µ) = inf

y=(y₁,...,y_Q)∈Ye





 XQ q=1

Θ^q(µ)y_q







The online optimization problem is what is called alinear programming prob- lem: the goal is to minimize a linear function under a ﬁnite number of linear inequalities constraints. In our case there areQvariables andQ+M constraints.

There exist efﬁcient algorithms, such as the simplex algorithm (see[6]for instance), which solve such optimization problems under (on average) polynomial complexity with respect to the number of variables and number of constraints, even if they can be exponential in the worst cases. The key here, once again, is that online complexity is still independent of dimX.

A last remark we can do on the algorithm is about the trade-off in the choice of M: whatever M is, we always get a certiﬁed bound onα(µ), but increasingM will improve sharpness of this bound, at the expense of an increase in online computation time.

Choice ofC

We will use a so-called greedy procedure for choosing the ﬁnite subset C ⊂ P. Roughly speaking, a decision algorithm is said to be greedy if he makes the best possible choice at each step. This strategy does not always yield a globally optimal decision but often gives a good solution for cases in direct global optimization would be unpractical.

For our greedy choice ofC, we will start with C = {µ₁}, with an arbitrarily chosen µ₁ ∈ P. Then, at each step of the algorithm, we are going to add a point inC that gives the best possible expected improvement in the precision of our boundα(µ).e

Thus we get the following procedure for choice ofC:

(15)

1. choose M ;

2. initializeC ={µ₁}with some arbitraryµ₁∈ P ; 3. choose a ﬁnite-sized sample setΞ⊂ P ;

4. repeat:

• add

µ^∗=argmax

µ∈Ξ

α(µ)−α(µ)e

toC (hereα(µ)e stands for the SCM lower bound computed using the "current"C)

The repeat loop can be stopped either when #C has reached a maximal value, or when max

µ∈Ξα(µ)−α(µ)e is less than a desired precision.

The greedy algorithm written below would require computation of α(µ) for

#Ξ×(1+2+. . .+ (#C −1))values ofµ. If one choose a largeΞset, the cost of the choice ofC might be too prohibitive. Rather, one could consider a cheap upperboundαe^up(µ)forα(µ)and replaceα(µ)−α(µ)e by the relative surrogate sharpness estimator: αe^up(µ)−α(µ)e

e

α^up(µ) .

This cheap upper bound could be built using the following: let us deﬁne:

e

Y^up={y^∗(µ_k),k=1, . . . ,K}

where

{µ₁, . . . ,µ_K}=C and

y^∗(µ_k) =arginf

y∈Y





 XQ q=1

Θ_q(µ_k)y_q





 (k=1, . . . ,K)

We can deﬁne:

e

α^up(µ) = inf

y∈Ye^up





 XQ q=1

Θ_q(µ)y_q







which satisfiesαeûp(µ)≥α(µ)becauseYeûp⊂ Y.

Usingαe^up(µ) as a surrogate for α(µ) reduces ﬁxed complexity of the greedy algorithm toK=#C eigenproblems onX (for computation of they^∗(µ_k)) and

(16)

#Ξtrivial minimizations per step over reasonably-sized, ﬁnite,Ye^up.

4 Choice of a reduced basis

In this section, we present two methods for choosing our reduced basis{ζ₁, . . . ,ζ_N}.

These two methods have different offline complexities and lead to different bases. Both nicely fit into our reduced basis framework, and can benefit from the same procedure for error estimation, detailed in section 3.

4.1 Greedy procedure

The greedy procedure is treated in[5].

The following algorithm is based on the same heuristic than the one described earlier for choosingC in the SCM: at each step of the algorithm, we put one new function in the reduced basis; this function is the solutionu_µ^∗ of the PDE for a speciﬁc value of the parameter choosed so as to give the best expected improvement in the online error bound, i.e. the value of the parameter for which we make the greatest (estimated) error when using "current" reduced basis.

The greedy algorithm for choice of the reduced basis is the following: given N∈N, the desired ﬁnal size of the reduced basis,

• Choose a ﬁnite-sized, random, largeΞ⊂ P sample of parameters.

• Chooseµ^∗₁∈ P at random, and setζ₁=u_µ^∗

1.

• Repeat, fornfrom 2 toN:

– For eachµ∈Ξ, compute the online error bound forue_µ when using {ζ₁, . . . ,ζ_n−1}as reduced basis.

– Findµ^∗_n∈Ξwith the greatest online error bound.

– Setζ_n=u_µ^∗

n.

This procedure requires computation ofN actual solutions, and #Ξ×N online error bounds. It relies crucially on the existence of a cheap online error bound.

Of course, instead of ﬁxing a target size for the reduced basis, one can also keep iterating and adding functions to the reduced basis until the greatest online error bound (on theΞsample of parameters) gets below some prescribed value.

(17)

4.2 Proper Orthogonal Decomposition (POD)-driven procedure To our knowledge, the following presentation of the Proper Orthogonal Decom- position is new. We refer to[8]for a different presentation, based on singular value decomposition (SVD).

First mode

Suppose, for a soft start, that we want to ﬁnd a reduced basis containing only one item: {ζ₁}. Without loss of generality we can always suppose that||ζ₁||= 1. It seems reasonable to make the choice which makes the least summed squares error when projecting orthogonally the solutionsu_µ (for all µ ∈ P) onto the reduced space Span{ζ₁}. Said with formulas, we seekζ₁satisfying:

ζ₁= arginf

p∈X,||p||=1

Z

µ∈Ξ

u_µ−<u_µ,p>p

²dµ

The functionζ₁ will be called ourfirst POD mode.

Unfortunately, the integral appearing in the objective function below is, in general, analytically untractable. We can approximate it using a ﬁnite sample of parametersΞ⊂ P; the integral then gets replaced by a ﬁnite sum overΞ:

ζ₁= arginf

p∈X,||p||=1

X

µ∈Ξ

u_µ−<u_µ,p>p

²

The set of solutions{u_µ,µ∈Ξ}is called thesnapshotsensemble.

The objective function can then be rewritten:

X

µ∈Ξ

u_µ−<u_µ,p>p

² = X

µ∈Ξ

u_µ

²−2X

µ∈Ξ

<u_µ,p>²+X

µ∈Ξ

<u_µ,p>²||p||²

= X

µ∈Ξ

u_µ

²−X

µ∈Ξ

<u_µ,p>² using that||p||²=1

And so:

ζ₁= argmax

p∈X,||p||=1

X

µ∈Ξ

<u_µ,p>² (13) The objective function in (13) is a quadratic form in p, that has to be maxi- mized under a "unity norm" constraint. We know this can be reduced to an eigenproblem.

More speciﬁcally, let us introduce a basis {φ₁, . . . ,φ_N} ofX. As we stressed

(18)

in the introduction,N is large. We enumerate the snapshots parameters: Ξ = {µ₁, . . . ,µ_N

snap}.

Each snapshotu(i) =u_µ_i (i=1, . . . ,N_snap) has an expansion in the basis of X:

u(i) = XN

j=1

u(i)_jφ_j

and we can write downM, the matrix of theu(i)_j coefﬁcients:

M=







u(1)₁ u(2)₁ . . . u(N_snap)₁ u(1)₂ u(2)₂ . . . u(N_snap)₂

... ... . . . ... u(1)_N u(2)_N . . . u(N_snap)_N







Denote by Ω, the (symmetric, positive deﬁnite) matrix of the scalar product

<,>inX.

We can write down our objective function as:

X

µ∈Ξ

<u_µ,p>²=p^TΩ^TM M^TΩp

The condition of optimality for (13) is then given by the Lagrange multipliers theorem: there has to existλ∈R, so that:

Ω^TM M^TΩζ₁=λΩζ₁ (14)

andλis the optimal value of the problem (as can be seen by left-multiplying (14) byζ₁^T, and using||ζ₁||²=1) – which should be made as large as possible.

Now left-multiply (14) byΩ^−T = Ω⁻¹:

M M^TΩζ₁=λζ₁ (15)

so that ζ₁ is the unit eigenvector associated with the greatest eigenvalue of M M^TΩ.

So we have reduced our problem as an eigenproblem of dimensionN, because M M^TΩis of sizeN ×N. SinceN can be large, this could turn out to be highly unpractical. Fortunately, we are to see that we can solve the same problem using eigenproblem of sizeN_snap×N_snap.

The key is to look forζ₁as a linear combinationof the snapshots, instead of the

(19)

"generic" basis items{φ_i}.

We write:

ζ₁=

NXsnap

i=1

z_1iu(i) =M z₁

wherez₁= (z₁1, . . . ,z_1,N_snap)^T. Settingζ₁=M z₁into (15) yields:

M M^TΩM z₁=λM z₁ and a sufﬁcient condition for this to hold is:

M^TΩM z₁=λz₁

so that our new unknownz₁∈R^N^snap is an eigenvector ofM^TΩM (which of size N_snap×N_snap) associated with its greatest eigenvalue.

The absence of optimality loss when looking for ζ₁ in Range(M) (in other words,ζ₁ = M z₁ with z₁ as deﬁned below satisﬁes (13)) follows clearly from these facts:

1. if z is a nonzero eigenvector of M^TΩM (associated with λ 6= 0), then ζ=M zis a nonzero eigenvector ofM M^TΩassociated withλ;

2. if ζ is a nonzero eigenvector of M M^TΩ (associated with λ 6= 0), then z=M^TΩζis a nonzero eigenvector ofM^TΩM associated withλ;

3. M^TΩM and M M^TΩ have the same nonzero eigenvalues, each with the same multiplicity in one and the other.

Proof: To see 1., ﬁrst note that if we hadM z=0, thenM^TΩM z=0 and soz could not be an eigenvector associated with a nonzero eigenvalue of M^TΩM; thus M z6=0. Now M M^TΩ(M z) = M(M^TΩM z) =Mλz =λ(M z) so that M z is an eigenvector of M M^TΩ associated withλ. Point 2. is 1. mutatis mutan- dis. Point 3.: it is clear from 1. and 2. that the two matrices have the same nonzero eigenvalues. To get the statement about the multiplicities, suppose that Span{z₁, . . . ,z_k} is the eigenspace of M^TΩM associated with eigenvalue λ. Since M^TΩM is self-adjoint with respect to the Euclidean inner product (i.e., is symmetric), we can suppose that{z₁, . . . ,z_k}is orthogonal (for the Eu- clidean inner product). Now, from 2., Span{M z₁, . . . ,M z_k}is contained in the eigenspace ofM M^TΩassociated withλ. Moreover, for everyi6= j:

<M z_i,M z_j>=z^T_j M^TΩM z_i=λz^T_j z_i=0

(20)

so that{M z₁, . . . ,M z_k}is<,>-orthogonal, and therefore linearly independent, and soλhas multiplicitykinM M^TΩ.

Second mode

Now let’s say we are looking for a set of two functions as our reduced basis:

{ζ₁,ζ₂}. As earlier, we want(ζ₁,ζ₂)to minimize:

X

µ∈Ξ

u_µ−(<u_µ,ζ₁> ζ₁+<u_µ,ζ₂> ζ₂)

²

among the(ζ₁,ζ₂)∈X×X satisfying: ||ζ₁||=||ζ₂||=1 and< ζ₁,ζ₂>=0.

In other words, we aim to minimize the overall error (over our sample of parameters Ξ) made when orthogonally projecting u_µ onto Span{ζ₁,ζ₂}, with {ζ₁,ζ₂}an orthonormal family.

Mimicking what we have done earlier, we rewrite our objective function as:

X

µ∈Ξ

u_µ−(<u_µ,ζ₁> ζ₁+<u_µ,p>p)

² = X

µ∈Ξ

||u_µ||²−2X

µ∈Ξ

<u_µ,ζ₁>²−2X

µ∈Ξ

<u_µ,p>² +X

µ∈Ξ

<u_µ,ζ₁>||ζ₁||²+X

µ∈Ξ

<u_µ,p>²||p||² +X

µ∈Ξ

<u_µ,ζ₁><u_µ,ζ₂>< ζ₁,ζ₂>

= Constant−X

µ∈Ξ

<u_µ,ζ₁>²−X

µ∈Ξ

<u_µ,ζ₂>²

because of the constraints onζ₁ andζ₂.

The objective function, apart from the terms that are independent of ζ₁ and ζ₂, is a sum of a function dependent on ζ₁ only and a function dependent onζ₂ only. Thus the joint optimization on(ζ₁,ζ₂) is equivalent to successive optimization onζ₁, then onζ₂. Optimization onζ₁ has been done previously.

Optimization inζ₂is the following:

ζ₂= argmax

p∈X,||p||=1,<p,ζ₁>=0

X

µ∈Ξ

<u_µ,p>²

The objective function has exactly the same matrix expression as the one of the problem deﬁning the ﬁrst POD mode:

X

µ∈Ξ

<u_µ,p>²=p^TΩ^TM M^TΩp

(21)

The Lagrange multiplier theorem now gives that there existλ₂∈Rso that:

Ω^TM M^TΩζ₂=λ₂Ωζ₂ (16) As before, left multiplying byζ₂^T gives thatλ₂ is the optimal value of the problem, which should be made as large as possible.

Left-multiplying (16) byΩ^−T we get that (16) is equivalent to:

M M^TΩζ₂=λζ₂ soλ₂ is an eigenvalue ofM M^TΩ.

Now we have, for every x,y∈X:

<M M^TΩx,y> = y^TΩM M^TΩx

= y^T

M M^TΩT

Ωx

= <x,M M^TΩy >

so that M M^TΩ is self-adjoint with respect to <,>. So we can ﬁnd a unit eigenvector ζ₂, orthogonal to ζ₁ associated with the second greatest eigenvalue (note that this eigenvalue is equal toλ₁ ifλ₁has multiplicity>1).

As earlier, we can use the fact thatM M^TΩandM^TΩM have the same nonzero eigenvalues, to getζ₂= M z₂ wherez₂ is an eigenvector of the smaller matrix M^TΩM. The fact that multiplicities of nonzero eigenvalues in M M^TΩ and M^TΩM are the same ensures that this procedure is correct even ifλ₁=λ₂. Following modes, and summary for POD

We can readily repeat what we have done forζ₂ to prove:

Let Ξ = {µ₁, . . . ,µ_N_snap} be a ﬁnite part of P. Let M be the matrix whose columns are the coefﬁcients ofu_µ₁, . . . ,u_µ

Nsnap with respect to some basis ofX. LetΩbe the matrix of the inner product inX.

Then, the family(ζ₁, . . . ,ζ_N)of vectors inX that minimizes:

X

µ∈Ξ

u(µ)− <u_µ,ζ₁> ζ₁+<u_µ,ζ₂> ζ₂+. . .+<u_µ,ζ_N> ζ_N

2

under the constraints< ζ_i,ζ_j>=

( 1 ifi= j 0 else.

(22)

is deﬁned by: 









ζ₁ = 1

||M z₁||M z₁ ...

ζ_N = 1

||M z_n||M z_n

wherez_k (k=1, . . . ,N) is a nonzero eigenvector, associated withk-th largest eigenvalue (counting repeated nonsimple eigenvalues) ofM^TΩM.

The statement above holds for everyN∈N^∗so that theN-th largest eigenvalue ofM^TΩM is nonzero.

The functions(ζ₁, . . . ,ζ_N)deﬁned above are called the ﬁrst N POD modes associated with the snapshot ensembleΞ.

A note about the (ofﬂine) complexity of the POD: for aN-sized reduced basis, it requires computation ofN_snap "expensive" solutionsu_µ and extraction of the N leading eigenelements of anN_snap×N_snapsymmetric matrix.

5 Case of a scalar output

One is often not interested in the full solution u_µ, but rather in an "output"

calculated fromu_µ:

s_µ=η(u_µ) whereη:X→Ris a (µ-independent) functional.

We present here some approaches for computation of a surrogate outputes_µ fromeu_µ.

5.1 Primal approach

A trivial approach to the question is to computeeu_µ, and then, online:

es_µ=η(ue_µ) (17) Done naively, this procedure might be dimX-dependent, since evaluation ofη may require access to every component ofeu_µ in the originalX space.

A way of overcoming this problem, ifηis a linear functional, is to write that:

e s_µ=η

XN n=1

c_n(µ)ζ_n

!

= XN n=1

c_n(µ)η(ζ_n)

(23)

and the ofﬂine/online procedure becomes:

• ofﬂine: compute and storeη(ζ_n),n=1, . . . ,N

• online: ﬁndc_n(µ)(n=1, . . . ,N) satisfying (2) using procedure described in2.2and compute

e s_µ=

XN n=1

c_n(µ)η(ζ_n) (18)

And now the online phase is still dimX-independent.

For a quadratic output functional, one proceeds the same way using:

e s_µ=

XN n=1

XN n^′=1

c_n(µ)c_n^′(µ)η_b(ζ_n,ζ_n^′)

whereη_bis the symmetric bilinear form associated withη.

This time, one has to compute and store (ofﬂine)η_b(ζ_n,ζ_n^′), forn,n^′=1, . . . ,N.

The same procedure can easily be generalized for polynomial outputs.

Error bound for the output

A straightforward majoration for |s_µ−es_µ|can be found under a Lipschitz hypothesis forη: let L>0 be so that, whateverv,w∈X:

|η(v)−η(w)| ≤L||v−w||

(for example, ifηis a continuous linear functional, Lcan be chosen as the dual norm ofη).

Then we have:

|s_µ−es_µ| ≤ Lǫ_µ (19)

5.2 Primal/dual approach

For linear outputs, error bound (19) can be improved, at the expense of, in some cases, an increase in computation time and program complexity. The idea, present in[5], is to introduce thedualproblem of (1) associated withη:

a_µ(v,ψ_µ) =−η(v) ∀v∈X (20) whose unknown isψ_µ∈X.

(24)

This dual problem is an afﬁne parametric PDE which we can apply the RB framework on. That is, given a reduced basis {ζ₁^du, . . . ,ζ^du_N

du} and a reduced spaceXe^du=Span{ζ₁^du, . . . ,ζ^du_N

du}, ﬁnd an approximationψe_µ∈Xe^dutoψ_µsatis- fying:

a_µ(v,ψe_µ) =−η(v) ∀v∈Xe^du with error bound:

||ψe_µ−ψ_µ|| ≤ǫ_µ^du

After having found reduced basis solutions for the primal and dual problems, the following surrogate output can be taken:

e

s_µ=η(eu_µ)−r_µ(ψe_µ) (21) The error bound on the output now reads:

||es_µ−s_µ|| ≤α(µ)ǫ_µǫ_µ^du (22) Using (21) instead of (17) gives us the quadratic error bound (22) instead of the linear (19) for only (roughly) twice the price in computation and storage requirements. And if the problem is self-dual (i.e. a_µ is symmetric and η =

−f), we can takeN=N_du,ζ_i=ζ^du_i ∀i=1, . . . ,Nand get this "squared" effect for free, by just adding up the corrective term in (21).

Proof of(22): We omit theµsubscripts.

|s−es| = |η(u−u)e −a(eu,ψ) +e f(ψ)|e

= | −a(u−u,eψ)−a(u,e ψ) +e f(ψ)|e from (20)

= | −a(u,ψ) +f(ψ) +e a(eu,ψ−ψ)|e by reordering terms

= |f(ψe−ψ) +a(eu,ψ−ψ)|e from (1)

= |r(ψ−ψ)|e by the very deﬁnition ofr

≤ ||r||_X^′||ψ−ψ|| ≤ ||e r||_X^′ǫ^du=αǫǫ^dufrom the definition ofǫ. We shall now pass to the offline/online procedure: we first require computation ofr_µ(ψe_µ)in (21).

Expandingψe_µ on the dual reduced basis:

ψe_µ=

N_du

X

p=1

q_p(µ)ζ^du_p

(25)

we have:

r_µ(ψe_µ) =

N_du

X

p=1

q_n(µ)r_µ(ζ^du_p ) (23) and, for each p=1, . . . ,N_du, we have, from deﬁnition ofr_µ, afﬁne expansion ofa_µ and reduced basis expansion (2) ofeu_µ:

r_µ(ζ^du_p ) = f(ζ^du_p )− XQ q=1

Θ_q(µ) XN n=1

c_n(µ)a_q(ζ_n,ζ^du_p ) (24)

So our full ofﬂine/online procedure for primal/dual evaluation and error bound ons_µ reads:

• ofﬂine:

– perform ofﬂine-stage RB approximation (part2.2) and error estimation (part3) for the two problems (1) and (20) ;

– compute and storeη(ζ_n)forn=1, . . . ,N ;

– compute and store f(ζ^du_p ) (p = 1, . . . ,N_du) and a_q(ζ_n,ζ^du_p ) (q = 1, . . . ,Q;n=1, . . . ,N;p=1, . . . ,N_du) ;

• online:

– perform online-stage RB approximation and error estimation for the two problems (1) and (20); this gives respective expansions {c_n(µ)}_n=1,...,N and {q_p(µ)}_p=1,...,N

du of eu_µ and ψe_µ in, respectively primal reduced basis{ζ₁, . . . ,ζ_N}and dual reduced basis{ζ₁^du, . . . ,ζ_N_du}, together withα(µ),ǫ_µ(error bound for primal problem) andǫ_µ(error bound for dual problem) ;

– computeη(µ)e as left-hand side of (18) ; – computer_µ(ψe_µ)using (23) and (24) ; – outputes_µ using (21) ;

– give error bound (22).

5.3 Output-aware choice of basis: goal-oriented POD

When one is interested in a special output of u_µ, it could be wise to choose anadaptedreduced basis for this output. For instance, if the output depends strongly on the values ofu_µ(x)forx in some regionR₁contained in the domain

Reduced-basis solutions of aﬀinely-parametrized linear partial differential equations

HAL Id: hal-00725344

https://hal.archives-ouvertes.fr/hal-00725344

Reduced-basis solutions of aﬀinely-parametrized linear partial differential equations

Alexandre Janon

To cite this version:

Reduced-basis solutions of

affinely-parametrized linear partial differential equations

Contents

Introduction and overview

1 Notations and assumptions

2 Reduced basis approximation

3 Error bound

4 Choice of a reduced basis

5 Case of a scalar output