epfl-mox-logo
Optimal weighted least-squares methods for high-dimensional approximation
Giovanni Migliorati
Universit´e Pierre et Marie Curie, Paris, France joint work with Albert Cohen (UPMC)
Journ´ee scientifique du groupe SMAI-SIGMA
epfl-mox-logo
Outline
1 Motivations and example of applications
2 Notation and definitions
3 Stability and accuracy of standard least squares with evaluations at ran- dom points
4 Stability and accuracy of weighted least squares with evaluations at ran- dom points
5 Sampling algorithms for the optimal density
6 Conclusions
epfl-mox-logo Motivations and example of applications
1 Motivations and example of applications
2 Notation and definitions
3 Stability and accuracy of standard least squares with evaluations at ran- dom points
4 Stability and accuracy of weighted least squares with evalua- tions at random points
5 Sampling algorithms for the optimal density
6 Conclusions
epfl-mox-logo Motivations and example of applications
Fast solution to parametric / stochastic PDEs
PDE modelF(u,y) = 0 depending on a parameter vectory ∈Γ⊂Rd,d≫1.
For eachy ∈Γ the PDE model is well-posed in some Hilbert spaceV. Example of PDE model:
−∇ ·(a∇u) =f, inD⊂R2; u= 0 on∂D.
D= (0,1)2checkboard withd= 22k squaresD1, . . . ,Dd,k ≥1, and the diffusion coefficientais piece-wise constant onD1, . . . ,Dd with valuesa1, . . . ,ad
that define the parameter vector
y = (a1, . . . ,ad)∈Γ = [amin,amax]d, 0<amin≤amax<+∞. Examples of goals: using the evaluationsu(y1), . . . ,u(ym) withyj ∈Γ,
reconstruction of the solution mapy 7→u(y)∈V, or approximation of quantities of interest, likey 7→R
Du(x,y)dx.
Each evaluation ofu is computationally expensive.
Evaluations ofucould be affected by measurement and numerical errors.
epfl-mox-logo Notation and definitions
1 Motivations and example of applications
2 Notation and definitions
3 Stability and accuracy of standard least squares with evaluations at ran- dom points
4 Stability and accuracy of weighted least squares with evalua- tions at random points
5 Sampling algorithms for the optimal density
6 Conclusions
epfl-mox-logo Notation and definitions
Notation and definitions
For any d ≥1, Γ⊆Rd,dρ probability measure on Γ, andu : Γ→R. dµsampling measure on Γ, such that
w dµ=dρ
for some w : Γ→R+ defined everywhere and with R
Γw−1dρ= 1.
hf,gi:=
Z
Γ
f(y)g(y)dρ(y), hf,gim := 1 m
m
X
j=1
w(yj)f(yj)g(yj),
k · k:=h·,·i1/2, k · km :=h·,·i1/2m , withy1, . . . ,ym being i.i.d. according toµ.
Goal: approximation ofu in L2(Γ,dρ) using pointwise evaluations u(yj).
epfl-mox-logo Notation and definitions
Approximation space
Choose an orthonormal basis (Lj)j≥1 of L2(Γ,dρ).
Assumption: ∀y ∈Γ there exists an indexk s.t. Lk(y)6= 0.
Define the linear space
Vn:=span{L1, . . . ,Ln}, wheren=dim(Vn).
A minimal sufficient condition to satisfy this assumption is thatVn contains the functions that are constant over Γ.
epfl-mox-logo Notation and definitions
Observation models
Assumption: the functionu is well-defined at any point in Γ except eventually adρ-zero measure set, and u ∈L2(Γ,dρ).
•noiseless observation model:
zi =u(yi), i = 1, . . . ,m, y1, . . . ,ym i.i.d.∼ µ;
•noisy observation model:
zi =u(yi) +ηi, i = 1, . . . ,m.
This talk: only noiseless model. Analogous results proven for the noisy observation model, with several different assumptions on the noise type.
epfl-mox-logo Notation and definitions
Discrete least-squares approximation
Continuous and discreteL2 projections ofu overVn defined as argmin
v∈Vn
ku−vk,
uW := argmin
v∈Vn
ku−vkm = argmin
v∈Vn
m
X
i=1
w(yi)|v(yi)−zi|2. Normal equations:
Gβ=b, with
[G]ij =hLi,Ljim, [b]j =m−1
m
X
i=1
w(yi)ziLj(yi), andβ contains the coefficients of the expansionuW =Pn
j=1βjLj. Standard least squares: w ≡1 and therefore dµ=dρ.
Weighted least squares: w 6≡1 plus previous conditions, thus dµ6=dρ.
epfl-mox-logo Notation and definitions
For a given functionu : Γ→Rin L2(Γ,dρ) and a givenVn with dim(Vn) =:n ≤m:
i)how stable is the weighted discrete least-squares approximation ofu fromVn usingm evaluations at random points?
ii)how accurate is the weighted least-squares estimatoruW ofu?
Comparison of the approximation errorku−uWk with the best approximation error ofu onVn.
epfl-mox-logo Least squares with evaluations at random points
1 Motivations and example of applications
2 Notation and definitions
3 Stability and accuracy of standard least squares with evaluations at ran- dom points
4 Stability and accuracy of weighted least squares with evalua- tions at random points
5 Sampling algorithms for the optimal density
6 Conclusions
epfl-mox-logo Least squares with evaluations at random points
The function
y 7→kn(y) =
n
X
j=1
|Lj(y)|2
is the diagonal of the integral kernel of the projector on Vn, and depends only onVn anddρ.
In general we have the lower bound
Kn:=kknkL∞(Γ)≥n.
First limitation: cannot address relevant situations like Γ =Rd,dρ Gaussian measure on Γ and (Lj)j Hermite polynomials.
epfl-mox-logo Least squares with evaluations at random points
Chernoff bound for random matrices [Tropp 2011]
G =m−1Pm
i=1H(yi) whereHjk=Lj(y)Lk(y).
Since|||H||| ≤Kn a.s., it holds
Prρ(|||G−I|||> δ)≤2nexp
−mc(δ) Kn
,
wherec(δ) =δ+ (1−δ) ln(1−δ)>0.
Chooseδ = 1/2 such that c(1/2) = 0.15.
For any r>0, if
0.15 1 +r
m lnm ≥Kn
then
Prρ(|||G−I|||>1/2)≤2m−r.
epfl-mox-logo Least squares with evaluations at random points
Norm equivalence on V
nFor someδ ∈(0,1) it holds
(1−δ)kvk2 ≤ kvk2m ≤(1 +δ)kvk2, ∀v ∈Vn. For any v∈Vn,v= (vj)j coefficients of the expansionv =Pn
j=1vjLj. Sincekvk2m=hGv,viRn andkvk2=hv,viRn, the matrixG satisfies
|||G|||= sup
v∈Vn\{v≡0}
kvk2m
kvk2, |||G−1|||= sup
v∈Vn\{v≡0}
kvk2 kvk2m
.
Hence, norm equivalence onVn w.h.p. iff concentration bounds 1−δ≤ |||G||| ≤1 +δ,
1
1 +δ ≤ |||G−1||| ≤ 1 1−δ,
|||G−I||| ≤δ.
epfl-mox-logo Least squares with evaluations at random points
Γ⊂Rd bounded. Assume that|u| ≤τ almost surely w.r.t. dρand define Tτ(t) :=sign(t) min{τ,|t|}, uT :=Tτ◦uW
Theorem ( [CCMNT-ESAIM:M2AN 2015] ) In any dimension d , for any r>0 and any n≥1, if
0.15 1 +r
m
lnm ≥Kn, then it holds that
Prρ(cond(G)≤3)≥1−2m−r, Prρ
ku−uWk ≤(1 +√ 2) inf
v∈Vnku−vkL∞
≥1−2m−r, Eρ ku−uTk2
≤
1 + 0.6
(1 +r) lnm
v∈Vminnku−vk2+ 8τ2m−r.
epfl-mox-logo Least squares with evaluations at random points
Second limitation: superlinear growth ofKnw.r.t. n.
Example: multivariate approximation with polynomials:
Γ = [−1,1]d, dρ=⊗dj=1(1−yj)θ1(1 +yj)θ2dyj, θ1, θ2≥ −1/2, Λ⊂Nd
0 downward closed: ν ∈Λ andν′ ≤ν =⇒ ν′ ∈Λ, Vn=PΛ:=span{yν, ν ∈Λ}with n =dim(PΛ) = #(Λ).
Proven upper bounds ( [CCMNT-ESAIM:M2AN 2015], [M-JAT 2015] ) Kn≤
(nln 3ln 2, ifθ1=θ2 =−1/2, n2 max{θ1,θ2}+2, ifθ1, θ2∈N0.
Equality attained for index sets of anisotropic tensor product type.
epfl-mox-logo Weighted least squares with evaluations at random points
1 Motivations and example of applications
2 Notation and definitions
3 Stability and accuracy of standard least squares with evaluations at ran- dom points
4 Stability and accuracy of weighted least squares with evalua- tions at random points
5 Sampling algorithms for the optimal density
6 Conclusions
epfl-mox-logo Weighted least squares with evaluations at random points
Two “limitations”: superlinear growth ofKn w.r.t. n and Γ bounded.
How to circumvent them?
Back to the general setting: Γ⊆Rd, (Lj)j orthonormal basis in L2(Γ,dρ).
kn,w(y) :=w(y)kn(y) =w(y)
n
X
j=1
|Lj(y)|2,
Kn,w :=kkn,wkL∞(Γ)≥n.
Pros: freedom of choice forw ≥0 (only need R
Γw−1dρ= 1).
epfl-mox-logo Weighted least squares with evaluations at random points
Γ⊆Rd. Assume that |u| ≤τ almost surely w.r.t. dρ and define Tτ(t) :=sign(t) min{τ,|t|}, uT :=Tτ◦uW
uC :=uW, if cond(G)<3; uC := 0,otherwise.
Theorem ( [CM-SMAI JCM 2017] )
In any dimension d , for any r>0 and any n≥1, if 0.15
1 +r m
lnm ≥Kn,w, then it holds that
Prµ(cond(G)≤3)≥1−2m−r, Prµ
ku−uWk ≤(1 +√ 2) inf
v∈Vnku−vkL∞
≥1−2m−r, Eµ ku−uTk2
≤
1 + 0.6
(1 +r) lnm
v∈Vminnku−vk2+ 8τ2m−r, Eµ ku−uCk2
≤
1 + 0.6
(1 +r) lnm
vmin∈Vnku−vk2+ 2kuk2m−r.
epfl-mox-logo Weighted least squares with evaluations at random points
Optimal weighted least squares
Choose the weight function as w = n
kn = n Pn
j=1|Lj|2, and thus
dµ=w−1dρ= Pn
j=1|Lj|2
n dρ=:dµn. kn,w ≡n =⇒ Kn,w =n.
In generaldµn is not a product measure on Γ.
epfl-mox-logo Weighted least squares with evaluations at random points
From the previous theorem, using the optimal choice ofw we obtain:
Corollary ( [CM-SMAI JCM 2017] )
In any dimension d , for any r>0 and any n≥1, if 0.15
1 +r m lnm ≥n, then it holds that
Prµ(cond(G)≤3)≥1−2m−r, Prµ
ku−uWk ≤(1 +√ 2) inf
v∈Vnku−vkL∞
≥1−2m−r, Eµ ku−uTk2
≤
1 + 0.6
(1 +r) lnm
v∈Vminnku−vk2+ 8τ2m−r, Eµ ku−uCk2
≤
1 + 0.6
(1 +r) lnm
vmin∈Vnku−vk2+ 2kuk2m−r.
epfl-mox-logo Sampling algorithms for the optimal density
1 Motivations and example of applications
2 Notation and definitions
3 Stability and accuracy of standard least squares with evaluations at ran- dom points
4 Stability and accuracy of weighted least squares with evalua- tions at random points
5 Sampling algorithms for the optimal density
6 Conclusions
epfl-mox-logo Sampling algorithms for the optimal density
Multivariate polynomial approximation
We use orthogonal polynomials, orthonormalized inL2(Γ,dρ).
Assume Γ has a Cartesian structure,e.g. Γ = [−1,1]d or Γ =Rd. Given univariate orthonormal polynomials (φk)k≥0 and a multi-index set Λ⊂Nd
0, for anyν∈Λ we define Lν(y) :=
d
Y
i=1
φνi(yi), y ∈Γ,
PΛ:=span{Lν :ν∈Λ}, with dim(PΛ) = #(Λ).
Then choose the approximation space asVn=PΛ.
epfl-mox-logo Sampling algorithms for the optimal density
Connections with equilibrium measure
In some specific settingsdµn converges in weak-star sense to the equilibrium measuredµ∗.
Example: choose the uniform measure on Γ = [−1,1] and Pk =span{yj : 0≤j ≤k−1}. Then
dµnn→∞→ dµ∗= 1 2πp
1−y2dλ.
Whenever asymptotic equivalences are available c dµ∗ ≤dµn≤C dµ∗,
the previous results on stability and accuracy carry over by choosingw such thatdµ=dµ∗, but under the more demanding condition
0.15 1 +r
c C
m lnm ≥n.
epfl-mox-logo Sampling algorithms for the optimal density
How to sample efficiently the optimal density ?
Algorithm 1 Sequential conditional sampling forµn.
INPUT: m,d, Λ,ρi, (φj)j≥0fori= 1, . . . ,d.
OUTPUT: y1, . . . ,ymi.i.d.∼ µn. fork= 1 tomdo
αν←(#(Λ))−1, for anyν∈Λ.
Sampley1kfromt7→ϕ1(t) =ρ1(t) P
ν∈Λ
αν|φν1(t)|2. forq= 2 toddo
αν←
q−1Q
j=1
|φνj(xjk)|2 P
e ν∈Λ
q−1Q
j=1
|φνej(xjk)|2
, for anyν∈Λ.
Sampleyqk fromt7→ϕq(t) =ρq(t) P
ν∈Λ
αν|φνq(t)|2. end for
yk←(y1k, . . . ,ydk).
end for
Overall computational cost of generatingm independent samples fromµn
is linear in bothd and m.
epfl-mox-logo Sampling algorithms for the optimal density
Pr { cond ( G ) ≤ 3 } , d = 1: weighted LS vs LS
dρ uniform measure dρ Gaussian measure dρChebyshev measure
weightedLS n n n
m/lnm m/lnm m/lnm
LS n n n
epfl-mox-logo Sampling algorithms for the optimal density
Pr { cond ( G ) ≤ 3 } , d = 10: weighted LS vs LS
dρ uniform measure dρ Gaussian measure dρChebyshev measure
weightedLS n n n
m/lnm m/lnm m/lnm
LS n n n
m/lnm m/lnm m/lnm
epfl-mox-logo Sampling algorithms for the optimal density
method dρ d= 1 d= 2 d= 5 d= 10 d= 50 d= 100
weighted LS uniform 1 1 1 1 1 1
weighted LS Gaussian 1 1 1 1 1 1
weighted LS Chebyshev 1 1 1 1 1 1
standard LS uniform 0 0 0.54 1 1 1
standard LS Gaussian 0 0 0 0 0 0
standard LS Chebyshev 1 1 1 1 1 1
Table:Pr{cond(G)≤3}, with m= 26559 and n = 200.
method dρ d= 1 d= 2 d= 5 d= 10 d= 50 d= 100
weighted LS uniform 1.5593 1.4989 1.4407 1.4320 1.4535 1.4179 weighted LS Gaussian 1.5994 1.5698 1.4743 1.4643 1.4676 1.4237 weighted LS Chebyshev 1.5364 1.4894 1.4694 1.4105 1.4143 1.4216 standard LS uniform 19.9584 29.8920 3.0847 1.9555 1.7228 1.5862 standard LS Gaussian ∼1019 ∼1019 ∼1019 ∼1016 ∼109 ∼103 standard LS Chebyshev 1.5574 1.5367 1.5357 1.4752 1.4499 1.4625
Table:Average of cond(G), withm= 26559 and n= 200.
epfl-mox-logo Conclusions
1 Motivations and example of applications
2 Notation and definitions
3 Stability and accuracy of standard least squares with evaluations at ran- dom points
4 Stability and accuracy of weighted least squares with evalua- tions at random points
5 Sampling algorithms for the optimal density
6 Conclusions
epfl-mox-logo Conclusions
Conclusions - analysis weighted least squares
RANDOM POINTS: analysis w.r.t. m, n, d,dρ, smoothness u: in any dimensiond, with any measure dρ (e.g. Jacobi or Gaussian), proven stability and accuracy w.h.p. and in expectation provided that
m
lnm ≥C n =Cdim(Vn), with C independent of d.
However:
results are for a given approximation space, adaptivity could be an issue.
we have developed efficient algorithms for sampling the optimal densityµn, but require dρ be a product measure.
epfl-mox-logo Conclusions
Thank you for your attention!
epfl-mox-logo References
Some references
A.Cohen, G.Migliorati:Optimal weighted least-squares methods, SMAI Journal of Computational Mathematics, 2017.
A.Chkifa, A.Cohen, G.Migliorati, F.Nobile, R.Tempone:Discrete least squares polynomial approximation with random evaluations; application to parametric and stochastic elliptic PDEs.
ESAIM:M2AN, 2015.
G.Migliorati:Multivariate Markov-type and Nikolskii-type inequalities for polynomials associated with downward closed multi-index sets, J.Approximation Theory, 2015.
G.Migliorati, F.Nobile, R.Tempone:Convergence estimates in probability and in expectation for discrete least squares with noisy evaluations at random points, J. Multivariate Analysis, 2015.
A.Cohen, G.Migliorati, F.Nobile:Discrete least-squares approximations over optimized downward closed polynomial spaces in arbitrary dimension, Constructive Approximation, 2017.