Estimation of the Sobol indices in a linear functional multidimensional model

(1)

HAL Id: hal-00685998

https://hal.archives-ouvertes.fr/hal-00685998

Submitted on 6 Apr 2012

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Estimation of the Sobol indices in a linear functional multidimensional model

Jean-Claude Fort, Thierry Klein, Agnès Lagnoux, Béatrice Laurent

To cite this version:

Jean-Claude Fort, Thierry Klein, Agnès Lagnoux, Béatrice Laurent. Estimation of the Sobol indices

in a linear functional multidimensional model. Journal of Statistical Planning and Inference, Elsevier,

2013, 143 (9), pp.1590-1605. �hal-00685998�

(2)

Estimation of the Sobol indices in a linear functional multidimensional model

Jean-Claude Fort

^∗

, Thierry Klein

^†

, Agnès Lagnoux

^‡

, Béatrice Laurent

^§

April 6, 2012

Abstract

We consider a functional linear model where the explicative variables are stochastic processes taking values in a Hilbert space, the main example is given by Gaussian processes inL²([0,1]). We propose estimators of the Sobol indices in this functional linear model. Our estimators are based on U−statistics. We prove the asymptotic normality and the efficiency of our estimators and we compare them from a theoretical and practical point of view with classical estimators of Sobol indices.

Mathematics Subject Classification:

Keywords Karhunen-Loève expansion, fractional Gaussian process, semi- parametric efficient estimation, sensitivity analysis, quadratic functionals.

Introduction

Many mathematical models encountered in applied sciences involve a large number of poorly-known parameters as inputs. It is important for the practitioner to assess the impact of this uncertainty on the model output. An aspect of this assessment is sensitivity analysis, which aims to identify the most sensitive parameters, that is, parameters having the largest influence on the output. In global stochastic sensitivity analysis (see for example [17] and [19] and references therein) the input variables are assumed to be independent random variables.

Their probability distributions account for the practitioner’s belief about the input uncertainty. This turns the model output into a random variable, whose total variance can be split down into different partial variances (this is the so- called Hoeffding decomposition see [24]). Each of these partial variances mea- sures the incertitude on the output induced by each input variable uncertainty.

By considering the ratio of each partial variance to the total variance, we obtain a measure of importance for each input variable that is called theSobol index

∗Université Paris Descartes, 45 rue des Saints Pères, 75006 Paris, France

†Institut de Mathématiques de Toulouse, Université Toulouse 3, 31062 Toulouse Cédex 9, France

‡Institut de Mathématiques de Toulouse, Université Toulouse 2, 31058 Toulouse Cédex 9, France

§Institut de Mathématiques de Toulouse, INSA, 135 av. de Rangueil, 31077 Toulouse Cédex 4, France

(3)

orsensitivity index of the variable [20]; the most sensitive parameters can then be identified and ranked as the parameters with the largest Sobol indices.

Once the Sobol indices have been defined, the question of their effective computation or estimation remains open. In practice, one has to estimate (in a statistical sense) those indices using a finite sample of evaluations of model outputs [6]. Many Monte Carlo or quasi Monte Carlo approaches have been developed by the experimental sciences and engineering communities. This in- cludes the FAST methods (see for example [3], [23] and references therein) and the Sobol pick-freeze (SPF) scheme (see [20,22]). Nevertheless, those methods require many evaluations of model outputs which can be a strong limitation when thoses evaluations are expensive. Many approaches have been developped to overcome this issue. The most popular are Bayesian approach (see for example [17]) or the construction of metamodels. As mentioned in Kleijnen [11] (see equation (1) page 121) one can use functional linear regression as metamodel.

In this paper, we study the particular context of the functional linear regression and propose a different way of estimation. We consider nonparametric estimators of quadratic functionals by projection methods, which are related to the procedures developed by Laurent (see [12, 13]) in a density model and by Da Veiga and Gamboa in [4] in a regression model. This method allows us to estimate simultaneously all the Sobol indices with a single sample of reasonable size.

More precisely we consider a separable Hilbert space H endowed with the scalar product<, >andX¹, . . . , X^p, pindependent centered,H-valued, stochastic processes. The model that we consider is a linear regression model :

Y =µ+

p

X

k=1

< β^k, X^k>+ε.

where β_i,1 ≤ i ≤ p are elements of H, µ is in R and ε is a centered noise independent of the processesX₁, . . . , X_p.

Our approach is based on the so-called Karhunen-Loève decomposition of the processesX^k ([16,1]). Thanks to this decomposition we construct natural estimators of the Sobol indices for whom we prove asymptotic normality and efficiency. (Asymptotic efficiency is a natural property which generalizes the notion of minimum variance unbiased estimator, see [24] chapters 8 and 25 or [8] for more details.)

This paper is organized as follows: in the first section, we set up the notations for the model, review the definition of Sobol indices and present our estimators.

In the second section, we prove asymptotic normality and efficiency when we consider a simple functional linear regression model. These two properties are generalized in the third section in the general setting of the multiple functional linear regression. In Section3, we also compare this method with the classical SPF. The fourth section gives numerical illustrations on a benchmark model.

1 Setting and notations

LetHbe a separable Hilbert space endowed with the scalar product<, > and letX¹, . . . , X^p bepindependent centered, H-valued, stochastic processes.

(4)

The generic model that we consider is the following

Y =µ+

p

X

k=1

< β^k, X^k >+ε, (1)

whereβ^k,1≤k≤pare elements ofH, µis inRandεis a centered noise with varianceη²independent of the processesX1, . . . , Xp, and with finite fourth order moment.

We define, as usual in a finite dimensional setting, the Sobol index with respect to the entry (explicative variable) numberk:

S^k =Var(E(Y|X^k)) Var(Y) . From the observation of X_i¹, . . . , X_i^p, Y_i

1≤i≤n i.i.d. obeying to Model (1), our aim is to estimate the vector S¹, . . . , S^p

. Since Var(Y)can be easily estimated by the empirical variance based on(Y₁, . . . , Y_n), the main purpose of the paper is to estimate for allkthe quantity

V^k =Var(E(Y|X^k)).

We will assume that we know the distributions of the input processes X^k

1≤k≤p. In the next section we consider the simple case wherep= 1. In this setting the Sobol index is of less interest, but the computations then easily extend to the generic model.

2 Simple functional linear regression model

Using the same notations as in Section1, we consider, in this section, the case p= 1, which leads to the model

Y =µ+< β,X >+ε. (2)

We observe an-sample of(X, Y), that we denote(Xi, Yi),1≤i≤n.

We haveE(Y|X) =µ+< β, X > and V^X =Var(E(Y|X)) =Var(< β,X >).

The Sobol indexS^X is defined by

S^X= V^X Var(Y). We assume that E kXk²

< ∞, so that the covariance operator defined for all f ∈ H by Γ(f) = E[< X, f > X] is Hilbert-Shmidt and is diagonalizable via the Karhunen-Loève expansion in an orthonormal basis of eigenfunctions (ϕl,1≤l), with decreasing eigenvalues(λl,1≤l)such thatP∞

l=1λl<∞. See e.g. [14, 15,10]. We set< X, ϕl>=√

λlξl. The variables(ξl)_l≥1 are centered, uncorrelated, with variance 1. When X is a Gaussian process, the variables (ξl)_l≥1are i.i.d. standard Gaussian variables.

Example 2.1(Examples of Karhunen-Loève expansions ([16,1])).

(5)

1. LetB= (B_t)_t∈[0,1] a Brownian motion and H=L²([0,1], dt). Then

λl= 1

(π(l−¹₂))² and ϕl(t) =√

2 sin( t

√λl

), l≥1.

2. LetW = (W_t)_t∈[0,1] a Brownian bridge and H=L²([0,1], dt).

λl= 1

(πl)² and ϕl(t) =√

2 sin(πlt), l≥1.

3. LetX= (Xt)_t∈[0,1] a fractional Brownian motion with Hurst exponentH. Then the asymptotic of the eigenvalues is given by

λ_l= sin(πH)Γ(2H+ 1) (lπ)^2H+1 +o

l(2H+2)(4H+3)

4H+5 +δ

.

In the case H = 1/2, the Brownian case, the result agrees with the exact result λl=_{(π(l−1/2))}¹ 2.

In the basis(ϕl)_l≥1,β has the expansionβ =

∞

X

l=1

γlϕl, and we have :

E(Y X) =E(< X, β > X) = Γ(β) =

∞

X

l=1

λlγlϕl.

Setting g=P∞

l=1λlγlϕl, we have : γl=< g, ϕl> /λl. The indexV^X is then given by

V^X=Var(E(Y|X)) =E(E(Y|X)²) =E(< β, X >< β, X >) =< β,Γ(β)> . which expendsV^X =

∞

X

l=1

λ_lγ²_l.

We consider the empirical unbiased estimator ofγ_l :

bγl= 1 λl

1 n

n

X

i=1

< Xi, ϕl> Yi. For allm∈N^∗, we introduce theU-statistics of order2 :

Vb_m^X =

m

X

l=1

1 λl

1 n(n−1)

X

1≤i6=j≤n

< Xi, ϕl> Yi< Xj, ϕl> Yj. (3)

We have EVb_m^X =

m

X

l=1

λlγ_l², hence Vb_m^X is a biased estimator of V^X, with bias

Bm=

∞

X

l=m+1

λ_lγ²_l.

In the following section, we study the asymptotic behaviour of Vb_m^X, for a suitable choice of the parameterm.

(6)

2.1 Central limit theorem

Using Hoeffding’s decompsition of theU-statistics Vb_m^X (see [7]), we straightfor- wardly get the following result.

Proposition 1. Vb_m^X can be rewritten as the sum of a totally degenerated U- statistics of order 2, a centered linear term and a deterministic term in the following way :

Vb_m^X =UnK+PnL+

m

X

l=1

γ_l²λl (4)

where

U_nK :=

m

X

l=1

1 λ_l

1 n(n−1)

X

1≤i6=j≤n

(Y_i < X_i, ϕ_l>−γ_lλ_l) (Y_j< X_j, ϕ_l>−γ_lλ_l)

P_nL := 2 n

m

X

l=1 n

X

i=1

γ_l(Y_i< X_i, ϕ_l>−γlλ_l).

As a consequence, we have

Vb_m^X−V^X =UnK+PnL−X

l>m

γ_l²λl=UnK+PnL−Bm.

Theorem 2. Let(Xi, Yi)_1≤i≤n be i.i.d. observations with the same distribution as(X, Y)from Model (2). We assume that E(kXk⁴)<+∞ and thatE(ε⁴)<

+∞. We consider the Karhunen-Loève expansion of X :

X =X

l≥1

pλlξlϕl.

We assume

sup

l≥1E(ξ⁴_l)<+∞. (5)

We consider the estimatorV_m^X ofVX defined by (3) with m=m(n) =√ nh(n), whereh(n) satisfies : h(n)→0and∀α >0,n^αh(n)→+∞ asn→+∞.

We assume that there existC >0 andδ >1 such that

∀l≥1, λl≤Cl^−δ Then we have

√n(bV_m^X−V^X) →^L

n→∞N(0,4Var(Y < X, β >)). (6) Comments :

1. We may assume that h(n) = 1/log(n), and hencem(n) = √

n/logn, to fill the condition

∀α >0, lim

n→∞n^αh(n) = +∞.

The estimatorVb_m^X converges at the parametric rate1/√

n, for any β.

(7)

2. The nature of the problem of estimating the quadratic functional V_m^X is completely different from the estimation of the signal β, where nonparametric rates are obtained (see for example [2] for the estimation ofβ in a circular functional linear model).

3. We will prove in the next section the asymptotic efficiency of the estimator Vb_m^X.

4. Note that the condition (5) is verified whenX is a Gaussian process, since in the case, the variables(ξ_l)_l≥1 are i.i.d. standard Gaussian variables.

5. In Theorem 2 we have assumed that there exist C > 0 and δ >1 such that

∀l≥1, λ_l≤Cl^−δ. Let us recall that since E kXk²

< +∞, P

l≥1λ_l < +∞ hence this assumption is not very strong.

Proof. In order to prove (6), we will show that







B²m=o(1/n) E (UnK)²

=o(1/n)

√nP_nL →^L

n→∞N(0,4Var(Y < X, β >)) .

a) Bias term

Since we have assumed that∀l≥1, λl≤Cl^−δ for someC >0andδ >1, and sinceP

l≥1γ²_l <+∞, we get Bm=X

l>m

γ_l²λ_l≤CX

l>m

γ_l²l^−δ ≤m^−δX

l>m

γ_l².

Recalling the definition ofm=m(n), we obtain B²m=o

1 n

. b) Term UnK

One hasE((UnK)²) =E1+E2+E3 where E1 = 2

(n(n−1))²

m

X

l1,l2=1

1 λl₁λl₂

X

1≤i16=j1≤n

E h

(Yi₁ < Xi₁, ϕl₁>−λl₁γl₁)

(Y_i₁ < X_i₁, ϕ_l₂ >−λl2γ_l₂)(Y_j₁ < X_j₁, ϕ_l₁ >−λl1γ_l₁)(Y_j₁< X_j₁, ϕ_l₂ >−λl2γ_l₂)i ,

E2 = 4 (n(n−1))²

m

X

l₁,l₂=1

1 λl₁λl₂

X

i₁,j₁,j₂all6=

E h

(Yi₁ < Xi₁, ϕl₁>−λl₁γl₁)

(Yi1 < Xi1, ϕl2 >−λl2γl2)(Yj1 < Xj1, ϕl1 >−λl1γl1)(Yj2< Xj2, ϕl2 >−λl2γl2)i , E3 = 1

(n(n−1))²

m

X

l₁,l₂=1

1 λl₁λl₂

X

i₁,j₁,i₂,j₂all6=

E h

(Yi₁ < Xi₁, ϕl₁ >−λl₁γl₁)

(Yi2 < Xi2, ϕl2 >−λl2γl2)(Yj1 < Xj1, ϕl1 >−λl1γl1)(Yj2< Xj2, ϕl2 >−λl2γl2)i .

(8)

One easily see that

E2= 0 and E3= 0,

since the variables(Y < X, ϕl>−λlγl)are centered and independent.

Let us now computeE1. DenoteZ¯_j,k:=Y_j< X_j, ϕ_k >−λkγ_k.

E1 = 2 (n(n−1))²

m

X

l,k=1

1 λlλk

X

1≤i6=j≤n

EZ¯_1,lZ¯_2,lZ¯_1,kZ¯_2,k

= 2

n(n−1)

m

X

l,k=1

1

λlλkEZ¯_1,lZ¯_2,lZ¯_1,kZ¯_2,k

= 2

n(n−1)

m

X

l,k=1

1

λlλkE²Z¯_1,lZ¯_1,k

≤ 2

n(n−1)

m

X

l,k=1

1 λlλkE

(Y < X, ϕl>)² E

(Y < X, ϕk>)²

= 2

n(n−1)

m

X

l,k=1

E Y²ξ_l²

E Y²ξ_k²

≤ 2 n(n−1)E

Y⁴

m

X

l,k=1

E ξ_l⁴1/2

E ξ_k⁴1/2

≤ 2

n(n−1)E Y⁴

m

X

l=1

E ξ_l⁴^1/2

!²

By assumption (5) we know thatsup_kE(ξ_k⁴)≤K,hence we have

E1 ≤ 2Km²

n(n−1)E Y⁴

and we still obtain thatE (UnK)²

=o ¹_n .

c) TermPnL First, let

PnL⁰ := 2 n

n

X

i=1

h

Yi< Xi, β >−E(Y < X, β >)i

(7)

= 2

n

X

i=1



Yi< Xi, β >−X

l≥1

γ²_lλl



 (8)

since

E(Y < X, β >) =E(< X, β >²) =E





∞

X

l=1

γ_lp λ_lξ_l

!²

=

∞

X

l=1

γ_l²λ_l,

(9)

we write

PnL = 2 n

m

X

l=1 n

X

i=1

γl(Yi< Xi, ϕl>−γlλl)

= 2 n

n

X

i=1

Y_i< X_i, β >_m−

m

X

l=1

γ_l²λ_l

!

= PnL⁰+ (PnL−PnL⁰), where< X_i, β >_mdenotesPm

l=1γ_l< X_i, ϕ_l>. In the one hand, P_nL⁰−P_nL = 2

n

X

i=1

Y_i[< X_i, β >−< X_i, β >_m]−2X

l>m

γ²_lλ_l

= 2

n

X

i=1

X

l>m

γ_l[Y_i< X_i, ϕ_l>−γ_lλ_l]

Let Z := P

l>mγl[Y < X, ϕl>−γlλl] and let us give an upper bound for Var(Z).

Var(Z) = Var X

l>m

γl[Y < X, ϕl>−γlλl]

!

= Var X

l>m

γ_lY < X, ϕ_l>

!

≤ E



Y² X

l>m

γ_l< X, ϕ_l>

!2



= E Y²< X, β >²_m⊥

≤ E Y²||X||²_m⊥

||β||²_m⊥ by Cauchy-Schwarz inequality

≤ E Y²||X||²

||β||²_m⊥,

where < X, β >_m⊥ and |β||_m⊥ respectively denote P

l>mγ_l < X_i, ϕ_l > and P

l>mγ²_l. The first expectation does not depend on n and the second term tends to 0 whenn(and thereforem) tends to∞. As a consequence,

• Var(Z) =o(1),

• Var(PnL−PnL⁰) =o _n¹ ,

• PnL−PnL⁰=o_P

√1 n

.

In the other hand, we establish a central limit theorem for PnL⁰, since it is an empirical sum of iid centered variables. As a conclusion,

√nPnL⁰ →^L

n→∞N(0,4Var(Y < X, β >)) (9) and by Slutsky theorem sinceP_nL−P_nL⁰=o_P

√1 n

,

√nPnL →^L

n→∞N(0,4Var(Y < X, β >)). (10)

(10)

This concludes the proof.

Our aim is to estimate the Sobol indexS^Xwhich is defined byS^X=V^X/Var(Y).

It is therefore natural to introduce the estimator ofS^Xdefined byVb_m^X/Vb, where Vb is the empirical variance estimating Var(Y):

Vb := 1 n−1

n

X

i=1

(Y_i−Y¯_n)².

In the following theorem, we obtain the asymptotic behaviour of this estimator.

Theorem 3. Under the same assumptions as in Theorem2, we have

√n Vb_m^X Vb −S^X

!

→L

n→∞N

0, Var(U) (Var(Y))²

(11)

whereU := 2Y < X, β >−S^X(Y −E(Y))².

Proof. (i) First, letW_i= (2 [Y_i< X_i, β >−E(Y < X, β >)], Y_i², Y_i)^t(i= 1, . . . , n) andW = (2 [Y < X, β >−E(Y < X, β >)], Y², Y)^t. Then

Wn= 1 n

n

X

i=1

Wi= (PnL⁰, Y²n, Yn) and E^W :=E(W) = (0,E(Y²),E(Y)).

Now from the Central Limit Theorem,

√n Wn−E^W L

n→∞→ N3(0,Σ)

with

Σ =





4Var(Y < X, β >) 2Cov(Yi< X, β >, Y²) 2Cov(Y < X, β >, Y) 2Cov(Y < X, β >, Y²) Var(Y²) Cov(Y², Y)

2Cov(Y < X, β >, Y) Cov(Y², Y) Var(Y)





(ii) Then letW_i⁰=Wi+ (UnK+ (PnL−PnL⁰) +Pm

l=1γ_l²λl,0,0). We have W⁰_n = 1

n

X

i=1

W_i⁰= (bV_m^X, Y²n, Yn).

Since,U_nK+ (P_nL−P_nL⁰) +P

l>mγ_l²λ_l=o_P ¹_n

, we still have a Central Limit

Theorem √

n

W⁰_n−EW⁰

_L

n→∞→ N3(0,Σ) whereEW⁰ = (E(Y < X, β >),E(Y²),E(Y)).

(iii) LetΦthe mapping fromR³ toRdefined byΦ(x, y, z) = _y−z^x2.We want to apply the Delta method (cf. [24]) to W⁰ andΦ. Easily,











∂Φ

∂x(x, y, z) = _y−z¹2

∂Φ

∂x(EW⁰) = ¹ Var(Y)

∂Φ

∂y(x, y, z) =−_(y−z^x₂₎₂ ^∂Φ_∂y(EW⁰) =−^E(Y <X,β>) (Var(Y))²

∂Φ

∂z(x, y, z) = _(y−z^2xz₂₎₂ ^∂Φ_∂x(EW⁰) = ²^E(Y <X,β>)E(Y) (Var(Y))²

(11)

so that

√n

Φ(W⁰_n)−Φ(EW⁰) _L

n→∞→ N 0,Φ⁰(EW⁰)ΣΦ⁰(EW⁰)^t . ButΦ(W_n) =^V^b^m^X

Vb ,Φ(EW⁰) =S^X and Φ⁰(EW⁰)ΣΦ⁰(EW⁰)^t = 4Var(Y < X, β >)

(Var(Y))² + 2E(Y < X, β >) (Var(Y))³

h

4E(Y)Cov(Y < X, β >, Y)

−2Cov(Y < X, β >, Y²) + 2(E(Y))²E(Y < X, β >)i

+(E(Y < X, β >))² (Var(Y))⁴

Var(Y²)−4E(Y)Cov(Y², Y)

One can check that

Φ⁰(E^W⁰)ΣΦ⁰(E^W⁰)^t = Var(U) (Var(Y))².

Remark 2.1. 1. In the case where the variance of Y is known and plug in the estimator, the result is

√n Vb_m^X

Var(Y)−S^X

!

→L

n→∞N(0,4Var(Y < X, β >)

(Var(Y))² ). (12) 2. In the general case where the variance ofY is unknown andY is centered, we thus have an extra term (due to the estimation of the variance), namely the difference between the two asymptotic variances is

Var(U)−4Var(Y < X, β >) (Var(Y))² =

E(Y < X, β >) (Var(Y))⁴

h

E(Y < X, β >)Var(Y²)−4Var(Y)Cov(Y < X, β >, Y²)i It is worth to notice that this term is not always positive. Namely, if

< X, β >is a centered variable with second moments²and fourth moment k1s⁴ and if ε is a centered variable with second moment η² and fourth moment k2η⁴, then the latter extra term is

− 6s⁴

(s²+η²)² + s⁴

(s²+η²)⁴((3−k1)(3s⁴+ 4η²s²)−(3−k2)η⁴).

In particular, when < X, β >∼ N^L (0, s²) and ∼ N^L (0, η²), k1 =k2 = 3 and one gets an extra term equal to

−6s⁴(s²+η²)² (Var(Y))⁴ .

In this particular case, the asymptotic variance of the estimator of the Sobol index is smaller when the variance ofY is estimated that when it is known and plug in the estimator.

(12)

2.2 Asymptotic efficiency

In this section we prove thatVb_m^X/Vb is asymptotically efficient for estimating the Sobol indexS^X (see [24], Section 25 for the definition of asymptotic efficiency).

This notion somewhat extends the notion of Cramér-Rao bound and enables to define a criterion of optimality for estimators, called asymptotic efficiency.

Proposition 2.1 (Asymptotic efficiency). Under the same assumptions as in Theorem2,Vb_m^X/Vb is asymptotically efficient for estimatingS.

Proof. Note that

S = Φ(V^X,Var(Y)) and Vb_m^X

Vb = Φ(Vb_m^X,Vb) whereΦis defined byΦ(x, y) =x/y.

First let us prove that Vb_m^X is asymptotically efficient for estimatingV^X. The result directly comes from decomposition (4) and the Hoeffding’s decomposition Vb_m^X−V^X=U_nK+P_nL−Bm,P_nLis asymptotically efficient (example 25.24 [efficiency of the empirical distribution]), the other terms are o_P

√1 n

. From which we conclude the asymptotic efficiency ofVb_m^X.

Second using again example 25.24, we get the asymptotic efficiency of Vb for estimating Var(Y).

Then theorem 25.50 (efficiency in product space) in [24] gives that (Vb_m^X,Vb)is asymptotically efficient for estimating(V^X,Var(Y)).

Now since Φ is differentiable in R² \ {y = 0} then by theorem 25.47 (efficiency and Delta method),(Φ(Vb_m^X,Vb))n is asymptotically efficient for estimat- ingΦ(V^X,Var(Y)).

3 Multiple functional linear regression model

We recall the model (1):

Y =µ+

p

X

k=1

< β^k, X^k >+ε.

In this setting we observe an-sample(Y_i, X_i¹, . . . , X_i^p)of(Y, X¹, . . . , X^p).

3.1 Estimation by U-statistics of order 2

We assume that the processes X^k

1≤k≤pare independent. Under this assumption, the previous section whole applies and the Sobol indices can be estimated from theU statistics of order 2. For allm≥1, let

Vb_m^k =

m

X

l=1

1 λ^k_l

1 n(n−1)

X

1≤i6=j≤n

< X_i^k, ϕ_l> Y_i< X_j^k, ϕ_l> Y_j.

For a suitable choice ofm=m(n),Vb_m^k is a consistent estimators ofV^k. Actually by independence

E(Y|X^k) =< β^k, X^k >,

(13)

and denoting ε˜=Pp

j=1,j6=k < β^j, X^j >+ε, we retrieve the simple regression model.

In fact we can obtain a central limit theorem forVbm where

Vb_m= (Vb_m¹,Vb_m², . . . ,Vb_m^p)^T. (13) LetV^j=Var(E(Y|X^j)),j= 1. . . pandV = (V¹, V², . . . , V^p)^T.

Theorem 4. Let (Y_i, X_i¹, . . . , X_i^p)be i.i.d. observations from Model (1).

We assume that for all1≤j≤p,E(kX^jk⁴)<+∞and thatE(ε⁴)<+∞. We consider the Karhunen-Loève expansion of X^j :

X^j=X

l≥1

q λ^j_lξ_l^jϕ^j_l.

We assume

∀1≤j≤p,sup

l≥1

E((ξ^j_l)⁴)<+∞. (14) We consider the estimator Vb_mof V defined by (13) withm=m(n) =√

nh(n), where h(n) satisfies : h(n) →0 as n → +∞ and ∀α >0, n^αh(n)→ +∞ as n→+∞.

We assume that

∀ 1≤j≤p, ∀l≥1, λ^j_l ≤C_jl^−δ^j for someCj>0 andδj >1.

The following result holds :

√n(Vbm−V) →^L

n→∞Np

0,4 Cov(Y < X^k, β^k>, Y < X^l, β^l>)

k,l=1...p

. (15) Proof. Using the same decomposition

Vb_m−V =U_nK+P_nL−Bm

and the fact thatkUnKk,kPnL−PnL⁰kandkBmk²areo_P ¹_n

ando _n¹

, we just need to check thatP_nL⁰= (P_nL⁰¹, P_nL⁰², . . . , P_nL^0p)^T converges in distribution towards aNp

0,4 Cov(Y < X^k, β^k >, Y < X^l, β^l>)

k,l=1...p

. We have

P_nL⁰= 2 n

n

X

i=1

W_i

where Wi = (W_i¹, . . . , W_i^p)^T and W_i^k = Yi < X_i^k, β^k >−E(Yi < X_i^k, β^k >).

Since the W_i’s are i.i.d., the result follows from the central limit theorem in R^p.

Now defineSbmasVbm/Vb = (bV_m¹/V , . . . ,b Vb_m^p/Vb)^T, whereVb)denotes the empirical estimator of the Var(Y). Let us remind that S =V /Var(Y) = (S¹, . . . , S^p)^T. Using once again the Delta method, we easily get the following result.

Proposition 3.1. Under the same assumptions as in Theorem4, we have

√n

Sbm−S _L

n→∞→ Np 0,

Cov(U^k, U^l) (Var(Y))²

k,l=1...p

!

. (16)

whereU^k := 2(Y −E(Y))< X^k, β^k >−S^k(Y −E(Y))².

(14)

3.2 Comparison with Sobol estimators

Sobol [21] proposed an empirical method, based on a particular design of experiments, to estimate Sobol indices. This method called Sobol Pick and Freeze (SPF) is also studied in [18]. In [9, 5], the authors prove asymptotic normality and efficiency of Sobol estimators. In this section, our aim is to compare the method proposed in this paper, which is based on U statistics of order 2, and the method proposed by Sobol in order to see which experimental set up should be used by the practitionner. It is important to notice that the two procedures are not based on the same design of experiments. Both methods lead to asymptotically normal and efficient estimators, but the asymptotic variances are different due to the designs of experiments. We therefore want to compare the asymptotic variances obtained by the two methods, for a similar number of experiments in both cases. Let us first recall the design of experiments and the estimators proposed by Sobol.

Let(X, Y)obey to the model (1), whereX = (X¹, . . . , X^p). LetX⁰= (X⁰¹, . . . , X^0p) an i.i.d. copy ofX. For allk∈ {1, . . . , p}, letY^k be defined by

Y^k =< X^k, β^k>+

p

X

j=1,j6=k

< X^0j, β^j>+ε⁰,

whereε⁰ is an i.i.d. copy ofε. Let (Y1, . . . , Yn)be i.i.d. copies ofY and for all k∈ {1, . . . , p}let(Y₁^k, . . . , Y_n^k)be i.i.d. copies of Y^k. The estimator proposed by Sobol to estimateV^k=Var(E(Y|X^k))is defined by

Vb_{SP F}^k = 1 n

n

X

i=1

YiY_i^k− 1 n(p+ 1)

n

X

i=1

Yi+Y_i¹+. . .+Y_i^p

!2

.

Now letSbSP F =VbSP F/Ve = (bV_{SP F}¹ /V , . . . ,e Vb_{SP F}^p /Ve)^T, whereVe is the estimator of Var(Y)in the S.P.F. method defined by

Ve = 1 n(p+ 1)

n

X

i=1

(Y_i)²+ Y_i¹²

+. . .+ (Y_i^p)²

− 1

n(p+ 1)

n

X

i=1

Y_i+Y_i¹+. . .+Y_i^p

!² .

Theorem 5. We have (see [9,5])

√n

SbSP F−S _L

n→∞→ Np 0,

Cov(V^k, V^l) (Var(Y))²

k,l=1...p

!

. (17)

whereV^k :=Y Y^k−S^k

Y²+Pp

k=1(Y^k)²

/(p+ 1).

The asymptotic variance appearing in the central limit theorem for the estimator Sb_{SP F} = (Sb¹_{SP F}, . . . ,Sb_{SP F}^p ) (resp. Sb_m = (Sb_m¹, . . . ,Sb_m^p)) is denoted by (Var(Y))⁻²Γ_{SP F} (resp. (Var(Y))⁻²Γ(cf. (15)). Our aim is to compare Γand Γ_{SP F}.

For sake of simplicity we assume ε = 0. Then we define Wi =< Xⁱ, βⁱ >, σ_i²=Var(Wi),σ²=Pp

i=1σ²_i andSⁱ=σ²_i/σ². After computations, one obtains

(15)

thatΓ is defined by

∀k, Γ(k, k) = σ_k⁴ σ⁴

p

X

i=1

E(W_i⁴)−3σ_i⁴

+ 4σ²−σ²_k

σ² E(W_k⁴) + 4σ²σ_k²+ 12σ_k⁶

σ² −14σ_k⁴

∀k6=l, Γ(k, l) = − 2 σ²

E(W_k⁴)σ²_l +E(W_l⁴)σ_i²

+ 2σ_k²σ²_l

3σ_k²+σ_l² σ² −1

+σ_k²σ²_l σ⁴

p

X

i=1

E(W_i⁴)−3σ_i⁴

andΓSP F by

∀k, ΓSP F(k, k) =

1− 4 p+ 1

σ_k² σ²

E(W_k⁴) +σ_k⁴ σ⁴

p²−2p+ 5 (p+ 1)²

^p X

i=1

E(W_i⁴)

+

σ²_k

(p+ 1)σ² 4 +σ_k² σ²

−2p²+ 5p−13 p+ 1

^p X

i=1

σ_i⁴

+σ⁴+

−p²−7p+ 6 (p+ 1)²

σ_k⁴+ 4 p+ 1

σ_k⁶ σ²− 4

p+ 1σ²_kσ²

∀k6=l, ΓSP F(k, l) = − 2 p+ 1

σ_k²E(W_l⁴) +σ_l²E(W_k⁴) σ²

+ σ²_kσ_l²

σ⁴

p²−2p+ 5 (p+ 1)²

^p X

i=1

E(W_i⁴)

+

2(σ_k²+σ_l²) +−2p²+ 5p−13 (p+ 1)²σ⁴

^p X

i=1

σ_i⁴

+σ⁴+σ_k²σ_l²

2p²−p+ 9 (p+ 1)²

−p+ 3

p+ 1σ²(σ²_k+σ_l²) + 2

p+ 1 σ²_kσ²_l

σ² (σ_k²+σ_l²).

Sobol experiment requires(p+ 1)nobservations (or computations of the black- box code) to estimate the p indices. In order to have a fair comparison of both methods, we consider that we haven(p+1)i.i.d. observations from Model (1) to estimate the Sobol indices by our methods. Withn(p+ 1)observations instead ofn, the asymptotic variance matrixΓmis devided byp+ 1, hence, we have to study the matrix

D:= (p+ 1)ΓSP F−Γ.

In order to compare the two methods, we evaluate the eigenvalues of the matrix D in order to determine whether it is positive-definite or not. If not we study only the sign of the diagonal terms ofDthat correspond to the difference of the asymptotic variances obtained with both methods.

First example: We asssume that fori= 1, . . . , pwithp≥2

< Xⁱ, βⁱ>∼ N(0,1).

(16)

We have

D_i,i= p⁵+ 2p⁴−5p³−2p²+ 6p−6

p(p+ 1) >0 ∀ 1≤i≤p,

Di,j= p⁵+ 4p⁴+ 3p³+ 2p−6

p(p+ 1) ∀ 1≤i6=j≤p.

Let α = p⁵+ 2p⁴−5p³−2p²+ 6p−6, β = p⁵+ 4p⁴+ 3p³+ 2p−6. The eigenvalues ofDare[α−β]/[p(p+ 1)]of orderp−1and[α+ (p−1)β]/[p(p+ 1)]

of order 1. But

α−β = 2p(−p³−4p²−p+ 2)<0 ∀p≥2, and

α−β+pβ=p(p⁵+ 4p⁴+p³−8p²−2)>0 ∀p≥2.

For p ≥ 2, the matrix D is not positive nor negative. Hence, as mentioned before, we simply study the sign of the diagonal terms of the matrixD. Since Di,i>0for anyp≥2andi= 1. . . p, the asymptotic variance of each estimator of Sobol indices with SPF method is larger than the one obtained by the method proposed in this paper using the same number of observations.

Second example: In the casep= 2andσ²=σ₁²+σ₂²= 1, the model is Y =< X¹, β¹>+< X², β²> .

Then ifW1=< X¹, β¹>andW2=< X², β²>, D_1,1=E(W₁⁴)

−1 +2 3σ₁⁴

+2

3σ₁⁴E(W₂⁴) + 3−8σ₁⁶+ 10σ₁⁴−8σ²₁+σ²₁(σ₁⁴+σ₂⁴)

4−2 3σ₁²

D_2,2=E(W₂⁴)

−1 +2 3σ₂⁴

+2

3σ₂⁴E(W₁⁴) + 3−8σ₂⁶+ 10σ₂⁴−8σ²₂+σ²₂(σ₁⁴+σ₂⁴)

4−2 3σ₂²

D_2,1=σ²₁σ²₂ 2

3(E(W₁⁴) +E(W₂⁴)) + 3

+ σ⁴₁+σ⁴₂ 7

3 + 3σ²₁σ₂²

−2

In the particular case where W₁ ∼ N(0, x) and W₂ ∼ N(0,1−x), for some x∈]0,1[, we obtain

D1,1(x) = 3−4x+1

3 x²−8x³+ 8x⁴ , D2,2(x) = 3−4(1−x) +1

3 (1−x)²−8(1−x)³+ 8(1−x)⁴ , D1,2(x) =1

3 +10 3 x−40

3 x²+ 20x³−10x⁴.

Letλ1(x)andλ2(x)be the eigenvalues ofD(x)andx1and1−x1the real zeros of its determinant (x1≈0.6701). Then

x 0 1−x1 x1 1

λ1(x) +λ2(x) + | + | +

λ1(x)λ2(x) - 0 + 0 -

(17)

Forx∈]x₁, x₂[, the matrixD(x)is positive-definite. We can also study the signs of the diagonal terms of the matrixD(x)forx∈]0,1[.

Now let x0 the zero of D1,1(x)(x0 ≈0.6738). The folowing tabular gives the signs of the diagonal terms of the matrixD(x):

x 0 1−x0 x0 1

D_1,1(x) + | + 0 -

D_2,2(x) - 0 + | +

We deduce from this tabular that forx∈]1−x0, x0[,ourmethod leads to smaller variances for both estimators.

4 Numerical experiments

We perform a simulation study to evaluate the performances of the procedure proposed in this paper for estimating Sobol indices in a functional linear model, and to compare this procedure with Sobol’s procedure.

We consider the model : Y_i=

p

X

k=1

< β^k, X_i^k >+ε_i, (18)

where for allk= 1, . . . , p,X^k and β^k are defined from the coefficients of their expansions onto an orthonormal basis of L²([0,1]). For all k, we consider the basis corresponding to the eigenfunctions of the Karhunen-Loève expansion of the processX^kand we define the functionβ^k by the coefficients of its expansion onto this basis. For the simulations, the basis that we consider is either the one associated to the Karhunen-Loève expansion of the Brownian motion (that has been recalled in Section2) or the one associated to the fractional Brownian motion. It is important to recall that the processesX^kand the related functions β^k are expanded onto the same basis and that the estimators proposed in this paper will perform well in the case where the coefficients of the functions β^k in those basis are non-increasing. The variablesεi are i.i.d. Gaussian centered variables with varianceσ².

4.1 First example

4.1.1 Using Karhunen-Loève expansion of the Brownian motion We consider the model (18) with p= 2andε_i = 0for alli. The processesX¹ and X² are given by a truncated expansion of the Brownian motion onto its Karhunen-Loève basis, with coefficients:

λl= 1

π(l−1/2)², 1≤l≤L;

λl= 0, l > L.

Concerning the functionsβ¹, β², they are respectively caracterized by the coefficients(γl)_l≥1 of their expansion onto the same basis: