• Aucun résultat trouvé

Estimation of quantile oriented sensitivity indices

N/A
N/A
Protected

Academic year: 2021

Partager "Estimation of quantile oriented sensitivity indices"

Copied!
12
0
0

Texte intégral

(1)

HAL Id: hal-01448360

https://hal.archives-ouvertes.fr/hal-01448360

Submitted on 2 Feb 2017

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Estimation of quantile oriented sensitivity indices

Véronique Maume-Deschamps, Ibrahima Niang

To cite this version:

(2)

ESTIMATION OF QUANTILE ORIENTED SENSITIVITY INDICES

V ´ERONIQUE MAUME-DESCHAMPS AND IBRAHIMA NIANG

Abstract. The paper concerns quantile oriented sensitivity analysis. We rewrite the corresponding indices using the Conditional Tail Expec-tation risk measure. Then, we use this new expression to built estima-tors.

1. Introduction

Many models encountered in applied sciences involve input parameters which are often not precisely known. Some of the input variables may strongly affect the output, while others have a small effect. Sensitivity anal-ysis aims at measuring the impact of each input parameter uncertainty on the model output and, more specifically, to identify the most sensitive pa-rameters (or groups of papa-rameters). One of the common metrics to evaluate the sensitivity is the Sobol index. More precisely, given two random vari-ables X and Y , the X-Sobol index on Y compares the total variance of Y to the expected variance of the variable Y conditioned by X, i.e:

(1.1) SX = Var(E(Y | X))

Var(Y ) .

It rewrites SX = 1−E((VarYVar(Y )| X)). It is a statistical measure of the

rela-tive impact of X on the variability of Y ; the most sensirela-tive parameters can be identified and ranked as the parameters with the largest Sobol indices. Introduced in [7], contrast indices named ”Goal Oriented Sensitivity Analy-sis” generalize the Sobol ones. The construction of these indices is based on contrast functions. Roughly speaking, given ψ a contrast function and X, Y two random variables, the ψ-X contrast index on Y is given by (see [7]):

SXψ = min

θ∈R(E[ψ(Y ; θ)])− E(minθ∈R[E(ψ(Y ; θ)|X)])

min θ∈R(E[ψ(Y ; θ)])− E  min θ∈Rψ(Y ; θ)  .

X represents a statistical indicator of the impact of X on the variability of the output function Y with respect to the argmin of a contrast function. If we consider the mean-constrat function ψ : (y, θ) 7→ (y − θ)2, then we

retrieve the first order Sobol indices defined in (1.1) and it measures the impact of X to the deviation of Y with respect to the mean. There exists a pretty large literature in the case where the contrast function is given by the mean-constrat function ψ : (y, θ) 7→ (y − θ)2 (see for example [15], [2],

Key words and phrases. Quantile oriented sensitivity analysis.

(3)

[3], [14] or [9]).

In this paper, we focus on contrast indices obtained with the α-quantile contrast function, i.e., with the contrast function given by:

(1.2) ψα: (y, θ)7→ (y − θ)(α − 1y≤θ), α∈]0, 1[

We call quantile contrast index the sensitivity measure that one obtain with the contrast function given by (1.2). This setting is also called quantile

oriented sensitivity analysis (QOSA) (see [5]). In this work, we show that

quantile contrast indices can be linked with the Conditional Tail Expecta-tion (or CTE) risk measure. Then, from this expression, we propose an estimation method based on Monte Carlo sampling techniques. Note that this approach is the first formal estimation of the quantiles oriented sensi-tivity indices. It appears firstly in [12]. Nevertheless, during the writing of this article, we have been informed of a recent work ([5]) where another procedure is proposed.

The paper is organized as follows. In Section 2, we recall basic definitions on the CTE and the contrast indices. In Section 3, we show that quantile contrast may be rewritten in terms of the CTE. Finally, Section 4, is devoted to the estimation of quantile contrast indices and to some simulations.

2. Main definitions

Let us consider a measurable function f and a random vector X = (X1, . . . , Xk) ∈ Rk, k ≥ 1. Let Y be the real-valued response variable:

Y = f (X). We assume that the random input variables Xi, i≥ 1 are

inde-pendent, which is the standard setting of sensitivity analysis.

For a random variable Z, FZ denotes its distribution function, and FZ−1 the

generalized inverse of FZ (or the quantile function). We recall that for a

continuous random variable Z with moment of order 1, the Conditional Tail Expectation denoted CTE of Z at level α∈]0, 1[ is given by:

(2.1) CT Eα(Z) = E  Z|Z > FZ−1(α)= 1 1− α Z 1 α FZ−1(u)du.

From an actuarial point of view, the CTE measures the average of losses given that a specified confidence level α is exceed. We refer the reader to [6] for a detailed review. In what follows, we introduce some concepts about contrast functions and contrast index. The reader is referred for instance to [7]. We assume that the output Y = f (X) is a continuous random variable. Given a function ψ

ψ : R× R → R+ (y, θ) 7→ ψ(y, θ),

such that there is a unique minimum θ∗ ∈ R to E[ψ(Y ; θ)]. Some examples of contrast functions that allow to estimate various parameters associated to a probability distribution are listed on Table 1.

(4)

Contrast functions θ∗

The mean ψ(y, θ) =|y − θ|2 θ= E(Y )

The median ψ(y, θ) =|y − θ| θ∗= FY−1(12) The α-quantile ψ(y, θ) = (y− θ)(α − 1y≤θ) θ∗= FY−1(α)

Table 1. Contrast functions exemples

In what follows, we focus on α- quantile contrast function, i.e the contrast function given by:

(2.2) ψα: (y, θ)7→ (y − θ)(α − 1y≤θ), α∈]0, 1[

The quantile oriented index associated to the input Xi is:

(2.3) SαX

i =

min

θ∈R(E[ψα(Y ; θ)])− E(minθ∈RE[ψα(Y ; θ)|Xi])

min

θ∈R(E[ψα(Y ; θ)])

Remark. It is straightforward to see that if Y and Xi are independent then

X

i = 0 and if Y is Xi-measurable then S

α

Xi = 1, which is an expected

behavior for sensitivity indices.

We are interested on how one can express the quantile contrast index in terms of the Conditional Tail Expectation (CTE) risk measure.

3. Relation between contrast indices and Conditional Tail Expectation (CTE) risk measure

We recall that, for X and Y two random variables, the conditional cu-mulative distribution FY|X and the conditional quantile FY−1|X at level α are defined respectively as:

FY|X(y) = P (Y ≤ y|X) , y ∈ R. (3.1) FY−1|X(α) = infny∈ R| FY|X(y)≥ αo.

These are X-measurable random functions. In the following proposition, we give a relation between the risk measure CTE and quantile contrast index. Proposition 3.1. Assume that E|Y | < ∞, then ∀α ∈]0, 1[, ∀i = 1, . . . , k, (3.2) SXαi = 1E  Y|Y > FY−1|X i(α)  − E(Y ) CT Eα(Y )− E(Y ) .

Proof. SXαi rewrites as:

(5)

Consider θ∗

1 = argmin θ∈R E[ψα

(Y ; θ)]. We have θ∗

1 = F−1(α).

On one other side, E(min θ∈R[E(ψα(Y ; θ)|Xi)]) = E (E (ψα(Y ; θ ∗ 2)|Xi)) = E (ψα(Y ; θ∗2)) , whit θ∗ 2 = argmin θ∈R E [ψα

(Y ; θ)|Xi] it is a Xi measurable random variable. Let

Z = Y − FY−1|X

i(α).

Remark that, ∂E[ψα(Y ;θ)|Xi]

∂θ =−α + P(Y ≤ θ|Xi) and θ∗2 = FY−1|Xi(α). We deduce: E (ψα(Y ; θ2∗)) = E  Z1Y >F−1 Y|Xi(α)  − (1 − α)E(Z)

= P(Y > θ∗2)E(Y|Y > θ∗2)− P(Y > θ∗2)E(θ∗2|Y > θ2∗)− (1 − α)E(Z).

In addition,

P(Y > θ∗2) = 1−P(Y ≤ θ2∗) = 1−E E(1Y≤θ∗ 2|Xi)



= 1−E FY|Xi2∗)= 1−α. On one other side, conditioning by Xi, leads to

E(θ∗2|Y > θ∗2) = 1 1− αE θ ∗ 21Y >θ∗ 2  = 1 1− αE(E θ ∗ 21Y >θ∗ 2|Xi  ) = E(θ∗2). Consequently,

E (ψα(Y ; θ2∗)) = (1− α)E(Y |Y > θ∗2)− (1 − α)E(θ∗2)− (1 − α)E(Y ) + (1 − α)E(θ2∗)

= (1− α) (E(Y |Y > θ2)− E(Y )) .

which concludes the proof. 

We shall use this expression of the quantile oriented indices to propose an estimation of them.

4. Estimation of quantile contrast index

We consider the statistical estimation of quantile contrast index. Fix an index i = 1, . . . , k.We assume that E|Y | < ∞. We consider, for α ∈]0, 1[. We have (4.1) SXαi = 1E  Y|Y > FY−1|X i(α)  − E(Y ) CT Eα(Y )− E(Y ) .

(6)

4.1. Estimation procedure. Let θ∗ = F−1

Y (α) and θi(x) = FY−1|Xi=x(α) be

the respective quantile at level α of Y and of Y|Xi= x.

Consider Y1, . . . , Yn are n, n≥ 1 an i.i.d. n-sample of the distribution of Y .

Fn denotes the empirical distribution function:

Fn(y) = 1 n n X i=1 1{Yi≤y},∀y ∈ R and bθ∗ is the classical quantile estimator.

The conditional distribution function FY|Xi=xcan be estimated by a Kernel

estimator: Fn(y|Xi= x) = Pn j=1K  x−Xij hn  1{Yj≤y} Pn j=1K  x−Xij hn  ,

where K is a kernel and hn is a positive number depending on the sample

size n, called bandwidth. Then θi(x) is obtained by

(4.2) θˆi(x) = Fn−1(α|x) = inf {y : Fn(y|x) ≥ α} ,

We refer the reader to [1]; [11]; [8] or [16] for a detailed review on quantile and conditional quantile estimation.

In what follows, we give an estimation procedure for estimating quan-tile contrast index. It requires the two following steps (recall that Y = f(X1, . . . , Xk))

(1) Generate X1j, . . . , Xkj and compute the Yj = f (X1j, . . . , Xkj), for j =

1, . . . , n. Replace in (4.1) the expectation E(Y ) by its empirical ver-sion.

(2) Generate X∗j = (X∗j

1 , . . . , Xk∗j) for j = 1, . . . , n (independent copies

of the previous vectors) and compute the Yj∗ = f (X1∗j, . . . , Xk∗j), j = 1, . . . , n. Then from the sample Y∗

j = f (X1∗j, . . . , Xk∗j), j = 1, . . . , n

compute ˆθ∗ and ˆθi(Xi).

An estimator of the index (4.1) is given by (4.3) Sn,Xα i = 1− 1 n(1−α) Pn j=1Yj1Yj> ˆθi(Xij)− 1 n Pn j=1Yj 1 n(1−α) Pn j=1Yj1Y j> ˆθ∗− 1 n Pn j=1Yj . Remark that equation (4.3) is equivalent to:

(7)

conditionally to the sample X∗, are two independent and identically

dis-tributed (i.i.d) sample of the distribution of Ri and Z where

Ri = Y  1 1− α1Y >F−1 Y|Xi(α)− 1  , Z = Y  1 1− α1Y >FY−1(α)− 1  . For the sake of simplicity, we denote by

¯ Zn= 1 n n X j=1 Zj, R¯n= 1 n n X j=1 Rij.

Proposition 4.1. (Consistency). Assume that E|Y | < ∞, then

(4.5) Sn,Xα i a.s −→ n→∞S α Xi

Proof. The result is a straightforward application of the strong law of large

number conditionally to X∗. 

Proposition 4.2. (Asymptotic normality). Assume that Y is square inte-grable. Then, (4.6) √n(Sn,Xα i− S α Xi) L −→ n→∞N 0, σ 2 S  , where σS2 = Var(R i)− 2β(1 − Sα Xi) + (1− S α Xi) 2Var(Z) (CT Eα(Y )− E(Y ))2 , and β = Cov(Ri, Z).

Proof. We do the proof conditionally to X. Denote

Sn,Xα

i = h( ¯Un),

where Uj = (Rij, Zj)T and

h(x, y) = 1−xy. The central limit theorem gives that:

n( ¯Un− µ) −→L

n→∞N2(0, Γ) ,

where Γ is the covariance matrice of (Ri, Z) and µ =

 E(Ri)

E(Z) 

. Since the function h is differentiable at µ, the Delta method gives:

√ n(Sn,Xα i− SXαi) −→L n→∞N 0, g TΓg, where g=∇h(µ).

By differentiation, we get that, for any x, y so that y 6= 0: ∇h(x, y) =  −1y, x y2 T . Hence, using 4.1, we get that

(8)

Thus gTΓg = Var(Ri)(1− S α Xi) 2 E(Ri)2 − 2Cov(R i, Z)(1− S α Xi) 3 E(Ri)2 + Var(Z) (1− Sα Xi) 4 E(Ri)2 = (1− S α Xi) 2 E(Ri)2 Var(R i)− 2Cov(Ri, Z)(1− Sα Xi) + Var(Z)(1− S α Xi) 2. Remark that (1−S α Xi) 2 E(Ri)2 = 1

E(Z)2 and that E(Z) = CT Eα(Y )− E(Y ), hence

we deduce that gTΓg = Var(R i)− 2β(1 − Sψα Xi) + (1− S α Xi) 2Var(Z) (CT Eα(Y )− E(Y ))2 ,

where β = Cov(Ri, Z) which concludes the proof. 

Remark. When computing σS2, we replace the variance of Ri and Z as well

as the covariance of the random vector (Ri, Z) by their empirical versions.

Remark. Following [10], in order to improve the efficiency of the estimator,

CT Eα(Y ) and E(Y ) could be estimated by using the complete sample X, X∗.

A further study is needed to get its asymptotic properties.

4.2. Numerical illustrations. In order to validate our estimation proce-dure, we illustrate the asymptotic results of Proposition 4.2 in the following example that is considered in [7].

Example 1. Let us considerer an output of type Y = f (X1, X2) = X1+ X2.

where X1 ∼ Exp(1) and X2 ∼ −X1 with X1 and X2 independent. The

quantile contrast index of X1 and X2 are known analytically (see [7]) and

they are given by

SXα1 =        (1− α)(1 − log(2(1 − α))) + α log(α) (1− α)(1 − log(2(1 − α))) if α≥ 1 2 α(1− log(2α)) + α log(α) α(1− log(2α)) if α < 1 2 and SXα2 =        (1− α)(1 − log(2(1 − α))) + (1 − α) log(1 − α) (1− α)(1 − log(2(1 − α))) if α≥ 1 2 α(1− log(2α)) + (1 − α) log(1 − α) α(1− log(2α)) if α < 1 2 In the case where α = 12, then Sα

X1 = S

α X2.

Table 2 displays the sensitivity of X1 and X2 for different values of α. We

estimate for each variable a 95% confidence interval. We denote by SX1

n,α and

SX2

n,α as the empirical estimator of SXα1 and S

α

X2 given in (4.3) for a sample

size n = 100000. The conditional quantile of Y|X1 and Y|X2 are computed

using a gaussian kernel with a bandwidth hn = n−

1

5. The quantity IC

1

and IC2 denote the respective confidence interval of X1 and X2. Table 2

(9)

X1and X2which measures the relative average of the square of the ”errors”,

that is the following quantity RM SEXi = v u u t 1 n n X j=1 (SXi n,α)j − SXαi SXα i !2 , i= 1, 2.

As we can see in Table 3, the RMSE is low which assess the reliability of our estimator. From this result, we can conclude that the estimated indices SXi

n,α, i= 1, 2 is near to the true values modulo the Monte- Carlo errors due

to the numerical simulation of the indices.

α SαX1 Sn,Xα 1 IC1 SXα2 S α n,X2 IC2 0.05 0.0929 0.0950 [0.0807, 0.1052] 0.7049 0.7005 [06917, 0.7181] 0.1 0.1176 0.1166 [0.1097, 0.1255] 0.6366 0.6397 [0.6259, 0.6474] 0.5 0.3069 0.3108 [0.3009, 0.3128] 0.3069 0.3042 [0.3010, 0.3127] 0.7 0.4491 0.4436 [0.4417, 0.4566] 0.2031 0.2026 [0.1980, 0.2082] 0.99 0.7974 0.7907 [0.7756, 0.8193] 0.0625 0.0630 [0.0309, 0.0941]

Table 2. Quantile contrast index confidence interval at level 95% for different values of α.

α SXα1 RM SE SXα2 RM SE 0.05 0.0929 0.0318 0.7049 0.0064

0.1 0.1176 0.0241 0.6366 0.0065 0.5 0.3069 0.0091 0.3069 0.0096 0.7 0.4491 0.0075 0.2031 0.0130 Table 3. Relative mean square error (RMSE).

Example 2. (Vasicek Model) Here we present a financial application of the use of quantile contrast index. We focus on the classical Vasicek model where the yield curve is given as an output of an instantaneous spot rate model with the following risk-neutral dynamics

(4.7) drt= a(b− rt)dt + σdWt

(10)

where A(t, T ) = exp  (b σ 2 2a2)(B(t, T )− (T − t)) − σ2 4aB 2(t, T )  and B(t, T ) = 1− e −a(T −t) a .

The yield-curve can be obtained as a deterministic transformation of zero-coupon bond prices at different maturities.

In what follows, we quantify the sensitivity of the input parameters{a, b, σ} affecting the uncertainty in the bond price at time t = 0. In the following numerical experiments, the maturity T and the initial spot rate r0are chosen

such that T = 1 and r0 = 10%. Table 4 reports the estimated quantile

contrast indices of the parameter a, b, σ for different values of α where the probability laws of these parameters are uniform on [0, 1]. For this model, no closed form formula is available. Contrary to Sobol indices, the sensitivity

α Sbaα Sbbα Sbσα 0.05 0.4961 0.5961 0.0210 0.1 0.1948 0.4036 0.0667 0.5 0.1722 0.2685 0.0508 0.7 0.1096 0.1679 0.1913 0.9 0.1053 0.0025 0.2928 0.99 0.4744 0.2596 0.3965

Table 4. Estimation of α-quantile contrast indices for dif-ferent values of α

of our model parameters strongly depend on the confidence level α. For α ∈ {0.05, 0.1, 0.5, }, most of the uncertainty in the bond price is due to the long-term mean-reversion level b and the speed of mean reversion a whereas for α = 0.9 or α = 0.99, we can see that a and σ becomes more important than the the long-term mean-reversion level b in terms of output uncertainty.

5. Conclusion

The main goal of this paper was to propose an estimation of the quantile oriented sensitivity indices. Our proposition if based on a rewriting of these indices using the Conditional Tail Expectation. This domain deserve much additional work in order to make the QOSA a useful and practical tool. In particular, their performance with respect to the procedure proposed in [5] has to be studied.

References

(11)

[2] Emanuele Borgonovo. Measuring uncertainty importance: investigation and comparison of alternative approaches. Risk analysis, 26(5):1349– 1361, 2006. 1

[3] Emanuele Borgonovo and Lorenzo Peccati. On the quantification and decomposition of uncertainty. In Uncertainty and Risk, pages 41–59. Springer, 2007. 2

[4] Damiano Brigo and Fabio Mercurio. Interest rate models-theory and

practice: with smile, inflation and credit. Springer Science &amp;

Busi-ness Media, 2007. 8

[5] Thomas Browne, Jean-Claude Fort, Bertrand IooSS, and Le Gratiet Lo¨ıc. Estimation of quantile-oriented sensitivity indices. 2017. 2, 9 [6] Michel Denuit, Jan Dhaene, Marc Goovaerts, and Rob Kaas. Actuarial

theory for dependent risks: measures, orders and models. John Wiley

&amp; Sons, 2006. 2

[7] Jean-Claude Fort, Thierry Klein, and Nabil Rachdi. New sensitivity analysis subordinated to a contrast. Communications in

Statistics-Theory and Methods, 45(15):4349–4364, 2016. 1, 2, 7

[8] Ali Gannoun, J´erˆome Saracco, and Keming Yu. Nonparametric predic-tion by condipredic-tional median and quantiles. Journal of statistical Planning

and inference, 117(2):207–223, 2003. 5

[9] Alexandre Janon. Analyse de sensibilit´e et r´eduction de dimension.

Ap-plication `a l’oc´eanographie. PhD thesis, Universit´e de Grenoble, 2012.

2, 4

[10] Alexandre Janon, Thierry Klein, Agnes Lagnoux, Ma¨elle Nodet, and Cl´ementine Prieur. Asymptotic normality and efficiency of two sobol index estimators. ESAIM: Probability and Statistics, 18:342–364, 2014. 4, 7

[11] Roger Koenker and Quanshui Zhao. Conditional quantile estimation and inference for arch models. Econometric Theory, 12(05):793–813, 1996. 5

[12] Ibrahima Niang. Quantification et methodes statistiques pour le risque

de modele. PhD thesis, Universit´e Claude Bernard Lyon 1, 2016. 2

[13] Andrea Saltelli. Making best use of model evaluations to compute sen-sitivity indices. Computer Physics Communications, 145(2):280–297, 2002. 4

[14] Andrea Saltelli, Marco Ratto, Terry Andres, Francesca Campolongo, Jessica Cariboni, Debora Gatelli, Michaela Saisana, and Stefano Taran-tola. Global sensitivity analysis: the primer. John Wiley &amp; Sons, 2008. 2

[15] Ilya M Sobol. Global sensitivity indices for nonlinear mathematical models and their monte carlo estimates. Mathematics and computers

in simulation, 55(1):271–280, 2001. 1, 4

[16] Ichiro Takeuchi, Quoc V Le, Timothy D Sears, and Alexander J Smola. Nonparametric quantile estimation. The Journal of Machine Learning

Research, 7:1231–1264, 2006. 5

[17] Stefano Tarantola, Debora Gatelli, and Thierry Alex Mara. Random balance designs for the estimation of first order global sensitivity indices.

(12)

Universit´e de Lyon, Universit´e Lyon 1, Institut Camille Jordan ICJ UMR 5208 CNRS

E-mail address: [email protected] Altran Research

Références

Documents relatifs

Very few studies in children have investigated the potential impact of prenatal exposure to organophosphate pesticides OP on sensory functions The association between urinary

In this section we construct an estimator that works well whenever the distribution of X is sufficiently spherical in the sense that a positive fraction of the eigenvalues of

In summary, the aims of this study are: (i) to propose a nonparametric estimation method for the aggregated Sobol’ indices in a given data frame- work when sample size is small (ii)

In this paper, we have designed a novel incremental discretized quantile esti- mator based on the theory of stochastic search on the line.. We emphasize that the estimator can be

In this study, we develop two kinds of wavelet estimators for the quantile density function g : a linear one based on simple projections and a nonlinear one based on a hard

But in the case when individual components are highly reliable (in the sense that failure rates are much smaller than the repair rates), this often requires a very long computation

In the second section, we prove asymptotic normality and efficiency when we consider a simple functional linear regression model.. These two properties are generalized in the

Of course it may happen that several goals are to be reached, then one can adopt a mixed strategy i.e compute various indices related to each goal and combine them to define