• Aucun résultat trouvé

Efficient evaluation of reliability-oriented sensitivity indices

N/A
N/A
Protected

Academic year: 2021

Partager "Efficient evaluation of reliability-oriented sensitivity indices"

Copied!
33
0
0

Texte intégral

(1)

HAL Id: hal-01689366

https://hal.archives-ouvertes.fr/hal-01689366

Submitted on 22 Jan 2018

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Efficient evaluation of reliability-oriented sensitivity indices

Gilles Defaux, Guillaume Perrin

To cite this version:

Gilles Defaux, Guillaume Perrin. Efficient evaluation of reliability-oriented sensitivity indices. Journal

of Scientific Computing, Springer Verlag, 2018. �hal-01689366�

(2)

indies

G.Perrin a

, G.Defaux a

a

CEA/DAM/DIF, F-91297, Arpajon,Frane

Abstrat

Therole ofsimulationkeeps inreasingforthe reliabilityanalysis ofom-

plex systems. Most ofthe time, these analysesan beredued toestimating

theprobabilityofourreneofanundesirableevent,alsoalledfailureprob-

ability,usingastohastimodelofthesystem. Iftheonsideredeventisrare,

sophistiated sample-basedproedures are generally introdued toget a rel-

evant estimate of the failureprobability. Based on the samples onstruted

for the evaluationof this estimate, this workdenes two typesof reliability-

oriented sensitivity indies. The rst ones are introdued to identify the

modelinputs whosevariabilityhas toberedued inpriority toderease this

probability. The seond ones are used to nd the model inputs whose dis-

tributionhas tobe partiularlywell-haraterizedfor the available estimate

to be realisti. It is also shown how these sensitivity indies an be derived

whenthetruemodelisapproximatedbyasurrogatemodel. Inpartiular,an

innovative proedure is proposed to take into aount the surrogate model

unertainty in the estimation of these sensitivity indies. The proposed ap-

proahis then appliedto the reliability analysis of aseries of numerialand

industrialexamples.

Keywords:

Sobol indies,Gaussianproess, sensitivity analysis, risk analysis.

1. Introdution

The reliability analysis of omplex systems is more and more quantied

using numerial simulations. Hene, omputer odes in the form

y(x) =

Email address: guillaume.perrin2ea.fr(G.Perrin)

(3)

g(x; d)

are generally involved. Here,

x = (x

1

, . . . , x

D

) ∈ X

is the vetor of stohasti inputs and

d

is the vetor of deterministi inputs. Formally, given an adapted threshold

q

, the problem of estimating the probability of ourreneofanundesirableevent,seenas

y(x)

exeeding

q

,anberedued

to omputing the probability

p := P

x

(y(x) > q) = E

x

1

y(x)>q

,

(1)

1

y(x)>q

=

( 1

if

y(x) > q,

0

otherwise

.

(2)

While

D

inreases and low values of

p

are onsidered, the evaluation

of

p

annot be handled with usual quadratures. Sampling tehniques are preferred, suh as the Monte Carlo Simulation (MCS) [34℄. In MCS, the

ode is omputed in a large number of inputs, and probability

p

is esti-

mated by ounting the number of responses that are above the threshold

q

. However, the square value of the oeient of variation of the estimator

provided by MCS is proportional to

1/p

. Hene, the numberof ode evalu-

ations required for MCS to estimate smallvalues of

p

(say

p < 10

−3 for the

onsidered appliations) quikly beomes burdensome. To irumvent this

problem, various approahes have been proposed. On the one hand, several

nonstatistial approahes, suh as the rst-order or seond-order reliability

methods (FORM/SORM) [14, 31, 18, 5℄, propose to approximate the limit

statefuntionasaparametrifuntion. Then,theseapproximationsareused

to evaluate

p

at a low omputational ost, but at the expense of a redued preision. On theotherhand,thesplittingmethods[16℄proposetorewrite

p

using a nite sequene of inreasing thresholds. Dependingon the hoie of

the thresholds,thevarianeoftheaggregatedestimatoran bemuhsmaller

than the one given by MCS, as itwillbeexplained inthe next setion.

In addition to this estimation of

p

, it is useful to quantify the impor-

tane of eah modelinput onthe failureprobability. This is the purpose of

reliability-orientedglobalsensitivityanalysis(ro-GSA).Here,thewordglobal

refers to the denition given by [36℄, as the whole variation domain of the

inputs is onsidered. Indeed, if one modelinput happens to be strongly in-

uential,itouldbeworthtrying todereaseitsvariability. Ontheontrary,

if one model inputseems tohave noinuene on

p

,itsvariabilityan be ne- gleted, resulting in a simpler model. Several methods have been proposed

(4)

howthe probabilityof exeeding

q

isaeted by xing one inputmodeltoa

given value. It an beshown that this is equivalent toomputing the Sobol

indies assoiated with the indiator funtion

1

y(x)>q (see Setion 3). How-

ever, dediated evaluationsof the ode are generally required toassess suh

indies[40℄,whihanbemuhmorenumerousthatthe onesrequiredtoget

a goodestimate of

p

. Hene, the rst objetive of this work is to propose a

methodusingnonparametristatististoevaluatesuhSobolindieswithout

additionalode evaluations.

Thesensitivityofprobability

p

toeahmodelinputanalsobeevaluated

by omparing the partial derivatives of

p

with respet to the statistialmo-

ments of eah model input. Using adapted strategies, suh derivatives an

be omputed as a simple post-proessing of the ode evaluations that were

arriedoutforthe estimationof

p

[33℄. WhereasSobolindiesareassoiated

with axed distributionfor the inputs,suhindiators assess thesensitivity

of

p

tosmallhangesof theinput distribution. Hene,the meanings ofthese twosetsofindiatorsare dierentbut omplementary. Ontheone hand,the

Sobol indies indiatethe modelinputs whose variability has tobe redued

inpriorityif wewantto derease

p

. On theother hand,the derivative-based indies show the model inputs whose statistial moments have to be par-

tiularly well-ontrolled to get a relevant estimation of

p

. In that prospet,

generalizing the works ahieved in [19℄, the seond objetive of this work is

topropose reliability-orientedindiesthatan onsider moregeneralmodi-

ationsof the modelinputsdistributions,and whihan be usedtoevaluate

ross-eets between modelinputs. Suh indies willalsobeomputed asa

simple post-proessingof the simulations used toestimate

p

.

When the numerial ost assoiated with one evaluation of the ode is

high (between several minutes to several days CPU), surrogate models are

ommonlyintroduedtoemulatethe time-demandingomputer ode forthe

estimation of

p

. Among these methods, the Gaussian proess regression

(GPR) method, or kriging, plays a major role. This is mostly due to its

ability to provide an unertainty on the evaluation of

p

that is due to the

substitution of the true ode by its emulator [35, 37℄. Finally, this paper

shows how to derive eah reliability-oriented sensitivity index in the ase

whenthe ode isreplaed by aGaussianemulator. Inpartiular, theimpat

of the emulator unertainty on the estimation of these indiesis quantied.

Theoutlineofthisworkisasfollows. First,Setion2brieyreviewsexist-

ing sample-basedmethodsfor estimating

p

. Setion3 introdues reliability-

(5)

on

p

. Another type of sensitivity indies is dened in Setion4, in order to quantify the robustness of the evaluation of

p

to small perturbations of the inputdistribution. Then,Setion5introduestheestimationoftheseindies

whentheomputerode isreplaedbyaGaussianemulator. Atlast,aseries

of examples are shown inSetion6 to illustratethe interest of the proposed

methods.

Notations

The following notations are adopted:

• x, y

orrespond to salars.

• X, Y

orrespond to integers.

• x, y

orrespond to vetors.

Let

x

i bethe omponentsof a vetor

x

.

For all

D

-dimensional vetor

x = (x

1

, . . . , x

D

)

, we denote by

x

−i

:=

(x

1

, . . . , x

i−1

, x

i+1

, . . . , x

D

)

the vetor that gathers allthe omponents

of

x

but the

i

th.

For all random vetor

x

,

E

x

[ · ]

and

V

x

[ · ]

denote the mathematial expetationand the varianeoperator assoiatedwiththe distribution

of

x

.

2. Bakground : sample-based methods to estimate probabilities

of exeeding thresholds

Let

S

be the system we are interested in,whose properties (dimensions, boundaryonditions,materialproperties...) anbeharaterizedbyavetor

of

D ≥ 1

parameters

x ∈ X

,where

X

isasubsetof

R

D. Vetor

x

ismodelled

by a random vetor to take into aount the fat that these parameters are

not perfetly known. The omponents of

x

are assumed to be statistially independent. Forall

1 ≤ i ≤ D

,let

X

i,

f

xi and

F

xi bethedenition domain,

the probabilitydensity funtion (PDF) and the umulative density funtion

(CDF) of omponent

x

i respetively. It follows that

(6)

X =

×

i=1D

X

i

, f

x

(x) = Y

D i=1

f

xi

(x

i

), F

x

(x) = Y

D i=1

F

xi

(x

i

),

(3)

where

f

x and

F

x are the PDF and the CDF of

x

respetively. In addition, let

y :

( X → R

x 7→ y(x)

(4)

be the real-valued deterministi mapping desribing the behaviour of

S

. In

this work,we areinterestedinthe evaluationofthe probability

p

for

y(x)

to

exeed a given threshold

q ∈ R

,

p := P

x

(y(x) > q) = Z

X

1

y(x)>q

f

x

(x)dx = E

x

1

y(x)>q

,

(5)

butalsointheidentiationoftheomponentsof

x

thatplaythemostimpor-

tant roles on this probability. We moreover assume that the omputational

ost assoiated with one evaluationof

y

is high (between several minutes to

several hours CPU), so that the number of ode evaluations is supposed to

bebounded (less than

10

3 forinstane). In thatontext, weare partiularly interested by methods that ould allow all the omputational budget to be

used atthe sametimeforthe estimationof

p

and forthesensitivityanalysis.

Asthemodelinputsareassumed independent,theyallanbeonsidered

as normally distributed, entred and of variane equal to 1 without loss of

generality. Indeed, an isoprobabilist transform an been applied to eah

modelinput[32, 24, 17℄,impatingneitherthe denitionof

p

northe results

of the sensitivity analysis. Therefore, in the following,

f

i

(x

i

) = ϕ(x

i

; 0, 1), x

i

∈ X

i

= R ,

(6)

where for all

(µ, σ)

in

R × R

+∗,

ϕ(x

i

; µ, σ) := 1

√ 2πσ exp

− (x

i

− µ)

2

2

.

(7)

The most famous sample-based method to estimate

p

is the MCS. If

x

n

, 1 ≤ n ≤ N

, denote

N

independent opiesof

x

,itiswellknown [34℄that

(7)

p

MC

:= 1 N

X

N n=1

1

y(xn)>q (8)

denes an unbiased estimator of

p

. The assoiated oeient of variation

veries

δ

MC2

= 1 − p

Np .

(9)

This approah is partiularly easy to implement, but requires a lot of

ode evaluationstogetaeptablevaluesfor

δ

MC. Alternatively,the splitting methodsrewrite

p

using a nite sequene of inreasing thresholds

(q

k

)

Kk=0,

p = P

x

(y(x) > q

K

| y(x) > q

K−1

) ×· · ·× P

x

(y(x) > q

1

| y(x) > q

0

) × P

x

(y(x) > q

0

),

(10)

with

q

0

= −∞

and

q

K

= q

. Then, lassial Monte Carlo estimators an

be proposed for eah onditional probability. All these estimators being

unbiased and independent, the mean of their produt is still equal to

p

.

However, the variane of the aggregated estimator strongly depends on the

hoie of the thresholds. In pratie, the sequene of thresholds is dened

on they,whihisgenerallyreferredasAdaptivesplitting[7℄. Inpartiular,

the Markov hain

y

k

:= (y(x) | y(x) > y

k−1

), y

0

= −∞ , k ≥ 1,

(11)

is alledaninreasing randomwalk. Andit an beshown that the ounting

randomvariableof thenumberofevents before

q

,whihisdenoted by

M :=

Card

{ k ≥ 1 | y

k

≤ q }

, follows a Poisson law with parameter

− log(p)

[41℄.

Hene, given

Q ≥ 1

independent randomounting variables

(M

q

)

1≤q≤Q,

p

MP

:=

1 − 1

Q

PQq=1Mq

(12)

alsodenesanunbiasedestimatorof

p

,whoseoeientofvariationisequal

to

p − log(p)/Q

. ThisapproahisreferredasMovingPartile(MP)method

inthe following. Anotherstrategy tohoose the dierentthresholds isgiven

by the Subset Simulation (SS) method. The interested reader may refer to

[1, 8℄for further details about this approah.

(8)

Indeed, if we fous on the Markov hain dened by Eq. (11),

y

k has to

be randomly generated onditionally greater than

y

k−1. This an be done

using theMetropolis-Hastingsalgorithm[22,15℄. If

T

isthenumberofsteps

that isused toontrolthe onvergene of the Markov haintoitsstationary

distribution, it follows that, on average,

1 − T log(p)

samples have to be

generatedtogetonerealizationoftheountingvariable

M

q. Thus, themean

total number of ode evaluations toget the

Q

samples

M

1

, . . . , M

Q is equal

to

N = Q(1 − T log(p))

. It follows that the oeient of variation of

δ

M P

an beapproximated as

δ

MP2

≈ log(p)(T log(p) − 1)

N ≈ T log(p)

2

N .

(13)

This has tobeompared to

δ

2MC

≈ 1/(pN)

for the rude MonteCarlo.

In MCS, SS and MP methods, let

p b

denote the best estimate of

p

we

get one the maximal omputational budget is attained. For eah of these

methods, it an be notied that the points

x

(k), where it was observed that

y(x

(k)

)

is greater than

q

, are independent realizations of the onditioned randomvetor

(x | y(x) > q)

. Letusgatheralltheserealisationsof

(x | y(x) >

q)

in the set

D

f

:=

x

(1)

, . . . , x

(N) . Hene, in the following,

N

denotes

the numberof pointsthat have been sampledin the failuredomain.

3. Compared inuene of the inputs variability on

p

Based on the estimated value of

p

and the elements of

D

f only, the

purpose of this setion is to identify the omponents of

x

, whose vari-

ability has to be redued in priority if we want to derease the value of

p

. To this end, it an be interesting to quantify the eet on

p

due to

the fat that

x

i is xed to the partiular value

x

i. Indeed, the higher

P

x

(y(x) > q) − P

x

−i

(y(x) > q | x

i

= x

i

)

2

, the more inuential

x

i. Thus,

averaging over

x

i, the quantity

E

x

i

h

P

x

(y(x) > q) − P

x

−i

(y(x) > q | x

i

)

2

i

= V

x

i

E

x

−i

1

y(x)>q

| x

i

(14)

(9)

an be used to analyse the sensitivity of

p

to model input

x

i. Normalizing thesequantitiesby

V

x

1

y(x)>s

,wendbakthewell-knownrstorderSobol

indies [38℄ assoiated with funtion

1

y(x)>q:

s

i

:= V

x

i

E

x

−i

1

y(x)>q

| x

i

V

x

1

y(x)>q

.

(15)

By onstrution, index

s

i indiates the variane of

1

y(x)>q aused by

x

i

individually. Thevarianeof

1

y(x)>q ausedby

x

i inludinginterationswith the omponents of

x

−i is given by the

i

th total Sobol index, denoted by

t

i,

whih veries:

t

i

:= 1 − V

x

−i

E

x

i

1

y(x)>q

| x

−i

V

x

1

y(x)>q

.

(16)

Based onEqs. (15) and (16), the omputation of

s

i and

t

i is nontrivial, sine

E

x

−i

[ · ]

and

V

x

−i

[ · ]

refertomultidimensionalintegrals. Thismotivated the introdution of various algorithms to redue the omputational ost of

theSobol'indies. Inpartiular,eientsample-basedmethodsanbefound

in[40, 39, 25,20℄toreplaethe naive and very expensive double-loopMCS.

However, inspiteof these developments,the numberof dediatedode eval-

uations that are needed for these methods is still very high. To irumvent

this problem,another approahis proposedin this paper, whihis based on

the Proposition 1, whoseproof has been moved toAppendix.

Proposition 1. For all

1 ≤ i ≤ D

, we have:

s

i

= p 1 − p V

x

i

f

xi|y(x)>q

(x

i

) f

xi

(x

i

)

,

(17)

t

i

= 1 − p 1 − p V

x

−i

f

x−i|y(x)>q

(x

−i

) f

x−i

(x

−i

)

,

(18)

where,for all

x

in

X

,

 

 

 

 

 

 

 

 

f

x|y(x)>q

(x) := 1

p 1

y(x)>q

f

x

(x), f

xi|y(x)>q

(x

i

) :=

Z

×

1≤j≤D, j6=iXj

f

x|y(x)>q

(x) Y

1≤j≤D, j6=i

dx

j

,

f

x−i|y(x)>q

(x

−i

) :=

Z

Xi

f

x|y(x)>q

(x)dx

i

.

(19)

(10)

indiatorfuntionareproportionaltothevarianesofthe ratiosbetween the

apriori PDFsof

x

i and

x

−iandtheirPDFsonditionedbythefatthat

y(x)

isgreaterthan

q

. In this work,weproposetoapproximatethesePDFsusing oneofthenonparametriapproahesdesribedin[30,29℄. Thesemethodsare

partiularlysuited for this kind of approximations, as the onstrutionthey

proposeonlyrequiresthe preseneofindependentrealizationsofthe random

vetor to be modelled. For eah

1 ≤ i ≤ D

, let

f b

xi|y(x)>q and

f b

x−i|y(x)>q

be these approximations of funtions

f

xi|y(x)>q and

f

x−i|y(x)>q based on the

elementsoftheset

D

f only. Sobolindies

s

iand

t

ianthenbeapproximated as:

s

i

≈ b s

i

:= p b 1 − p b V

x

i

"

f b

xi|y(x)>q

(x

i

) f

xi

(x

i

)

#

,

(20)

t

i

≈ b t

i

:= 1 − p b 1 − p b V

x

−i

"

f b

x−i|y(x)>q

(x

−i

) f

x−i

(x

−i

)

#

,

(21)

where it is reminded that

b p

is the estimated value of

p

based on one of

the sample-based methods presented in Setion 2. Finally, generating inde-

pendent realizations under

f b

xi and

f b

x−i being quik and easy, Monte-Carlo estimations of

b s

i and

b t

i an be alulatednumeriallywith aontrolled pre-

ision.

4. Robustness analysis of the estimation of

p

Indies

s

i and

t

i,whihare dened by Eqs. (17) and (18),are assoiated

with a xed distribution of the model inputs. In this setion, another type

of sensitivity indies is dened, whih an be used to quantify the robust-

ness of the evaluation of

p

to small hanges of the input distribution. As explained in Introdution, the information provided by these new indies is

omplementary to the information provided by

s

i and

t

i. When

s

i and

t

i

allow the identiation of the omponents of

x

whose variability has to be redued in priorityto derease the value of

p

, thesenew indies an be used

to identify the omponents of

x

whose distributions have to be partiularly well-haraterized for arelevant estimationof

p

.

Toquantify the robustness of the estimation of

p

tosmall hanges of the

inputdistribution,wegenerallyomputethegradientofthe failureprobabil-

ity with respet to the parameters that haraterize the PDF of the inputs.

Références

Documents relatifs

keywords: atmospheric model, numerical weather prediction, AROME model, HyMeX observations, ensemble prediction, surface perturbations, ensemble data assimilation..

In integrated corpus testing, however, the classifier is machine learned on the merged databases and this gives promisingly robust classification results, which suggest that

More precisely, we investigate the problem of efficient esti- mation of some general nonlinear functional based on the density of a pair of random variables.. Our approach follows

In the case where the contrast function is given by mean-contrast functions which correspond to Sobol index, there exist a pretty large literature dedicated to the estimation of

in: Berg- Schlosser, Dirk/Giegel, Hans-Joachim (eds) 1999: Perspektiven der Demokratie. Schneider, Herbert 1999: Local parties in German countryside. in: Saiz, Martin/Geser, Hans

Mig´ orski, Variational stability analysis of optimal control problems for systems governed by nonlinear second order evolution equations, J. Mig´ orski, Control problems for

(1998), that the service level is a direct trade off between various cost factors (typically holding vs. cost of stock out). Other would argue that it is determined by

In this paper, we investigate the relationships between crop biophysical parameters (LAI, SPAD, Humidity) and two airborne-derived spectral indices (NDVI: Normalized Vegetation