Efficient evaluation of reliability-oriented sensitivity indices

(1)

HAL Id: hal-01689366

https://hal.archives-ouvertes.fr/hal-01689366

Submitted on 22 Jan 2018

HAL

is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or

L’archive ouverte pluridisciplinaire

HAL, est

destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires

Eﬀicient evaluation of reliability-oriented sensitivity indices

Gilles Defaux, Guillaume Perrin

To cite this version:

Gilles Defaux, Guillaume Perrin. Eﬀicient evaluation of reliability-oriented sensitivity indices. Journal

of Scientific Computing, Springer Verlag, 2018. �hal-01689366�

(2)

indies

G.Perrin a

, G.Defaux a

a

CEA/DAM/DIF, F-91297, Arpajon,Frane

Abstrat

Therole ofsimulationkeeps inreasingforthe reliabilityanalysis ofom-

plex systems. Most ofthe time, these analysesan beredued toestimating

theprobabilityofourreneofanundesirableevent,alsoalledfailureprob-

ability,usingastohastimodelofthesystem. Iftheonsideredeventisrare,

sophistiated sample-basedproedures are generally introdued toget a rel-

evant estimate of the failureprobability. Based on the samples onstruted

for the evaluationof this estimate, this workdenes two typesof reliability-

oriented sensitivity indies. The rst ones are introdued to identify the

modelinputs whosevariabilityhas toberedued inpriority toderease this

probability. The seond ones are used to nd the model inputs whose dis-

tributionhas tobe partiularlywell-haraterizedfor the available estimate

to be realisti. It is also shown how these sensitivity indies an be derived

whenthetruemodelisapproximatedbyasurrogatemodel. Inpartiular,an

innovative proedure is proposed to take into aount the surrogate model

unertainty in the estimation of these sensitivity indies. The proposed ap-

proahis then appliedto the reliability analysis of aseries of numerialand

industrialexamples.

Keywords:

Sobol indies,Gaussianproess, sensitivity analysis, risk analysis.

1. Introdution

The reliability analysis of omplex systems is more and more quantied

using numerial simulations. Hene, omputer odes in the form

y(x) =

Email address: guillaume.perrin2ea.fr(G.Perrin)

(3)

g(x; d)

^are ^generally ^involved. ^Here,

x = (x

1

, . . . , x

D

) ∈ X

is the vetor of stohasti inputs and

d

^is ^the ^vetor ^of deterministi inputs. Formally, given an adapted threshold

q

^, ^the ^problem ^of ^estimating ^the probability of ourreneofanundesirableevent,seenas

y(x)

^exeeding

q

^,^an^be^redued

to omputing the probability

p := P

_x

(y(x) > q) = E

_x

1

_y(x)>q

,

⁽¹⁾

1

y(x)>q

=

( 1

^if

y(x) > q,

0

^otherwise

.

⁽²⁾

While

D

înreases ând ^low ^values ôf

p

âre ônsidered, ^the êvaluation

of

p

^annot ^be ^handled ^with ^usual quadratures. Sampling tehniques are preferred, suh as the Monte Carlo Simulation (MCS) [34℄. In MCS, the

ode is omputed in a large number of inputs, and probability

p

^is ^esti-

mated by ounting the number of responses that are above the threshold

q

^. ^However, ^the ^square ^value ôf ^the ôeient ôf ^variation ôf ^the êstimator

provided by MCS is proportional to

1/p

^. ^Hene, ^the ^numberôf ôde êvalu-

ations required for MCS to estimate smallvalues of

p

^(say

p < 10

⁻³ ^for ^the

onsidered appliations) quikly beomes burdensome. To irumvent this

problem, various approahes have been proposed. On the one hand, several

nonstatistial approahes, suh as the rst-order or seond-order reliability

methods (FORM/SORM) [14, 31, 18, 5℄, propose to approximate the limit

statefuntionasaparametrifuntion. Then,theseapproximationsareused

to evaluate

p

^at ^a ^low omputational ost, but at the expense of a redued preision. On theotherhand,thesplittingmethods[16℄proposetorewrite

p

using a nite sequene of inreasing thresholds. Dependingon the hoie of

the thresholds,thevarianeoftheaggregatedestimatoran bemuhsmaller

than the one given by MCS, as itwillbeexplained inthe next setion.

In addition to this estimation of

p

^, ît îs ûseful ^to ^quantify ^the împor-

tane of eah modelinput onthe failureprobability. This is the purpose of

reliability-orientedglobalsensitivityanalysis(ro-GSA).Here,thewordglobal

refers to the denition given by [36℄, as the whole variation domain of the

inputs is onsidered. Indeed, if one modelinput happens to be strongly in-

uential,itouldbeworthtrying todereaseitsvariability. Ontheontrary,

if one model inputseems tohave noinuene on

p

^,^itsvariabilityan be ne- gleted, resulting in a simpler model. Several methods have been proposed

(4)

howthe probabilityof exeeding

q

îsâeted ^by ^xing ône înput^model^toâ

given value. It an beshown that this is equivalent toomputing the Sobol

indies assoiated with the indiator funtion

1

_y(x)>q ^(see ^Setion ^3). ^How-

ever, dediated evaluationsof the ode are generally required toassess suh

indies[40℄,whihanbemuhmorenumerousthatthe onesrequiredtoget

a goodestimate of

p

^. ^Hene, ^the ^rst ôbjetive ôf ^this ^work îs ^to ^propose â

methodusingnonparametristatististoevaluatesuhSobolindieswithout

additionalode evaluations.

Thesensitivityofprobability

p

^toêah^modelînputânâlso^beêvaluated

by omparing the partial derivatives of

p

^with ^respet ^to ^the ^statistial^mo-

ments of eah model input. Using adapted strategies, suh derivatives an

be omputed as a simple post-proessing of the ode evaluations that were

arriedoutforthe estimationof

p

^[33℄. ^Whereas^Sobolîndiesâreâssoiated

with axed distributionfor the inputs,suhindiators assess thesensitivity

of

p

^to^small^hanges^of ^the^input distribution. Hene,the meanings ofthese twosetsofindiatorsare dierentbut omplementary. Ontheone hand,the

Sobol indies indiatethe modelinputs whose variability has tobe redued

inpriorityif wewantto derease

p

^. ^On ^the^other ^hand,^the ^derivative-based indies show the model inputs whose statistial moments have to be par-

tiularly well-ontrolled to get a relevant estimation of

p

^. ^In ^that ^prospet,

generalizing the works ahieved in [19℄, the seond objetive of this work is

topropose reliability-orientedindiesthatan onsider moregeneralmodi-

ationsof the modelinputsdistributions,and whihan be usedtoevaluate

ross-eets between modelinputs. Suh indies willalsobeomputed asa

simple post-proessingof the simulations used toestimate

p

^.

When the numerial ost assoiated with one evaluation of the ode is

high (between several minutes to several days CPU), surrogate models are

ommonlyintroduedtoemulatethe time-demandingomputer ode forthe

estimation of

p

^. ^Among ^these ^methods, ^the ^Gaussian ^proess ^regression

(GPR) method, or kriging, plays a major role. This is mostly due to its

ability to provide an unertainty on the evaluation of

p

^that ^is ^due ^to ^the

substitution of the true ode by its emulator [35, 37℄. Finally, this paper

shows how to derive eah reliability-oriented sensitivity index in the ase

whenthe ode isreplaed by aGaussianemulator. Inpartiular, theimpat

of the emulator unertainty on the estimation of these indiesis quantied.

Theoutlineofthisworkisasfollows. First,Setion2brieyreviewsexist-

ing sample-basedmethodsfor estimating

p

^. ^Setion³ ^introdues reliability-

(5)

on

p

^. ^Another ^type ^of sensitivity indies is dened in Setion4, in order to quantify the robustness of the evaluation of

p

^to ^small perturbations of the inputdistribution. Then,Setion5introduestheestimationoftheseindies

whentheomputerode isreplaedbyaGaussianemulator. Atlast,aseries

of examples are shown inSetion6 to illustratethe interest of the proposed

methods.

Notations

The following notations are adopted:

• x, y

^orrespond ^to ^salars.

• X, Y

^orrespond ^to ^integers.

• x, y

^orrespond ^to ^vetors.

•

^Let

x

i ^be^the ômponentsôf â ^vetor

x

^.

•

^F^or ^all

D

-dimensional vetor

x = (x

₁

, . . . , x

D

)

^, ^we ^denote ^by

x

_−i

:=

(x

1

, . . . , x

i−1

, x

i+1

, . . . , x

D

)

^the ^vetor ^that ^gathers ^all^the ^omponents

of

x

^but ^the

i

^th^.

•

^F^or ^all ^random ^vetor

x

^,

E

_x

[ · ]

^and

V

_x

[ · ]

^denote ^the mathematial expetationand the varianeoperator assoiatedwiththe distribution

of

x

^.

2. Bakground : sample-based methods to estimate probabilities

of exeeding thresholds

Let

S

^be ^the ^system ^we âre înterested în,^whose ^properties (dimensions, boundaryonditions,materialproperties...) anbeharaterizedbyavetor

of

D ≥ 1

^parameters

x ∈ X

,where

X

isasubsetof

R

^D. Vetor

x

^is^modelled

by a random vetor to take into aount the fat that these parameters are

not perfetly known. The omponents of

x

^are ^assumed ^to ^be statistially independent. Forall

1 ≤ i ≤ D

^,^let

X

_i,

f

xi ^and

F

xi ^be^the^denition ^domain,

the probabilitydensity funtion (PDF) and the umulative density funtion

(CDF) of omponent

x

i respetively. It follows that

(6)

X =

×

i=1D

X

_i

, f

x

(x) = Y

D i=1

f

xi

(x

i

), F

x

(x) = Y

D i=1

F

xi

(x

i

),

⁽³⁾

where

f

x ^and

F

x âre ^the ^PDF ând ^the ^CDF ôf

x

respetively. In addition, let

y :

( X → R

x 7→ y(x)

⁽⁴⁾

be the real-valued deterministi mapping desribing the behaviour of

S

^. ^In

this work,we areinterestedinthe evaluationofthe probability

p

^for

y(x)

^to

exeed a given threshold

q ∈ R

,

p := P

_x

(y(x) > q) = Z

X

1

_y(x)>q

f

x

(x)dx = E

_x

1

_y(x)>q

,

⁽⁵⁾

butalsointheidentiationoftheomponentsof

x

^that^play^the^most^impor-

tant roles on this probability. We moreover assume that the omputational

ost assoiated with one evaluationof

y

^is ^high ^(between ^several ^minutes ^to

several hours CPU), so that the number of ode evaluations is supposed to

bebounded (less than

10

³ ^forînstane). În ^thatôntext, ^weâre partiularly interested by methods that ould allow all the omputational budget to be

used atthe sametimeforthe estimationof

p

^and ^for^thesensitivityanalysis.

Asthemodelinputsareassumed independent,theyallanbeonsidered

as normally distributed, entred and of variane equal to 1 without loss of

generality. Indeed, an isoprobabilist transform an been applied to eah

modelinput[32, 24, 17℄,impatingneitherthe denitionof

p

^nor^the ^results

of the sensitivity analysis. Therefore, in the following,

f

i

(x

i

) = ϕ(x

i

; 0, 1), x

i

∈ X

_i

= R ,

⁽⁶⁾

where for all

(µ, σ)

ⁱⁿ

R × R

^+∗,

ϕ(x

i

; µ, σ) := 1

√ 2πσ exp

− (x

i

− µ)

²

2σ

²

.

⁽⁷⁾

The most famous sample-based method to estimate

p

^is ^the ^MCS. ^If

x

n

, 1 ≤ n ≤ N

^, ^denote

N

independent opiesof

x

^,^it^is^well^known ^[34℄^that

(7)

p

^MC

:= 1 N

X

N n=1

1

y(x_n)>q ⁽⁸⁾

denes an unbiased estimator of

p

^. ^The âssoiated ôeient ôf ^variation

veries

δ

MC²

= 1 − p

Np .

⁽⁹⁾

This approah is partiularly easy to implement, but requires a lot of

ode evaluationstogetaeptablevaluesfor

δ

^MC^. Alternatively,the splitting methodsrewrite

p

ûsing â ^nite ^sequene ôf înreasing ^thresholds

(q

k

)

^K_k=0^,

p = P

_x

(y(x) > q

K

| y(x) > q

K−1

) ×· · ·× P

_x

(y(x) > q

1

| y(x) > q

0

) × P

_x

(y(x) > q

0

),

(10)

with

q

0

= −∞

^and

q

K

= q

^. ^Then, ^lassial ^Monte ^Carlo ^estimators ^an

be proposed for eah onditional probability. All these estimators being

unbiased and independent, the mean of their produt is still equal to

p

^.

However, the variane of the aggregated estimator strongly depends on the

hoie of the thresholds. In pratie, the sequene of thresholds is dened

on they,whihisgenerallyreferredasAdaptivesplitting[7℄. Inpartiular,

the Markov hain

y

k

:= (y(x) | y(x) > y

k−1

), y

0

= −∞ , k ≥ 1,

⁽¹¹⁾

is alledaninreasing randomwalk. Andit an beshown that the ounting

randomvariableof thenumberofevents before

q

^,^whih^is^denoted ^by

M :=

Card

{ k ≥ 1 | y

k

≤ q }

^, ^follows ^a ^Poisson ^law ^with ^parameter

− log(p)

^[41℄.

Hene, given

Q ≥ 1

independent randomounting variables

(M

q

)

1≤q≤Q^,

p

^MP

:=

1 − 1

Q

^P^Q_q=1Mq

(12)

alsodenesanunbiasedestimatorof

p

^,^whoseôeientôf^variationîsêqual

to

p − log(p)/Q

^. ^Thisâpproahîs^referredâs^Moving^Partile^(MP)^method

inthe following. Anotherstrategy tohoose the dierentthresholds isgiven

by the Subset Simulation (SS) method. The interested reader may refer to

[1, 8℄for further details about this approah.

(8)

Indeed, if we fous on the Markov hain dened by Eq. (11),

y

k ^has ^to

be randomly generated onditionally greater than

y

k−1^. ^This ^an ^be ^done

using theMetropolis-Hastingsalgorithm[22,15℄. If

T

^is^the^number^of^steps

that isused toontrolthe onvergene of the Markov haintoitsstationary

distribution, it follows that, on average,

1 − T log(p)

^samples ^have ^to ^be

generatedtogetonerealizationoftheountingvariable

M

q^. ^Thus, ^the^mean

total number of ode evaluations toget the

Q

^samples

M

1

, . . . , M

Q ^is ^equal

to

N = Q(1 − T log(p))

^. Ît ^follows ^that ^the ôeient ôf ^variation ôf

δ

M P

an beapproximated as

δ

MP²

≈ log(p)(T log(p) − 1)

N ≈ T log(p)

²

N .

⁽¹³⁾

This has tobeompared to

δ

²MC

≈ 1/(pN)

^for ^the ^rude ^Monte^Carlo.

In MCS, SS and MP methods, let

p b

^denote ^the ^best ^estimate ^of

p

^we

get one the maximal omputational budget is attained. For eah of these

methods, it an be notied that the points

x

^(k)^, ^where ^it ^was ^observed ^that

y(x

^(k)

)

^is ^greater ^than

q

^, ^are independent realizations of the onditioned randomvetor

(x | y(x) > q)

^. ^Let^us^gather^all^theserealisationsof

(x | y(x) >

q)

ⁱⁿ ^the ^set

D

^f

:=

x

⁽¹⁾

, . . . , x

^(N^∗⁾ ^. ^Hene, ⁱⁿ ^the ^following,

N

^∗ ^denotes

the numberof pointsthat have been sampledin the failuredomain.

3. Compared inuene of the inputs variability on

p

Based on the estimated value of

p

ând ^the êlements ôf

D

^f ^only^, ^the

purpose of this setion is to identify the omponents of

x

^, ^whose ^vari-

ability has to be redued in priority if we want to derease the value of

p

^. ^Tô ^this ênd, ît ân ^be interesting to quantify the eet on

p

^due ^to

the fat that

x

i ^is ^xed ^to ^the ^partiular ^value

x

^⋆_i^. ^Indeed, ^the ^higher

P

_x

(y(x) > q) − P

_x

−i

(y(x) > q | x

i

= x

^⋆_i

)

2

, the more inuential

x

i^. ^Thus,

averaging over

x

i^, ^the ^quantity

E

_x

i

h

P

_x

(y(x) > q) − P

_x

−i

(y(x) > q | x

i

)

2

i

= V

_x

i

E

_x

−i

1

y(x)>q

| x

i

(14)

(9)

an be used to analyse the sensitivity of

p

^to ^model ^input

x

i^. Normalizing thesequantitiesby

V

_x

1

y(x)>s

,wendbakthewell-knownrstorderSobol

indies [38℄ assoiated with funtion

1

y(x)>q^:

s

i

:= V

_x

i

E

_x

−i

1

y(x)>q

| x

i

V

_x

1

y(x)>q

.

⁽¹⁵⁾

By onstrution, index

s

i ^indiates ^the ^variane ^of

1

_y(x)>q ^aused ^by

x

i

individually. Thevarianeof

1

y(x)>q ^aused^by

x

i ^inludinginterationswith the omponents of

x

_−i ^is ^given ^by ^the

i

^th ^total ^Sobol ^index, ^denoted ^by

t

i^,

whih veries:

t

i

:= 1 − V

_x

−i

E

_x

i

1

y(x)>q

| x

_−i

V

_x

1

_y(x)>q

.

⁽¹⁶⁾

Based onEqs. (15) and (16), the omputation of

s

i ^and

t

i ^is nontrivial, sine

E

_x

−i

[ · ]

^and

V

_x

−i

[ · ]

^refer^tomultidimensionalintegrals. Thismotivated the introdution of various algorithms to redue the omputational ost of

theSobol'indies. Inpartiular,eientsample-basedmethodsanbefound

in[40, 39, 25,20℄toreplaethe naive and very expensive double-loopMCS.

However, inspiteof these developments,the numberof dediatedode eval-

uations that are needed for these methods is still very high. To irumvent

this problem,another approahis proposedin this paper, whihis based on

the Proposition 1, whoseproof has been moved toAppendix.

Proposition 1. For all

1 ≤ i ≤ D

^, ^we ^have:

s

i

= p 1 − p V

_x

i

f

xi|y(x)>q

(x

i

) f

xi

(x

i

)

,

⁽¹⁷⁾

t

i

= 1 − p 1 − p V

_x

−i

f

^x_−i|y(x)>q

(x

−i

) f

x_−i

(x

−i

)

,

⁽¹⁸⁾

where,for all

x

ⁱⁿ

X

,

 

 



 

 

f

x|y(x)>q

(x) := 1

p 1

y(x)>q

f

x

(x), f

xi|y(x)>q

(x

i

) :=

Z

×

1≤j≤D, j6=iX_j

f

x|y(x)>q

(x) Y

1≤j≤D, j6=i

dx

j

,

f

x_−i|y(x)>q

(x

−i

) :=

Z

X_i

f

x|y(x)>q

(x)dx

i

.

(19)

(10)

indiatorfuntionareproportionaltothevarianesofthe ratiosbetween the

apriori PDFsof

x

i ^and

x

_−i^and^their^PDF^s^onditioned^by^the^fat^that

y(x)

isgreaterthan

q

^. ^In ^this ^work,^we^propose^toapproximatethesePDFsusing oneofthenonparametriapproahesdesribedin[30,29℄. Thesemethodsare

partiularlysuited for this kind of approximations, as the onstrutionthey

proposeonlyrequiresthe preseneofindependentrealizationsofthe random

vetor to be modelled. For eah

1 ≤ i ≤ D

^, ^let

f b

xi|y(x)>q ^and

f b

x_−i|y(x)>q

be these approximations of funtions

f

xi|y(x)>q ^and

f

^x_−i|y(x)>q ^based ^on ^the

elementsoftheset

D

^f ^only^. ^Sobol^indies

s

i^and

t

i^an^then^beapproximated as:

s

i

≈ b s

i

:= p b 1 − p b V

_x

i

"

f b

xi|y(x)>q

(x

i

) f

xi

(x

i

)

#

,

⁽²⁰⁾

t

i

≈ b t

i

:= 1 − p b 1 − p b V

_x

−i

"

f b

x_−i|y(x)>q

(x

−i

) f

^x_−i

(x

−i

)

#

,

⁽²¹⁾

where it is reminded that

b p

îs ^the êstimated ^value ôf

p

^based ôn ône ôf

the sample-based methods presented in Setion 2. Finally, generating inde-

pendent realizations under

f b

xi ^and

f b

x_−i ^being ^quik ^and ^easy, Monte-Carlo estimations of

b s

i ^and

b t

i ân ^be âlulated^numerially^with âôntrolled ^pre-

ision.

4. Robustness analysis of the estimation of

p

Indies

s

i ^and

t

i^,^whihâre ^dened ^by Êqs. ⁽¹⁷⁾ ând ^(18),âre âssoiated

with a xed distribution of the model inputs. In this setion, another type

of sensitivity indies is dened, whih an be used to quantify the robust-

ness of the evaluation of

p

^to ^small ^hanges ^of ^the ^input distribution. As explained in Introdution, the information provided by these new indies is

omplementary to the information provided by

s

i ^and

t

i^. ^When

s

i ^and

t

i

allow the identiation of the omponents of

x

^whose variability has to be redued in priorityto derease the value of

p

^, ^these^new îndies ân ^be ûsed

to identify the omponents of

x

^whose distributions have to be partiularly well-haraterized for arelevant estimationof

p

^.

Toquantify the robustness of the estimation of

p

^to^small ^hanges ^of ^the

inputdistribution,wegenerallyomputethegradientofthe failureprobabil-

ity with respet to the parameters that haraterize the PDF of the inputs.