On the use of derivatives in the polynomial chaos based global sensitivity and uncertainty analysis applied to the distributed parameter models

(1)

HAL Id: hal-02608782

https://hal.inrae.fr/hal-02608782

Submitted on 29 Jul 2021

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Distributed under a Creative Commons Attribution| 4.0 International License

On the use of derivatives in the polynomial chaos based global sensitivity and uncertainty analysis applied to the

distributed parameter models

Igor Gejadze, Pierre-Olivier Malaterre, Victor Shutyaev

To cite this version:

Igor Gejadze, Pierre-Olivier Malaterre, Victor Shutyaev. On the use of derivatives in the polynomial chaos based global sensitivity and uncertainty analysis applied to the distributed parameter models.

Journal of Computational Physics, Elsevier, 2019, 381, pp.218-245. �10.1016/j.jcp.2018.12.023�. �hal-

02608782�

(2)

On the use of derivatives in the polynomial chaos based global sensitivity and uncertainty analysis applied to the distributed

parameter models

I. Gejadze

^a

^,∗ , P.-O. Malaterre

^a

, V. Shutyaev

^b

aUMRG-EAU,IRSTEACentredeMontpellier361RueJ.F.Breton,BP5095,34196,Montpellier,France

bMarchukInstituteofNumericalMathematics,RussianAcademyofSciences(RAS),FederalStateBudgetScientiﬁcInstitution“Marine HydrophysicalInstituteofRAS”,MoscowInstituteforPhysicsandTechnology,119333Gubkina8,Moscow,Russia

Sensitivity Analysis(SA)and Uncertainty Quantiﬁcation (UQ) are importantcomponents of modern numerical analysis.However, solving the relevant tasks involvinglarge-scale fluid flow models currently remains a serious challenge. The diﬃculties are associated with the computational cost of running such models and with large dimensions of the discretized model input. The common approach is to constructa metamodel – an inexpensiveapproximationtotheoriginalmodelsuitablefortheMonteCarlosimulations.

The polynomial chaos (PC) expansion is the most popular approach for constructing metamodels.Somefluidflowmodelsofinterestareinvolvedintheprocessofvariational estimation/dataassimilation.Thisimpliesthatthetangentlinearandadjointcounterparts ofsuchmodelsare availableand,therefore,computingthegradient (firstderivative)and theHessian(secondderivative)ofagivenfunctionofthestatevariablesispossible.New techniquesforSAand UQwhichbenefitfromusingthederivativesarepresented inthis paper. The gradient-enhancedregression methodsfor computingthe PCexpansionhave been developed recently. An intrinsic step ofthe regression methodis a minimization process.Itisoftenassumedthatgenerating‘data’fortheregressionproblemissignificantly moreexpensivethatsolvingtheregressionproblemitself.Thisdepends,however,onthe sizeoftheimportanceset,whichisasubsetof‘influential’inputs.Adistinguishingfeature ofthedistributedparameter modelsisthatthenumberofthesuchinputscould stillbe large,whichmeansthatsolvingtheregressionproblembecomesincreasinglyexpensive.In thispaperweproposeaderivative-enhancedprojectionmethod,wherenominimization isrequired.ThemethodisbasedontheexplicitrelationshipsbetweenthePCcoefficients and thederivatives, complementedwitharelatively inexpensivefilteringprocedure.The methodiscurrentlylimitedtothe PCexpansionofthethird order.Besides,wesuggest an improvedderivative-based globalsensitivity measure.The numericaltests havebeen performedfor the Burgers’model. Theresults confirmthat thelow-order PCexpansion obtainedbyourmethodrepresentsausefultoolformodelingthenon-gaussianbehavior ofthechosenquantityofinterest(QoI).

*

Correspondingauthor.

E-mailaddresses:igor.gejadze@irstea.fr(I. Gejadze),p-o.malaterre@irstea.fr(P.-O. Malaterre),victor.shutyaev@mail.ru(V. Shutyaev).

(3)

Introduction

SensitivityAnalysis (SA)andUncertaintyQuantiﬁcation (UQ)are importantcomponentsofmodern numericalanalysis [8].InSAonestudiestherelativeimportanceofdifferentinputsonthemodeloutput,whichisusuallyrepresentedbythe quantitiesofinterest(QoI).ThemainpurposeofSAisthedesignofoptimal(andfeasible)control,UQanddataassimilation (DA)algorithms.InUQonefocusesonquantifyingtheuncertaintyintheQoI,giventheuncertaintyinthemodelinputs.The moststraightforwardUQmethodisthedirectMonteCarlomethod,wherethemodelisruninaloopwiththeperturbed inputs;then the sample ofQoI is processedto obtain its statisticalcharacterization(skill scores, forexample). The basic MonteCarlomethodhasa numberofmoresophisticatedvariants,e.g. usingtheimportancesampling[46],quasi-random sequences[43],etc. However,forcomputationallyexpensivefluidflowmodelssuchasthosecommoningeophysics,these methods are not particularly suitable. While inthe absence of a viablealternative the Monte Carlomethod is presently usedinweatherandoceanforecasting(ensembleforecasting,[27]),theneedfordevelopmentofmoreeﬃcientmethodsis obvious.

Onepossibleapproachistousethe‘local’sensitivities.Inparticular,thecorresponding SAandUQmethodshavebeen recentlyappliedinthefieldofcomputationalfluiddynamics[22,38,3,15].Here,theadjoint(local)sensitivitiesareusedto evaluatethevarianceoftheuncertaintyinQoI(orthecovarianceforavectorofQoI).Theuncertaintyintheinputvector can also be either prior orposterior, i.e.before and after DA, respectively. The posterior covariance of the input uncertainty is approximatedby the inverseHessianof theauxiliary data assimilationcost function [16]. Theadjoint approach iscomputationallyvery efficient,howeverits accuracyfornonlinearmodelsdependsonthevalidity ofthetangentlinear approximation forthe errordynamics(caseofweak nonlinearityand/orsmalluncertainties).This isakey assumptionin theensembleforecasting[27], sincethemethodreliesoncomputingthesingular vectorsofthepropagatorconsidered at theoptimalsolutionpoint.Besides,inthecaseofstronglynonlineardynamics,boththeposteriorinputuncertaintyandthe uncertaintyinQoImayhavenon-gaussian probabilitydistributions, whichmeansthatknowing thevariance/covarianceis notsufficient.Thus,theglobalsensitivitybasedmethodshavetobeconsidered.

Analternativeoptionistodevelopthereduced-orderdynamicalmodels,eitherusingtheproperorthogonaldecomposi- tion(POD)[48,28] orthedynamicmode decomposition(DMD)[39,35].InthePODmethod,thedistributedstatevariables arerepresentedbythelinearcombinationsofthespatialmodes.Theserepresentationsaresubstitutedintothemodelequa- tionsandtheGalerkinprojectionisapplied.Inthisprocesstheoriginalmodelgovernedbythenonlinearpartialdifferential equationsistransformedintothesystemofnonlinearordinarydifferentialequationsforthetimedependentweightsasso- ciatedwiththePODmodes.Theresulting‘surrogate’modelismuchlessexpensivethantheoriginalone,anditisgenerally suitable for sampling. Themethod is intrusivein thesense that thesurrogate model haveto be actually created,which meansanadditionaldevelopmenteffort.Thisisnotalwayseasyaswell,despiteseveralsuccessfulexamplesbeingreported.

In the DMD method,one looks fora (linear) Koopmanoperator (modes, eigenvalues andeigenfunctions), which deﬁnes theevolutionmodelexplicitly.Thus,theDMDmethodisnon-intrusive,butreliesonalinearapproximationoftheoriginal propagator.ItisnotobviousthatusingthistypeoflinearizedmodelforUQisbetterthanjustusingthelocalsensitivities.

Anotherapproachistouseaphenomenologicalmodel,i.e.amodelwhichdescribestherelationshipbetweenthemodel inputsandoutputswithoutattemptingtoexplainwhythevariablesinteractthewaytheydo.Suchmodelcanbebasedon thepolynomialchaos(PC)expansion(foraconcisereviewofthePCmethodsthereadermayreferto[26]).

IntheintrusiveversionofthePC-basedUQtechnique,calledthe‘directnonsamplingmethod’,see[30],theuncertainty- loaded model variables are expressed using the PC expansions. These are substituted into the model equations andthe Galerkinprojectionisapplied.TechnicallythisapproachissomewhatsimilartotheoneusedwithPOD,howevertheresult- ingmodelhastobesolvedjustonce(nosamplingrequired).

In ourpaperwe focus ona non-intrusiveapproach,where thePC expansion describestherelationship betweenthe modelinputsandoutputsdirectly.Thetaskofconstructingthecontinuousresponsesurface,givenafinitesampleofmodel responses, is solved by thetwo main approaches: regressionand projection. Inthe regressionmethods one has to min- imizethe appropriate cost function. Afew typesof regressionmethods are used: the leastsquares regression [5,11,21], l1-minimization[33],andleastangleregression[7],adaptivemethods[1].Intheprojectionmethodsonebenefitsfromor- thogonalityofthepolynomialbasesinvolved,whichallowsthePCexpansiontobecomputedasmultidimensionalintegrals [49,14,17,18]. In its original form, i.e. involving a full-tensor product quadrature, this method is not feasible for high- dimensionalproblems.Themajor upgradesofthismethodhavebeendestinedto improvetheintegration efficiency.They include,forexample,theuseofsparsegridsamplingtechniques[40],nestedquadratures(Clenshaw–Curtis),quasi-random sequences[43] orLatinHypercubesampling[9].Inallcases,however,theintegralsoverthemodelresponsefunction(QoI) areconsidered.Asopposedtothisclassicalapproach,inthepresentedpapertheintegralsoverthederivativesofthemodel responsefunctionareconsidered,whichgivessomeessentialbenefits.

Variationaldata assimilation(DA) is a methodbased onthe optimal control theory [25,29]. Presently, this methodis preferred for weather andocean forecasting in major operational centers around the globe, particularly in the form of theincremental4D-Var[13],andintheformoftheensemble4D-Var[12].Ingeneral, thevariationalDAmethodbelongs to thefamilyofvariationalestimation methods,whichare widely usedina varietyofapplications, includinggeophysics, astrophysics,bio-medical,mechanicalandaerospaceengineering. Oneusefulbonus associatedwiththenumericalmodels involvedintovariationalestimationistheavailabilityoftheadjointmodelwhichhelpsthegradient(ﬁrstderivative)ofthe standarddatamismatchcost-functiontobe calculatedinasinglemodelrun. Thisadjoint modelcan beeasily adaptedto

(4)

compute thegradientofanyotherQoIbeingdefinedasafunctionofthestatevariables.Moreover,computingtheHessian (second derivative)ofQoIisalsopossible,though moreexpensive, witha modificationoftheexisting adjointmodelinto thesecond-orderadjointmodel[47].Thus,atleasttwofirstderivativescanbeavailableinthediscussedframeworkforthe specifiedgroupofmodels.

The idea of the gradient-enhanced methods for computing the PC expansion has been introduced andimplemented almost simultaneously by severalauthors [2,34,32] in theregression framework. In thispaper we propose a derivative- enhanced projection method. It has been mentioned that the major drawback of the projection methods is a slow convergence of the corresponding integrals. However, the use of the derivatives acceleratesthis convergence.For example,ifthegradientsareused,theconvergenceofthecorrespondingintegralsforthefirst- andhigher-orderPCcoefficients isacceleratedbyorderofmagnitude.Besides,wedevelopafilteringprocedurewhichallowsthesamplingerrortobefur- therreduced. Anotherbenefitofusingthegradientsisan easywayofaccessingthe‘totaleffect’globalsensitivityindices [42].Theseareusedforrankinganddefining theimportanceset,i.e.apresumablysmallsubsetofthefullinputincluding themostinfluentialcomponents.Theproposedmethodiscurrentlylimitedtothecaseofthethird-orderPCexpansion.For the distributedparametersystemsa largedimension ofthefull modelinputis expected.Moreover, thedimension ofthe importancesetcanstillbelarge,inwhichcaseconsideringthePCexpansionofthefourth- andhigherordersdoesnotlook feasibleanyway.Whilethemethodisdevisedforthegaussianinputs(i.e.usingHermitpolynomials),itcanbeextendedto differenttypesofinputsusingotherpolynomialsoftheAskeyscheme.

Thepaperisorganizedasfollows.In§1wedeﬁneQoIasafunctionofthemodelinputvector(called‘designfunction’

hereafter),anditsPCexpansionusingthe‘probabilistic’Hermitpolynomials.In§2wedescribetheregressionapproach,its gradient-enhanced versionandtheproposed Hessian-enhancedversion.Thederivative-enhancedprojection methodintro- duced inthispaperis basedon theexplicit expressionsfor thePC expansioncoefficientsvia the derivativesofarbitrary order,presentedin§3(seeTheorem1,proof–inAppendix1).In§4therelationshipbetweentheknownderivative-based sensitivitymeasure[24] andtheglobalsensitivityindicesisanalyzed(seeTheorem2,proof–inAppendices2and3),and an improvedderivative-basedsensitivityestimate issuggested(seeCorollary 2).Theimplementationdetails,includingthe description ofthefilteringalgorithms,arepresentedin§5.Takingintoaccountthat thePCcoefficientsareestimatedand representedinthelimited-memoryform,theappropriate expressionsforthemetamodelsarederived in§6.Theproposed method is validatedusing theBurgers’ model: the modelequation, the adjoint equation andthe Hessian-vector product operator are definedin§7.The descriptionof numericalexperimentsandthe resultsof simulationsare presentedin §8.

The mainoutcomesofthe paperaresummarized intheConclusions,andmajor mathematicalproofsandderivations are collectedintheAppendixes.

1. Problemsetup

Letusconsider thenumericalmodeldescribing behaviorofa dynamical systemintermsofits state variables X∈X. GivenafullsetofthemodelinputsU∈U,themodelcanbeconsideredasa‘control-to-state’mappingM:U→X

X

=

M

(

U

),

(1)

whereX andUarethestateandcontrolspaces,correspondingly.

For modeling thesystem behavior the trueinput set _U¯ _must _be _speciﬁed. _Under_the _‘perfect_model’ _assumption _the followingcanbepostulated: _X¯=M

(

_U¯

)

.Inreality,somecomponentsof_U¯ _containuncertainties

ξ

¯∈U.Thus,insteadof_U¯ weuseitsbestavailableapproximation,eitherprior(background)orposterior(estimate):

U^∗

= ¯

U

+ ¯ ξ .

(2)

Becauseofthepresenceof

ξ

¯,thepredictedstate X|U^∗,thatis, X evaluated(orconditioned)onU^∗,alsocontainsanerror

δ

X=X|U^∗−X| ¯U.

Inmanypracticalsituationsthequantitiesofinterest(QoI)arerepresentedbythefunctionalofthestateD

(

X

)

∈G,where Gisthe‘design’spaceandD:X→Gisalinearornonlinearmapping.Thus,onecandeﬁneageneralized‘control-to-design’

mappingG:U→G,suchthat

D

(

M

(

U

)) =

G

(

U

).

(3)

Theuncertainty

ξ

¯ resultsintotheerror

ε =

G

(

_U

¯ + ¯ ξ ) −

G

(

_U

¯ ).

(4)

Fromnowonweshallcall

ε

^–^the^design^function.Quantiﬁcationof

ε

îsânîmportant^taskⁱⁿâ^varietyôfapplications.

Belowwesupposethattheinputspaceisﬁnite-dimensional,i.e.U=R^N^,^and

ξ

¯=

( ξ

¯1

, . . . , ξ

¯N

)

^T.Consider

ξ

¯ istheform

ξ ¯ =

R

(ξ ),

(5)

where

ξ

=

(ξ

1

, . . . , ξ

N

)

^Twith

ξ

i∼N

(

0

,

1

)

beingi.i.d.gaussianrandomvariableswithunitvariance,andRisatransformation fromthestandardnormaltoadesiredprobabilitydistribution.Forexample,ifR

(ξ )

=B¹^/²

ξ

,thenoneobtains

ξ

¯∼N

(

0

,

B

)

; ifR

(ξ )

=

μ

⁺

σ

exp

(ξ )

,onegetsthelog-normallydistributed

ξ

¯.Thus,thedesignfunctioncanﬁnallybedeﬁnedintheform

ε (ξ ) =

G

(

_U

¯ +

R

(ξ )) −

G

(

_U

¯ ).

(6)

Letusconsiderthepolynomialexpansionof

ε ^{(ξ )}

^using‘probabilistic’Hermitepolynomials:

ε (ξ ) =

b0H0

+

N i1=1

b_i₁H1

(ξ

_i₁

) +

N i1=1

i1

i2=1

b_i₁_i₂H2

(ξ

_i₁

, ξ

_i₂

) +

N i1=1

i1

i2=1 i2

i3=1

b_i₁_i₂_i₃H3

(ξ

_i₁

, ξ

_i₂

, ξ

_i₃

) + ...+

N i1=1

i1

i2=1

. . .

ip−1

i_p=1

b_i₁_i₂...ipH_p

(ξ

_i₁

, ξ

_i₂

, . . . , ξ

_i_p

),

(7)

where

H0

=

1

,

H1

(ξ

_i₁

) = ξ

_i₁

,

H2

(ξ

_i₁

, ξ

_i₂

) = ξ

_i₁

ξ

_i₂

− δ

_i₁_i₂

,

H3

(ξ

_i₁

, ξ

_i₂

, ξ

_i₃

) = ξ

_i₁

ξ

_i₂

ξ

_i₃

− ξ

_i₁

δ

_i₂_i₃

− ξ

_i₂

δ

_i₁_i₃

− ξ

_i₃

δ

_i₁_i₂

, ...

(8) The convergence of the series (7) for the gaussian entries

ξ

is proved in [10]. The total number of coeﬃcients b=

(

b₀

,

{b_i₁},{b_i₁_i₂},

. . . ,

{b_i₁_..._i_p})^T isgivenby

M

= (

N

+

p

)!

N

!

p

! ,

(9)

wherepisthechosenorderofpolynomials.Representation(7) isoftenreferredasthepolynomialchaos(PC)expansion.

2. Implicitapproach(least-square,regression)

Accordingto(9),computingthePCexpansionisfeasibleonlyforsmallN and p.Thus,oneshoulddeﬁne theinputsU^′ (and,correspondingly,

ξ

^′)havingthemostimpacton

ε (ξ )

.Theseinputsconstitute theimportance setofsize N^′,whichis expectedto bemuch smallerthan N.Fortheimportance set,thevector ofPCcoeﬃcientsis b^′.Belowwe dealwiththe importanceset,keepinghowevertheoriginalnotations(i.e.without^′)forthesakeofsimplicity.

Letusrepresent(7) inthedotproductform

ε (ξ ) =

^H

(ξ )

^b

,

where H

(ξ )

isavectorofweights(i.e.thevaluesofthecorrespondingHermitpolynomials).Intheleastsquare regression methodonelooksfortheminimumofthecost-function

J

= (

^b

) +

J0

,

(10)

with

J0

=

L l=1

ε

^∗

(ξ

^l

) −

^H

(ξ

^l

)

^b

2

,

(11)

where

ε

^∗

(ξ

^l

)

isthe computedvalue of thedesign function (6) at point

ξ

^l, and

(

^b

)

is a regularization term. The latter shouldbeusedifthesystemisunder-determinedandothertypesofregularizationarenotinvolved.Inpractice,inorderto getasensibleestimateofboneneedsanover-determinedsystemL=kM,wherek

>

2 atleast.Letusmentionthattheleast square methodisnottheonly suitableoptionforﬁndingvector b.Forexample,some authorspreferthel1-minimization (see[32,33] andthereferencespresentedthere).

Theidea ofusingthe ﬁrstderivative(gradient)of

ε (ξ )

intheUQcontextcomes fromtheoptimalcontrol theory[29], wheretheconceptof‘adjointequations’isintroduced.Theadjointmodelallows thegradientofanyfunction ofthestate variablestobecomputedatacostcomparabletothecostofcomputingthefunction itself.The gradientof(7) isavector ofsize N withelementsgivenby

∂ ε (ξ )

∂ξ

_i

=

N i1=1

b_i₁

∂

H1

(ξ

_i₁

)

∂ξ

_i

+

N i1=1

i1

i2=1

b_i₁_i₂

∂

H2

(ξ

_i₁

, ξ

_i₂

)

∂ξ

_i

+

N i1=1

i1

i2=1 i2

i3=1

b_i₁_i₂_i₃

∂

H3

(ξ

_i₁

, ξ

_i₂

, ξ

_i₃

)

∂ξ

_i

+ ...

(12)

i

=

1

, . . . ,

N

.

(6)

Theaboveexpressioncanbere-writteninthedotproductform

∂ ε (ξ )

∂ξ

_i

= ∂

^H

(ξ )

∂ξ

_i ^b¹

,

where b₁=

({

bi₁},{bi₁i₂},

. . . ,

{bi₁...ip})^T isa vector whichincludes all thePC coeﬃcientsexceptb0.Now, thegeneralized cost-functionlooksasfollows:

J

= (

^b

) +

J0

+

J1

,

(13)

with

J1

=

L1

l=1

N i=1

∂ ε

^∗

(ξ

^l

)

∂ξ

_i

− ∂

^H

(ξ

^l

)

∂ξ

_i b₁

2

,

(14)

where

∂ ε

^∗

(ξ

^l

)/∂ξ

is the gradient of the design function (6) computed atpoint

ξ

^l. Thus, one run of the adjoint model simultaneously provides ‘data’for N constraints.Theearliest reportsonthe gradient-enhanced PCexpansionmethod can befoundin[2,34].

Anaturalextensiontothismethodwouldbetoconsiderthesecondderivative (Hessian)of(6).Inparticular,weknow that the Hessian-vector product is given by the second-order adjoint model [47]. Using the Hessian-vector product, the eigenvaluedecomposition{λ_s

,

Us},s=1

, . . . ,

SoftheHessiancanbeobtainedusingtheLanczosmethod.Then,theHessian canbepresentedviaitseigenpairsinthelimited-memoryformasfollows:

:= ∂

²

ε (ξ )

∂ξ

²

=

S s=1

λ

_sUsU_s^T

.

(15)

Thesecondderivativeof(7) isasymmetricN×N matrixwithelements

∂

²

ε (ξ )

∂ξ

_i

∂ξ

_j

=

N i1=1

i1

i2=1

b_i₁_i₂

∂

²H2

(ξ

_i₁

, ξ

_i₂

)

∂ξ

_i

∂ξ

_j

+

N i1=1

i1

i2=1 i2

i3=1

b_i₁_i₂_i₃

∂

²H3

(ξ

_i₁

, ξ

_i₂

, ξ

_i₃

)

∂ξ

_i

∂ξ

_j

+ ...

(16)

Theaboveexpressioncanbere-writteninthedotproductform

∂

²

ε (ξ )

∂ξ

_i

∂ξ

_j

= ∂

²^H

(ξ )

∂ξ

_i

∂ξ

_j b₂

,

whereb₂=

({

bi₁i₂},

. . . ,

{bi₁...ip})^T.Now,thegeneralizedcost-functionlooksasfollows:

J

= (

^b

) +

J0

+

J1

+

J2

,

(17)

with

J2

=

L2

l=1

N i=1

i j=1

∂

²

ε

^∗

(ξ

^l

)

∂ξ

_i

∂ξ

_j

− ∂

²^H

(ξ

^l

)

∂ξ

_i

∂ξ

_j ^b²

2

,

(18)

where

∂

²

ε

^∗

(ξ

^l

)/∂ξ

²istheHessianof(6) computedatpoint

ξ

^l andpresentedintheform(15).

While the idea ofusingthe Hessian computationsinthe context ofthe PC expansionlooks attractive, its practicality dependsonthecircumstances. TheHessianisfarmoreexpensiveobjecttocompute thanthegradient.Foreach Hessian- vector product the tangent linear andadjoint models have to be run in a sequence (see §7.2). The number of Lanczos iterationsdependsontheeigenvaluespectrumoftheHessian.Ifthisspectrumcontainsawellseparatedsubsetof‘leading’

eigenvalues,thenumberofLanczositerationsisnearlyequivalent tothesizeofthissubset; itcould besigniﬁcantlylarger otherwise.Ontheotherhand,onecomputedHessiansimultaneouslyaddsN

(

N+1

)/

2 constraintsforeachl.

There isone importantaspectoftheregressionmethodwhich isoftenoverlooked.Itis usuallyassumedthat running themodelfordataismuchmoreexpensivethansolvingtheminimizationproblemfor(10).Infact,thisdependsonN (size of theimportance set). Letusconsider,for example,thethird-order PCexpansion p=3.In thiscase, computing H

(ξ

^l

)

^b for each l requires N+N

(

N+1

)

+N

(

N+1

)(

N+2

)/

2 multiplications. However, taking intoaccount the sparsity of the derivativesof H,computing

∂

^H

(ξ

^l

)/∂ξ

i·^b₁ requires N+N

(

N+1

)

multiplications,andcomputing

∂

²^H

(ξ

^l

)/(∂ξ

i

∂ξ

j

)

·^b₂ – justN multiplications,foreachl.Therefore,eachequationfrom(14) and(18) isaboutN andN²timeslessexpensivethan the one from(11), respectively. So,the question hereiswhat is moreexpensive:producing the‘data’, i.e.the sample of

∂ ε

^∗

/∂ξ, ∂

²

ε

^∗

/∂ξ

²,orresolvingthesubsequentminimizationproblem. Letusnotethatfordistributedsystems,thesizeof theimportancesetN couldstillbequitelarge.

(7)

WhiletheHessian-basedenhancementoftheregressionmethodscanbeusefulincertaincircumstancesoneneedsmore efficienttoolsallowingforlargerN.Belowwepresentan alternativemethodinwhichthederivativesareused,butmini- mizationisnotrequired.Thismethod,designedforthe3rd-orderPCexpansion,i.e.p≤3,isrelatedtothequasi-regression orprojectionmethods[4,3,17,18].Thekeyelementsofthismethodarethe2Dand3Dfilteringalgorithmswhichallowthe samplingerrortobeefficientlyremoved.

3. ExplicitexpressionsforthePCcoefficients

LetusdenotebyE[·]theexpectationwithrespecttothestandardnormaldistribution N

(

0

,

1

)

.ThePCexpansioncoeﬃ- cientsbcanbeexpressedviaexpectationsof

ε (ξ )

anditsderivatives.Forthehigh-dimensionaldistributedmodelsinmind welimitourselvesbyconsideringthethird-orderpolynomials.Themainresultsaresummarizedinthefollowingtheorem:

Theorem1.Forthecoeﬃcientsb₀

,

{b_i},{b_{i j}},{b_{i jk}},i=1

, . . . ,

N,1≤j

<

i,1≤k

<

j oftheexpansion(7),thefollowingexpressions hold:

b0

=

E

[ ε (ξ )],

(19)

β

_ib_i

=

E

[

H1

(ξ

_i

) ε (ξ )] =

E

H0

∂ ε (ξ )

∂ξ

_i

,

(20)

β

_{i j}b_{i j}

=

E H2

(ξ

_i

, ξ

_j

) ε (ξ )

=

E

H1

(ξ

_i

) ∂ ε (ξ )

∂ξ

_j

=

E

H0

∂

²

ε (ξ )

∂ξ

_i

∂ξ

_j

,

(21)

β

_{i jk}b_{i jk}

=

E H3

(ξ

_i

, ξ

_j

, ξ

_k

) ε (ξ )

=

E

H2

(ξ

_i

, ξ

_j

) ∂ ε (ξ )

∂ξ

_k

=

E

H1

(ξ

_i

) ∂

²

ε (ξ )

∂ξ

_j

∂ξ

_k

=

E

H0

∂

³

ε (ξ )

∂ξ

_i

∂ξ

_j

∂ξ

_k

,

(22)

where

β

_i

=

E

(

H1

(ξ

_i

))

²

=

1

,

i

=

1

, . . . ,

N

,

(23)

β

_{i j}

=

E

(

H2

(ξ

_i

, ξ

_j

))

²

=

2

!,

i

=

j

1 i

=

j

,

i

>

j

,

(24)

β

_{i jk}

=

E

(

H3

(ξ

_i

, ξ

_j

, ξ

_k

))

²

=

⎧ ⎪

⎪ ⎨

⎪ ⎪

⎩

3

!

i

=

j

=

k 2

!

i

=

j

=

k 2

!

i

=

j

=

k 1 i

=

j

=

k

.

(25)

Proofof thistheoremispresentedin Appendix1.Due tothesuper-symmetry, order ofindexingin(21) and(22) can be changed. The expressions forb via

ε (ξ )

(ﬁrst equalities inthe sequences (20)–(22)) are knownin the ‘projection’or

‘quasi-regression’methods.However,expressionsforbviathederivativesof

ε (ξ )

presentedherehavenotbeenpreviously reportedintheliterature.Theseexpressionscan beusedasabasisforexplicitevaluationofthePCcoeﬃcientsusingthe derivativesof

ε (ξ )

.

Proposition1.Onecanprovebyinductionthat E H_p

(ξ

_i₁

, . . . ξ

_i_p

) ε (ξ )

=

E

H_p₋1

(ξ

_i₁

, . . . ξ

_i_p₋₁

) ∂ ε (ξ )

∂ξ

_i_p

= . . . =

E

H0

∂

^p

ε (ξ )

∂ξ

_i₁

. . .∂ξ

_i_p

.

(26)

Thisimpliesthatthechainsofequalities(21) and(22))canbegeneralizedtothepolynomialsofarbitrarydegreep,with correspondingweights

β

.Letusnote,however,that forhigh-dimensionaldistributedmodelsonlycomputingtheﬁrsttwo derivatives(gradientandHessian)maybefeasible.

Corollary1.ThePC-representation(7) canbere-writtenviathederivativesofthedesignfunction

ε (ξ )

asfollows:

ε (ξ ) =

E

[ ε (ξ )] +

N i1=1

E

∂ ε (ξ )

∂ξ

_i₁

H1

(ξ

_i₁

) +

N i1=1

i1

i2=1

1

β

_i₁_i₂^E

H1

(ξ

_i₁

) ∂ ε (ξ )

∂ξ

_i₂

H2

(ξ

_i₁

, ξ

_i₂

)+

+

N i1=1

i1

i2=1 i2

i3=1

1

β

_i₁_i₂_i₃^E

H2

(ξ

_i₁

, ξ

_i₂

) ∂ ε ^{(ξ )}

∂ξ

_i₃

H3

(ξ

_i₁

, ξ

_i₂

, ξ

_i₃

) + ...,

(27)

or

(8)

ε (ξ ) =

E

[ ε (ξ )] +

N i1=1

E

∂ ε (ξ )

∂ξ

_i₁

H1

(ξ

_i₁

) +

N i1=1

i1

i2=1

1

β

_i₁_i₂^E

∂

²

ε (ξ )

∂ξ

_i₁

∂ξ

_i₂

H2

(ξ

_i₁

, ξ

_i₂

)+

+

N i1=1

i1

i2=1 i2

i3=1

1

β

_i₁_i₂_i₃^E

H1

(ξ

_i₁

) ∂

²

ε (ξ )

∂ξ

_i₂

∂ξ

_i₃

H3

(ξ

_i₁

, ξ

_i₂

, ξ

_i₃

) + ... .

(28)

TheseexpressionsprovidedifferentformsofwhatcanbereferredastheprobabilisticTaylorexpansion.

4. Globalsensitivityanalysis

Global sensitivity analysis (GSA) aims to quantify the relative impact of the inputs

ξ

on the design function

ε

. The commonlyusedGSAmethodisthevariance-basedmethodofSobol’sensitivityindices[41],alsoknownasANOVAdecom- position.Inthispaperwelimitourselvesbyconsideringtheﬁrst-order‘maineffect’indicesS_i andthe‘totaleffect’indices S_i^T,deﬁnedin[37] asfollows:

S_i

=

^{V ar}

(

E

[ ε (ξ )|ξ

_i

])

V ar

( ε (ξ ))

⁽²⁹⁾

S_i^T

=

^{V ar}

( ε ^{(ξ ))} −

V ar

(

E

[ ε (ξ )|ξ

−i

])

V ar

( ε (ξ )) ,

(30)

where

ε (ξ )|ξ

_i meansthe randomvalueof

ε

withthe inputcomponent

ξ

i beingﬁxedatits‘generic’value,

ξ

−i standsfor

‘all,buti’,andV ar

( ε (ξ ))

isthetotalvarianceof

ε (ξ )

.

Theuniquerelationshipbetweentheglobalsensitivityindices(GSI)andthePCcoeﬃcientshasbeenderivedin[6].For thePCexpansionintheform(7),whichisdifferentfromtheoneusedin[6],wecanwrite(seeAppendix3fordetail):

S_iV ar

( ε (ξ )) = β

_ib²_i

+ β

_iib²_ii

+ β

_iiib²_iii

+ ...

(31)

S_i^T _{V ar}

( ε (ξ )) = β

_ib²_i

+

i

j=1

β

_{i j}b²_{i j}

+

i

j=1

j k=1

β

_{i jk}b²_{i jk}

+ ...

(32)

Thederivative-basedglobalsensitivitymeasureshasbeenintroducedin[42].Ithasbeenprovedthatfor

ξ

beingavector ofrandomiidnormalvariables

ξ

i∼N

(

0

,

1

)

S_i^T _{V ar}

( ε (ξ )) ≤

E

∂ ε (ξ )

∂ξ

_i

2

.

(33)

The above formulagivesa convenienttool forcomputingthe upper-boundfor S_i^T,tobe used asabasis forranking the inputsand, subsequently,fordeﬁning theimportance set.Iftheﬁnal goalistoconstructthe probabilitydensityfunction (pdf) of

ε (ξ )

,usingtheupper-boundguarantiesthatallimportantvariablesareaccountedfor,howeverthecorresponding importancesetmaycontainsomelessimportantvariables.

The upper-bound estimate can be improvedif some ofthe PCcoeﬃcients are known.This followsfromthe theorem below:

Theorem2.Theexpectationofthesquaredgradientisgivenbytheseries

E

∂ ε ^{(ξ )}

∂ξ

_i

2

= ˜ β

_ib²_i

+

i

j=1

β ˜

_{i j}b²_{i j}

+

i

j=1

j k=1

β ˜

_{i jk}b²_{i jk}

+ ...,

(34)

where

β ˜

_i

=

1

,

i

=

1

, . . . ,

N

,

(35)

β ˜

_{i j}

=

₂

×

2

!,

i

=

j

1 i

=

j

,

i

>

j

,

(36)

β ˜

_{i jk}

=

⎧ ⎪

⎪ ⎨

⎪ ⎪

⎩

3

×

3

!

i

=

j

=

k 2

×

2

!

i

=

j

=

k 1

×

2

!

i

=

j

=

k 1 i

=

j

=

k

.

(37)

(9)

ProofofthistheoremispresentedinAppendix2.TheTheoremessentiallyreplicatessome oftheresultsfrom[44],but representstheminamoretransparentandconvenientform.

Corollary2.Foranygivenorderofpolynomials,theimprovedderivative-basedupper-boundforthe‘totaleffect’GSIisgivenbythe expression

S_i^T V ar

( ε (ξ )) ≤

E

∂ ε (ξ )

∂ξ

_i

2

−

i

j=1

β ˜

_{i j}

− β

_{i j}

β

_{i j}² ^E

2

H1

(ξ

_i

) ∂ ε (ξ )

∂ξ

_j

−

i

j=1

j k=1

β ˜

_{i jk}

− β

_{i jk}

β

_{i jk}² ^E

2

H2

(ξ

_i

, ξ

_j

) ∂ ε (ξ )

∂ξ

_k

− ...

(38)

ProofofCorollary2isfairlytrivial.Comparing(32) and(34) wenoticethat

E

∂ ε (ξ )

∂ξ

_i

2

−

S_i^T _{V ar}

( ε (ξ )) ≥

i

j=1

( β ˜

_{i j}

− β

_{i j}

)

b²_{i j}

+

i

j=1

j k=1

( β ˜

_{i jk}

− β

_{i jk}

)

b²_{i jk}

+ ...

For the inﬁnite series the above expression becomes an equality. Since

β

˜≥

β

, all terms in the series to the right are non-negative. Therefore, the more terms in the series are truncated, the stronger the inequality becomes. Formula(33) correspondstothecasewhenalltermsaretruncated.Thatiswhy(38) must(atleastintheory)providelessoverestimated upper-boundforS_i^T than(33).Takingintoaccountexpressionsforbi j

,

bi jk

, ...

from(20)–(22) weobtain(38).Itcanbeeasily seenthat,forthetruncatedseries,formula(32) givesalower-boundforS_i^T,whichmaynotbeareliablebasisfordeﬁning theimportanceset.

Inpractice,givenasampleoffunctionsandgradients{

ε

^∗

(ξ

^l

), ∂ ε

^∗

(ξ

^l

)/∂ξ

},l=1

,

· · ·

,

L,insteadof(38) onewoulduse

S_i^T _{V ar}

( ε (ξ )) <

L⁻¹

L i=1

∂ ε

^∗

(ξ

^l

)

∂ξ

_i

2

−

i

j=1

( β ˜

_{i j}

− β

_{i j}

)

b²_{i j}

−

i

j=1

j k=1

( β ˜

_{i jk}

− β

_{i jk}

)

b²_{i jk}

− ...,

(39)

where

V ar

( ε (ξ )) =

¹ L

−

1

L l=1

ε

^∗

(ξ

^l

) −

b0

2

(40)

isthesamplevarianceandbi j

,

bi jk

, ...

aretheestimatedPCexpansioncoeﬃcientsbasedonthissample.Ofcourse,forsmall samplesthereisnowarrantythattheupper-boundevaluatedfrom(39) isbetterthantheonewithoutcorrectionterms.

Letusﬁnallynote(considering(20) and(31))that

S_i_{V ar}

( ε (ξ )) ≥

E²

∂ ε ^{(ξ )}

∂ξ

_i

,

(41)

i.e.thesquaredexpectationofthegradientgivesalowerboundfortheﬁrst-order‘maineffect’indices.

As discussed above, given the sample of gradients one can evaluate the upperbound forthe ‘total effect’sensitivity indicesS_i^T

,

i=1

, . . . ,

N by(32).Then,theinputsU canberankedaccordingly,whichmeansthemappingi=I

(

i^′

)

between theoriginalandtherankedsequencesofelementsinU,aswell asitsinversei^′=I⁻¹

(

i

)

,aredeﬁned.Theimportanceset canbechosenbyconsidering N^′

<

N elements(havingthelargestS_i^T),whereN^′satisﬁesthecondition

N^′

i^′=1

S^T

I(i^′)

/

N

i=1

S_i^T

≈ ǫ ,

where

ǫ

isa giventhreshold.Below we dealwiththe importance setof size N^′, whichmeans that all indexesinvolved arei^′.Forthesakeofsimplicity,however,wekeeptheoriginalnotations,i.e.weuseN insteadofN^′,iinsteadofi^′,etc.

5. FinitesampleestimatesforthePCcoefficients 5.1. Zero- andﬁrst-ordercoeﬃcients

Ithasbeenalreadymentionedthattheexplicit(‘projection’ or‘quasi-regression’)methodsforestimatingthePCcoeﬃ- cientshavebeenpreviouslyreported,thoughwithoutthederivativesofthedesignfunctionbeingused.Thesemethodsare usefulifcomputingasuﬃcientlylargesampleofmodelresponsesisfeasible.

(10)

Notethatequations(19)–(22) representthestatisticalmomentsofamultivariaterandomfunction

ε

^∗

(ξ )

anditsderivatives.Intheory,theuncertainty(variance)inthesamplemean,variance,skewnessandkurtosisisproportionalto1

/

L,2

/

L, 6

/

Land24

/

L,respectively,whereListhesamplesize[23].Thesevaluesaresigniﬁcantlylargerinpractice.

Letusassumethat thefollowingestimatesofthePCcoeﬃcientshaveasuﬃcientlevelofstatisticalconditioningwith L=L^∗:

b0

=

L⁻¹

L

l=1

ε

^∗

(ξ

^l

),

b_i

=

L⁻¹

L l=1

∂ ε

^∗

(ξ

^l

)

∂ξ

_i

,

b_{i j}

= β

_{i j}⁻¹L⁻¹

L

l=1

∂

²

ε

^∗

(ξ

^l

)

∂ξ

_i

∂ξ

_j

.

(42)

Fortheestimates

b_i

=

L⁻¹

L

l=1

ξ

_i^l

ε

^∗

(ξ

^l

),

b_{i j}

= β

_{i j}⁻¹L⁻¹

L l=1

ξ

_i^l

∂ ε

^∗

(ξ

^l

)

∂ξ

_j

,

(43)

theequivalentlevelofstatisticalconditioningcanbeachievedwiththesamplesizeL

>

2×L^∗,andfor

b_{i j}

= β

_{i j}⁻¹L⁻¹

L l=1

(ξ

_i^l

ξ

^l_j

− δ

_i,j

) ε

^∗

(ξ

^l

)

(44)

– withthesamplesize L

>

6×L^∗.

Sincethecomputationalcostofthefunctionandofitsgradientareoftencomparable,computingthesampleofthegra- dientsofsizeL^∗isfeasible.Therefore,thesecondformulain(42) canbeusedforevaluatingbi.However,thecomputational cost ofthe Hessian is signiﬁcantly (50–100 times) larger, which means that the sample of Hessians of size L^∗ may not be available.Therefore,b_i_,_j havetobecomputedusingthesecondformulafrom(43). Thisrequireseitheralargersample size to be consistent with the estimatesof b0 and bi, or a specialtreatment to reduce the samplingerror. The ﬁltering proceduresforcomputingbi j andbi jk areintroducedbelow.

5.2. Second-ordercoeﬃcientswithﬁltering

The second-ordercoeﬃcientsb_{i j} canbe computedviathegradientof

ε (ξ )

usingthe secondformulain(43).Let A be an N×N matrixwiththeelements

Ai,j

=

L⁻¹

L

l=1

ξ

_i^l

∂ ε

^∗

(ξ

^l

)

∂ξ

_j

,

(45)

whereN istheimportancesetsize.Accordingtothelastequalityin(21),matrix AmustbesymmetricinthelimitL→ ∞. Forafinite L,however,itisnotsymmetricduetothesamplingerror,seeFig.3(left).Fromthisfigureonecanalsonotice that thepresented2D-fieldissmooth inthehorizontaldirection(index j),andoscillatory intheverticaldirection(index i).Indeed, asitfollows from(45), each rowof A is a transposedweighted averageof thegradient. Thelatter oftenisa sufficientlysmoothfunctionofthespatialcoordinates(fordistributedparametersystems).

Thepurposeofﬁlteringistoretrievethesymmetricpartfromanoisy2D‘image’.Letusdeﬁnethematrix–vectorproduct w=A^Tv,wherethesummationisperformedalongtheoscillatorydirectioni,asfollows:

w_j

=

N i=1

A_i,jv_i

=

L⁻¹

L l=1

∂ ε

^∗

(ξ

^l

)

∂ξ

^l_j

_N

i=1

ξ

_i^lv_i

,

j

=

1

, . . . ,

N

.

(46)

Thismatrix–vectorproductisusedintheLanczosalgorithmtoevaluateanumberS₁ ofeigenpairs(the largestmagnitude eigenvalues

λ

sandthecorrespondingeigenvectorsUs)ofmatrixA.Then,thelimited-memorysymmetricrepresentationof

Aisgivenbytheformula

A

ˆ =

S1

s=1

λ

_sUsU_s^T

,

(47)

andforthesecond-ordercoeﬃcientsweﬁnallyobtain

b_{i j}

= β

_{i j}⁻¹_A

ˆ

_{i j}

= β

_{i j}⁻¹

S1

s=1

λ

_sU_s,iU_s,j

.

(48)

Twoversionsofthisalgorithmarepossible.Inthe‘full-memory’versionmatrix Aisexplicitlycomputed(LN² multiplications)andstoredinmemory(N² numbers)beforetheeigenvalueanalysisstep.Inthiscasecomputingthematrix–vector