HAL Id: hal-02608782
https://hal.inrae.fr/hal-02608782
Submitted on 29 Jul 2021
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
Distributed under a Creative Commons Attribution| 4.0 International License
On the use of derivatives in the polynomial chaos based global sensitivity and uncertainty analysis applied to the
distributed parameter models
Igor Gejadze, Pierre-Olivier Malaterre, Victor Shutyaev
To cite this version:
Igor Gejadze, Pierre-Olivier Malaterre, Victor Shutyaev. On the use of derivatives in the polynomial chaos based global sensitivity and uncertainty analysis applied to the distributed parameter models.
Journal of Computational Physics, Elsevier, 2019, 381, pp.218-245. �10.1016/j.jcp.2018.12.023�. �hal-
02608782�
On the use of derivatives in the polynomial chaos based global sensitivity and uncertainty analysis applied to the distributed
parameter models
I. Gejadze
a,∗ , P.-O. Malaterre
a, V. Shutyaev
baUMRG-EAU,IRSTEACentredeMontpellier361RueJ.F.Breton,BP5095,34196,Montpellier,France
bMarchukInstituteofNumericalMathematics,RussianAcademyofSciences(RAS),FederalStateBudgetScientificInstitution“Marine HydrophysicalInstituteofRAS”,MoscowInstituteforPhysicsandTechnology,119333Gubkina8,Moscow,Russia
Sensitivity Analysis(SA)and Uncertainty Quantification (UQ) are importantcomponents of modern numerical analysis.However, solving the relevant tasks involvinglarge-scale fluid flow models currently remains a serious challenge. The difficulties are associated with the computational cost of running such models and with large dimensions of the discretized model input. The common approach is to constructa metamodel – an inexpensiveapproximationtotheoriginalmodelsuitablefortheMonteCarlosimulations.
The polynomial chaos (PC) expansion is the most popular approach for constructing metamodels.Somefluidflowmodelsofinterestareinvolvedintheprocessofvariational estimation/dataassimilation.Thisimpliesthatthetangentlinearandadjointcounterparts ofsuchmodelsare availableand,therefore,computingthegradient (firstderivative)and theHessian(secondderivative)ofagivenfunctionofthestatevariablesispossible.New techniquesforSAand UQwhichbenefitfromusingthederivativesarepresented inthis paper. The gradient-enhancedregression methodsfor computingthe PCexpansionhave been developed recently. An intrinsic step ofthe regression methodis a minimization process.Itisoftenassumedthatgenerating‘data’fortheregressionproblemissignificantly moreexpensivethatsolvingtheregressionproblemitself.Thisdepends,however,onthe sizeoftheimportanceset,whichisasubsetof‘influential’inputs.Adistinguishingfeature ofthedistributedparameter modelsisthatthenumberofthesuchinputscould stillbe large,whichmeansthatsolvingtheregressionproblembecomesincreasinglyexpensive.In thispaperweproposeaderivative-enhancedprojectionmethod,wherenominimization isrequired.ThemethodisbasedontheexplicitrelationshipsbetweenthePCcoefficients and thederivatives, complementedwitharelatively inexpensivefilteringprocedure.The methodiscurrentlylimitedtothe PCexpansionofthethird order.Besides,wesuggest an improvedderivative-based globalsensitivity measure.The numericaltests havebeen performedfor the Burgers’model. Theresults confirmthat thelow-order PCexpansion obtainedbyourmethodrepresentsausefultoolformodelingthenon-gaussianbehavior ofthechosenquantityofinterest(QoI).
*
Correspondingauthor.E-mailaddresses:igor.gejadze@irstea.fr(I. Gejadze),p-o.malaterre@irstea.fr(P.-O. Malaterre),victor.shutyaev@mail.ru(V. Shutyaev).
Introduction
SensitivityAnalysis (SA)andUncertaintyQuantification (UQ)are importantcomponentsofmodern numericalanalysis [8].InSAonestudiestherelativeimportanceofdifferentinputsonthemodeloutput,whichisusuallyrepresentedbythe quantitiesofinterest(QoI).ThemainpurposeofSAisthedesignofoptimal(andfeasible)control,UQanddataassimilation (DA)algorithms.InUQonefocusesonquantifyingtheuncertaintyintheQoI,giventheuncertaintyinthemodelinputs.The moststraightforwardUQmethodisthedirectMonteCarlomethod,wherethemodelisruninaloopwiththeperturbed inputs;then the sample ofQoI is processedto obtain its statisticalcharacterization(skill scores, forexample). The basic MonteCarlomethodhasa numberofmoresophisticatedvariants,e.g. usingtheimportancesampling[46],quasi-random sequences[43],etc. However,forcomputationallyexpensivefluidflowmodelssuchasthosecommoningeophysics,these methods are not particularly suitable. While inthe absence of a viablealternative the Monte Carlomethod is presently usedinweatherandoceanforecasting(ensembleforecasting,[27]),theneedfordevelopmentofmoreefficientmethodsis obvious.
Onepossibleapproachistousethe‘local’sensitivities.Inparticular,thecorresponding SAandUQmethodshavebeen recentlyappliedinthefieldofcomputationalfluiddynamics[22,38,3,15].Here,theadjoint(local)sensitivitiesareusedto evaluatethevarianceoftheuncertaintyinQoI(orthecovarianceforavectorofQoI).Theuncertaintyintheinputvector can also be either prior orposterior, i.e.before and after DA, respectively. The posterior covariance of the input uncer- tainty is approximatedby the inverseHessianof theauxiliary data assimilationcost function [16]. Theadjoint approach iscomputationallyvery efficient,howeverits accuracyfornonlinearmodelsdependsonthevalidity ofthetangentlinear approximation forthe errordynamics(caseofweak nonlinearityand/orsmalluncertainties).This isakey assumptionin theensembleforecasting[27], sincethemethodreliesoncomputingthesingular vectorsofthepropagatorconsidered at theoptimalsolutionpoint.Besides,inthecaseofstronglynonlineardynamics,boththeposteriorinputuncertaintyandthe uncertaintyinQoImayhavenon-gaussian probabilitydistributions, whichmeansthatknowing thevariance/covarianceis notsufficient.Thus,theglobalsensitivitybasedmethodshavetobeconsidered.
Analternativeoptionistodevelopthereduced-orderdynamicalmodels,eitherusingtheproperorthogonaldecomposi- tion(POD)[48,28] orthedynamicmode decomposition(DMD)[39,35].InthePODmethod,thedistributedstatevariables arerepresentedbythelinearcombinationsofthespatialmodes.Theserepresentationsaresubstitutedintothemodelequa- tionsandtheGalerkinprojectionisapplied.Inthisprocesstheoriginalmodelgovernedbythenonlinearpartialdifferential equationsistransformedintothesystemofnonlinearordinarydifferentialequationsforthetimedependentweightsasso- ciatedwiththePODmodes.Theresulting‘surrogate’modelismuchlessexpensivethantheoriginalone,anditisgenerally suitable for sampling. Themethod is intrusivein thesense that thesurrogate model haveto be actually created,which meansanadditionaldevelopmenteffort.Thisisnotalwayseasyaswell,despiteseveralsuccessfulexamplesbeingreported.
In the DMD method,one looks fora (linear) Koopmanoperator (modes, eigenvalues andeigenfunctions), which defines theevolutionmodelexplicitly.Thus,theDMDmethodisnon-intrusive,butreliesonalinearapproximationoftheoriginal propagator.ItisnotobviousthatusingthistypeoflinearizedmodelforUQisbetterthanjustusingthelocalsensitivities.
Anotherapproachistouseaphenomenologicalmodel,i.e.amodelwhichdescribestherelationshipbetweenthemodel inputsandoutputswithoutattemptingtoexplainwhythevariablesinteractthewaytheydo.Suchmodelcanbebasedon thepolynomialchaos(PC)expansion(foraconcisereviewofthePCmethodsthereadermayreferto[26]).
IntheintrusiveversionofthePC-basedUQtechnique,calledthe‘directnonsamplingmethod’,see[30],theuncertainty- loaded model variables are expressed using the PC expansions. These are substituted into the model equations andthe Galerkinprojectionisapplied.TechnicallythisapproachissomewhatsimilartotheoneusedwithPOD,howevertheresult- ingmodelhastobesolvedjustonce(nosamplingrequired).
In ourpaperwe focus ona non-intrusiveapproach,where thePC expansion describestherelationship betweenthe modelinputsandoutputsdirectly.Thetaskofconstructingthecontinuousresponsesurface,givenafinitesampleofmodel responses, is solved by thetwo main approaches: regressionand projection. Inthe regressionmethods one has to min- imizethe appropriate cost function. Afew typesof regressionmethods are used: the leastsquares regression [5,11,21], l1-minimization[33],andleastangleregression[7],adaptivemethods[1].Intheprojectionmethodsonebenefitsfromor- thogonalityofthepolynomialbasesinvolved,whichallowsthePCexpansiontobecomputedasmultidimensionalintegrals [49,14,17,18]. In its original form, i.e. involving a full-tensor product quadrature, this method is not feasible for high- dimensionalproblems.Themajor upgradesofthismethodhavebeendestinedto improvetheintegration efficiency.They include,forexample,theuseofsparsegridsamplingtechniques[40],nestedquadratures(Clenshaw–Curtis),quasi-random sequences[43] orLatinHypercubesampling[9].Inallcases,however,theintegralsoverthemodelresponsefunction(QoI) areconsidered.Asopposedtothisclassicalapproach,inthepresentedpapertheintegralsoverthederivativesofthemodel responsefunctionareconsidered,whichgivessomeessentialbenefits.
Variationaldata assimilation(DA) is a methodbased onthe optimal control theory [25,29]. Presently, this methodis preferred for weather andocean forecasting in major operational centers around the globe, particularly in the form of theincremental4D-Var[13],andintheformoftheensemble4D-Var[12].Ingeneral, thevariationalDAmethodbelongs to thefamilyofvariationalestimation methods,whichare widely usedina varietyofapplications, includinggeophysics, astrophysics,bio-medical,mechanicalandaerospaceengineering. Oneusefulbonus associatedwiththenumericalmodels involvedintovariationalestimationistheavailabilityoftheadjointmodelwhichhelpsthegradient(firstderivative)ofthe standarddatamismatchcost-functiontobe calculatedinasinglemodelrun. Thisadjoint modelcan beeasily adaptedto
compute thegradientofanyotherQoIbeingdefinedasafunctionofthestatevariables.Moreover,computingtheHessian (second derivative)ofQoIisalsopossible,though moreexpensive, witha modificationoftheexisting adjointmodelinto thesecond-orderadjointmodel[47].Thus,atleasttwofirstderivativescanbeavailableinthediscussedframeworkforthe specifiedgroupofmodels.
The idea of the gradient-enhanced methods for computing the PC expansion has been introduced andimplemented almost simultaneously by severalauthors [2,34,32] in theregression framework. In thispaper we propose a derivative- enhanced projection method. It has been mentioned that the major drawback of the projection methods is a slow convergence of the corresponding integrals. However, the use of the derivatives acceleratesthis convergence.For exam- ple,ifthegradientsareused,theconvergenceofthecorrespondingintegralsforthefirst- andhigher-orderPCcoefficients isacceleratedbyorderofmagnitude.Besides,wedevelopafilteringprocedurewhichallowsthesamplingerrortobefur- therreduced. Anotherbenefitofusingthegradientsisan easywayofaccessingthe‘totaleffect’globalsensitivityindices [42].Theseareusedforrankinganddefining theimportanceset,i.e.apresumablysmallsubsetofthefullinputincluding themostinfluentialcomponents.Theproposedmethodiscurrentlylimitedtothecaseofthethird-orderPCexpansion.For the distributedparametersystemsa largedimension ofthefull modelinputis expected.Moreover, thedimension ofthe importancesetcanstillbelarge,inwhichcaseconsideringthePCexpansionofthefourth- andhigherordersdoesnotlook feasibleanyway.Whilethemethodisdevisedforthegaussianinputs(i.e.usingHermitpolynomials),itcanbeextendedto differenttypesofinputsusingotherpolynomialsoftheAskeyscheme.
Thepaperisorganizedasfollows.In§1wedefineQoIasafunctionofthemodelinputvector(called‘designfunction’
hereafter),anditsPCexpansionusingthe‘probabilistic’Hermitpolynomials.In§2wedescribetheregressionapproach,its gradient-enhanced versionandtheproposed Hessian-enhancedversion.Thederivative-enhancedprojection methodintro- duced inthispaperis basedon theexplicit expressionsfor thePC expansioncoefficientsvia the derivativesofarbitrary order,presentedin§3(seeTheorem1,proof–inAppendix1).In§4therelationshipbetweentheknownderivative-based sensitivitymeasure[24] andtheglobalsensitivityindicesisanalyzed(seeTheorem2,proof–inAppendices2and3),and an improvedderivative-basedsensitivityestimate issuggested(seeCorollary 2).Theimplementationdetails,includingthe description ofthefilteringalgorithms,arepresentedin§5.Takingintoaccountthat thePCcoefficientsareestimatedand representedinthelimited-memoryform,theappropriate expressionsforthemetamodelsarederived in§6.Theproposed method is validatedusing theBurgers’ model: the modelequation, the adjoint equation andthe Hessian-vector product operator are definedin§7.The descriptionof numericalexperimentsandthe resultsof simulationsare presentedin §8.
The mainoutcomesofthe paperaresummarized intheConclusions,andmajor mathematicalproofsandderivations are collectedintheAppendixes.
1. Problemsetup
Letusconsider thenumericalmodeldescribing behaviorofa dynamical systemintermsofits state variables X∈X. GivenafullsetofthemodelinputsU∈U,themodelcanbeconsideredasa‘control-to-state’mappingM:U→X
X
=
M(
U),
(1)whereX andUarethestateandcontrolspaces,correspondingly.
For modeling thesystem behavior the trueinput set U¯ must be specified. Underthe ‘perfectmodel’ assumption the followingcanbepostulated: X¯=M
(
U¯)
.Inreality,somecomponentsofU¯ containuncertaintiesξ
¯∈U.Thus,insteadofU¯ weuseitsbestavailableapproximation,eitherprior(background)orposterior(estimate):U∗
= ¯
U+ ¯ ξ .
(2)Becauseofthepresenceof
ξ
¯,thepredictedstate X|U∗,thatis, X evaluated(orconditioned)onU∗,alsocontainsanerrorδ
X=X|U∗−X| ¯U.Inmanypracticalsituationsthequantitiesofinterest(QoI)arerepresentedbythefunctionalofthestateD
(
X)
∈G,where Gisthe‘design’spaceandD:X→Gisalinearornonlinearmapping.Thus,onecandefineageneralized‘control-to-design’mappingG:U→G,suchthat
D
(
M(
U)) =
G(
U).
(3)Theuncertainty
ξ
¯ resultsintotheerrorε =
G(
U¯ + ¯ ξ ) −
G(
U¯ ).
(4)Fromnowonweshallcall
ε
–thedesignfunction.Quantificationofε
isanimportanttaskinavarietyofapplications.Belowwesupposethattheinputspaceisfinite-dimensional,i.e.U=RN,and
ξ
¯=( ξ
¯1, . . . , ξ
¯N)
T.Considerξ
¯ istheformξ ¯ =
R(ξ ),
(5)where
ξ
=(ξ
1, . . . , ξ
N)
Twithξ
i∼N(
0,
1)
beingi.i.d.gaussianrandomvariableswithunitvariance,andRisatransformation fromthestandardnormaltoadesiredprobabilitydistribution.Forexample,ifR(ξ )
=B1/2ξ
,thenoneobtainsξ
¯∼N(
0,
B)
; ifR(ξ )
=μ
+σ
exp(ξ )
,onegetsthelog-normallydistributedξ
¯.Thus,thedesignfunctioncanfinallybedefinedintheformε (ξ ) =
G(
U¯ +
R(ξ )) −
G(
U¯ ).
(6)Letusconsiderthepolynomialexpansionof
ε (ξ )
using‘probabilistic’Hermitepolynomials:ε (ξ ) =
b0H0+
N i1=1bi1H1
(ξ
i1) +
N i1=1i1
i2=1
bi1i2H2
(ξ
i1, ξ
i2) +
N i1=1i1
i2=1 i2
i3=1
bi1i2i3H3
(ξ
i1, ξ
i2, ξ
i3) + ...+
N i1=1i1
i2=1
. . .
ip−1
ip=1
bi1i2...ipHp
(ξ
i1, ξ
i2, . . . , ξ
ip),
(7)where
H0
=
1,
H1(ξ
i1) = ξ
i1,
H2(ξ
i1, ξ
i2) = ξ
i1ξ
i2− δ
i1i2,
H3
(ξ
i1, ξ
i2, ξ
i3) = ξ
i1ξ
i2ξ
i3− ξ
i1δ
i2i3− ξ
i2δ
i1i3− ξ
i3δ
i1i2, ...
(8) The convergence of the series (7) for the gaussian entriesξ
is proved in [10]. The total number of coefficients b=(
b0,
{bi1},{bi1i2},. . . ,
{bi1...ip})T isgivenbyM
= (
N+
p)!
N
!
p! ,
(9)wherepisthechosenorderofpolynomials.Representation(7) isoftenreferredasthepolynomialchaos(PC)expansion.
2. Implicitapproach(least-square,regression)
Accordingto(9),computingthePCexpansionisfeasibleonlyforsmallN and p.Thus,oneshoulddefine theinputsU′ (and,correspondingly,
ξ
′)havingthemostimpactonε (ξ )
.Theseinputsconstitute theimportance setofsize N′,whichis expectedto bemuch smallerthan N.Fortheimportance set,thevector ofPCcoefficientsis b′.Belowwe dealwiththe importanceset,keepinghowevertheoriginalnotations(i.e.without′)forthesakeofsimplicity.Letusrepresent(7) inthedotproductform
ε (ξ ) =
H(ξ )
b,
where H
(ξ )
isavectorofweights(i.e.thevaluesofthecorrespondingHermitpolynomials).Intheleastsquare regression methodonelooksfortheminimumofthecost-functionJ
= (
b) +
J0,
(10)with
J0
=
L l=1ε
∗(ξ
l) −
H(ξ
l)
b 2,
(11)where
ε
∗(ξ
l)
isthe computedvalue of thedesign function (6) at pointξ
l, and(
b)
is a regularization term. The latter shouldbeusedifthesystemisunder-determinedandothertypesofregularizationarenotinvolved.Inpractice,inorderto getasensibleestimateofboneneedsanover-determinedsystemL=kM,wherek>
2 atleast.Letusmentionthattheleast square methodisnottheonly suitableoptionforfindingvector b.Forexample,some authorspreferthel1-minimization (see[32,33] andthereferencespresentedthere).Theidea ofusingthe firstderivative(gradient)of
ε (ξ )
intheUQcontextcomes fromtheoptimalcontrol theory[29], wheretheconceptof‘adjointequations’isintroduced.Theadjointmodelallows thegradientofanyfunction ofthestate variablestobecomputedatacostcomparabletothecostofcomputingthefunction itself.The gradientof(7) isavector ofsize N withelementsgivenby∂ ε (ξ )
∂ξ
i=
N i1=1bi1
∂
H1(ξ
i1)
∂ξ
i+
N i1=1i1
i2=1
bi1i2
∂
H2(ξ
i1, ξ
i2)
∂ξ
i+
N i1=1i1
i2=1 i2
i3=1
bi1i2i3
∂
H3(ξ
i1, ξ
i2, ξ
i3)
∂ξ
i+ ...
(12)i
=
1, . . . ,
N.
Theaboveexpressioncanbere-writteninthedotproductform
∂ ε (ξ )
∂ξ
i= ∂
H(ξ )
∂ξ
i b1,
where b1=
({
bi1},{bi1i2},. . . ,
{bi1...ip})T isa vector whichincludes all thePC coefficientsexceptb0.Now, thegeneralized cost-functionlooksasfollows:J
= (
b) +
J0+
J1,
(13)with
J1
=
L1
l=1
N i=1∂ ε
∗(ξ
l)
∂ξ
i− ∂
H(ξ
l)
∂ξ
i b1 2,
(14)where
∂ ε
∗(ξ
l)/∂ξ
is the gradient of the design function (6) computed atpointξ
l. Thus, one run of the adjoint model simultaneously provides ‘data’for N constraints.Theearliest reportsonthe gradient-enhanced PCexpansionmethod can befoundin[2,34].Anaturalextensiontothismethodwouldbetoconsiderthesecondderivative (Hessian)of(6).Inparticular,weknow that the Hessian-vector product is given by the second-order adjoint model [47]. Using the Hessian-vector product, the eigenvaluedecomposition{λs
,
Us},s=1, . . . ,
SoftheHessiancanbeobtainedusingtheLanczosmethod.Then,theHessian canbepresentedviaitseigenpairsinthelimited-memoryformasfollows::= ∂
2ε (ξ )
∂ξ
2=
S s=1λ
sUsUsT.
(15)Thesecondderivativeof(7) isasymmetricN×N matrixwithelements
∂
2ε (ξ )
∂ξ
i∂ξ
j=
N i1=1i1
i2=1
bi1i2
∂
2H2(ξ
i1, ξ
i2)
∂ξ
i∂ξ
j+
N i1=1i1
i2=1 i2
i3=1
bi1i2i3
∂
2H3(ξ
i1, ξ
i2, ξ
i3)
∂ξ
i∂ξ
j+ ...
(16)Theaboveexpressioncanbere-writteninthedotproductform
∂
2ε (ξ )
∂ξ
i∂ξ
j= ∂
2H(ξ )
∂ξ
i∂ξ
j b2,
whereb2=
({
bi1i2},. . . ,
{bi1...ip})T.Now,thegeneralizedcost-functionlooksasfollows:J
= (
b) +
J0+
J1+
J2,
(17)with
J2
=
L2
l=1
N i=1 i j=1∂
2ε
∗(ξ
l)
∂ξ
i∂ξ
j− ∂
2H(ξ
l)
∂ξ
i∂ξ
j b2 2,
(18)where
∂
2ε
∗(ξ
l)/∂ξ
2istheHessianof(6) computedatpointξ
l andpresentedintheform(15).While the idea ofusingthe Hessian computationsinthe context ofthe PC expansionlooks attractive, its practicality dependsonthecircumstances. TheHessianisfarmoreexpensiveobjecttocompute thanthegradient.Foreach Hessian- vector product the tangent linear andadjoint models have to be run in a sequence (see §7.2). The number of Lanczos iterationsdependsontheeigenvaluespectrumoftheHessian.Ifthisspectrumcontainsawellseparatedsubsetof‘leading’
eigenvalues,thenumberofLanczositerationsisnearlyequivalent tothesizeofthissubset; itcould besignificantlylarger otherwise.Ontheotherhand,onecomputedHessiansimultaneouslyaddsN
(
N+1)/
2 constraintsforeachl.There isone importantaspectoftheregressionmethodwhich isoftenoverlooked.Itis usuallyassumedthat running themodelfordataismuchmoreexpensivethansolvingtheminimizationproblemfor(10).Infact,thisdependsonN (size of theimportance set). Letusconsider,for example,thethird-order PCexpansion p=3.In thiscase, computing H
(ξ
l)
b for each l requires N+N(
N+1)
+N(
N+1)(
N+2)/
2 multiplications. However, taking intoaccount the sparsity of the derivativesof H,computing∂
H(ξ
l)/∂ξ
i·b1 requires N+N(
N+1)
multiplications,andcomputing∂
2H(ξ
l)/(∂ξ
i∂ξ
j)
·b2 – justN multiplications,foreachl.Therefore,eachequationfrom(14) and(18) isaboutN andN2timeslessexpensivethan the one from(11), respectively. So,the question hereiswhat is moreexpensive:producing the‘data’, i.e.the sample of∂ ε
∗/∂ξ, ∂
2ε
∗/∂ξ
2,orresolvingthesubsequentminimizationproblem. Letusnotethatfordistributedsystems,thesizeof theimportancesetN couldstillbequitelarge.WhiletheHessian-basedenhancementoftheregressionmethodscanbeusefulincertaincircumstancesoneneedsmore efficienttoolsallowingforlargerN.Belowwepresentan alternativemethodinwhichthederivativesareused,butmini- mizationisnotrequired.Thismethod,designedforthe3rd-orderPCexpansion,i.e.p≤3,isrelatedtothequasi-regression orprojectionmethods[4,3,17,18].Thekeyelementsofthismethodarethe2Dand3Dfilteringalgorithmswhichallowthe samplingerrortobeefficientlyremoved.
3. ExplicitexpressionsforthePCcoefficients
LetusdenotebyE[·]theexpectationwithrespecttothestandardnormaldistribution N
(
0,
1)
.ThePCexpansioncoeffi- cientsbcanbeexpressedviaexpectationsofε (ξ )
anditsderivatives.Forthehigh-dimensionaldistributedmodelsinmind welimitourselvesbyconsideringthethird-orderpolynomials.Themainresultsaresummarizedinthefollowingtheorem:Theorem1.Forthecoefficientsb0
,
{bi},{bi j},{bi jk},i=1, . . . ,
N,1≤j<
i,1≤k<
j oftheexpansion(7),thefollowingexpressions hold:b0
=
E[ ε (ξ )],
(19)β
ibi=
E[
H1(ξ
i) ε (ξ )] =
EH0
∂ ε (ξ )
∂ξ
i,
(20)β
i jbi j=
E H2(ξ
i, ξ
j) ε (ξ )
=
EH1
(ξ
i) ∂ ε (ξ )
∂ξ
j=
EH0
∂
2ε (ξ )
∂ξ
i∂ξ
j,
(21)β
i jkbi jk=
E H3(ξ
i, ξ
j, ξ
k) ε (ξ )
=
EH2
(ξ
i, ξ
j) ∂ ε (ξ )
∂ξ
k=
EH1
(ξ
i) ∂
2ε (ξ )
∂ξ
j∂ξ
k=
EH0
∂
3ε (ξ )
∂ξ
i∂ξ
j∂ξ
k,
(22)where
β
i=
E(
H1(ξ
i))
2=
1,
i=
1, . . . ,
N,
(23)β
i j=
E(
H2(ξ
i, ξ
j))
2=
2!,
i=
j1 i
=
j,
i>
j,
(24)β
i jk=
E(
H3(ξ
i, ξ
j, ξ
k))
2=
⎧ ⎪
⎪ ⎨
⎪ ⎪
⎩
3
!
i=
j=
k 2!
i=
j=
k 2!
i=
j=
k 1 i=
j=
k.
(25)Proofof thistheoremispresentedin Appendix1.Due tothesuper-symmetry, order ofindexingin(21) and(22) can be changed. The expressions forb via
ε (ξ )
(first equalities inthe sequences (20)–(22)) are knownin the ‘projection’or‘quasi-regression’methods.However,expressionsforbviathederivativesof
ε (ξ )
presentedherehavenotbeenpreviously reportedintheliterature.Theseexpressionscan beusedasabasisforexplicitevaluationofthePCcoefficientsusingthe derivativesofε (ξ )
.Proposition1.Onecanprovebyinductionthat E Hp
(ξ
i1, . . . ξ
ip) ε (ξ )
=
EHp−1
(ξ
i1, . . . ξ
ip−1) ∂ ε (ξ )
∂ξ
ip= . . . =
EH0
∂
pε (ξ )
∂ξ
i1. . .∂ξ
ip.
(26)Thisimpliesthatthechainsofequalities(21) and(22))canbegeneralizedtothepolynomialsofarbitrarydegreep,with correspondingweights
β
.Letusnote,however,that forhigh-dimensionaldistributedmodelsonlycomputingthefirsttwo derivatives(gradientandHessian)maybefeasible.Corollary1.ThePC-representation(7) canbere-writtenviathederivativesofthedesignfunction
ε (ξ )
asfollows:ε (ξ ) =
E[ ε (ξ )] +
N i1=1E
∂ ε (ξ )
∂ξ
i1H1
(ξ
i1) +
N i1=1i1
i2=1
1
β
i1i2EH1
(ξ
i1) ∂ ε (ξ )
∂ξ
i2H2
(ξ
i1, ξ
i2)+
+
N i1=1i1
i2=1 i2
i3=1
1
β
i1i2i3EH2
(ξ
i1, ξ
i2) ∂ ε (ξ )
∂ξ
i3H3
(ξ
i1, ξ
i2, ξ
i3) + ...,
(27)or
ε (ξ ) =
E[ ε (ξ )] +
N i1=1E
∂ ε (ξ )
∂ξ
i1H1
(ξ
i1) +
N i1=1i1
i2=1
1
β
i1i2E∂
2ε (ξ )
∂ξ
i1∂ξ
i2H2
(ξ
i1, ξ
i2)+
+
N i1=1i1
i2=1 i2
i3=1
1
β
i1i2i3EH1
(ξ
i1) ∂
2ε (ξ )
∂ξ
i2∂ξ
i3H3
(ξ
i1, ξ
i2, ξ
i3) + ... .
(28)TheseexpressionsprovidedifferentformsofwhatcanbereferredastheprobabilisticTaylorexpansion.
4. Globalsensitivityanalysis
Global sensitivity analysis (GSA) aims to quantify the relative impact of the inputs
ξ
on the design functionε
. The commonlyusedGSAmethodisthevariance-basedmethodofSobol’sensitivityindices[41],alsoknownasANOVAdecom- position.Inthispaperwelimitourselvesbyconsideringthefirst-order‘maineffect’indicesSi andthe‘totaleffect’indices SiT,definedin[37] asfollows:Si
=
V ar(
E[ ε (ξ )|ξ
i])
V ar
( ε (ξ ))
(29)SiT
=
V ar( ε (ξ )) −
V ar(
E[ ε (ξ )|ξ
−i])
V ar
( ε (ξ )) ,
(30)where
ε (ξ )|ξ
i meansthe randomvalueofε
withthe inputcomponentξ
i beingfixedatits‘generic’value,ξ
−i standsfor‘all,buti’,andV ar
( ε (ξ ))
isthetotalvarianceofε (ξ )
.Theuniquerelationshipbetweentheglobalsensitivityindices(GSI)andthePCcoefficientshasbeenderivedin[6].For thePCexpansionintheform(7),whichisdifferentfromtheoneusedin[6],wecanwrite(seeAppendix3fordetail):
SiV ar
( ε (ξ )) = β
ib2i+ β
iib2ii+ β
iiib2iii+ ...
(31)SiT V ar
( ε (ξ )) = β
ib2i+
ij=1
β
i jb2i j+
ij=1
j k=1β
i jkb2i jk+ ...
(32)Thederivative-basedglobalsensitivitymeasureshasbeenintroducedin[42].Ithasbeenprovedthatfor
ξ
beingavector ofrandomiidnormalvariablesξ
i∼N(
0,
1)
SiT V ar
( ε (ξ )) ≤
E∂ ε (ξ )
∂ξ
i 2.
(33)The above formulagivesa convenienttool forcomputingthe upper-boundfor SiT,tobe used asabasis forranking the inputsand, subsequently,fordefining theimportance set.Ifthefinal goalistoconstructthe probabilitydensityfunction (pdf) of
ε (ξ )
,usingtheupper-boundguarantiesthatallimportantvariablesareaccountedfor,howeverthecorresponding importancesetmaycontainsomelessimportantvariables.The upper-bound estimate can be improvedif some ofthe PCcoefficients are known.This followsfromthe theorem below:
Theorem2.Theexpectationofthesquaredgradientisgivenbytheseries
E
∂ ε (ξ )
∂ξ
i 2= ˜ β
ib2i+
ij=1
β ˜
i jb2i j+
ij=1
j k=1β ˜
i jkb2i jk+ ...,
(34)where
β ˜
i=
1,
i=
1, . . . ,
N,
(35)β ˜
i j=
2×
2!,
i=
j1 i
=
j,
i>
j,
(36)β ˜
i jk=
⎧ ⎪
⎪ ⎨
⎪ ⎪
⎩
3
×
3!
i=
j=
k 2×
2!
i=
j=
k 1×
2!
i=
j=
k 1 i=
j=
k.
(37)ProofofthistheoremispresentedinAppendix2.TheTheoremessentiallyreplicatessome oftheresultsfrom[44],but representstheminamoretransparentandconvenientform.
Corollary2.Foranygivenorderofpolynomials,theimprovedderivative-basedupper-boundforthe‘totaleffect’GSIisgivenbythe expression
SiT V ar
( ε (ξ )) ≤
E∂ ε (ξ )
∂ξ
i 2−
ij=1
β ˜
i j− β
i jβ
i j2 E2
H1
(ξ
i) ∂ ε (ξ )
∂ξ
j−
−
ij=1
j k=1β ˜
i jk− β
i jkβ
i jk2 E2
H2
(ξ
i, ξ
j) ∂ ε (ξ )
∂ξ
k− ...
(38)ProofofCorollary2isfairlytrivial.Comparing(32) and(34) wenoticethat
E
∂ ε (ξ )
∂ξ
i 2−
SiT V ar( ε (ξ )) ≥
ij=1
( β ˜
i j− β
i j)
b2i j+
ij=1
j k=1( β ˜
i jk− β
i jk)
b2i jk+ ...
For the infinite series the above expression becomes an equality. Since
β
˜≥β
, all terms in the series to the right are non-negative. Therefore, the more terms in the series are truncated, the stronger the inequality becomes. Formula(33) correspondstothecasewhenalltermsaretruncated.Thatiswhy(38) must(atleastintheory)providelessoverestimated upper-boundforSiT than(33).Takingintoaccountexpressionsforbi j,
bi jk, ...
from(20)–(22) weobtain(38).Itcanbeeasily seenthat,forthetruncatedseries,formula(32) givesalower-boundforSiT,whichmaynotbeareliablebasisfordefining theimportanceset.Inpractice,givenasampleoffunctionsandgradients{
ε
∗(ξ
l), ∂ ε
∗(ξ
l)/∂ξ
},l=1,
· · ·,
L,insteadof(38) onewoulduseSiT V ar
( ε (ξ )) <
L−1 L i=1∂ ε
∗(ξ
l)
∂ξ
i 2−
ij=1
( β ˜
i j− β
i j)
b2i j−
ij=1
j k=1( β ˜
i jk− β
i jk)
b2i jk− ...,
(39)where
V ar
( ε (ξ )) =
1 L−
1 L l=1ε
∗(ξ
l) −
b0 2(40)
isthesamplevarianceandbi j
,
bi jk, ...
aretheestimatedPCexpansioncoefficientsbasedonthissample.Ofcourse,forsmall samplesthereisnowarrantythattheupper-boundevaluatedfrom(39) isbetterthantheonewithoutcorrectionterms.Letusfinallynote(considering(20) and(31))that
SiV ar
( ε (ξ )) ≥
E2∂ ε (ξ )
∂ξ
i,
(41)i.e.thesquaredexpectationofthegradientgivesalowerboundforthefirst-order‘maineffect’indices.
As discussed above, given the sample of gradients one can evaluate the upperbound forthe ‘total effect’sensitivity indicesSiT
,
i=1, . . . ,
N by(32).Then,theinputsU canberankedaccordingly,whichmeansthemappingi=I(
i′)
between theoriginalandtherankedsequencesofelementsinU,aswell asitsinversei′=I−1(
i)
,aredefined.Theimportanceset canbechosenbyconsidering N′<
N elements(havingthelargestSiT),whereN′satisfiestheconditionN′
i′=1
ST
I(i′)
/
Ni=1
SiT
≈ ǫ ,
where
ǫ
isa giventhreshold.Below we dealwiththe importance setof size N′, whichmeans that all indexesinvolved arei′.Forthesakeofsimplicity,however,wekeeptheoriginalnotations,i.e.weuseN insteadofN′,iinsteadofi′,etc.5. FinitesampleestimatesforthePCcoefficients 5.1. Zero- andfirst-ordercoefficients
Ithasbeenalreadymentionedthattheexplicit(‘projection’ or‘quasi-regression’)methodsforestimatingthePCcoeffi- cientshavebeenpreviouslyreported,thoughwithoutthederivativesofthedesignfunctionbeingused.Thesemethodsare usefulifcomputingasufficientlylargesampleofmodelresponsesisfeasible.
Notethatequations(19)–(22) representthestatisticalmomentsofamultivariaterandomfunction
ε
∗(ξ )
anditsderiva- tives.Intheory,theuncertainty(variance)inthesamplemean,variance,skewnessandkurtosisisproportionalto1/
L,2/
L, 6/
Land24/
L,respectively,whereListhesamplesize[23].Thesevaluesaresignificantlylargerinpractice.Letusassumethat thefollowingestimatesofthePCcoefficientshaveasufficientlevelofstatisticalconditioningwith L=L∗:
b0
=
L−1 Ll=1
ε
∗(ξ
l),
bi=
L−1 L l=1∂ ε
∗(ξ
l)
∂ξ
i,
bi j= β
i j−1L−1 Ll=1
∂
2ε
∗(ξ
l)
∂ξ
i∂ξ
j.
(42)Fortheestimates
bi
=
L−1 Ll=1
ξ
ilε
∗(ξ
l),
bi j= β
i j−1L−1 L l=1ξ
il∂ ε
∗(ξ
l)
∂ξ
j,
(43)theequivalentlevelofstatisticalconditioningcanbeachievedwiththesamplesizeL
>
2×L∗,andforbi j
= β
i j−1L−1 L l=1(ξ
ilξ
lj− δ
i,j) ε
∗(ξ
l)
(44)– withthesamplesize L
>
6×L∗.Sincethecomputationalcostofthefunctionandofitsgradientareoftencomparable,computingthesampleofthegra- dientsofsizeL∗isfeasible.Therefore,thesecondformulain(42) canbeusedforevaluatingbi.However,thecomputational cost ofthe Hessian is significantly (50–100 times) larger, which means that the sample of Hessians of size L∗ may not be available.Therefore,bi,j havetobecomputedusingthesecondformulafrom(43). Thisrequireseitheralargersample size to be consistent with the estimatesof b0 and bi, or a specialtreatment to reduce the samplingerror. The filtering proceduresforcomputingbi j andbi jk areintroducedbelow.
5.2. Second-ordercoefficientswithfiltering
The second-ordercoefficientsbi j canbe computedviathegradientof
ε (ξ )
usingthe secondformulain(43).Let A be an N×N matrixwiththeelementsAi,j
=
L−1 Ll=1
ξ
il∂ ε
∗(ξ
l)
∂ξ
j,
(45)whereN istheimportancesetsize.Accordingtothelastequalityin(21),matrix AmustbesymmetricinthelimitL→ ∞. Forafinite L,however,itisnotsymmetricduetothesamplingerror,seeFig.3(left).Fromthisfigureonecanalsonotice that thepresented2D-fieldissmooth inthehorizontaldirection(index j),andoscillatory intheverticaldirection(index i).Indeed, asitfollows from(45), each rowof A is a transposedweighted averageof thegradient. Thelatter oftenisa sufficientlysmoothfunctionofthespatialcoordinates(fordistributedparametersystems).
Thepurposeoffilteringistoretrievethesymmetricpartfromanoisy2D‘image’.Letusdefinethematrix–vectorproduct w=ATv,wherethesummationisperformedalongtheoscillatorydirectioni,asfollows:
wj
=
N i=1Ai,jvi
=
L−1 L l=1∂ ε
∗(ξ
l)
∂ξ
lj Ni=1
ξ
ilvi,
j=
1, . . . ,
N.
(46)Thismatrix–vectorproductisusedintheLanczosalgorithmtoevaluateanumberS1 ofeigenpairs(the largestmagnitude eigenvalues
λ
sandthecorrespondingeigenvectorsUs)ofmatrixA.Then,thelimited-memorysymmetricrepresentationofAisgivenbytheformula
A
ˆ =
S1
s=1
λ
sUsUsT,
(47)andforthesecond-ordercoefficientswefinallyobtain
bi j
= β
i j−1Aˆ
i j= β
i j−1S1
s=1
λ
sUs,iUs,j.
(48)Twoversionsofthisalgorithmarepossible.Inthe‘full-memory’versionmatrix Aisexplicitlycomputed(LN2 multipli- cations)andstoredinmemory(N2 numbers)beforetheeigenvalueanalysisstep.Inthiscasecomputingthematrix–vector