HAL Id: hal-00770006
https://hal-upec-upem.archives-ouvertes.fr/hal-00770006
Submitted on 4 Jan 2013
HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
high dimension from a set of realizations
Guillaume Perrin, Christian Soize, Denis Duhamel, Christine Fünfschilling
To cite this version:
Guillaume Perrin, Christian Soize, Denis Duhamel, Christine Fünfschilling. Identification of poly- nomial chaos representations in high dimension from a set of realizations. SIAM Journal on Sci- entific Computing, Society for Industrial and Applied Mathematics, 2012, 34 (6), pp.A2917-A2945.
�10.1137/11084950X�. �hal-00770006�
G. PERRIN
∗† ‡
,C. SOIZE
∗
, D. DUHAMEL
†
, AND C. FUNFSCHILLING
‡
Abstrat.
Thispaper deals with the identiation in high dimensionof polynomial haos expansionof
randomvetorsfromasetofrealizations. Duetonumerialandmemoryonstraints,theusualpoly-
nomialhaosidentiation methodsarebasedon aseriesof trunations thatinduesanumerial
bias. Thisbiasbeomesverydetrimentaltothe onvergeneanalysisofpolynomialhaosidenti-
ationinhighdimension. Thispaperthereforeproposesanewformulationoftheusualpolynomial
haosidentiationalgorithmstoavoidthisnumerialbias.Afterareviewofthepolynomialhaos
identiationmethod,theinueneofthenumerialbiasontheidentiationaurayisquantied.
Thenewformulationisthendesribedindetails,andillustratedontwoexamples.
Keywords.polynomialhaosexpansion,highdimension,omputation.
AMSsubjet lassiations.60H35,60H15,60H25,60H40,65C50
1. Introdution. Inspiteofalwaysmoreauratenumerialsolvers,determin-
istimodelsarenotabletorepresentmostoftheexperimental data,whiharevari-
ableandoftenunertainbynature. Hene,theappliationeldsofnondeterministi
modeling,whihantakeintoaountthemodelparametersvariabilityaswellasthe
modelerrorunertainties,haskeptinreasing. Unertaintiesarethereforeintrodued
inomputationalmehanialmodelswithmoreandmoredegreesoffreedom. Inthis
ontext,theharaterizationoftheprobabilitydistribution
P
η(dx)
ofN
η-dimensionrandomvetor
η
fromsetsofexperimentalmeasurementsisboundtoplayakeyrole, inpartiular,inhighdimension,thatistosayforalargevalueofN
η. Inthiswork,itisassumed that
P
η(dx) = p
η(x)dx
inwhihtheprobabilitydensityfuntion(PDF)p
η is afuntion in thesetF ( D , R
+)
of all thepositive-valued funtions dened on anypartD
ofR
Nη andforwhihintegraloverD
is1.Two kinds of methods an be used to build suh a PDF: the diret and the
indiretmethods. Amongthediretmethods,thePriorAlgebraiStohastiModeling
(PASM) methods postulate an algebrai representation
η ≈ t
alg(Ξ, w)
, witht
alg aprior transformation,
Ξ
a given random vetor andw
a vetor of parameters toidentify. Inthe same ategory, the methods based on the Information Theory and
theMaximumEntropyPriniple (MEP) havebeendevelopped(see[13℄ and[27℄)to
ompute
p
ηfromtheonlyavailableinformationofrandomvetorη
. Thisinformation anbeseenastheadmissiblesetC
adforp
η:C
ad=
p
η∈ F ( D , R
+) | Z
D
p
η(x)dx = 1,
∀ 1 ≤ m ≤ M, Z
D
g
m(x)p
η(x)dx = f
m,
(1.1)∗
UniversitéParis-Est, Modélisationet SimulationMulti-Éhelle (MSMEUMR8208 CNRS),5
Bd.Desartes,77454Marne-la-Vallée,Frane(hristian.soizeuniv-paris-est.fr).
†
UniversitéParis-Est,Navier(ENPC-IFSTTAR-CNRSUMR8205),EoleNationaledesPontset
Chaussées,6et8AvenueBlaisePasal,CitéDesartes,ChampssurMarne,77455Marne-la-Vallée,
Cedex2,Frane(denis.duhamelenp.fr)
‡
SNCF,Innovation and Researh Department, Immeuble Lumière, 40avenue desTerroirs de
Frane,75611,Paris,Cedex12,Frane(guillaume.perrinsnf.fr,hristine.funfshillingsnf.fr).
where
{ f
m, 1 ≤ m ≤ M }
gathersM
givenvetorswhih arerespetivelyassoiated with a given vetor-valued funtions{ g
m, 1 ≤ m ≤ M }
. Hene, the MPE allowsbuilding
p
η asthesolutionoftheoptimizationproblem:p
η= arg max
pη∈Cad
− Z
D
p
η(x) log (p
η(x)) dx
.
(1.2)Ontheotherhand,theindiretmethodsallowtheonstrutionofthePDF
p
ηoftheonsidered random vetor
η
from atransformationt
of aknownrandom vetorξ = ξ
1, ..., ξ
Ngofgivendimension
N
g≤ N
η:η = t (ξ) ,
(1.3)dening atransformation
T
betweenp
η andthePDFp
ξ ofξ
:p
η= T (p
ξ) .
(1.4)The onstrution of the transformation
t
is thus the key point of these indi-retmethods. Inthis ontext,theisoprobabilistitransformationssuh astheNataf
transformation(see[20℄)ortheRosenblatttransformation(see[23℄)haveallowedthe
developmentofinterestingresultsintheseondpartofthetwentiethenturybutare
stilllimitedtoverysmalldimensionasesandnottothehighdimensionaseonsid-
eredinthiswork. Nowadays,themostpopularindiretmethodsarethepolynomial
haosexpansion(PCE)methods,whihhavebeenrstintroduedbyWiener[33℄for
stohastiproesses,andpioneered byGhanemandSpanos[10℄ [11℄fortheuseof it
in omputational sienes. In thelast deade, this verypromising method has thus
been applied in many works (see, for instane [1℄, [2℄, [3℄, [4℄, [5℄, [7℄, [8℄, [9℄, [12℄,
[14℄,[15℄, [16℄,[19℄,[18℄,[17℄,[21℄,[22℄,[24℄, [26℄,[28℄,[31℄,[32℄, [25℄,[34℄). ThePCE
is basedon adiret projetionof therandom vetor
η
on ahosen hilbertian basisB
orth=
ψ
α(ξ), α ∈ N
Ng ofalltheseond-orderrandomvetorswithvaluesinR
Nη:η = X
α∈NNg
y
(α)ψ
α(ξ),
(1.5)ξ 7→ ψ
α(ξ) = X
α1(ξ
1) ⊗ ... ⊗ X
αNg(ξ
Ng),
(1.6)where
x 7→ X
αℓ(x)
isthenormalizedpolynomialbasisofdegreeα
ℓassoiatedwiththep
ξℓ oftherandomvariableξ
ℓ,andα
isthemulti-indexofthemultidimensional polynomialbasiselementψ
α(ξ)
. Buildingthetransformationt
requiresthereforetheonstrutionoftheprojetionvetors
y
(α), α ∈ N
Ng .Thepresentwork is devoted tothe identiationin high dimensionof thePCE
oeients
y
(α), α ∈ N
Ng , when the only available information on the random vetorη
isasetofν
expindependentrealizationsη
(1), · · · , η
(νexp) .Inpratie,thePCEof
η
hasrsttobetrunated:η ≈ η
chaos(N) = X
α∈Ap
y
(α)ψ
α(ξ),
(1.7)A
p=
α = α
1, ..., α
Ng| | α | =
Ng
X
ℓ=1
α
ℓ≤ p
= n
α
(1), · · · , α
(N)o
,
(1.8)where
η
chaos(N)
is the projetion ofη
on theN
-dimension subspae spanned by{ ψ
α(ξ), α ∈ A
p} ⊂ B
orth. It an be notied thatN
inreases veryquikly withrespet tothe dimension
N
g ofξ
and themaximumdegreep
ofthetrunated basis{ ψ
α(ξ), α ∈ A
p}
,as:N = (N
g+ p)!/ (N
g! p!) .
(1.9)Methodstoperformtheonvergeneanalysisinhighdimensionwithrespettoa
givenerrorthresholdonthePCEresidue
η − η
chaos(N )
arethereforeofgreatonerntojustifythetrunationparameters
N
g andp
.In this prospet, the artile [29℄ provides advaned algorithms to ompute the
PCEoeientsfromthe
ν
exp independentrealizationsη
(1), · · · , η
(νexp) byfous-ing on the maximization of the likelihood. In partiular, one of the key point of
these algorithms is the alulation of
N × ν
chaosreal matrix
[Ψ]
of independent realizationsofthetrunatedPCEbasis{ ψ
α(ξ) , α ∈ A
p}
:[Ψ] = [Ψ (ξ (θ
1) , p) · · · Ψ (ξ (θ
νchaos) , p)] ,
(1.10)Ψ (ξ, p) = ψ
α(1)ξ
1, · · · , ξ
Ng, · · · , ψ
α(N)ξ
1, · · · , ξ
Ng,
(1.11)where the set
{ ξ (θ
1) , · · · , ξ (θ
νchaos) }
gathersν
chaos independent realizations of the randomvetorξ
.Reurrene formula or algebrai expliit representations are generally used to
omputesuhmatrix
[Ψ]
,whih aresupposedtoverifytheasymptotialproperty:νchaos
lim
→+∞1
ν
chaos[Ψ][Ψ]
T= [I
N],
(1.12)asadiretonsequeneoftheorthonormalityofthePCEbasis
{ ψ
α, α ∈ A
p}
,where[I
N]
istheN
-dimensionidentitymatrix.However,fornumeriallyadmissiblevaluesof
ν
chaos(between1000and10000),ithasbeenshownin[30℄thatthedierene
1
νchaos
[Ψ][Ψ]
T− [I
N]
anbeverysigniantwhenhigh valuesofthemaximumdegree
p
anbeenountered withsimultaneously signiantvaluesofN
g. ThisdiereneinduesadetrimentalbiasinthePCEidenti-In[30℄,itisthereforeproposedamethod usingsingularmatrixdeompositiontonu-
meriallyadaptlassialgenerationsof
[Ψ]
,and makethisdierenebezeroforanyvaluesof
p
andN
g. Nevertheless,thisonditionningon[Ψ]
modiestheinitialstru-tureof
[Ψ]
, and makestheidentied PCE oeientsy
(α), α ∈ A
p impossibleto bereusedonanothermatrix[Ψ
∗]
ofν
chaos,∗ newrealizationsofΨ(ξ, p)
.As an extension of the works desribed in [29℄ and [30℄, this artile proposes
anoriginal deomposition of the PCE oeients
y
(α), α ∈ A
p , that redues the numerial bias introdued during the identiation by the nite dimension of[Ψ]
andfor largevaluesof degree
p
. Thisnewformulationispartiulary adaptedto the highdimension,andallowstheidentiedoeientstobereusedforothermatrixofrealizations
[Ψ
∗]
.InSetion2,thePCEidentiationfromasetofexperimentaldatawithanarbi-
trarymeasureisdesribed. Inpartiular,theroleplayedbythematrixofindependent
realizations
[Ψ]
isemphasized. Setion3fousesontheonvergenepropertiesofthis matrix[Ψ]
with respet to three statistial measures, and desribes an innovativemethod togeneratethismatrixwithoutusing omputationalreurreneformulanor
algebraiexpliitrepresentation. InSetion4,thenewformulationofthePCEiden-
tiationproblemisgiven. Finally,arepresentedinSetion5twoappliationsofthe
formermethodwithaGaussianmeasure.
2. PCE identiation of random vetors from a set of independent
realizations. Inthissetion,adesriptionofthePCEidentiationwithrespetto
anarbitrarymeasureis given. The objetiveis tosummarizethedierentkeysteps
ofthePCEidentiationmethodandthewaytheyarepratiallyimplemented.
After having dened the theoretial frame of the PCE identiation, the ost-
funtion that leads to the omputation of the PCE oeients
y
(α), α ∈ A
p is presented, for given trunation parametersN
g andp
. At last, to justify the hoieof these trunation parameters, a method to perform the onvergene analysis is
introdued.
2.1. Theoretialframe. Let
(Θ, T , P )
beaprobabilityspae. LetL
2PΘ, R
Nηbethespaeofalltheseond-order
N
η-dimensionrandomvetorsdenedon(Θ, T , P )
withvaluesin
R
Nη, equippedwiththeinnerproduth ., . i
:h U , V i =
Z
Θ
U
T(θ)V (θ)dP (θ) = E U
TV
, ∀ U, V ∈ L
2PΘ, R
Nη,
(2.1)where
E (.)
isthemathematialexpetation.Let
η = η
1, · · · , η
Nηbe an element of
L
2PΘ, R
Nη. It is assumed that
ν
expindependent realizations
η
(1), · · · , η
(νexp) ofη
are known and gathered in the(N
η× ν
exp)
real matrix[η
exp]
:[η
exp] = h
η
(1)· · · η
(νexp)i
.
(2.2)Equation(1.7)anberewrittenas:
η
chaos(N ) = [y]Ψ(ξ, p),
(2.3)[y] = h
y
(α(1))· · · y
(α(N))i
.
(2.4)Theorthonormalitypropertyoftheprojetionbasis
{ ψ
α(ξ), α ∈ A
p}
yieldstheondition:
E Ψ(ξ, p)Ψ(ξ, p)
T= [I
N].
(2.5)Sine
ψ
α(1)(ξ) = 1
,itanbeseenthat:E η
chaos(N )
= y (
α(1)) .
(2.6)Let
[R
η]
and[R
chaosη(N)]
betheautoorrelationmatrixoftherandomvetorsη
and
η
chaos(N )
:[R
η] = E ηη
T,
(2.7)R
chaosη(N)
= E
η
chaos(N ) η
chaos(N )
T= [y]E Ψ(ξ, p)Ψ(ξ, p)
T[y]
T= [y][y]
T.
(2.8)
2.2. Identiation of the polynomial haos expansion oeients. In
thissetion,partiularvaluesofthetrunationparameters
N
g andp
areonsidered.Let
M
NηN bethespaeofallthe(N
η× N )
realmatries. Foragivenvalueof[y
∗]
inM
NηN, therandomvetorU ([y
∗]) = [y
∗]Ψ (ξ, p)
isaN
η-dimensionrandomvetor,forwhihtheautoorrelationisequalto
[y
∗][y
∗]
T. Letp
U([y∗])beitsmultidimensional PDF.Whentheonlyavailableinformationon
η
isasetofν
expindependentrealizations, theoptimaloeientsmatrix[y]
ofitstrunated PCE,η
chaos(N ) = [y]Ψ(ξ, p)
,anbeseenastheargumentwhihmaximizesthelog-likelihood
L
U([y∗])([η
exp])
ofU ([y
∗])
:[y] = arg max
[y∗]∈MNη N
L
U([y∗])([η
exp]) ,
(2.9)L
U([y∗])([η
exp]) =
ν
X
expi=1
ln p
U([y∗])η
(i).
(2.10)2.3. Pratial solving ofthe log-likelihoodmaximization.
2.3.1. Theneedforstatistialalgorithmstomaximizethelog-likelihood.
Thelog-likelihood
L
U([y∗])([η
exp])
beingnon-onvex,deterministialgorithmssuhas gradient algorithms annot be applied to solveEq. (2.9), and random searh algo-rithms have to beused. Hene, thepreision ofthe PCE has to beorrelated to a
numerial ost
M
, whih orresponds to a number of independent trials of[y
∗]
inM
NηN. LetY =
[y
∗]
(r), 1 ≤ r ≤ M
be a set ofM
elements, whih have beenhosenrandomly in
M
NηN. Foragiven numerialostM
,the mostauratePCEoeientsmatrix
[y]
isapproximatedby:[y] ≈ [y
Y] = arg max
[y∗]∈Y
L
U([y∗])([η
exp]) .
(2.11)2.3.2. Restritionofthemaximizationdomain. Fromthe
ν
expindependent realizationsη
(1), · · · , η
(νexp) ,themeanvalueE(η)
andtheautoorrelationmatrix[R
η]
ofη
anbeestimatedby:E (η) ≈ η(ν b
exp) = 1 ν
expνexp
X
i=1
η
(i),
(2.12)[R
η] ≈ [ R b
η(ν
exp)] = 1 ν
expνexp
X
i=1
η
(i)η
(i)T= 1
ν
exp[η
exp][η
exp]
T.
(2.13)Agoodwaytoimprovetheeienyofthenumerialidentiationof
[y]
isthentorestrittheresearhsetto
O
η⊂ M
NηN,with:O
η= n [y] = h
y
(α(1)), · · · , y
(α(N))i
∈ M
NηN| y
(α(1))= b η(ν
exp), [y][y]
T= [ R b
η(ν
exp)] o
,
(2.14)
whih,takingintoaountEqs. (2.6)and(2.8),guaranteesbyonstrutionthat:
[R
chaosη(N )] = [ R b
η(ν
exp)], E η
chaos(N )
= η(ν b
exp).
(2.15)Hene,thePCE oeientsmatrix
[y]
anbeapproximatedastheargumentinO
η that maximizes the log-likelihoodL
U([y∗])([η
exp])
. By deningW
the set thatgathers
M
randomlyraisedelementsofO
η,[y]
anthen be assessedasthesolutionofthenewoptimization problem:
[y] ≈ [y
W] = arg max
[y∗]∈W
L
U([y∗])([η
exp]) .
(2.16)2.3.3. Approximation of the log-likelihood funtion. From a partiular
matrix of realizations
[Ψ]
(whih is dened in Eq. (1.10)), if[y
∗]
is an element ofO
η,ν
chaos independent realizationsU ([y
∗], θ
n) = [y
∗]Ψ (ξ(θ
n), p) , 1 ≤ n ≤ ν
chaosoftherandomvetor
U ([y
∗])
anbeomputedandgatheredinthematrix[U ]
:[U ] = [U ([y
∗], θ
1) · · · U ([y
∗], θ
νchaos)] = [y
∗][Ψ].
(2.17)Hene,usingGaussian Kernels,thePDF
p
U([y∗])ofU ([y
∗])
anbediretlyesti-matedbyitsnonparametriestimator
p b
U:∀ x ∈ R
Nη, p
U([y∗])(x) ≈ b
p
U(x) = 1
(2π)
Nη/2ν
chaosQ
Nηk=1
h
k νX
chaosn=1
exp
− 1 2
Nη
X
k=1
x
k− U
k([y
∗], θ
n) h
k 2
,
(2.18)
where
h = h
1, · · · , h
NηisthemultidimensionnaloptimalSilvermanbandwithvetor
(see[6℄)oftheKernelsmoothingestimationof
p
U([y∗]):∀ 1 ≤ k ≤ N
η, h
k= b σ
Uk4 (2 + N
η)ν
exp 1/(Nη+4),
(2.19)where
b σ
Uk is theempirial estimation of thestandard deviation of eah omponentU
k ofU
. Ithastobenotied thatp b
U onlydependsonthebandwidthvetorh
,andthetwomatries
[y
∗]
and[Ψ]
. Hene,aordingtotheEqs. (2.10),(2.17)and(2.18),for a given value of
ν
chaos, the maximizationof the log-likelihood funtionL
U([y∗])anbereplaedbythemaximizationoftheost-funtion
C ([η
exp], [y
∗], [Ψ])
suhthat:[y] ≈ [y
Oη] = arg max
[y∗]∈Oη
C ([η
exp], [y
∗], [Ψ]),
(2.20)where:
C ([η
exp], [y
∗], [Ψ]) = C
C+ C
V([η
exp], [y
∗], [Ψ]),
(2.21)C
C= − ν
expln
(2π)
Nη/2ν
chaosNη
Y
k=1
h
k
,
(2.22)C
V([η
exp], [y
∗], [Ψ]) =
ν
X
expi=1
ln
ν
X
chaosn=1
exp
− 1 2
Nη
X
k=1
η
k(i)− U
k([y
∗], θ
n) h
k!
2
.
(2.23)Hene,theoptimization problem dened byEq. (2.16)annallybeestimated
by:
[y] ≈ [y
OMη] = arg max
[y∗]∈W
C ([ν
exp], [y
∗], [Ψ]) .
(2.24)The optimization problem dened by Eq. (2.24) is now supposed to be solved
withtheadvanedalgorithmsdesribedin[29℄tooptimizethetrialsoftheelementsof
W
foragivenomputationostM
. ThehigherthevalueofM
is,thebetterthePCEidentiation should be. Therefore, this value has to behosenas high aspossible
2.3.4. Auray of the PCE identiation. For agiven omputation ost
M
, let[y
MOη]
be an optimal solution of Eq. (2.24).[y
MOη]
is a numerialestimationof the PCE oeients matrix
[y]
. For a newN × ν
chaos,∗real matrix
[Ψ
∗]
ofindependentrealizations(
ν
chaos,∗ anbe higherthanν
chaos),therobustnessof[y
MOη]
regardingthehoieof
[Ψ]
anthenbeestimated byomparingC
[η
exp], [y
OMη], [Ψ]
and
C
[η
exp], [y
MOη], [Ψ
∗]
. In addition, if
ν
exp new independent realizations ofη
wereavailableandgatheredin thematrix
[η
exp,new]
,theover-learningofthemethod ould be measured by omparingC
[η
exp], [y
MOη], [Ψ]
and
C
[η
exp,new], [y
OMη], [Ψ]
.
Atlast,forthesameomputationost
M
,if[y
M,newOη]
isanewoptimalsolutionofEq.(2.24), the globalauray ofthe identiationstems from theomparison between
C
[η
exp,new], [y
MOη], [Ψ
∗]
and
C
[η
exp,new], [y
M,newOη], [Ψ
∗]
.
2.4. Identiation of the PCE trunation parameters. As shown in In-
trodution, two trunation parameters,
N
g andp
, appear in the trunated PCE,η
chaos(N ) = [y]Ψ(ξ, p)
, ofη
. Thevaluesofthese parametershaveto bedeterminedfrom aonvergene analysis. Theobjetiveofthis setionisthus togivethe funda-
mentalelementstoperformsuhaonvergeneanalysis.
2.4.1. Denitionof a log errorfuntion. Foreahomponent
η
kchaos(N)
ofthe trunated PCE,
η
chaos(N ) = [y]Ψ(ξ, p)
, ofη
, theL
1-log error funtionerr
k isintroduedasdesribedin [29℄:
∀ 1 ≤ k ≤ N
η, err
k(N
g, p) = Z
BIk
| log
10(p
ηk(x
k)) − log
10p
ηchaosk
(x
k)
| dx
k,
(2.25)where:
• BI
k isthesupportofη
expk ;• p
ηk andp
ηchaosk
arethePDF of
η
k andη
chaosk respetively.Themultidimensional errorfuntion
err(N
g, p)
isthen dedued from theunidi-mensional
L
1-logerrorfuntion as:err(N
g, p) =
Nη
X
k=1
err
k(N
g, p).
(2.26)The parameters
N
g andp
havethus to be determined to minimizethe multidi-mensional
L
1-logerrorfuntionerr(N
g, p)
.For given values of trunation parameters
N
g andp
, it is reminded that PCEoeients matrix
[y]
is searhed in order to maximize the multidimensional log- likelihoodfuntion,whihallowsustoonsideraprioristronglyorrelatedproblems.One this matrix
[y]
is identied, it is possible to generate as many independent realizationsoftrunatedPCEη
chaos(N )
asneededtoestimateaspreiselyaspossiblethenon parametriestimator
p b
U of its multidimensionalPDF. Thenumberν
exp ofavailable experimentalrealizationsof
η
ishoweverlimited. Thisnumberisgenerallytoosmall for thenon parametri estimatorof multidimensional PDF
p
η ofη
to berelevant,whereasit ismostof thetime largeenoughtodene theestimatorsof the
marginalsof