HAL Id: tel-02092072
https://hal.archives-ouvertes.fr/tel-02092072v4
Submitted on 9 Dec 2019
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of
sci-entific research documents, whether they are
pub-lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diffusion de documents
scientifiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Gaussian process regression of two nested computer
codes
Sophie Marque-Pucheu
To cite this version:
Sophie Marque-Pucheu. Gaussian process regression of two nested computer codes. General
Math-ematics [math.GM]. Université Sorbonne Paris Cité, 2018. English. �NNT : 2018USPCC155�.
�tel-02092072v4�
de l'Université Sorbonne Paris Cité
Préparée à l'Université Paris Diderot
E ole do torale n
o
386 Mathématiques Paris Centre
Laboratoire de Probabilités, Statistique et Modélisation
Gaussian pro ess regression of two nested
omputer odes
Par Sophie Marque-Pu heu
Thèse de do torat de Mathématiques Appliquées
Présentée et soutenue publiquement àParisle10 o tobre 2018 devantle jury suivant
Examinatri e Fis her Aurélie Maître de onféren es UniversitéParisDiderot
Dire teur de thèse Garnier Josselin Professeur É ole Polyte hnique
Examinatri e Marrel Amandine Ingénieurde re her he CEA
Rapporteur Monod Hervé Dire teurde re her he INRA
Président du jury Nouy Anthony Professeur É ole Centrale Nantes
Examinateur Perrin Guillaume Ingénieurde re her he CEA
D'après les rapports de
MonodHervé Dire teur de re her he INRA
Marzouk Youssef Professor MIT
Remer iements
Mespremiersremer iementss'adressentnaturellementauxdeuxen adrantsquim'ont
a om-pagnéelorsde estroisannéesdethèse. JosselinGarnier,mondire teurdethèse,asuivimes
travauxdemanièreattentive. Satrèsgrande ultures ientiqueetsaréa tivitéontétéd'une
aide pré ieuse. Guillaume Perrin, mon en adrant au CEA, a également fait preuve d'une
très grande impli ation dans le suivi de ette thèse. Je le remer ie pour sa pédagogie et sa
onan e.
Jeremer ieégalementlesdeuxrapporteursde ettethèse: HervéMonodetYoussefMarzouk.
Thank you,Mr. Marzouk, foryour areful reading ofthemanus ript. Mer i également à M.
Monod pour sale tureattentive dumanus rit, ainsique pour avoirété membredu jury.
Jetiensaussiàremer ierAurélieFis her,AmandineMarreletAnthonyNouyd'avoira epté
defaire partie de monjuryde thèse.
Enn,je saluetouslesmembresde l'équipein ertitudes duCEA,ainsiquelesdo torants du
bâtiment B.
Jeremer ieégalementtouslespersonnelsadministratifsduCEAetl'UniversitéParis Diderot
quiont fa ilité d'unemanièreou d'uneautre monquotidien.
Mesderniersremer iementsvont àmafamille,enparti ulier mesparentsqui m'ont transmis
legoût d'apprendre. Last but not least, je remer ie enn mon onjoint pour sonformidable
Cettethèsetraitedelamétamodélisation(ouémulation)parpro essusgaussiendedeux odes
ouplés. Le terme deux odes ouplés désigne i i un système de deux odes haînés : la
sortie dupremier odeest une desentrées du se ond ode.
Les deux odes sont oûteux. An de réaliser une analyse de sensibilitéde la sortiedu ode
ouplé, on her he à onstruire un métamodèle de ette sortie à partir d'un faible nombre
d'observations. Trois typesd'observations du systèmeexistent : ellesde la haîne omplète,
ellesdupremier odeuniquement, ellesduse ond odeuniquement. Lemétamodèleobtenu
doit êtrepré is dansles zones lesplus probablesde l'espa ed'entrée.
Les métamodèles sont obtenus par krigeageuniversel, ave uneappro he bayésienne.
Dans unpremier temps,le assans information intermédiaire,ave sorties alaire, esttraité.
Uneméthodeinnovantededénitiondelafon tiondelamoyennedupro essusgaussien,basée
surle ouplagededeuxpolynmes,estproposée. Ensuitele asave informationintermédiaire
esttraité. Unprédi teurbasésurle ouplagedesprédi teursgaussiensasso iésauxdeux odes
est proposé. Desméthodespour évaluerrapidement lamoyenne et lavarian e du prédi teur
obtenu sont proposées. Les résultats obtenus pour le as s alaire sont ensuite étendus au
as où les deux odes sont à sortie de grande dimension. Pour e faire, une méthode de
rédu tiondedimensione a e delavariableintermédiairede grandedimension estproposée
pourfa iliterlarégressionparpro essusgaussiendudeuxième ode. Lesméthodesproposées
sont appliquéessur desexemplesnumériques.
Mots- lés
Codesnumériques emboîtés, odes ouplés, odes haînés, régressionpar pro essusgaussien,
métamodélisation, variable fon tionnelle, rédu tion de dimension, Stepwise Un ertainty
Re-du tion, plansd'expérien es séquentiels.
Abstra t
ThisthesisdealswiththeGaussianpro ess regressionoftwonested odes. Theterm"nested
odes" refersto a systemof two hained omputer odes: theoutput ofthe rst ode is one
of the inputs ofthe se ond ode.
The two odesare omputationally expensive. In orderto perform a sensitivity analysis, we
aim at emulating the output ofthenested odefrom asmall number of observations.
Three types of observations of thesystemexist: thoseof the hained ode,those of therst
ode onlyand those ofthe se ond ode only. Thesurrogate modelhasto bea urate on the
most likely regionsof theinputdomain of thenested ode.
In this work, the surrogate models are onstru ted using the Universal Kriging framework,
witha Bayesian approa h.
First,the asewhenthereisnoinformationabouttheintermediaryvariable(theoutputofthe
rst ode)isaddressed. Aninnovative parametrization ofthemeanfun tion oftheGaussian
pro essmodeling thenested odeisproposed. Itisbasedonthe oupling oftwopolynomials.
Then, the asewithintermediary observations is addressed. A sto hasti predi tor basedon
the oupling of the predi tors asso iated with the two odes is proposed. Methods aiming
at omputing qui kly the meanand thevarian e ofthis predi tor areproposed. Finally, the
methods obtainedfor the ase of odeswiths alaroutputsareextendedto the aseof odes
withhighdimensional ve torial outputs.
We proposean e ient dimension redu tion method of thehighdimensional ve torial input
Keywords
Nested omputer odes,Gaussianpro essregression,surrogate modeling,fun tionalvariable,
Cettethèseprésentedenouveauxdéveloppementspourlamétamodélisationde odes oûteux
haînés,oùlasortiedupremier odeestunedesentréesdu odesuivant. Cette onguration
et sa généralisation à plus que deux odessont fréquemment ren ontrées en pratique. Mais
la onstru tion demétamodèles adaptésà ette ongurationa étépeu étudiée jusqu'i i.
Cemanus rit ontienttrois ontributionsnouvellesparrapportàl'étatdel'art,détailléesdans
les hapitres 3 à 5. La première ontribution on erne la régression par pro essus gaussien
ave unefon tion demoyenne dénie parune polynme. Une nouvelle méthode dedénition
delatendan epolynomiale,baséesurla ompositiondedeuxpolynmes,estproposée. Dans
e asde gure, lavariableintermédiaire entreles deux odesn'est pas onnue.
La se onde ontribution suppose la onnaissan e de la variable intermédiaire et traite de
l'enri hissement du plan d'expérien es en vue de la régression par pro essus gaussien de la
sortie de la haîne de deux odes. Le hoix d'une nouvelle observation soulève plusieurs
questions. Tout d'abord pour un ode donné, il faut hoisir les variables d'entrée de la
nouvelleobservation. Ensuite, ommeilyadeux odes,laquestionseposeégalement(si ela
est possible)de hoisirauquel desdeux odesajouter une nouvelle observation.
La troisième ontribution traite le as de deux odes à sortie de très grande dimension (par
exemple des fon tions du temps). Dans ette onguration, le se ond ode a une sortie,
maiségalement une entrée fon tionnelle. Uneméthodede rédu tiondedimension del'entrée
fon tionnelle adaptée à e as est alors proposée. Les ritères d'enri hissement proposés
pré édemmentsont ombinésave etteméthodederédu tiondedimensionandelesétendre
au asdedeux odesàsortiefon tionnelle. Lesméthodesproposéessontensuiteappliquéesà
un astestindustrielmodélisantl'explosiond'une hargedansune uvesphérique. Ce astest
estasso iéàun ouplageentreun odededétoniqueetun odededynamiquedesstru tures.
Les paragraphesqui suivent présentent plusen détails lastru turedu manus rit.
Lepremier hapitrepasseenrevuel'étatdel'art on ernantlamétamodélisationd'ununique
ode àentréeetsortiede faiblesdimensions. Une brève présentation de larégressionlinéaire
et du haos polynomial est faite,ainsi quede méthodesde régularisation ommeLASSO ou
LARS.Lerestedu hapitreestdédiéàlarégressionparpro essusgaussien(GP) oukrigeage.
Après un rappel des bases de la régression par pro essus gaussien, omme le hoix de la
fon tion de ovarian e,lekrigeageuniverseldansun adrebayésien estprésenté. Ensuite,les
ritères pour plans d'expérien es pour larégression par pro essus gaussien et l'optimisation
bayésienne sont passés en revue. Le hapitre se on lut sur une brève partie on ernant
l'analysedesensibilité,enparti ulierlesméthodesbaséessurunedé ompositiondelavarian e
(indi es de Sobol).
Le deuxième hapitre passe en revue les méthodes pour la régression par pro essus gaussien
d'un ode àentréeet/ou sortiedénie omme une fon tiondis rétisée du temps. L'attention
se on entre i i sur la rédu tion de la dimension de l'entrée ou de la sortie. Con ernant la
rédu tiondeladimension del'entrée, ertainesméthodesneprennent en omptequel'entrée
fon tionnelle, tandis que d'autres ont pour obje tif larédu tion de la dimension de l'entrée
de manière adaptée à la sortie. Ces dernières sont tout parti ulièrement adaptées pour le
système haîné onsidéré dans e travail. Con ernant lasortie fon tionnelle, deux appro hes
sontpossibles. Lapremière onsisteàprojeterlasortiefon tionnellesurunebasededimension
Letroisième hapitre ontient lapremière ontribution de ettethèse : la onstru tion d'une
fon tiondemoyenne dupro essusgaussienpar ouplage dedeuxpolynmes. Cetteappro he
intègre l'information que l'on a a priori sur la stru ture haînée des deux odes, mais sans
observations ni onnaissan ede lastru turede lavariableintermédiaire. Dans e as,la
on-guration estpro he d'unerégression parpro essusgaussien lassique,ave desobservations
desentréesetsortiedela haînede odes. Laspé i itédelaméthodereposesurl'utilisation
del'informationquel'onasur ettestru ture haînée. Ladénitiondelafon tiondemoyenne
omprend unepremière étape de ompositionde deuxpolynmes,puisune se onde étapede
linéarisationde ette omposition. Cettelinéarisationpermetdelimiterl'impa td'uneerreur
d'estimationdesparamètres de ha undesdeuxpolynmes. Ensuiteleprédi teurdelasortie
de la haîne de ode est onstruit en utilisant le krigeage universel dans un adre bayésien.
Par ailleurs, lastru ture proposée pour la tendan e polynomiale ore une grande exibilité,
puisqueles ordrestotaux de ha un desdeuxpolynmes,maisaussiladimension delasortie
du premier polynme, peuvent être optimisés. Cependant, ette exibilité né essitela
réso-lution d'un problème d'optimisation omplexe ar non onvexe. Une appro he heuristique,
baséesuruneminimisation alternéeparrapportauxvariables,estproposéepourrésoudre e
problèmed'optimisation. Par ailleurs, un ritèrebasé surl'erreur Leave One Out(LOO) est
utilisé pour ara tériser laperforman e de prédi tion du prédi teur gaussien. Ce ritère est
utilisé pour hoisir la ombinaisonde valeurs laplus performante pour les ordres totaux des
deuxpolynmes etla dimensionde lasortie dupremier polynme.
Lequatrième hapitre ontient ladeuxième ontribution de ettethèse: lamétamodélisation
de deux odes haînés lorsque des observations de lavariableintermédiaire sont disponibles.
Le prédi teur proposéest basésur un ouplage de prédi teurs gaussiens de ha un des deux
odes. Le hapitreproposeenparti ulierdeux ritèresd'enri hissementdupland'expérien es.
Ces ritères reposent surune minimisation de lavarian e de prédi tion intégrée (IMSE). La
varian e de prédi tion doitdon être évaluéeen untrès grand nombre depoints. Lepremier
ritère orrespond au asoù les deux odes ne peuvent pasêtre appelés de manièreséparée.
Le se ond orrespond au as où les odes peuvent être lan és de manière séparée. Dans e
as,onpeut hoisirlequeldesdeux odesappeler,enretenant eluiquimaximise larédu tion
de la varian e de prédi tion intégrée par unité de temps de al ul pour une évaluation du
ode. Une di ulté majeure liée à ette appro he tient au fait que le ouplage de deux
prédi teurs gaussiens n'est pas gaussien. La varian e de prédi tion doit don être évaluée
en utilisant des méthodes de quadrature ou Monte Carlo. An de résoudre es di ultés
numériques, deux méthodes pour une évaluation rapide de la varian e de prédi tion sont
proposées. Danslepremier as,silepro essusgaussienasso iéause ond odeaunefon tion
de ovarian e gaussienne etunetendan e polynomiale,alors lavarian e peutêtre évaluéede
manière analytique. Dansle as où es onditions ne sont pasvalables, une autre appro he
reposantsurlalinéarisationdu ouplagedesdeuxprédi teurspeutêtreutilisée. Lesméthodes
proposées sont ensuite appliquées surdeux exemples numériques : un premier analytique et
unse ond portant surlatraje toirebalistique d'unproje tile onique. Les résultatsobtenus
montrent l'intérêt de prendre en ompte les observations de la variable intermédiaire et de
pouvoir appeler de manièreséparée ha un desdeux odes.
Le inquième hapitre ontient les ontributions nales de ette thèse et on erne la
mé-tamodélisation par pro essus gaussien de deux odes haînés à sortie fon tionnelle (de très
grandedimension). La ontribution majeure de e hapitre estune méthodede rédu tionde
du se ond ode, qui estlinéaire par rapportà l'entrée fon tionnelle du se ond ode (qui est
également lasortie du premier ode). Le modèle linéaire proposéest en faitun ltre ausal,
paramétré par un petit nombre de variables qui peuvent être estimées à partir d'un faible
nombre d'observations.
Cette ombinaison d'une approximation linéaire et d'une rédu tion de dimension adaptée à
e modèlelinéaire permetderéduireladimensionde l'entrée fon tionnelleduse ond odede
manière adaptéeà laprédi tion de lasortiede e ode.
Grâ e à ette rédu tion de dimension, ha un desdeux odespeutêtre asso iéà un
pro es-sus gaussien ave un ve teur d'entrées de faible dimension. Deux prédi teurs gaussiens sont
obtenus en utilisant une ovarian e tensorisée pour prendre en ompte le ara tère
multidi-mensionnel des sorties des fon tions onsidérées. Les prédi teurs sont ensuite ouplés et le
ouplage estlinéarisé. Ce ipermetd'obtenirunprédi teurgaussiende lasortiefon tionnelle
de la haînede deux odes. La moyenne etlavarian e duprédi teur peuvent alors être
éval-uées de manière analytique, etdon très rapide. Les ritères d'enri hissement proposés dans
le hapitrepré édentsontensuiteadaptésau asdedeux odes ouplésàsortiefon tionnelle.
Enn, lesméthodesproposées sontmisesenappli ation surle astest industrielquiamotivé
ettethèse,àsavoirle ouplaged'un odededétoniqueave un odededynamiquedes
stru -tures. Les sortiesde ha undes odessont desfon tionsdis rétisées dutemps. Les résultats
obtenus montrent l'intérêtdeprendreen omptelesobservationsde lavariableintermédiaire,
par rapport àune simple régressionpar pro essusgaussien dela sortiede la haînede odes
Introdu tion i
Notations xv
I State of the art for the surrogate modeling of omputer odes 1
1 Surrogate modeling of a single ode with s alar inputs and output 5
1.1 Linear regression . . . 5
1.2 Polynomial ChaosExpansion . . . 6
1.3 Methods for the sele tion of the regressorsof alinear model . . . 8
1.3.1 Stepwise and all-subsets regressions. . . 8
1.3.2 Ridge regression . . . 8
1.3.3 LASSO . . . 9
1.3.4 Forward stagewise regression . . . 9
1.3.5 Least Angle Regression . . . 9
1.3.6 Dantzig sele tor. . . 11
1.3.7 Con lusions . . . 11
1.4 Gaussian pro ess regressionor Kriging . . . 11
1.4.1 Gaussian pro esses . . . 11
1.4.2 Ordinary, simpleand universalKriging . . . 18
1.4.3 Estimation ofa parametri ovarian e fun tion . . . 20
1.5 Design ofexperiments . . . 22
1.5.1 Spa e-lling designs . . . 22
1.5.2 Criterion-based designs . . . 24
1.5.3 Gaussian pro esses for pointwiseglobal optimization . . . 26
1.6 Sensitivityanalysis . . . 26
2 Gaussian pro ess regressionof a ode with a fun tional inputor output 29 2.1 Dimension redu tionof afun tional variable . . . 29
2.1.1 Dimension redu tionadaptedto thefun tional variable only . . . 30
2.1.2 Dimensionredu tionadaptedtothefun tionalvariableandadependent variable . . . 31
2.2 Gaussian pro ess predi tion ofa omputer ode withafun tional output . . . 33
2.2.1 Proje tion ofthefun tional outputon abasis . . . 34
II Contributions 37
3 Nested polynomial trends for the improvement of Gaussianpredi tors 39
3.1 Introdu tion . . . 39
3.2 Gaussian pro ess predi tors . . . 41
3.2.1 General framework . . . 41
3.2.2 Choi e ofthe ovarian e fun tion . . . 42
3.2.3 Choi e ofthe meanfun tion . . . 42
3.3 Nested polynomial trends for Gaussian pro esspredi tors . . . 43
3.3.1 Nested polynomialrepresentations . . . 43
3.3.2 Coupling nested representationsand Gaussian pro esses . . . 46
3.3.3 Linearization of the nestedpolynomialtrend . . . 47
3.3.4 Error evaluation . . . 48
3.3.5 Convergen e analysis . . . 50
3.4 Appli ations . . . 50
3.4.1
d = 1
. . . 513.4.2
d > 1
. . . 543.4.3 Relevan eofthe LOOerror . . . 55
3.5 Con lusions . . . 57
4 Gaussian pro essregression of two nested odes with s alaroutput 59 4.1 Introdu tion . . . 59
4.2 Surrogate modeling for two nested omputer odes . . . 60
4.2.1 General framework . . . 60
4.2.2 Gaussian pro ess-basedsurrogate models . . . 61
4.2.3 Sequential designsfor theimprovement ofGaussian pro ess predi tors 64 4.3 Fast omputation ofthe varian e of thepredi tor of thenested ode . . . 65
4.3.1 Expli it derivation ofthetwo rst statisti almomentsof thepredi tor 66 4.3.2 Linearized approa h . . . 67
4.4 Appli ations . . . 68
4.4.1 Chara teristi s oftheexamples . . . 69
4.4.2 Predi tion performan efor a given setof observations . . . 72
4.4.3 Performan es ofthe sequential designs . . . 74
4.5 Con lusions . . . 78 4.6 Proofs . . . 79 4.6.1 Proof ofProposition 4.2.1 . . . 79 4.6.2 Proof ofLemma 4.3.1 . . . 79 4.6.3 Proof ofLemma 4.3.2 . . . 80 4.6.4 Proof ofProposition 4.3.1 . . . 81 4.6.5 Proof ofProposition 4.3.2 . . . 85 4.6.6 Proof ofCorollary 4.3.3 . . . 85
5 Gaussian pro essregression of two nested odes with fun tional outputs 89 5.1 Introdu tion . . . 89
5.2 Dimensionredu tion ofthe fun tional inputof a ode . . . 91
5.2.1 Formalism . . . 91
5.2.2 Dimensionredu tion ofthefun tional inputonly . . . 92
5.2.3 PartialLeast Squaresregression. . . 93
5.2.4 A linearmodel-based dimension redu tionof thefun tional input . . . 94
5.3 Gaussian pro essregressionwithlowdimensionalinputsandafun tionaloutput 96 5.4 Surrogate modeling of the nested ode . . . 101
5.4.1 A linearizedGaussian predi tor ofthenested ode . . . 101
5.4.2 Sequential designs . . . 102
5.5 First numeri al example . . . 104
5.5.1 Des ription of the numeri al example . . . 104
5.5.2 Dimension redu tionof thefun tionalinput ofthese ond ode . . . . 106
5.5.3 Predi tion of the nested ode . . . 111
5.5.4 Sensitivityanalysis . . . 115
5.6 Se ondnumeri al example . . . 120
5.6.1 Des ription of thenumeri al example . . . 120
5.6.2 Results ofthenumeri al example . . . 121
5.7 Con lusions . . . 123 5.8 Proofs . . . 125 5.8.1 Proof of Proposition 5.2.1 . . . 125 5.8.2 Proof of Lemma 1 . . . 126 5.8.3 Proof of Proposition 5.3.1 . . . 128 5.8.4 Proof of Proposition 5.4.1 . . . 129 5.8.5 Proof of Proposition 5.4.2 . . . 129 Con lusions 131
Context
Surrogatemodelingforthesensitivityanalysisoftwonested omputer odes
This thesisis motivated by an appli ation ase. Thisappli ation ase isthe oupling of two
omputationally ostly omputer odes. The rst ode is a detonation ode and these ond
ode is a stru tural dynami s ode. The two odes have fun tional (i.e. high dimensional
ve torial) outputs and the fun tional output of the rst ode is one of the inputs of the
se ond ode.
Ifweaimatperformingdesignand erti ationstudiesofsu hasystem,theevaluationofthe
output of thesystem at a large numberof input pointsis oftenne essary. This is espe ially
true whenmethods like sensitivityanalysis, riskanalysisor optimization areperformed.
In this work we aim at performing a sensitivity analysis of the system mentioned above.
Given the omputational ost of the two odes, the rst obje tive is to build an emulator,
or a surrogate model, of the output of the two nested odes. This surrogate model will be
onstru ted from a small set of observations of the two odes. The number of observations
annot be veryhigh be ause ofthe omputational ostsofthe odes.
As the role of simulation is in reasing, the surrogate modeling of high- ost odes generates
growing interest. However, the existing methods are generally applied to a single ode or
onsider asystemof odesasasingle ode.
In this work, the framework of the Gaussian pro ess regression for the surrogate modeling
of omputer odes is onsidered. In this framework, the output of a ode is onsidered to
be the realization of a Gaussian pro ess. The framework used for the Gaussian pro ess
regression is the Universal Kriging framework and a Bayesian approa h is utilized. If some
notveryrestri tiveassumptionsonthepriordistributionoftheGaussianpro essarefullled,
a Gaussian predi tor ofthe ode an be obtainedby omputing theposterior distribution of
theGaussian pro ess giventhe observations of the odeoutput.
Moreover, the existing methods for the surrogate modeling of odes generally onsider the
ase of odes with low dimensional ve torial inputs. If a ode has a fun tional input, the
dimension of the fun tionalinput is oftenredu ed thanksto a proje tion. The hoi e of the
optimal method ofdimension redu tionof the fun tionalinputfor thesurrogatemodelingof
theoutput remainsa resear h topi .
Contributions of the thesis
Thisthesismakes ontributions to the surrogatemodeling oftwo nested odeswiths alaror
fun tionaloutputs. These ontributionsaimatsolvingthefollowing di ultiesofthestudied
system:
•
there aretwo odes,•
the odesare oupled bya fun tionalintermediary variable,•
the se ond ode hasa fun tionalinput.First,the aseoftwonested odeswiths alaroutputsisinvestigated. The onsideredsystem
is then:
x
1
→
x
2
y
1
(x
1
)
ց
ր
y
nest(x
nest) := y
2
(y
1
(x
1
), x
2
),
(0.0.1) withx
1
∈ R
d
1
andx
2
∈ R
d
2
the low dimensional ve torial inputs of the two odes,
y
1
∈ R
andy
2
∈ R
theoutput ofthetwo odes,andd
1
andd
2
two integers.Ina rst step, the ase where there are no observations of theintermediary variable
y
1
(x
1
)
is onsidered. An innovative parametrization of the mean fun tion of the Gaussian pro essis proposed. This parametrization is based on the oupling of polynomials and enables to
improvethepredi tiona ura y omparedtoa lassi al onstantorpolynomialmeanfun tion.
Thenthe ase where observations of theintermediary variableareavailable is onsidered.
Asto hasti predi tor of thenested ode isobtained by oupling theGaussian predi tors of
thetwo odes. Su h an approa h enables to take into a ount all thetypes of observations:
observations of the nested ode, of the rst ode only and of the se ond ode only. The
predi torisnon-Gaussian but its moments an be omputed usingMonteCarlomethods.
Thenwedenesequentialdesign riteriawhi haimatimprovingthepredi tiona ura yofthe
proposedpredi tor. The riteriaarebasedonaredu tionoftheintegratedpredi tionvarian e
be ause thepredi torhasto be a urate on themost probable areas oftheinputdomain for
the sensitivity analysis. Finally, two adaptations of the proposed predi tor are developed
inorder to evaluate thepredi tion varian e andthus theproposedsequential design riteria
qui kly. Therst adaptationis alled"analyti "andthese ondone"linearized". Theyboth
enableto ompute themeanand thevarian e oftheproposedpredi torin losed forms. The
"linearized" method leads also to a Gaussian predi tor of the nested ode. Moreover, the
interestoftaking into a ount theintermediary observations is shown.
Finally, the ase of two nested ode with fun tional outputs is investigated. The onsidered
systemis then:
x
1
→
x
2
y
1
(x
1
)
ց
ր
y
nest(x
nest) := y
2
(y
1
(x
1
), x
2
),
(0.0.2) withy
1
∈ R
N
t
andy
2
∈ R
N
t
the outputof the two odeswhen they arefun tional,N
t
≫ 1
denotingthe number ofdis retization steps ofthefun tionaloutputs.These ond odehasafun tionalinputandthe existingmethodsof Gaussianpro ess
regres-siongenerally onsiderlowdimensional ve torial inputs. The Gaussian pro ess regressionof
the se ond ode requires therefore the redu tion of the dimension of this fun tional input.
We propose a dimension redu tion of the fun tional input of a ode whi h is suited for the
predi tion of the fun tional output of this ode. Thisdimension redu tion method is based
on a two-step approa h. First, the output of the se ond ode is approximated by a linear
ausal lter. Thislinearmodel hasasparsestru ture, whi h is dened byonly
N
t
variables. Thesevariables an be estimatedfroma smallsetofobservations ofthefun tionalinputandoutputofthese ond ode. These ondstepistheuseofaproposedproje tion basiswhi h is
adaptedto alinearmodel. The ombination ofthesetwo stepsenables toobtainadimension
redu tionofthe fun tional inputof the se ond ode,whi h:
•
is adaptedto the outputof this ode•
an beestimatedfrom asmall setof observations,•
doesnot requirethe knowledgeof thederivativesoftheoutput ofthe ode,On e thedimension of the fun tional intermediary variable hasbeen e iently redu ed, the
previouslydenedlinearizedmethodisadaptedtothe aseoftwonested odeswithfun tional
outputs. AGaussianpredi torofthefun tionaloutputofthenested ode,withanalyti mean
andvarian e,isobtained. Finally,thepreviouslydenedsequentialdesign riteriaareadapted
Outline of the manus ript
The thesishastwo parts.
PartIprovidesareviewofthe stateoftheartforthesurrogatemodeling of omputer odes.
InChapter1,wereviewmethods forthesurrogatemodelingofasingle odewithlow
dimen-sionalve torial inputsand a s alaroutput.
Se tion 1.1 des ribesthesurrogatemodelingof asingle odebyLinear Regression.
Se tion 1.2 fo uses onthe surrogate modeling of a ode byPolynomialChaos Expansion.
Se tion 1.3 reviews the existingmethods for thesele tion of theregressorsin theframework
of Linear Regression.
Se tion 1.4provides areviewof theGaussian pro essregressionframeworkfor thesurrogate
modelingof asingle odewithlow dimensionalve torial inputs and as alaroutput.
Se tion 1.5presentsareviewofthedesign ofexperimentsfor ana uratesurrogatemodelon
thewholeinput domainof a odewith lowdimensionalve torial inputs anda s alaroutput.
Se tion1.6fo usesonthesensitivityanalysisoftheoutputofa ode,oraquantityasso iated
withit, withrespe tto theinputs ofthe ode.
InChapter2,wereviewmethodsforthe surrogatemodelingofasingle odewithafun tional
output, lowdimensional ve torial inputs and possiblya fun tionalinput.
Se tion 2.1 is devoted to the existing methods for the dimension redu tion of a fun tional
variable.
Se tion 2.2 reviews theexisting methods for the Gaussian pro ess regression of a ode with
low dimensionalve torial inputs and afun tional output.
PartIIdetailsour ontributions to the onstru tion ofasurrogatemodeloftwo nested odes
withs alar or fun tionaloutputs.
In Chapter 3, we fo us on the ase where the two odes have s alar outputs and no
obser-vations of theintermediary variable are available. We propose to dene the mean fun tion
of the Gaussian pro ess modeling the nested ode as a oupling of two polynomials. This
parametrizationisbasedonthe oupling oftwo polynomials. Weshowhowthis
parametriza-tion an improve the predi tion a ura y of the Gaussian predi tor ompared to the ase
where the mean fun tion isdened bypolynomials.
InChapter4wefo usonthe asewherethetwo odeshaves alaroutputsandobservationsof
theintermediary variableareavailable. We proposeasto hasti predi torof thenested ode
basedonthe ouplingoftheGaussianpredi torsofthetwo odes. Thissto hasti predi toris
non-Gaussian but itsmean and varian e an beevaluated using MonteCarlo methods. This
predi tor an take into a ount all the possible observations: those ofthenested ode,those
of therst ode and those ofthe se ond ode. Then sequential design riteria areproposed.
Thesedesign riteria aimat improvingthepredi tiona ura yon thewholeinputdomainof
thenested ode. Oneofthe riteria analsotakeintoa ountthedieren eof omputational
ostsbetween thetwo odes. Finally,we proposetwoadaptations ofthepreviouslyproposed
predi tor of the nested ode in order to a elerate the omputation of the mean and the
varian e of the predi tor. They both enable to ompute the predi tion mean and varian e
in losed forms. In addition, theproposed linearizedpredi tor of the nested ode enables to
obtainaGaussian predi torofthenested odewith onditioned meanand varian e fun tions
Theappli ation ofthe proposed methods to numeri al examplesshows theinterestof taking
into a ount the intermediary observations.
InChapter 5we fo uson the aseof the oupling oftwo odeswith fun tionaloutputs. We
rstproposeane ient dimensionredu tionofthefun tionalinputofthese ond ode. This
dimensionredu tionisbasedonalinearproje tion ofthefun tionalinputofthese ond ode.
Theproposedproje tionbasis anbeestimatedfromasmallsetofobservations ofthese ond
ode and doesnot require theknowledge ofthederivativesof the ode.
We alsoextend the linearizedpredi torofthe nested odeproposed inChapter 4to the ase
of two nested odes withs alar output. This extension relies on thedimension redu tionof
thefun tionaloutput and atensorized stru ture of theGaussian pro ess modeling the ode.
Bytensorizedstru turewemeanaseparationbetweentheindexoftheoutputandtheinputs.
Thesequentialdesign riteriaarealsoadaptedtothe aseoftwonested odeswithfun tional
outputs.
Theproposedmethodsareappliedtonumeri alexamples. Theresultsshowagaintheinterest
oftakingappropriately into a ount the intermediary observations.
Thepredi torobtainedat theend ofthesequential enri hment oftheinitial designisusedin
ordertoperform asensitivityanalysisofas alarquantityof interest basedonthefun tional
Ordinal variables
n
numberofobservationsd
dimension ofan inputvariablep
numberoffun tions ofa basisoffun tion inthe aseof UniversalKrigingN
t
dimension ofthe time-varyingoutput ofa ode ard(A)
numberofelementsof thesetA
Matrix, ve tors and s alar
x
as alarx
ave torx
i
or(x)
i
thei
-th entry of the ve torx
X
amatrix(X)
ij
the entry at linei
and rowj
ofthe matrixX
(X)
·i
the ve tor ofthe entriesof thei
-th olumn ofthematrixX
(X)
i·
the ve tor ofthe entriesof thei
-th row ofthematrixX
X
T
transposeofthematrixX
diag(x)
diagonal matrix withdiagonalx
diag
(X)
ve tor orrespondingto the diagonal of thematrixX
Tr(X)
tra e ofthematrixX
Probabilisti notations
d
=
equalityindistributionE
[
·]
Mean ofa random quantityV
[
·]
Varian e ofa random quantityN (m, K)
multivariate normal distribution withmeanm
and ovarian e matrixK
GP
(m (
·) , C (·, ·))
one-dimensional Gaussian pro ess with mean fun tionm
and ovari-an efun tionC
GP
(m (
·) , C (·, ·))
multidimensional Gaussian pro ess withve tor-valued mean fun tionm
and matrix-valued ovarian e fun tionC
Norms and s alar produ ts
(
·, ·)
X
s alar produ tin thespa e of square integrable real-valued fun tions onX
,su hthat(y, z)
X
:=
R
X
y(x)z(x)dx
k·k
X
norminthespa eofsquareintegrablereal-valuedfun tionsonX
,su h thatkyk
2
X
:= (y, y)
X
k·k
F
Frobenius normk·k
1
L
1
norm, su hthatkxk
1
=
d
P
i=1
|x
i
|
k·k
L
2
norm, su hthatkxk
2
=
s
d
P
i=1
x
2
i
State of the art for the surrogate
However, methods like un ertainty propagation, sensitivity analysis or optimization require
theevaluationoftheoutputofthe odeatahugenumberofinputpoints. Ifthe omputational
ost of the omputer ode is high, and only a small number of observations of its output is
available, the use of a surrogate model is ne essary. In this part we review some existing
methods for the surrogate modeling of omputer odes.
This part in ludes two hapters. The rst one is devoted to the surrogate modeling of a
omputer odewiths alar(i.e. lowdimensionalve torial)inputsandoutput. These ondone
fo uses onthesurrogatemodelingwithGaussianpro ess regressionofa odewithfun tional
Surrogate modeling of a single ode
with s alar inputs and output
Inthis hapter we onsidera modelofthe form
x
7→ y (x)
,x
∈ X ⊂ R
d
,
d
a positive integer, andµ
X
is a probability measure on the spa e omprisingX
and aσ
-algebra overX
. The following se tions detail the state of the art for the surrogate modeling ofy
from a set ofn
observations of theinputand the outputof the ode. Theseobservations aredenotedby:X
obs=
x
(1)
. . .x
(n)
,
(1.0.1) andy
obs=
y
(1)
= y
x
(1)
, . . . , y
(n)
= y
x
(n)
,
(1.0.2) whereX
obsisa
(n
× d)
-dimensional matrix andy
obsis a
n
-dimensional ve tor.Therst se tion is devoted to linear regression. The se ond one deals withthe useof
Poly-nomialChaosExpansionasasurrogatemodel. Thethirdone fo useson themethodsfor the
sele tion of regressors in regression models. The fourth one presents the Gaussian pro ess
regression for the surrogate modeling of a omputer ode. Finally, the last se tion reviews
some existing designs of experiments whi h are adapted for the a quisition of knowledge of
the omputer ode orthesequential improvement ofa surrogate model.
1.1 Linear regression
Generalized additive models area very ommon tool for theemulation of aresponsesurfa e
[Hastie and Tibshirani, 1990℄. It is the proje tion of the output
y
on a basis of fun tionsh
i
, 1
≤ i ≤ p
,p
a positive integer,oftheinputsx
. Theemulator an be written intheform:b
y (x) = h (x)
T
β,
(1.1.1)where
h
(x)
andβ
areinR
p
. Thefun tionsofthebasis anbepolynomials,withPolynomial
ChaosExpansion asaparti ular ase, wavelets, trigonometri fun tions...
Note that simple linear regression an be regarded as a parti ular ase of the generalized
additive models,witha basisoffun tions omprisingonly the ovariates:
h
(x) = x
.Theregression oe ients
β
an be estimatedfroma setofn
observations of theinputsand theoutputofthe odeX
obs and
y
obs
throughtheminimizationofthequadrati lossfun tion:
b
β
= argmin
β
∈R
p
n
X
i=1
y
x
(i)
− h
x
(i)
T
β
2
.
(1.1.2)If we denote:
H
=
h x
(1)
T
. . .h x
(n)
T
,
(1.1.3)thentheleast squaresestimateof theregression oe ients an bewritten:
b
β
= H
+
y
obs,
(1.1.4)where
H
+
isthe pseudo-inverse of
H
. Ifn
≥ p
andH
is ofrankp
,thenH
T
H
isinvertible
and
H
+
= H
T
H
−1
H
T
. By denition,
H
isa(n
× p)
-dimensional matrix. However,matrixH
T
H
isnotalwaysinvertible. Thenumberofobservations anbesmaller
than the number of regression oe ients (
p
≤ n
) or the fun tions of the basis an be or-related a ording to the probability measureµ
X
, whi h means that the olumns ofH
are orrelated, thusredu ing the rankof matrixH
.The matrix
H
T
H
ismorelikelyto beinverted ifthe basisfun tions arede orrelated with
respe t to the probability measure
µ
X
of the inputs, as performed with Polynomial Chaos Expansion. Another possible approa h is the use of a regularization term for the inversionof the matrix, or the sele tion ofthe most inuen ingregressors. Thetwo following se tions
detail thesetwo approa hes.
1.2 Polynomial Chaos Expansion
Polynomial Chaos expansion an be used to emulate a model response
y
with inputsx
. Besides, theprobability measureµ
X
asso iated withx
is a produ tmeasure. Therefore, the omponentsofthe inputve torareindependent. Ithasbeenapplied byGhanemand Spanos[1990℄ to sto hasti niteelements methods. PolynomialChaos expansion an beseenasthe
proje tionof the modeloutput
y
ona polynomial basiswhi h dependson thedistributionof themodel inputsx
. The polynomials areorthonormal withrespe tto thedistribution ofx
. The modelresponse an thereforebe expanded as:y (x) =
X
α∈N
d
β
α
Φ
α
(x) ,
(1.2.1)with
β
α
∈ R
andΦ
α
orthonormal multidimensional polynomials, whi hmeans:Z
X
Φ
α
(x) Φ
γ
(x) dµ
X
(x) = δ
αγ
,
(1.2.2)with
δ
αγ
denoting the Krone kerdelta.In pra ti e, the expansion of Eq. (1.2.1) an be trun ated in order to obtain a surrogate
modelof themodelresponse. Ifwe denoteby
A
⊂ N
d
the trun ated setofindi es, by
β
A
the ve tor gatheringtheβ
α
, α
∈ A
andbyΦ
A
theve torgatheringthesele tedpolynomials,this surrogate modelisdened as:b
y (x) = Φ
A
(x)
T
β
A
.
(1.2.3)Note that the trun ation is generally dened by an upper bound
r
on the total order of the polynomials, whi h meansA
=
{α ∈ N
d
,
kαk
1
≤ r}
. The total orderr
an be hosen adaptively a ording to atarget pre ision, withanestimation of theerror thanksto aross-validation riterion [Blatmanand Sudret, 2010,2011℄.
A oe ient
β
α
isdened astheproje tion ofthe model responseon fun tionΦ
α
:β
α
=
Z
X
Distribution Density Orthonormalbasis Uniform
1
2
1[−1,1]
(x)
P
k
(x)
√
2k + 1
,with
P
k
Legendre polynomialGaussian
1
√
2π
exp
−
x
2
2
H
k
(x)
√
k!
,with
H
k
Hermite polynomialGamma
x
a
Γ (a + 1)
exp (
−x)
1x>0
L
k
(x)
Γ (k + a + 1)
,withL
k
Laguerre polynomialTable 1.1: Classi alunivariate polynomial familiesusedfor Polynomial Chaos Expansion.
The integral an be estimated using Monte-Carlo methods, quadrature rules [Ghio el and
Ghanem,2002℄ or sto hasti ollo ation methods [Xiu, 2009℄.
The oe ients an alsobeestimatedbyleastsquaresregression[BlatmanandSudret, 2010,
2011℄froma set of
n
observations:b
β
A
= argmin
βA
∈R
ard(A)
n
X
i=1
y
(i)
− Φ
A
x
(i)
T
β
A
2
.
(1.2.5)Notethatiftheobservations aredrawna ording tothedistributionoftheinputs,the
meta-modelwill be morea urate inthe high-probability regionsof theinputdomain.
Theusualone-dimensional polynomial familiesusedfor Polynomial Chaos Expansion, whi h
are hosena ordingto thedistribution oftheone-dimensionalvariable
x
,aregiven inTable 1.1.Furthermore, the inputs an be transformed using an isoprobabilisti transformation, su h
asthe Nataf or the Rosenblatt transformations [Nataf, 1962; Rosenblatt, 1952; Lebrun and
Dutfoy, 2009℄. Su h transformations map
x
to ad
-dimensional standard Gaussian variableξ
(i.e.d
independent standard Gaussian variables). Then a Polynomial Chaos Expansion an be performed using Hermite polynomials [Blatman and Sudret, 2011℄. The expansionbe omes:
y (x) =
X
α∈N
d
β
α
H
α
(T (x)) ,
(1.2.6) whereH
α
=
d
Q
i=1
H
α
i
andβ
α
=
Z
T
(X)
y T
−1
(ξ)
H
α
(ξ)
d
Y
i=1
ϕ (ξ
i
) dξ.
(1.2.7)Here,
T : x
7→ ξ
is the isoprobabilisti transformation andT
−1
its inverse,
H
α
are Hermite polynomials,andϕ
the standard univariate Gaussianprobability densityfun tion.Thankstothisisoprobabilisti transformation,thePolynomialChaosExpansionofa omputer
1.3 Methods forthe sele tionof the regressors of alinear model
Inthisse tionwereviewtheexistingmethodsforthesele tionofthemostinuentialregressors
for linear regression or Polynomial Chaos Expansion. The methods are presented in the
hronologi al order of their appearan e. Two approa hes an be distinguished: the rst one
sele tstheregressorswhi harethemostinuential. These ondoneminimizesthe oe ients
asso iated withthe leastinuential regressors.
1.3.1 Stepwise and all-subsets regressions
Stepwiseregressionaimsatsele tingtheregressorswhi himprovethepredi tiona ura ythe
most. Therearethreemainapproa hestoperformthissele tion: forwardsele tion,ba kward
elimination and bidire tional elimination.
Intheforwardmethod,thesetofthe sele tedregressorsisemptyattheinitialstep. Then,at
ea hstep,oneaddstheregressorwhi hbestimprovesthepredi tiona ura yoftheregression
model. Theaddition ontinuesuntil astopping riterion is rea hed.
On the ontrary,withthe ba kward elimination, a huge number ofregressors aresele ted at
theinitialstep. Thentheregressorswhi h ontributetheleastto thepredi tion a ura yare
removed stepbystepfrom theregressionmodel.
Efroymson [1960℄ introdu ed an approa h ombining forward sele tion and ba kward
elimi-nation. At ea h step of the forward sele tion, the interest of removing one of the previous
sele ted regressorsis studied.
However,stepwiseregression isknown asbeinggreedy andquiteunstable [Hesterberg etal.,
2008℄.
Inparallel,all-subsetsregressionhasbeenintrodu edbyFurnivalandWilson[1974℄. Itrelies
on theevaluationof the a ura yof all the regressionmodels basedon all thesubsetsof the
set of regressors. Even though exhaustive, this approa h an be omputationally expensive,
espe iallywhenthe numberof regressorsis high.
1.3.2 Ridge regression
Introdu ed by Hoerl and Kennard [1970℄, ridge regression is based on a penalization of the
oe ientsoftheregressors. Thispenalization anbeseenasaregularizationoftheregression
problem. The oe ientsobtained withtheridgeregressionarethesolutions ofthefollowing
optimization problem:
b
β
ridge= argmin
β
∈R
p
n
X
i=1
y
x
(i)
− h
x
(i)
T
β
2
+ δ
kβk
2
2
,
(1.3.1)with
δ
anon-negative real-valued onstant. Thisleads to the normalequation:H
T
H
+ δI
p
b
β
ridge
= H
T
y
obs.
(1.3.2)Pra ti ally, the optimal value of
δ
an be estimated thanks to a Cross validation riterion. The absolutevalue of the oe ientsde reases asδ
in reases. Whenδ = 0
, theresultis the same astheone ofordinaryleastsquares. Ifδ > 0
thenthematrixH
T
H
+ δI
p
is positive
denite and thus invertible.
andArsenin, 1977℄, whi h isdened asfollows:
b
β
Tikhonov= argmin
β
∈R
p
n
X
i=1
y
x
(i)
− h
x
(i)
T
β
2
+
kΓβk
2
,
(1.3.3)with
Γ
ad
× d
-dimensional matrix. IfΓ
T
Γ
ispositivedenite, thisproblem hasthefollowingexpli it solution:
b
β
Tikhonov= H
T
H
+ Γ
T
Γ
−1
H
T
y
obs.
(1.3.4)Notethatif
Γ
isdened su h thatΓ
T
Γ
ispositive denite,thenthematrix
H
T
H
+ Γ
T
Γ
is
aninvertible matrix.
1.3.3 LASSO
TheLeastAbsoluteShrinkage andSele tionOperator (LASSO)methodhasbeen introdu ed
by Tibshirani [1989℄. It relies on a
L
1
-penalization of the estimation ofβ
, whi h an be written:b
β
LASSO= argmin
β
∈R
p
n
X
i=1
y
x
(i)
− h
x
(i)
T
β
2
+ δ
kβk
1
,
(1.3.5)with
δ
a non-negative onstant.Thehigher
δ
is, themore zero oe ientsthere areand thesparsertheregressionmodel is.1.3.4 Forward stagewise regression
Hastieetal.[2001℄haveintrodu ed theforwardstagewiseregression. Althoughdierent from
LASSO, ityields similarresults. The pro edure an be dened by thefollowing algorithm:
•
Initialize withR
= y
obsand
β
i
= 0, i
∈ {1, . . . , p}
, then repeat until no regressor is orrelated withR
:Find
i
∈ {1, . . . , p}
su h thath
i
X
obsisthe most orrelated with
R
, Updateβ
i
= β
i
+ ǫ
i
,ǫ
i
= ǫ
sign orh
i
X
obs
, R
, UpdateR
= R
− ǫ
i
h
i
X
obs
,where,by abuseof notation
h
i
X
obs= h
i
x
(1)
, . . . , h
i
x
(n)
. In pra ti e,ǫ
isset to a smallvalue,likeǫ = 0.01
. Ingeneral,thisapproa hismorereliablethanthe lassi alstepwise regression.1.3.5 Least Angle Regression
Introdu ed by Efron et al. [2004℄, Least Angle Regression (LAR) is similar to the forward
stagewiseregression, giventhatitsele tstheregressor
h
i
X
obswhi histhemost orrelated
withthe urrentresidual
R
. However,the omputation ofthevalueofβ
i
isdierent. Instead of being slightly modied, the value ofβ
i
is hosen su h that the orrelation between the new residualR
− β
i
h
i
X
obs
and its most orrelated regressor
h
j
X
obs is equal to the orrelation betweenR
− β
i
h
i
X
obs andh
i
X
obs. This method an also be seen as an
1.3.5.1 The algorithm
Least Angle Regression(LAR) isasso iated withthe following algorithm:
1. Initialize with
R
= y
obsand
β
i
= 0, i
∈ {1, . . . , p}
. 2. Findi
∈ {1, . . . , p}
su h thath
i
X
obs
isthe most orrelated with
R
.3. Move
β
i
from0
toward its least squares oe ient, until another regressorh
j
X
obs hasasmu h orrelationwithR
− β
i
h
i
X
obs
as
h
i
X
obs.
4. Movejointly
(β
i
, β
j
)
inthedire tiondenedbytheirjointleastsquares oe ientofthe urrent residual onh
i
X
obs
, h
j
X
obs
,until some regressor
h
k
X
obsis asmu h
orrelated withthe urrent residual.
5. Continue until min
(p, n
− 1)
regressorshave been retained.1.3.5.2 LASSO an be seen as spe i ase of LAR
Efronetal.[2004℄ andHastieetal.[2007℄haveshownthataslightlymodiedLARalgorithm
an provide the entire paths of the LASSO oe ients as the
δ
oe ient in reases. This modiedalgorithm isdened asfollows:•
RuntheLAR algorithm from step1 to 4,•
Ifanon-zero oe ienta hieveszero,removetheasso iatedregressorfromthelinearmodel and re omputethe joint least squaresdire tion,•
Continueuntil min(p, n
− 1)
regressorshavebeen retained.In the same way, a modied LAR algorithm an be used to perform a forward stagewise
regressioninthe aseof
ǫ
→ 0
[Hastieetal.,2007℄. NotethatthelabelLARSgenerallyrefers to this modiedLAR algorithm (where Srefersto Stagewiseor LASSO).1.3.5.3 Hybrid LARS
Introdu ed byEfron etal. [2004℄,hybrid LARS isderived fromtheoriginal LARS (referring
to the original LAR or LASSO here). This modiedalgorithm omprisesa LAR step whi h
enables to sele tthe regressors. The nextstep is the estimation byordinary leastsquares of
the oe ients asso iated withthe sele ted regressors.
Hybrid LARS relies on a separation between the hoi e of theregressors and theestimation
of the linearmodel.
It enables to in reasethe a ura yofthelinear model ompared to theoriginal LARS.
Relaxed LASSO [Meinshausen etal., 2007℄is an extension of the LARS-basedLASSO
algo-rithm. The rst stepis thesame asfor hybrid LARS. The ordinaryleastsquares estimation
ofthe oe ientsatthese ondstepisrepla edbyaLASSOestimationwithasmallpenalty.
In this approa h, for the sele ted regressors at a given step of the LARS algorithm, one
performs LASSO with a small penalty oe ient
δ
, su h that no regressor is eliminated. HybridLASSO isa parti ular aseof this algorithm,withδ = 0
.1.3.6 Dantzig sele tor
The Dantzig sele tor of Candes and Tao [2007℄ is based on the resolution of the following
optimizationproblem:
β
Dantzig= argmin
β
∈R
p
H
T
y
obs− Hβ
∞
subje t tokβk
1
≤ t,
(1.3.6) witht
∈ R
+
Inthe same wayas LARS,the Dantzig sele tor sets some oe ients to zero, thus sele ting
someregressors.
However, Efron etal.[2004℄ and Meinshausen etal. [2007℄ have shownthatthelinear model
obtained with LASSO is as a urate as or more a urate than the one obtained with the
Dantzigsele tor.
Note that a DASSO (DAntzig Sele tor with Sequential Optimization) algorithm has been
proposedbyJamesetal.[2008℄inorderto omputeinonestepthewholepathoftheDantzig
sele tor.
1.3.7 Con lusions
In this se tion, methods whi h enable to sele t the regressors of a linear model have been
reviewed. Su happroa hesareparti ularly usefulwhenthenumberofobservations
n
issmall ompared to the numberofpossible regressorsp
ofthelinearmodel.1.4 Gaussian pro ess regression or Kriging
This se tion is devoted to the surrogate modeling of a omputer ode by Gaussian Pro ess
Regression.
Gaussianpro essregressioniswidelyusedin omputerexperiments[Sa ksetal.,1989;Santner
etal.,2003; Rasmussen and Williams, 2006℄. Inthe Gaussian pro ess regression framework,
theoutput
y
of the ode an be seenasa realizationof a Gaussianpro ess.In the remainder of the se tion, we rst outline the multidimensional Gaussian distribution
and the denition of a Gaussian pro ess. Then the Gaussian pro ess regression framework
fora known ovarian e fun tion ispresented. Finally, theestimationof thehyperparameters
ofparametri ovarian e fun tionsisdes ribed.
1.4.1 Gaussian pro esses
1.4.1.1 Multidimensional (multivariate) Gaussian distribution
A random ve tor
u
= (u
1
, . . . , u
n
) , n
≥ 1
, is a Gaussian ve tor if the following equivalent assumptionsareveried:•
for anya
∈ R
n
,
a
T
u
hasa Gaussian distribution,
•
the hara teristi fun tion ofu
is of the formv
7→ exp
iv
T
m
−
1
2
v
T
Kv
withm
an
-dimensional ve tor andK
a(n
× n)
-dimensional matrix, whi h is symmetri and positive denite.1.4.1.2 Gaussian pro esses
Arandompro essasso iatestoanyvalueof
x
arandomvariableY (x)
. Arandompro essisa Gaussianpro essifitsnite-dimensionaldistributionsareGaussiandistributions. AGaussianpro ess
Y
is hara terizedbyitsmeanand ovarian efun tions. Themeanfun tionisdened by:m (x) = E [Y (x)] .
(1.4.1)The ovarian e fun tion isdened by:
C x, x
′
=
ovY (x) , Y x
′
,
(1.4.2)
x
′
inX
.AGaussian pro essissaid tobestationaryif,forall
x
(1)
, . . . , x
(n)
in
X
andh
∈ R
d
su hthat
x
(1)
+ h, . . . , x
(n)
+ h
arestillin
X
,themultidimensional distributionoftheGaussianpro essY
atx
(1)
, . . . , x
(n)
is the sameasthe oneat
x
(1)
+ h, . . . , x
(n)
+ h
.
Itfollowsthata ovarian e fun tionissaidto bestationary,if,forall
x, x
′
, x + h, x
′
+ h
∈ X
,
one has:
C x + h, x
′
+ h
= C x, x
′
= C x
− x
′
, 0
.
(1.4.3)Finally,a Gaussian pro ess is stationary if and only ifits mean fun tion is onstant and its
ovarian e fun tion isstationary.
Thenextse tionoutlinessome lassi alparametri familiesofstationary ovarian e fun tions
andtheirproperties. Foramoredetailedreviewof ovarian e fun tions,theinterestedreader
mayrefer to Abrahamsen [1997℄ andRasmussenand Williams[2006℄.
1.4.1.3 Parametri families of stationary ovarian e fun tions
Typi alparametri families of ovarian e fun tionsareof theform:
C x, x
′
= σ
2
K
ℓ
x
− x
′
(1.4.4)
where
K
ℓ
is a orrelation fun tion parametrized by the ve tor of orrelation lengthsℓ
∈
(0, +
∞)
d
,and
σ
2
∈ (0, +∞)
isa varian e parameter.
The following paragraphs present some lassi al stationary orrelationfun tions
K
ℓ
.The nugget orrelation fun tion
The nugget orrelationfun tion isdened by:
K
ℓ
x
− x
′
= δ
x=x
′
,
(1.4.5)where
δ
denotesthe Krone ker delta. Notethatthis ovarian e fun tion doesnot depend on any orrelationlength.By onstru tion, the observations of a Gaussian pro ess with a nugget orrelation fun tion
arenot orrelated and onsequently independent and identi ally distributed.
Figure1.1presentsanexampleofapathofthe enteredGaussianpro esswiththenugget
or-relationfun tionandaunitvarian e
σ
2
. Thetraje toryisveryroughandalltheobservations
0.0
0.2
0.4
0.6
0.8
1.0
−2
−1
0
1
2
PSfragrepla ementsx
y
(x
)
Figure1.1: Anexampleofapathofthe enteredGaussianpro esswiththenugget orrelation
fun tionand a unitvarian e
σ
2
.
The squared exponential orrelation fun tion
Thesquaredexponential (orGaussian) orrelationfun tion isdened by:
K
ℓ
x
− x
′
= exp
−d
ℓ
x
− x
′
2
,
(1.4.6) whered
ℓ
(x
− x
′
) =
v
u
u
t
X
d
i=1
x
i
− x
′
i
ℓ
i
2
. Thetraje toriesofaGaussianpro esswithasquared
exponential orrelationfun tionareinnitelydierentiable. This ovarian efun tioniswidely
usedinKrigingmodels. However,theassumptionofinnitedierentiabilitymaybeunrealisti
[Stein,1999℄.
Figure1.2presentsthesquared-exponential orrelationfun tion andan exampleof apathof
the enteredGaussianpro esswithasquared-exponential orrelationfun tion,aunitvarian e
σ
2
,andthefollowing orrelation lengths:ℓ
∈ {0.05, 0.1, 0.2}
. It an be seenthat theshorter the orrelationlengthis,thefasterthe orrelationfun tionde reases. Besides,thepathvariesmoreifthe orrelationlength isshort. Finally,notethatthetraje tories areverysmooth,in
agreement withtheir innitedierentiability.
The Matérn orrelation fun tion
Themulti-dimensional Matérnkernel an be dened as:
K
ℓ
x
− x
′
=
1
Γ (ν) 2
ν−1
2
√
νd
ℓ
x
− x
′
ν
K
ν
2
√
νd
ℓ
x
− x
′
,
(1.4.7)with
Γ (
·)
thegammafun tion,K
ν
amodiedBesselfun tion[AbramowitzandStegun,1965℄ andν
≥
1
2
the smoothnesshyperparameter.Notethatas
ν
→ ∞
,theMatérnkerneltendstothesquaredexponential orrelationfun tion. Besides,whenν = k +
1
2
, k
∈ N
,the Matérnkernelhasasimplerform. Inparti ular,wehave:•
ifν =
1
2
:K
ℓ
x
− x
′
= exp
−d
ℓ
x
− x
′
,
(1.4.8)0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
PSfragrepla ementsh
K
ℓ
(h
)
(a)Correlationfun tion
0.0
0.2
0.4
0.6
0.8
1.0
−1
0
1
2
PSfragrepla ementsx
y
(x
)
(b)Traje tories0.05
0.1
0.2
PSfragrepla ements Correlation lengthℓ
Figure 1.2: Onthe left gure: plotof the squared-exponential orrelation fun tion. On the
rightplot: anexampleofapathofthe enteredGaussianpro esseswithasquared-exponential
orrelation fun tion
K
ℓ
,ℓ
∈ {0.05, 0.1, 0.2}
and aunit varian e.•
ifν =
3
2
:K
ℓ
x
− x
′
=
1 +
√
3 d
ℓ
x
− x
′
exp
−
√
3 d
ℓ
x
− x
′
,
(1.4.9)•
ifν =
5
2
:K
ℓ
x
− x
′
=
1 +
√
5 d
ℓ
x
− x
′
+
5
3
d
ℓ
x
− x
′
2
exp
−
√
5 d
ℓ
x
− x
′
.
(1.4.10)Figure 1.3 presents the exponential orrelation fun tion and an example of a path of the
entered Gaussianpro esseswithan exponential orrelationfun tion,aunitvarian e
σ
2
and
thefollowing orrelationlengths:
ℓ
∈ {0.05, 0.1, 0.2}
. The traje tories arenot dierentiable. Figure1.4presentstheMatérn3
2
orrelationfun tionandexamplesofapathof the entered Gaussianpro esseswithaMatérn3
2
orrelationfun tion,aunitvarian eσ
2
,andthefollowing
orrelation lengths:
ℓ
∈ {0.05, 0.1, 0.2}
. The traje tories are not very smooth,but smoother than withtheexponential orrelationfun tion.Figure1.5presentstheMatérn
5
2
orrelationfun tionandanexampleofapathofthe entered Gaussianpro esseswithaMatérn5
2
orrelationfun tion,aunitvarian eσ
2
,andthefollowing
orrelation lengths:
ℓ
∈ {0.05, 0.1, 0.2}
. Thetraje tories arerelativelysmooth.It an be seen on Figures 1.2 to 1.5 thatthe shorterthe orrelation length is, the faster the
orrelationfun tionde reases. Besides,thepathvariesmoreifthe orrelationlengthisshort.
Figure 1.6 presents the Matérn orrelation fun tion and examples of a path of the entered
Gaussian pro esses with a Matérn orrelation fun tion, a orrelation length equal to
0.5
, a unit varian eσ
2
, and the following values of the smoothness parameter:
ν
∈ {
1
2
,
3
2
,
5
2
,
∞}
. It an be seen that the smoothness parameter strongly impa ts the form of the orrelation0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
PSfragrepla ementsh
K
ℓ
(h
)
(a)Correlationfun tion
0.0
0.2
0.4
0.6
0.8
1.0
−3
−2
−1
0
1
2
PSfragrepla ementsx
y
(x
)
(b)Traje tories0.05
0.1
0.2
PSfrag repla ements Correlation lengthℓ
Figure 1.3: On the left gure: plot of the exponential orrelation fun tion. On the right
plot: anexample ofpathsofthe enteredGaussian pro esseswithanexponential orrelation
fun tion
K
ℓ
,ℓ
∈ {0.05, 0.1, 0.2}
and aunit varian e.0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
PSfragrepla ementsh
K
ℓ
(h
)
(a)Correlationfun tion
0.0
0.2
0.4
0.6
0.8
1.0
−2
−1
0
1
2
PSfragrepla ementsx
y
(x
)
(b)Traje tories0.05
0.1
0.2
PSfrag repla ements Correlation lengthℓ
Figure1.4: Ontheleftgure: plotof theMatérn