Gaussian process regression of two nested computer codes

(1)

HAL Id: tel-02092072

https://hal.archives-ouvertes.fr/tel-02092072v4

Submitted on 9 Dec 2019

HAL is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not. The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est

destinée au dépôt et à la diffusion de documents

scientifiques de niveau recherche, publiés ou non,

émanant des établissements d’enseignement et de

recherche français ou étrangers, des laboratoires

publics ou privés.

Gaussian process regression of two nested computer

codes

Sophie Marque-Pucheu

To cite this version:

Sophie Marque-Pucheu. Gaussian process regression of two nested computer codes. General

Math-ematics [math.GM]. Université Sorbonne Paris Cité, 2018. English. �NNT : 2018USPCC155�.

�tel-02092072v4�

(2)

de l'Université Sorbonne Paris Cité

Préparée à l'Université Paris Diderot

E ole do torale n

o

386 Mathématiques Paris Centre

Laboratoire de Probabilités, Statistique et Modélisation

Gaussian pro ess regression of two nested

omputer odes

Par Sophie Marque-Pu heu

Thèse de do torat de Mathématiques Appliquées

Présentée et soutenue publiquement àParisle10 o tobre 2018 devantle jury suivant

Examinatri e Fis her Aurélie Maître de onféren es UniversitéParisDiderot

Dire teur de thèse Garnier Josselin Professeur É ole Polyte hnique

Examinatri e Marrel Amandine Ingénieurde re her he CEA

Rapporteur Monod Hervé Dire teurde re her he INRA

Président du jury Nouy Anthony Professeur É ole Centrale Nantes

Examinateur Perrin Guillaume Ingénieurde re her he CEA

D'après les rapports de

MonodHervé Dire teur de re her he INRA

Marzouk Youssef Professor MIT

(3)

(4)

Remer iements

Mespremiersremer iementss'adressentnaturellementauxdeuxen adrantsquim'ont

a om-pagnéelorsde estroisannéesdethèse. JosselinGarnier,mondire teurdethèse,asuivimes

travauxdemanièreattentive. Satrèsgrande ultures ientiqueetsaréa tivitéontétéd'une

aide pré ieuse. Guillaume Perrin, mon en adrant au CEA, a également fait preuve d'une

très grande impli ation dans le suivi de ette thèse. Je le remer ie pour sa pédagogie et sa

onan e.

Jeremer ieégalementlesdeuxrapporteursde ettethèse: HervéMonodetYoussefMarzouk.

Thank you,Mr. Marzouk, foryour areful reading ofthemanus ript. Mer i également à M.

Monod pour sale tureattentive dumanus rit, ainsique pour avoirété membredu jury.

Jetiensaussiàremer ierAurélieFis her,AmandineMarreletAnthonyNouyd'avoira epté

defaire partie de monjuryde thèse.

Enn,je saluetouslesmembresde l'équipein ertitudes duCEA,ainsiquelesdo torants du

bâtiment B.

Jeremer ieégalementtouslespersonnelsadministratifsduCEAetl'UniversitéParis Diderot

quiont fa ilité d'unemanièreou d'uneautre monquotidien.

Mesderniersremer iementsvont àmafamille,enparti ulier mesparentsqui m'ont transmis

legoût d'apprendre. Last but not least, je remer ie enn mon onjoint pour sonformidable

(5)

Cettethèsetraitedelamétamodélisation(ouémulation)parpro essusgaussiendedeux odes

ouplés. Le terme deux odes ouplés désigne i i un système de deux odes haînés : la

sortie dupremier odeest une desentrées du se ond ode.

Les deux odes sont oûteux. An de réaliser une analyse de sensibilitéde la sortiedu ode

ouplé, on her he à onstruire un métamodèle de ette sortie à partir d'un faible nombre

d'observations. Trois typesd'observations du systèmeexistent : ellesde la haîne omplète,

ellesdupremier odeuniquement, ellesduse ond odeuniquement. Lemétamodèleobtenu

doit êtrepré is dansles zones lesplus probablesde l'espa ed'entrée.

Les métamodèles sont obtenus par krigeageuniversel, ave uneappro he bayésienne.

Dans unpremier temps,le assans information intermédiaire,ave sorties alaire, esttraité.

Uneméthodeinnovantededénitiondelafon tiondelamoyennedupro essusgaussien,basée

surle ouplagededeuxpolynmes,estproposée. Ensuitele asave informationintermédiaire

esttraité. Unprédi teurbasésurle ouplagedesprédi teursgaussiensasso iésauxdeux odes

est proposé. Desméthodespour évaluerrapidement lamoyenne et lavarian e du prédi teur

obtenu sont proposées. Les résultats obtenus pour le as s alaire sont ensuite étendus au

as où les deux odes sont à sortie de grande dimension. Pour e faire, une méthode de

rédu tiondedimensione a e delavariableintermédiairede grandedimension estproposée

pourfa iliterlarégressionparpro essusgaussiendudeuxième ode. Lesméthodesproposées

sont appliquéessur desexemplesnumériques.

Mots- lés

Codesnumériques emboîtés, odes ouplés, odes haînés, régressionpar pro essusgaussien,

métamodélisation, variable fon tionnelle, rédu tion de dimension, Stepwise Un ertainty

Re-du tion, plansd'expérien es séquentiels.

Abstra t

ThisthesisdealswiththeGaussianpro ess regressionoftwonested odes. Theterm"nested

odes" refersto a systemof two hained omputer odes: theoutput ofthe rst ode is one

of the inputs ofthe se ond ode.

The two odesare omputationally expensive. In orderto perform a sensitivity analysis, we

aim at emulating the output ofthenested odefrom asmall number of observations.

Three types of observations of thesystemexist: thoseof the hained ode,those of therst

ode onlyand those ofthe se ond ode only. Thesurrogate modelhasto bea urate on the

most likely regionsof theinputdomain of thenested ode.

In this work, the surrogate models are onstru ted using the Universal Kriging framework,

witha Bayesian approa h.

First,the asewhenthereisnoinformationabouttheintermediaryvariable(theoutputofthe

rst ode)isaddressed. Aninnovative parametrization ofthemeanfun tion oftheGaussian

pro essmodeling thenested odeisproposed. Itisbasedonthe oupling oftwopolynomials.

Then, the asewithintermediary observations is addressed. A sto hasti predi tor basedon

the oupling of the predi tors asso iated with the two odes is proposed. Methods aiming

at omputing qui kly the meanand thevarian e ofthis predi tor areproposed. Finally, the

methods obtainedfor the ase of odeswiths alaroutputsareextendedto the aseof odes

withhighdimensional ve torial outputs.

We proposean e ient dimension redu tion method of thehighdimensional ve torial input

(6)

Keywords

Nested omputer odes,Gaussianpro essregression,surrogate modeling,fun tionalvariable,

(7)

Cettethèseprésentedenouveauxdéveloppementspourlamétamodélisationde odes oûteux

haînés,oùlasortiedupremier odeestunedesentréesdu odesuivant. Cette onguration

et sa généralisation à plus que deux odessont fréquemment ren ontrées en pratique. Mais

la onstru tion demétamodèles adaptésà ette ongurationa étépeu étudiée jusqu'i i.

Cemanus rit ontienttrois ontributionsnouvellesparrapportàl'étatdel'art,détailléesdans

les hapitres 3 à 5. La première ontribution on erne la régression par pro essus gaussien

ave unefon tion demoyenne dénie parune polynme. Une nouvelle méthode dedénition

delatendan epolynomiale,baséesurla ompositiondedeuxpolynmes,estproposée. Dans

e asde gure, lavariableintermédiaire entreles deux odesn'est pas onnue.

La se onde ontribution suppose la onnaissan e de la variable intermédiaire et traite de

l'enri hissement du plan d'expérien es en vue de la régression par pro essus gaussien de la

sortie de la haîne de deux odes. Le hoix d'une nouvelle observation soulève plusieurs

questions. Tout d'abord pour un ode donné, il faut hoisir les variables d'entrée de la

nouvelleobservation. Ensuite, ommeilyadeux odes,laquestionseposeégalement(si ela

est possible)de hoisirauquel desdeux odesajouter une nouvelle observation.

La troisième ontribution traite le as de deux odes à sortie de très grande dimension (par

exemple des fon tions du temps). Dans ette onguration, le se ond ode a une sortie,

maiségalement une entrée fon tionnelle. Uneméthodede rédu tiondedimension del'entrée

fon tionnelle adaptée à e as est alors proposée. Les ritères d'enri hissement proposés

pré édemmentsont ombinésave etteméthodederédu tiondedimensionandelesétendre

au asdedeux odesàsortiefon tionnelle. Lesméthodesproposéessontensuiteappliquéesà

un astestindustrielmodélisantl'explosiond'une hargedansune uvesphérique. Ce astest

estasso iéàun ouplageentreun odededétoniqueetun odededynamiquedesstru tures.

Les paragraphesqui suivent présentent plusen détails lastru turedu manus rit.

Lepremier hapitrepasseenrevuel'étatdel'art on ernantlamétamodélisationd'ununique

ode àentréeetsortiede faiblesdimensions. Une brève présentation de larégressionlinéaire

et du haos polynomial est faite,ainsi quede méthodesde régularisation ommeLASSO ou

LARS.Lerestedu hapitreestdédiéàlarégressionparpro essusgaussien(GP) oukrigeage.

Après un rappel des bases de la régression par pro essus gaussien, omme le hoix de la

fon tion de ovarian e,lekrigeageuniverseldansun adrebayésien estprésenté. Ensuite,les

ritères pour plans d'expérien es pour larégression par pro essus gaussien et l'optimisation

bayésienne sont passés en revue. Le hapitre se on lut sur une brève partie on ernant

l'analysedesensibilité,enparti ulierlesméthodesbaséessurunedé ompositiondelavarian e

(indi es de Sobol).

Le deuxième hapitre passe en revue les méthodes pour la régression par pro essus gaussien

d'un ode àentréeet/ou sortiedénie omme une fon tiondis rétisée du temps. L'attention

se on entre i i sur la rédu tion de la dimension de l'entrée ou de la sortie. Con ernant la

rédu tiondeladimension del'entrée, ertainesméthodesneprennent en omptequel'entrée

fon tionnelle, tandis que d'autres ont pour obje tif larédu tion de la dimension de l'entrée

de manière adaptée à la sortie. Ces dernières sont tout parti ulièrement adaptées pour le

système haîné onsidéré dans e travail. Con ernant lasortie fon tionnelle, deux appro hes

sontpossibles. Lapremière onsisteàprojeterlasortiefon tionnellesurunebasededimension

(8)

Letroisième hapitre ontient lapremière ontribution de ettethèse : la onstru tion d'une

fon tiondemoyenne dupro essusgaussienpar ouplage dedeuxpolynmes. Cetteappro he

intègre l'information que l'on a a priori sur la stru ture haînée des deux odes, mais sans

observations ni onnaissan ede lastru turede lavariableintermédiaire. Dans e as,la

on-guration estpro he d'unerégression parpro essusgaussien lassique,ave desobservations

desentréesetsortiedela haînede odes. Laspé i itédelaméthodereposesurl'utilisation

del'informationquel'onasur ettestru ture haînée. Ladénitiondelafon tiondemoyenne

omprend unepremière étape de ompositionde deuxpolynmes,puisune se onde étapede

linéarisationde ette omposition. Cettelinéarisationpermetdelimiterl'impa td'uneerreur

d'estimationdesparamètres de ha undesdeuxpolynmes. Ensuiteleprédi teurdelasortie

de la haîne de ode est onstruit en utilisant le krigeage universel dans un adre bayésien.

Par ailleurs, lastru ture proposée pour la tendan e polynomiale ore une grande exibilité,

puisqueles ordrestotaux de ha un desdeuxpolynmes,maisaussiladimension delasortie

du premier polynme, peuvent être optimisés. Cependant, ette exibilité né essitela

réso-lution d'un problème d'optimisation omplexe ar non onvexe. Une appro he heuristique,

baséesuruneminimisation alternéeparrapportauxvariables,estproposéepourrésoudre e

problèmed'optimisation. Par ailleurs, un ritèrebasé surl'erreur Leave One Out(LOO) est

utilisé pour ara tériser laperforman e de prédi tion du prédi teur gaussien. Ce ritère est

utilisé pour hoisir la ombinaisonde valeurs laplus performante pour les ordres totaux des

deuxpolynmes etla dimensionde lasortie dupremier polynme.

Lequatrième hapitre ontient ladeuxième ontribution de ettethèse: lamétamodélisation

de deux odes haînés lorsque des observations de lavariableintermédiaire sont disponibles.

Le prédi teur proposéest basésur un ouplage de prédi teurs gaussiens de ha un des deux

odes. Le hapitreproposeenparti ulierdeux ritèresd'enri hissementdupland'expérien es.

Ces ritères reposent surune minimisation de lavarian e de prédi tion intégrée (IMSE). La

varian e de prédi tion doitdon être évaluéeen untrès grand nombre depoints. Lepremier

ritère orrespond au asoù les deux odes ne peuvent pasêtre appelés de manièreséparée.

Le se ond orrespond au as où les odes peuvent être lan és de manière séparée. Dans e

as,onpeut hoisirlequeldesdeux odesappeler,enretenant eluiquimaximise larédu tion

de la varian e de prédi tion intégrée par unité de temps de al ul pour une évaluation du

ode. Une di ulté majeure liée à ette appro he tient au fait que le ouplage de deux

prédi teurs gaussiens n'est pas gaussien. La varian e de prédi tion doit don être évaluée

en utilisant des méthodes de quadrature ou Monte Carlo. An de résoudre es di ultés

numériques, deux méthodes pour une évaluation rapide de la varian e de prédi tion sont

proposées. Danslepremier as,silepro essusgaussienasso iéause ond odeaunefon tion

de ovarian e gaussienne etunetendan e polynomiale,alors lavarian e peutêtre évaluéede

manière analytique. Dansle as où es onditions ne sont pasvalables, une autre appro he

reposantsurlalinéarisationdu ouplagedesdeuxprédi teurspeutêtreutilisée. Lesméthodes

proposées sont ensuite appliquées surdeux exemples numériques : un premier analytique et

unse ond portant surlatraje toirebalistique d'unproje tile onique. Les résultatsobtenus

montrent l'intérêt de prendre en ompte les observations de la variable intermédiaire et de

pouvoir appeler de manièreséparée ha un desdeux odes.

Le inquième hapitre ontient les ontributions nales de ette thèse et on erne la

mé-tamodélisation par pro essus gaussien de deux odes haînés à sortie fon tionnelle (de très

grandedimension). La ontribution majeure de e hapitre estune méthodede rédu tionde

(9)

du se ond ode, qui estlinéaire par rapportà l'entrée fon tionnelle du se ond ode (qui est

également lasortie du premier ode). Le modèle linéaire proposéest en faitun ltre ausal,

paramétré par un petit nombre de variables qui peuvent être estimées à partir d'un faible

nombre d'observations.

Cette ombinaison d'une approximation linéaire et d'une rédu tion de dimension adaptée à

e modèlelinéaire permetderéduireladimensionde l'entrée fon tionnelleduse ond odede

manière adaptéeà laprédi tion de lasortiede e ode.

Grâ e à ette rédu tion de dimension, ha un desdeux odespeutêtre asso iéà un

pro es-sus gaussien ave un ve teur d'entrées de faible dimension. Deux prédi teurs gaussiens sont

obtenus en utilisant une ovarian e tensorisée pour prendre en ompte le ara tère

multidi-mensionnel des sorties des fon tions onsidérées. Les prédi teurs sont ensuite ouplés et le

ouplage estlinéarisé. Ce ipermetd'obtenirunprédi teurgaussiende lasortiefon tionnelle

de la haînede deux odes. La moyenne etlavarian e duprédi teur peuvent alors être

éval-uées de manière analytique, etdon très rapide. Les ritères d'enri hissement proposés dans

le hapitrepré édentsontensuiteadaptésau asdedeux odes ouplésàsortiefon tionnelle.

Enn, lesméthodesproposées sontmisesenappli ation surle astest industrielquiamotivé

ettethèse,àsavoirle ouplaged'un odededétoniqueave un odededynamiquedes

stru -tures. Les sortiesde ha undes odessont desfon tionsdis rétisées dutemps. Les résultats

obtenus montrent l'intérêtdeprendreen omptelesobservationsde lavariableintermédiaire,

par rapport àune simple régressionpar pro essusgaussien dela sortiede la haînede odes

(10)

Introdu tion i

Notations xv

I State of the art for the surrogate modeling of omputer odes 1

1 Surrogate modeling of a single ode with s alar inputs and output 5

1.1 Linear regression . . . 5

1.2 Polynomial ChaosExpansion . . . 6

1.3 Methods for the sele tion of the regressorsof alinear model . . . 8

1.3.1 Stepwise and all-subsets regressions. . . 8

1.3.2 Ridge regression . . . 8

1.3.3 LASSO . . . 9

1.3.4 Forward stagewise regression . . . 9

1.3.5 Least Angle Regression . . . 9

1.3.6 Dantzig sele tor. . . 11

1.3.7 Con lusions . . . 11

1.4 Gaussian pro ess regressionor Kriging . . . 11

1.4.1 Gaussian pro esses . . . 11

1.4.2 Ordinary, simpleand universalKriging . . . 18

1.4.3 Estimation ofa parametri ovarian e fun tion . . . 20

1.5 Design ofexperiments . . . 22

1.5.1 Spa e-lling designs . . . 22

1.5.2 Criterion-based designs . . . 24

1.5.3 Gaussian pro esses for pointwiseglobal optimization . . . 26

1.6 Sensitivityanalysis . . . 26

2 Gaussian pro ess regressionof a ode with a fun tional inputor output 29 2.1 Dimension redu tionof afun tional variable . . . 29

2.1.1 Dimension redu tionadaptedto thefun tional variable only . . . 30

2.1.2 Dimensionredu tionadaptedtothefun tionalvariableandadependent variable . . . 31

2.2 Gaussian pro ess predi tion ofa omputer ode withafun tional output . . . 33

2.2.1 Proje tion ofthefun tional outputon abasis . . . 34

(11)

II Contributions 37

3 Nested polynomial trends for the improvement of Gaussianpredi tors 39

3.1 Introdu tion . . . 39

3.2 Gaussian pro ess predi tors . . . 41

3.2.1 General framework . . . 41

3.2.2 Choi e ofthe ovarian e fun tion . . . 42

3.2.3 Choi e ofthe meanfun tion . . . 42

3.3 Nested polynomial trends for Gaussian pro esspredi tors . . . 43

3.3.1 Nested polynomialrepresentations . . . 43

3.3.2 Coupling nested representationsand Gaussian pro esses . . . 46

3.3.3 Linearization of the nestedpolynomialtrend . . . 47

3.3.4 Error evaluation . . . 48

3.3.5 Convergen e analysis . . . 50

3.4 Appli ations . . . 50

3.4.1

d = 1

. . . 51

3.4.2

d > 1

. . . 54

3.4.3 Relevan eofthe LOOerror . . . 55

3.5 Con lusions . . . 57

4 Gaussian pro essregression of two nested odes with s alaroutput 59 4.1 Introdu tion . . . 59

4.2 Surrogate modeling for two nested omputer odes . . . 60

4.2.1 General framework . . . 60

4.2.2 Gaussian pro ess-basedsurrogate models . . . 61

4.2.3 Sequential designsfor theimprovement ofGaussian pro ess predi tors 64 4.3 Fast omputation ofthe varian e of thepredi tor of thenested ode . . . 65

4.3.1 Expli it derivation ofthetwo rst statisti almomentsof thepredi tor 66 4.3.2 Linearized approa h . . . 67

4.4 Appli ations . . . 68

4.4.1 Chara teristi s oftheexamples . . . 69

4.4.2 Predi tion performan efor a given setof observations . . . 72

4.4.3 Performan es ofthe sequential designs . . . 74

4.5 Con lusions . . . 78 4.6 Proofs . . . 79 4.6.1 Proof ofProposition 4.2.1 . . . 79 4.6.2 Proof ofLemma 4.3.1 . . . 79 4.6.3 Proof ofLemma 4.3.2 . . . 80 4.6.4 Proof ofProposition 4.3.1 . . . 81 4.6.5 Proof ofProposition 4.3.2 . . . 85 4.6.6 Proof ofCorollary 4.3.3 . . . 85

5 Gaussian pro essregression of two nested odes with fun tional outputs 89 5.1 Introdu tion . . . 89

5.2 Dimensionredu tion ofthe fun tional inputof a ode . . . 91

5.2.1 Formalism . . . 91

5.2.2 Dimensionredu tion ofthefun tional inputonly . . . 92

5.2.3 PartialLeast Squaresregression. . . 93

5.2.4 A linearmodel-based dimension redu tionof thefun tional input . . . 94

5.3 Gaussian pro essregressionwithlowdimensionalinputsandafun tionaloutput 96 5.4 Surrogate modeling of the nested ode . . . 101

(12)

5.4.1 A linearizedGaussian predi tor ofthenested ode . . . 101

5.4.2 Sequential designs . . . 102

5.5 First numeri al example . . . 104

5.5.1 Des ription of the numeri al example . . . 104

5.5.2 Dimension redu tionof thefun tionalinput ofthese ond ode . . . . 106

5.5.3 Predi tion of the nested ode . . . 111

5.5.4 Sensitivityanalysis . . . 115

5.6 Se ondnumeri al example . . . 120

5.6.1 Des ription of thenumeri al example . . . 120

5.6.2 Results ofthenumeri al example . . . 121

5.7 Con lusions . . . 123 5.8 Proofs . . . 125 5.8.1 Proof of Proposition 5.2.1 . . . 125 5.8.2 Proof of Lemma 1 . . . 126 5.8.3 Proof of Proposition 5.3.1 . . . 128 5.8.4 Proof of Proposition 5.4.1 . . . 129 5.8.5 Proof of Proposition 5.4.2 . . . 129 Con lusions 131

(13)

Context

Surrogatemodelingforthesensitivityanalysisoftwonested omputer odes

This thesisis motivated by an appli ation ase. Thisappli ation ase isthe oupling of two

omputationally ostly omputer odes. The rst ode is a detonation ode and these ond

ode is a stru tural dynami s ode. The two odes have fun tional (i.e. high dimensional

ve torial) outputs and the fun tional output of the rst ode is one of the inputs of the

se ond ode.

Ifweaimatperformingdesignand erti ationstudiesofsu hasystem,theevaluationofthe

output of thesystem at a large numberof input pointsis oftenne essary. This is espe ially

true whenmethods like sensitivityanalysis, riskanalysisor optimization areperformed.

In this work we aim at performing a sensitivity analysis of the system mentioned above.

Given the omputational ost of the two odes, the rst obje tive is to build an emulator,

or a surrogate model, of the output of the two nested odes. This surrogate model will be

onstru ted from a small set of observations of the two odes. The number of observations

annot be veryhigh be ause ofthe omputational ostsofthe odes.

As the role of simulation is in reasing, the surrogate modeling of high- ost odes generates

growing interest. However, the existing methods are generally applied to a single ode or

onsider asystemof odesasasingle ode.

In this work, the framework of the Gaussian pro ess regression for the surrogate modeling

of omputer odes is onsidered. In this framework, the output of a ode is onsidered to

be the realization of a Gaussian pro ess. The framework used for the Gaussian pro ess

regression is the Universal Kriging framework and a Bayesian approa h is utilized. If some

notveryrestri tiveassumptionsonthepriordistributionoftheGaussianpro essarefullled,

a Gaussian predi tor ofthe ode an be obtainedby omputing theposterior distribution of

theGaussian pro ess giventhe observations of the odeoutput.

Moreover, the existing methods for the surrogate modeling of odes generally onsider the

ase of odes with low dimensional ve torial inputs. If a ode has a fun tional input, the

dimension of the fun tionalinput is oftenredu ed thanksto a proje tion. The hoi e of the

optimal method ofdimension redu tionof the fun tionalinputfor thesurrogatemodelingof

theoutput remainsa resear h topi .

Contributions of the thesis

Thisthesismakes ontributions to the surrogatemodeling oftwo nested odeswiths alaror

fun tionaloutputs. These ontributionsaimatsolvingthefollowing di ultiesofthestudied

system:

•

there aretwo odes,

•

the odesare oupled bya fun tionalintermediary variable,

•

the se ond ode hasa fun tionalinput.

First,the aseoftwonested odeswiths alaroutputsisinvestigated. The onsideredsystem

is then:

x

1 →

x

2 y

1 (x

1 )

ց

ր

y

nest

(x

nest

) := y

2 (y

1 (x

1 ), x

2 ),

(0.0.1) with

x

1 ∈ R

d

1

and

x

2 ∈ R

d

2

the low dimensional ve torial inputs of the two odes,

y

1 ∈ R

and

y

2 ∈ R

theoutput ofthetwo odes,and

d

1

and

d

2

two integers.

(14)

Ina rst step, the ase where there are no observations of theintermediary variable

y

1 (x

1 )

is onsidered. An innovative parametrization of the mean fun tion of the Gaussian pro ess

is proposed. This parametrization is based on the oupling of polynomials and enables to

improvethepredi tiona ura y omparedtoa lassi al onstantorpolynomialmeanfun tion.

Thenthe ase where observations of theintermediary variableareavailable is onsidered.

Asto hasti predi tor of thenested ode isobtained by oupling theGaussian predi tors of

thetwo odes. Su h an approa h enables to take into a ount all thetypes of observations:

observations of the nested ode, of the rst ode only and of the se ond ode only. The

predi torisnon-Gaussian but its moments an be omputed usingMonteCarlomethods.

Thenwedenesequentialdesign riteriawhi haimatimprovingthepredi tiona ura yofthe

proposedpredi tor. The riteriaarebasedonaredu tionoftheintegratedpredi tionvarian e

be ause thepredi torhasto be a urate on themost probable areas oftheinputdomain for

the sensitivity analysis. Finally, two adaptations of the proposed predi tor are developed

inorder to evaluate thepredi tion varian e andthus theproposedsequential design riteria

qui kly. Therst adaptationis alled"analyti "andthese ondone"linearized". Theyboth

enableto ompute themeanand thevarian e oftheproposedpredi torin losed forms. The

"linearized" method leads also to a Gaussian predi tor of the nested ode. Moreover, the

interestoftaking into a ount theintermediary observations is shown.

Finally, the ase of two nested ode with fun tional outputs is investigated. The onsidered

systemis then:

x

1 →

x

2 y

₁

(x

1 )

ց

ր

y

nest

(x

nest

) := y

2 (y

1 (x

1 ), x

2 ),

(0.0.2) with

y

1 ∈ R

N

t

and

y

2 ∈ R

N

t

the outputof the two odeswhen they arefun tional,

N

t

≫ 1

denotingthe number ofdis retization steps ofthefun tionaloutputs.

These ond odehasafun tionalinputandthe existingmethodsof Gaussianpro ess

regres-siongenerally onsiderlowdimensional ve torial inputs. The Gaussian pro ess regressionof

the se ond ode requires therefore the redu tion of the dimension of this fun tional input.

We propose a dimension redu tion of the fun tional input of a ode whi h is suited for the

predi tion of the fun tional output of this ode. Thisdimension redu tion method is based

on a two-step approa h. First, the output of the se ond ode is approximated by a linear

ausal lter. Thislinearmodel hasasparsestru ture, whi h is dened byonly

N

t

variables. Thesevariables an be estimatedfroma smallsetofobservations ofthefun tionalinputand

outputofthese ond ode. These ondstepistheuseofaproposedproje tion basiswhi h is

adaptedto alinearmodel. The ombination ofthesetwo stepsenables toobtainadimension

redu tionofthe fun tional inputof the se ond ode,whi h:

•

is adaptedto the outputof this ode

•

an beestimatedfrom asmall setof observations,

•

doesnot requirethe knowledgeof thederivativesoftheoutput ofthe ode,

On e thedimension of the fun tional intermediary variable hasbeen e iently redu ed, the

previouslydenedlinearizedmethodisadaptedtothe aseoftwonested odeswithfun tional

outputs. AGaussianpredi torofthefun tionaloutputofthenested ode,withanalyti mean

andvarian e,isobtained. Finally,thepreviouslydenedsequentialdesign riteriaareadapted

(15)

Outline of the manus ript

The thesishastwo parts.

PartIprovidesareviewofthe stateoftheartforthesurrogatemodeling of omputer odes.

InChapter1,wereviewmethods forthesurrogatemodelingofasingle odewithlow

dimen-sionalve torial inputsand a s alaroutput.

Se tion 1.1 des ribesthesurrogatemodelingof asingle odebyLinear Regression.

Se tion 1.2 fo uses onthe surrogate modeling of a ode byPolynomialChaos Expansion.

Se tion 1.3 reviews the existingmethods for thesele tion of theregressorsin theframework

of Linear Regression.

Se tion 1.4provides areviewof theGaussian pro essregressionframeworkfor thesurrogate

modelingof asingle odewithlow dimensionalve torial inputs and as alaroutput.

Se tion 1.5presentsareviewofthedesign ofexperimentsfor ana uratesurrogatemodelon

thewholeinput domainof a odewith lowdimensionalve torial inputs anda s alaroutput.

Se tion1.6fo usesonthesensitivityanalysisoftheoutputofa ode,oraquantityasso iated

withit, withrespe tto theinputs ofthe ode.

InChapter2,wereviewmethodsforthe surrogatemodelingofasingle odewithafun tional

output, lowdimensional ve torial inputs and possiblya fun tionalinput.

Se tion 2.1 is devoted to the existing methods for the dimension redu tion of a fun tional

variable.

Se tion 2.2 reviews theexisting methods for the Gaussian pro ess regression of a ode with

low dimensionalve torial inputs and afun tional output.

PartIIdetailsour ontributions to the onstru tion ofasurrogatemodeloftwo nested odes

withs alar or fun tionaloutputs.

In Chapter 3, we fo us on the ase where the two odes have s alar outputs and no

obser-vations of theintermediary variable are available. We propose to dene the mean fun tion

of the Gaussian pro ess modeling the nested ode as a oupling of two polynomials. This

parametrizationisbasedonthe oupling oftwo polynomials. Weshowhowthis

parametriza-tion an improve the predi tion a ura y of the Gaussian predi tor ompared to the ase

where the mean fun tion isdened bypolynomials.

InChapter4wefo usonthe asewherethetwo odeshaves alaroutputsandobservationsof

theintermediary variableareavailable. We proposeasto hasti predi torof thenested ode

basedonthe ouplingoftheGaussianpredi torsofthetwo odes. Thissto hasti predi toris

non-Gaussian but itsmean and varian e an beevaluated using MonteCarlo methods. This

predi tor an take into a ount all the possible observations: those ofthenested ode,those

of therst ode and those ofthe se ond ode. Then sequential design riteria areproposed.

Thesedesign riteria aimat improvingthepredi tiona ura yon thewholeinputdomainof

thenested ode. Oneofthe riteria analsotakeintoa ountthedieren eof omputational

ostsbetween thetwo odes. Finally,we proposetwoadaptations ofthepreviouslyproposed

predi tor of the nested ode in order to a elerate the omputation of the mean and the

varian e of the predi tor. They both enable to ompute the predi tion mean and varian e

in losed forms. In addition, theproposed linearizedpredi tor of the nested ode enables to

obtainaGaussian predi torofthenested odewith onditioned meanand varian e fun tions

(16)

Theappli ation ofthe proposed methods to numeri al examplesshows theinterestof taking

into a ount the intermediary observations.

InChapter 5we fo uson the aseof the oupling oftwo odeswith fun tionaloutputs. We

rstproposeane ient dimensionredu tionofthefun tionalinputofthese ond ode. This

dimensionredu tionisbasedonalinearproje tion ofthefun tionalinputofthese ond ode.

Theproposedproje tionbasis anbeestimatedfromasmallsetofobservations ofthese ond

ode and doesnot require theknowledge ofthederivativesof the ode.

We alsoextend the linearizedpredi torofthe nested odeproposed inChapter 4to the ase

of two nested odes withs alar output. This extension relies on thedimension redu tionof

thefun tionaloutput and atensorized stru ture of theGaussian pro ess modeling the ode.

Bytensorizedstru turewemeanaseparationbetweentheindexoftheoutputandtheinputs.

Thesequentialdesign riteriaarealsoadaptedtothe aseoftwonested odeswithfun tional

outputs.

Theproposedmethodsareappliedtonumeri alexamples. Theresultsshowagaintheinterest

oftakingappropriately into a ount the intermediary observations.

Thepredi torobtainedat theend ofthesequential enri hment oftheinitial designisusedin

ordertoperform asensitivityanalysisofas alarquantityof interest basedonthefun tional

(17)

(18)

Ordinal variables

n

numberofobservations

d

dimension ofan inputvariable

p

numberoffun tions ofa basisoffun tion inthe aseof UniversalKriging

N

t

dimension ofthe time-varyingoutput ofa ode ard

(A)

numberofelementsof theset

A

Matrix, ve tors and s alar

x

as alar

x

ave tor

x

i

or

(x)

i

the

i

-th entry of the ve tor

x

X

amatrix

(X)

_ij

the entry at line

i

and row

j

ofthe matrix

X

(X)

_·i

the ve tor ofthe entriesof the

i

-th olumn ofthematrix

X

(X)

_i·

the ve tor ofthe entriesof the

i

-th row ofthematrix

X

T

transposeofthematrix

X

diag

(x)

diagonal matrix withdiagonal

x

diag

(X)

ve tor orrespondingto the diagonal of thematrix

X

Tr

(X)

tra e ofthematrix

X

(19)

Probabilisti notations

d

=

equalityindistribution

E

_[

_·]

Mean ofa random quantity

V

_[

_·]

Varian e ofa random quantity

N (m, K)

multivariate normal distribution withmean

m

and ovarian e matrix

K

GP

(m (

·) , C (·, ·))

one-dimensional Gaussian pro ess with mean fun tion

m

and ovari-an efun tion

C

GP

(m (

·) , C (·, ·))

multidimensional Gaussian pro ess withve tor-valued mean fun tion

m

and matrix-valued ovarian e fun tion

C

Norms and s alar produ ts

(

_{·, ·)}

X

s alar produ tin thespa e of square integrable real-valued fun tions on

X

,su hthat

(y, z)

X

:=

R

X

y(x)z(x)dx

k·k

X

norminthespa eofsquareintegrablereal-valuedfun tionson

X

,su h that

kyk

2 X

:= (y, y)

X

k·k

F

Frobenius norm

k·k

1 L

1

norm, su hthat

kxk

1 =

d

P

i=1

|x

i

|

k·k

L

2

norm, su hthat

kxk

2 =

s

d

P

i=1

x

2 _i

(20)

State of the art for the surrogate

(21)

(22)

However, methods like un ertainty propagation, sensitivity analysis or optimization require

theevaluationoftheoutputofthe odeatahugenumberofinputpoints. Ifthe omputational

ost of the omputer ode is high, and only a small number of observations of its output is

available, the use of a surrogate model is ne essary. In this part we review some existing

methods for the surrogate modeling of omputer odes.

This part in ludes two hapters. The rst one is devoted to the surrogate modeling of a

omputer odewiths alar(i.e. lowdimensionalve torial)inputsandoutput. These ondone

fo uses onthesurrogatemodelingwithGaussianpro ess regressionofa odewithfun tional

(23)

(24)

Surrogate modeling of a single ode

with s alar inputs and output

Inthis hapter we onsidera modelofthe form

x

7→ y (x)

,

x

∈ X ⊂ R

d

,

d

a positive integer, and

µ

X

is a probability measure on the spa e omprising

X

and a

σ

-algebra over

X

. The following se tions detail the state of the art for the surrogate modeling of

y

from a set of

n

observations of theinputand the outputof the ode. Theseobservations aredenotedby:

X

obs

=







x

(1)

. . .

x

(n)





 ,

(1.0.1) and

y

obs

=

y

(1)

= y

x

(1)

, . . . , y

(n)

= y

x

(n)

,

(1.0.2) where

X

obs

isa

(n

× d)

-dimensional matrix and

y

obs

is a

n

-dimensional ve tor.

Therst se tion is devoted to linear regression. The se ond one deals withthe useof

Poly-nomialChaosExpansionasasurrogatemodel. Thethirdone fo useson themethodsfor the

sele tion of regressors in regression models. The fourth one presents the Gaussian pro ess

regression for the surrogate modeling of a omputer ode. Finally, the last se tion reviews

some existing designs of experiments whi h are adapted for the a quisition of knowledge of

the omputer ode orthesequential improvement ofa surrogate model.

1.1 Linear regression

Generalized additive models area very ommon tool for theemulation of aresponsesurfa e

[Hastie and Tibshirani, 1990℄. It is the proje tion of the output

y

on a basis of fun tions

h

i

, 1

≤ i ≤ p

,

p

a positive integer,oftheinputs

x

. Theemulator an be written intheform:

b

y (x) = h (x)

T

β,

(1.1.1)

where

h

(x)

and

β

arein

R

p

. Thefun tionsofthebasis anbepolynomials,withPolynomial

ChaosExpansion asaparti ular ase, wavelets, trigonometri fun tions...

Note that simple linear regression an be regarded as a parti ular ase of the generalized

additive models,witha basisoffun tions omprisingonly the ovariates:

h

(x) = x

.

Theregression oe ients

β

an be estimatedfroma setof

n

observations of theinputsand theoutputofthe ode

X

obs and

y

obs

throughtheminimizationofthequadrati lossfun tion:

b

β

= argmin

β

∈R

p

n

X

i=1

y

x

(i)

_{− h}

x

(i)

T

β

2 .

(1.1.2)

(25)

If we denote:

H

=







h x

(1)

T

. . .

h x

(n)

T





 ,

(1.1.3)

thentheleast squaresestimateof theregression oe ients an bewritten:

b

β

= H

+

y

obs

,

(1.1.4)

where

H

+

isthe pseudo-inverse of

H

. If

n

≥ p

and

H

is ofrank

p

,then

H

T

_H

isinvertible

and

H

+

_{= H}

T

_H

−1

_H

T

. By denition,

H

isa

(n

× p)

-dimensional matrix. However,matrix

H

T

_H

isnotalwaysinvertible. Thenumberofobservations anbesmaller

than the number of regression oe ients (

p

≤ n

) or the fun tions of the basis an be or-related a ording to the probability measure

µ

X

, whi h means that the olumns of

H

are orrelated, thusredu ing the rankof matrix

H

.

The matrix

H

T

_H

ismorelikelyto beinverted ifthe basisfun tions arede orrelated with

respe t to the probability measure

µ

X

of the inputs, as performed with Polynomial Chaos Expansion. Another possible approa h is the use of a regularization term for the inversion

of the matrix, or the sele tion ofthe most inuen ingregressors. Thetwo following se tions

detail thesetwo approa hes.

1.2 Polynomial Chaos Expansion

Polynomial Chaos expansion an be used to emulate a model response

y

with inputs

x

. Besides, theprobability measure

µ

X

asso iated with

x

is a produ tmeasure. Therefore, the omponentsofthe inputve torareindependent. Ithasbeenapplied byGhanemand Spanos

[1990℄ to sto hasti niteelements methods. PolynomialChaos expansion an beseenasthe

proje tionof the modeloutput

y

ona polynomial basiswhi h dependson thedistributionof themodel inputs

x

. The polynomials areorthonormal withrespe tto thedistribution of

x

. The modelresponse an thereforebe expanded as:

y (x) =

X

α∈N

d

β

α

Φ

α

(x) ,

(1.2.1)

with

β

α

∈ R

and

Φ

α

orthonormal multidimensional polynomials, whi hmeans:

Z

X

Φ

α

(x) Φ

γ

(x) dµ

X

(x) = δ

αγ

,

(1.2.2)

with

δ

αγ

denoting the Krone kerdelta.

In pra ti e, the expansion of Eq. (1.2.1) an be trun ated in order to obtain a surrogate

modelof themodelresponse. Ifwe denoteby

A

⊂ N

d

the trun ated setofindi es, by

β

A

the ve tor gatheringthe

β

α

, α

∈ A

andby

Φ

A

theve torgatheringthesele tedpolynomials,this surrogate modelisdened as:

b

y (x) = Φ

A

(x)

T

β

A

.

(1.2.3)

Note that the trun ation is generally dened by an upper bound

r

on the total order of the polynomials, whi h means

A

=

{α ∈ N

d

_,

_kαk

1 ≤ r}

. The total order

r

an be hosen adaptively a ording to atarget pre ision, withanestimation of theerror thanksto a

ross-validation riterion [Blatmanand Sudret, 2010,2011℄.

A oe ient

β

α

isdened astheproje tion ofthe model responseon fun tion

Φ

α

:

β

α

=

Z

X

(26)

Distribution Density Orthonormalbasis Uniform

1

2

1

[−1,1]

(x)

P

k

(x)

√

2k + 1

,with

P

k

Legendre polynomial

Gaussian

1 √

2π

exp

−

x

2

2 H

k

(x)

√

k!

,with

H

k

Hermite polynomial

Gamma

x

a

Γ (a + 1)

exp (

−x)

1

x>0

L

k

(x)

Γ (k + a + 1)

,with

L

k

Laguerre polynomial

Table 1.1: Classi alunivariate polynomial familiesusedfor Polynomial Chaos Expansion.

The integral an be estimated using Monte-Carlo methods, quadrature rules [Ghio el and

Ghanem,2002℄ or sto hasti ollo ation methods [Xiu, 2009℄.

The oe ients an alsobeestimatedbyleastsquaresregression[BlatmanandSudret, 2010,

2011℄froma set of

n

observations:

b

β

A

= argmin

βA

∈R

ard

(A)

n

X

i=1

y

(i)

_{− Φ}

A

x

(i)

T

β

A

2 .

(1.2.5)

Notethatiftheobservations aredrawna ording tothedistributionoftheinputs,the

meta-modelwill be morea urate inthe high-probability regionsof theinputdomain.

Theusualone-dimensional polynomial familiesusedfor Polynomial Chaos Expansion, whi h

are hosena ordingto thedistribution oftheone-dimensionalvariable

x

,aregiven inTable 1.1.

Furthermore, the inputs an be transformed using an isoprobabilisti transformation, su h

asthe Nataf or the Rosenblatt transformations [Nataf, 1962; Rosenblatt, 1952; Lebrun and

Dutfoy, 2009℄. Su h transformations map

x

to a

d

-dimensional standard Gaussian variable

ξ

(i.e.

d

independent standard Gaussian variables). Then a Polynomial Chaos Expansion an be performed using Hermite polynomials [Blatman and Sudret, 2011℄. The expansion

be omes:

y (x) =

X

α∈N

d

β

α

H

α

(T (x)) ,

(1.2.6) where

H

α

=

d

Q

i=1

H

α

i

and

β

α

=

Z

T

(X)

y T

−1

(ξ)

H

α

(ξ)

d

Y

i=1

ϕ (ξ

i

) dξ.

(1.2.7)

Here,

T : x

7→ ξ

is the isoprobabilisti transformation and

T

−1

its inverse,

H

α

are Hermite polynomials,and

ϕ

the standard univariate Gaussianprobability densityfun tion.

Thankstothisisoprobabilisti transformation,thePolynomialChaosExpansionofa omputer

(27)

1.3 Methods forthe sele tionof the regressors of alinear model

Inthisse tionwereviewtheexistingmethodsforthesele tionofthemostinuentialregressors

for linear regression or Polynomial Chaos Expansion. The methods are presented in the

hronologi al order of their appearan e. Two approa hes an be distinguished: the rst one

sele tstheregressorswhi harethemostinuential. These ondoneminimizesthe oe ients

asso iated withthe leastinuential regressors.

1.3.1 Stepwise and all-subsets regressions

Stepwiseregressionaimsatsele tingtheregressorswhi himprovethepredi tiona ura ythe

most. Therearethreemainapproa hestoperformthissele tion: forwardsele tion,ba kward

elimination and bidire tional elimination.

Intheforwardmethod,thesetofthe sele tedregressorsisemptyattheinitialstep. Then,at

ea hstep,oneaddstheregressorwhi hbestimprovesthepredi tiona ura yoftheregression

model. Theaddition ontinuesuntil astopping riterion is rea hed.

On the ontrary,withthe ba kward elimination, a huge number ofregressors aresele ted at

theinitialstep. Thentheregressorswhi h ontributetheleastto thepredi tion a ura yare

removed stepbystepfrom theregressionmodel.

Efroymson [1960℄ introdu ed an approa h ombining forward sele tion and ba kward

elimi-nation. At ea h step of the forward sele tion, the interest of removing one of the previous

sele ted regressorsis studied.

However,stepwiseregression isknown asbeinggreedy andquiteunstable [Hesterberg etal.,

2008℄.

Inparallel,all-subsetsregressionhasbeenintrodu edbyFurnivalandWilson[1974℄. Itrelies

on theevaluationof the a ura yof all the regressionmodels basedon all thesubsetsof the

set of regressors. Even though exhaustive, this approa h an be omputationally expensive,

espe iallywhenthe numberof regressorsis high.

1.3.2 Ridge regression

Introdu ed by Hoerl and Kennard [1970℄, ridge regression is based on a penalization of the

oe ientsoftheregressors. Thispenalization anbeseenasaregularizationoftheregression

problem. The oe ientsobtained withtheridgeregressionarethesolutions ofthefollowing

optimization problem:

b

β

ridge

= argmin

β

∈R

p

n

X

i=1

y

x

(i)

_{− h}

x

(i)

T

β

2 + δ

kβk

2

2 ,

(1.3.1)

with

δ

anon-negative real-valued onstant. Thisleads to the normalequation:

H

T

H

+ δI

p

b

β

ridge

= H

T

y

obs

.

(1.3.2)

Pra ti ally, the optimal value of

δ

an be estimated thanks to a Cross validation riterion. The absolutevalue of the oe ientsde reases as

δ

in reases. When

δ = 0

, theresultis the same astheone ofordinaryleastsquares. If

δ > 0

thenthematrix

H

T

_H

_{+ δI}

p

is positive

denite and thus invertible.

(28)

andArsenin, 1977℄, whi h isdened asfollows:

b

β

Tikhonov

= argmin

β

∈R

p

n

X

i=1

y

x

(i)

_{− h}

x

(i)

T

β

2 +

kΓβk

2 ,

(1.3.3)

with

Γ

a

d

× d

-dimensional matrix. If

Γ

T

_Γ

ispositivedenite, thisproblem hasthefollowingexpli it solution:

b

β

Tikhonov

= H

T

H

+ Γ

T

Γ

−1

H

T

y

obs

.

(1.3.4)

Notethatif

Γ

isdened su h that

Γ

T

_Γ

ispositive denite,thenthematrix

H

T

_H

_{+ Γ}

T

_Γ

is

aninvertible matrix.

1.3.3 LASSO

TheLeastAbsoluteShrinkage andSele tionOperator (LASSO)methodhasbeen introdu ed

by Tibshirani [1989℄. It relies on a

L

1

-penalization of the estimation of

β

, whi h an be written:

b

β

LASSO

= argmin

β

∈R

p

n

X

i=1

y

x

(i)

_{− h}

x

(i)

T

β

2 + δ

kβk

1 ,

(1.3.5)

with

δ

a non-negative onstant.

Thehigher

δ

is, themore zero oe ientsthere areand thesparsertheregressionmodel is.

1.3.4 Forward stagewise regression

Hastieetal.[2001℄haveintrodu ed theforwardstagewiseregression. Althoughdierent from

LASSO, ityields similarresults. The pro edure an be dened by thefollowing algorithm:

•

Initialize with

R

= y

obs

and

β

i

= 0, i

∈ {1, . . . , p}

, then repeat until no regressor is orrelated with

R

:

Find

i

∈ {1, . . . , p}

su h that

h

i

X

obs

isthe most orrelated with

R

, Update

β

i

= β

i

+ ǫ

i

,

ǫ

i

= ǫ

sign or

h

i

X

obs

, R

, Update

R

= R

− ǫ

i

h

i

X

obs

,

where,by abuseof notation

h

i

X

obs

= h

i

x

(1)

, . . . , h

i

x

(n)

. In pra ti e,

ǫ

isset to a smallvalue,like

ǫ = 0.01

. Ingeneral,thisapproa hismorereliablethanthe lassi alstepwise regression.

1.3.5 Least Angle Regression

Introdu ed by Efron et al. [2004℄, Least Angle Regression (LAR) is similar to the forward

stagewiseregression, giventhatitsele tstheregressor

h

i

X

obs

whi histhemost orrelated

withthe urrentresidual

R

. However,the omputation ofthevalueof

β

i

isdierent. Instead of being slightly modied, the value of

β

i

is hosen su h that the orrelation between the new residual

R

− β

i

h

i

X

obs

and its most orrelated regressor

h

j

X

obs

is equal to the orrelation between

R

− β

i

h

i

X

obs

and

h

i

X

obs

. This method an also be seen as an

(29)

1.3.5.1 The algorithm

Least Angle Regression(LAR) isasso iated withthe following algorithm:

1. Initialize with

R

= y

obs

and

β

i

= 0, i

∈ {1, . . . , p}

. 2. Find

i

∈ {1, . . . , p}

su h that

h

i

X

obs

isthe most orrelated with

R

.

3. Move

β

i

from

0

toward its least squares oe ient, until another regressor

h

j

X

obs

hasasmu h orrelationwith

R

− β

i

h

i

X

obs

as

h

i

X

obs

.

4. Movejointly

(β

i

, β

j

)

inthedire tiondenedbytheirjointleastsquares oe ientofthe urrent residual on

h

i

X

obs

, h

j

X

obs

,until some regressor

h

k

X

obs

is asmu h

orrelated withthe urrent residual.

5. Continue until min

(p, n

− 1)

regressorshave been retained.

1.3.5.2 LASSO an be seen as spe i ase of LAR

Efronetal.[2004℄ andHastieetal.[2007℄haveshownthataslightlymodiedLARalgorithm

an provide the entire paths of the LASSO oe ients as the

δ

oe ient in reases. This modiedalgorithm isdened asfollows:

•

RuntheLAR algorithm from step1 to 4,

•

Ifanon-zero oe ienta hieveszero,removetheasso iatedregressorfromthelinearmodel and re omputethe joint least squaresdire tion,

•

Continueuntil min

(p, n

− 1)

regressorshavebeen retained.

In the same way, a modied LAR algorithm an be used to perform a forward stagewise

regressioninthe aseof

ǫ

→ 0

[Hastieetal.,2007℄. NotethatthelabelLARSgenerallyrefers to this modiedLAR algorithm (where Srefersto Stagewiseor LASSO).

1.3.5.3 Hybrid LARS

Introdu ed byEfron etal. [2004℄,hybrid LARS isderived fromtheoriginal LARS (referring

to the original LAR or LASSO here). This modiedalgorithm omprisesa LAR step whi h

enables to sele tthe regressors. The nextstep is the estimation byordinary leastsquares of

the oe ients asso iated withthe sele ted regressors.

Hybrid LARS relies on a separation between the hoi e of theregressors and theestimation

of the linearmodel.

It enables to in reasethe a ura yofthelinear model ompared to theoriginal LARS.

Relaxed LASSO [Meinshausen etal., 2007℄is an extension of the LARS-basedLASSO

algo-rithm. The rst stepis thesame asfor hybrid LARS. The ordinaryleastsquares estimation

ofthe oe ientsatthese ondstepisrepla edbyaLASSOestimationwithasmallpenalty.

In this approa h, for the sele ted regressors at a given step of the LARS algorithm, one

performs LASSO with a small penalty oe ient

δ

, su h that no regressor is eliminated. HybridLASSO isa parti ular aseof this algorithm,with

δ = 0

.

(30)

1.3.6 Dantzig sele tor

The Dantzig sele tor of Candes and Tao [2007℄ is based on the resolution of the following

optimizationproblem:

β

Dantzig

= argmin

β

∈R

p

H

T

_y

obs

− Hβ

_∞

subje t to

kβk

1 ≤ t,

(1.3.6) with

t

∈ R

+

Inthe same wayas LARS,the Dantzig sele tor sets some oe ients to zero, thus sele ting

someregressors.

However, Efron etal.[2004℄ and Meinshausen etal. [2007℄ have shownthatthelinear model

obtained with LASSO is as a urate as or more a urate than the one obtained with the

Dantzigsele tor.

Note that a DASSO (DAntzig Sele tor with Sequential Optimization) algorithm has been

proposedbyJamesetal.[2008℄inorderto omputeinonestepthewholepathoftheDantzig

sele tor.

1.3.7 Con lusions

In this se tion, methods whi h enable to sele t the regressors of a linear model have been

reviewed. Su happroa hesareparti ularly usefulwhenthenumberofobservations

n

issmall ompared to the numberofpossible regressors

p

ofthelinearmodel.

1.4 Gaussian pro ess regression or Kriging

This se tion is devoted to the surrogate modeling of a omputer ode by Gaussian Pro ess

Regression.

Gaussianpro essregressioniswidelyusedin omputerexperiments[Sa ksetal.,1989;Santner

etal.,2003; Rasmussen and Williams, 2006℄. Inthe Gaussian pro ess regression framework,

theoutput

y

of the ode an be seenasa realizationof a Gaussianpro ess.

In the remainder of the se tion, we rst outline the multidimensional Gaussian distribution

and the denition of a Gaussian pro ess. Then the Gaussian pro ess regression framework

fora known ovarian e fun tion ispresented. Finally, theestimationof thehyperparameters

ofparametri ovarian e fun tionsisdes ribed.

1.4.1 Gaussian pro esses

1.4.1.1 Multidimensional (multivariate) Gaussian distribution

A random ve tor

u

= (u

1 , . . . , u

n

) , n

≥ 1

, is a Gaussian ve tor if the following equivalent assumptionsareveried:

•

for any

a

∈ R

n

,

a

T

_u

hasa Gaussian distribution,

•

the hara teristi fun tion of

u

is of the form

v

7→ exp

iv

T

_m

₋

1

2 v

T

_Kv

with

m

a

n

-dimensional ve tor and

K

a

(n

× n)

-dimensional matrix, whi h is symmetri and positive denite.

(31)

1.4.1.2 Gaussian pro esses

Arandompro essasso iatestoanyvalueof

x

arandomvariable

Y (x)

. Arandompro essisa Gaussianpro essifitsnite-dimensionaldistributionsareGaussiandistributions. AGaussian

pro ess

Y

is hara terizedbyitsmeanand ovarian efun tions. Themeanfun tionisdened by:

m (x) = E [Y (x)] .

(1.4.1)

The ovarian e fun tion isdened by:

C x, x

′

=

ov

Y (x) , Y x

′

_,

(1.4.2)

x

′

in

X

.

AGaussian pro essissaid tobestationaryif,forall

x

(1)

_{, . . . , x}

(n)

in

X

and

h

∈ R

d

su hthat

x

(1)

+ h, . . . , x

(n)

_{+ h}

arestillin

X

,themultidimensional distributionoftheGaussianpro ess

Y

at

x

(1)

_{, . . . , x}

(n)

is the sameasthe oneat

x

(1)

_{+ h, . . . , x}

(n)

_{+ h}

.

Itfollowsthata ovarian e fun tionissaidto bestationary,if,forall

x, x

′

_{, x + h, x}

′

_{+ h}

_{∈ X}

,

one has:

C x + h, x

′

+ h

= C x, x

′

= C x

_{− x}

′

, 0

.

(1.4.3)

Finally,a Gaussian pro ess is stationary if and only ifits mean fun tion is onstant and its

ovarian e fun tion isstationary.

Thenextse tionoutlinessome lassi alparametri familiesofstationary ovarian e fun tions

andtheirproperties. Foramoredetailedreviewof ovarian e fun tions,theinterestedreader

mayrefer to Abrahamsen [1997℄ andRasmussenand Williams[2006℄.

1.4.1.3 Parametri families of stationary ovarian e fun tions

Typi alparametri families of ovarian e fun tionsareof theform:

C x, x

′

= σ

2 K

ℓ

x

− x

′

(1.4.4)

where

K

ℓ

is a orrelation fun tion parametrized by the ve tor of orrelation lengths

ℓ

∈

(0, +

∞)

d

,and

σ

2 _{∈ (0, +∞)}

isa varian e parameter.

The following paragraphs present some lassi al stationary orrelationfun tions

K

ℓ

.

The nugget orrelation fun tion

The nugget orrelationfun tion isdened by:

K

ℓ

x

− x

′

= δ

x=x

′

,

(1.4.5)

where

δ

denotesthe Krone ker delta. Notethatthis ovarian e fun tion doesnot depend on any orrelationlength.

By onstru tion, the observations of a Gaussian pro ess with a nugget orrelation fun tion

arenot orrelated and onsequently independent and identi ally distributed.

Figure1.1presentsanexampleofapathofthe enteredGaussianpro esswiththenugget

or-relationfun tionandaunitvarian e

σ

2

. Thetraje toryisveryroughandalltheobservations

(32)

0.0

0.2

0.4

0.6

0.8

1.0 −2

−1

0

1

2

PSfragrepla ements

x

y

(x

)

Figure1.1: Anexampleofapathofthe enteredGaussianpro esswiththenugget orrelation

fun tionand a unitvarian e

σ

2

.

The squared exponential orrelation fun tion

Thesquaredexponential (orGaussian) orrelationfun tion isdened by:

K

ℓ

x

− x

′

= exp

_−d

ℓ

x

− x

′

2 ,

(1.4.6) where

d

ℓ

(x

− x

′

_{) =}

v

u

t

X

d

i=1

x

i

− x

′

i

ℓ

i

2

. Thetraje toriesofaGaussianpro esswithasquared

exponential orrelationfun tionareinnitelydierentiable. This ovarian efun tioniswidely

usedinKrigingmodels. However,theassumptionofinnitedierentiabilitymaybeunrealisti

[Stein,1999℄.

Figure1.2presentsthesquared-exponential orrelationfun tion andan exampleof apathof

the enteredGaussianpro esswithasquared-exponential orrelationfun tion,aunitvarian e

σ

2

,andthefollowing orrelation lengths:

ℓ

∈ {0.05, 0.1, 0.2}

. It an be seenthat theshorter the orrelationlengthis,thefasterthe orrelationfun tionde reases. Besides,thepathvaries

moreifthe orrelationlength isshort. Finally,notethatthetraje tories areverysmooth,in

agreement withtheir innitedierentiability.

The Matérn orrelation fun tion

Themulti-dimensional Matérnkernel an be dened as:

K

ℓ

x

− x

′

=

1 Γ (ν) 2

ν−1

2 √

νd

ℓ

x

− x

′

ν

K

ν

2 √

νd

ℓ

x

− x

′

,

(1.4.7)

with

Γ (

·)

thegammafun tion,

K

ν

amodiedBesselfun tion[AbramowitzandStegun,1965℄ and

ν

≥

1

2

the smoothnesshyperparameter.

Notethatas

ν

→ ∞

,theMatérnkerneltendstothesquaredexponential orrelationfun tion. Besides,when

ν = k +

1

2 , k

∈ N

,the Matérnkernelhasasimplerform. Inparti ular,wehave:

•

if

ν =

1

2

:

K

ℓ

x

− x

′

= exp

_−d

ℓ

x

− x

′

,

(1.4.8)

(33)

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

PSfragrepla ements

h

K

ℓ

(h

)

(a)Correlationfun tion

0.0

0.2

0.4

0.6

0.8

1.0 −1

0

1

2

PSfragrepla ements

x

y

(x

)

(b)Traje tories

0.05

0.1

0.2

PSfragrepla ements Correlation length

ℓ

Figure 1.2: Onthe left gure: plotof the squared-exponential orrelation fun tion. On the

rightplot: anexampleofapathofthe enteredGaussianpro esseswithasquared-exponential

orrelation fun tion

K

ℓ

,

ℓ

∈ {0.05, 0.1, 0.2}

and aunit varian e.

•

if

ν =

3

2

:

K

ℓ

x

− x

′

=

1 +

√

3 d

ℓ

x

− x

′

exp

₋

√

3 d

ℓ

x

− x

′

,

(1.4.9)

•

if

ν =

5

2

:

K

ℓ

x

− x

′

=

1 +

√

5 d

ℓ

x

− x

′

+

5

3 d

ℓ

x

− x

′

2 exp

₋

√

5 d

ℓ

x

− x

′

.

(1.4.10)

Figure 1.3 presents the exponential orrelation fun tion and an example of a path of the

entered Gaussianpro esseswithan exponential orrelationfun tion,aunitvarian e

σ

2

and

thefollowing orrelationlengths:

ℓ

∈ {0.05, 0.1, 0.2}

. The traje tories arenot dierentiable. Figure1.4presentstheMatérn

3

2

orrelationfun tionandexamplesofapathof the entered Gaussianpro esseswithaMatérn

3

2

orrelationfun tion,aunitvarian e

σ

2

,andthefollowing

orrelation lengths:

ℓ

∈ {0.05, 0.1, 0.2}

. The traje tories are not very smooth,but smoother than withtheexponential orrelationfun tion.

Figure1.5presentstheMatérn

5

2

orrelationfun tionandanexampleofapathofthe entered Gaussianpro esseswithaMatérn

5

2

orrelationfun tion,aunitvarian e

σ

2

,andthefollowing

orrelation lengths:

ℓ

∈ {0.05, 0.1, 0.2}

. Thetraje tories arerelativelysmooth.

It an be seen on Figures 1.2 to 1.5 thatthe shorterthe orrelation length is, the faster the

orrelationfun tionde reases. Besides,thepathvariesmoreifthe orrelationlengthisshort.

Figure 1.6 presents the Matérn orrelation fun tion and examples of a path of the entered

Gaussian pro esses with a Matérn orrelation fun tion, a orrelation length equal to

0.5

, a unit varian e

σ

2

, and the following values of the smoothness parameter:

ν

∈ {

1

2 ,

3

2 ,

5

2 ,

∞}

. It an be seen that the smoothness parameter strongly impa ts the form of the orrelation

(34)

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

PSfragrepla ements

h

K

ℓ

(h

)

0.0

0.2

0.4

0.6

0.8

1.0 −3

−2

−1

0

1

2

PSfragrepla ements

x

y

(x

)

(b)Traje tories

0.05

0.1

0.2

PSfrag repla ements Correlation length

ℓ

Figure 1.3: On the left gure: plot of the exponential orrelation fun tion. On the right

plot: anexample ofpathsofthe enteredGaussian pro esseswithanexponential orrelation

fun tion

K

ℓ

,

ℓ

∈ {0.05, 0.1, 0.2}

and aunit varian e.

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

PSfragrepla ements

h

K

ℓ

(h

)

0.0

0.2

0.4

0.6

0.8

1.0 −2

−1

0

1

2

PSfragrepla ements

x

y

(x

)

(b)Traje tories

0.05

0.1

0.2

PSfrag repla ements Correlation length

ℓ

Figure1.4: Ontheleftgure: plotof theMatérn

3

2

orrelationfun tion. Ontheright plot: anexampleofapathofthe enteredGaussianpro esseswithaMatérn

3

2

orrelationfun tion