• Aucun résultat trouvé

Physically interpretable machine learning algorithm on multidimensional non-linear fields

N/A
N/A
Protected

Academic year: 2021

Partager "Physically interpretable machine learning algorithm on multidimensional non-linear fields"

Copied!
36
0
0

Texte intégral

(1)

HAL Id: hal-03181089

https://hal.archives-ouvertes.fr/hal-03181089

Submitted on 25 Mar 2021

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci- entific research documents, whether they are pub- lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

Physically interpretable machine learning algorithm on multidimensional non-linear fields

Rem-Sophia Mouradi, Cédric Goeury, Olivier Thual, Fabrice Zaoui, Pablo Tassi

To cite this version:

Rem-Sophia Mouradi, Cédric Goeury, Olivier Thual, Fabrice Zaoui, Pablo Tassi. Physically inter- pretable machine learning algorithm on multidimensional non-linear fields. Journal of Computational Physics, Elsevier, 2021, 428, pp.110074. �10.1016/j.jcp.2020.110074�. �hal-03181089�

(2)

OATAO is an open access repository that collects the work of Toulouse researchers and makes it freely available over the web where possible

Any correspondence concerning this service should be sent

to the repository administrator: [email protected]

This is a publisher’s version published in: https://oatao.univ-toulouse.fr/2 7580

To cite this version:

Mouradi, Rem-Sophia and Goeury, Cédric and Thual, Olivier and Zaoui, Fabrice and Tassi, Pablo Physically interpretable machine learning algorithm on multidimensional non-linear fields. (2021) Journal of Computational Physics, 428.

110074. ISSN 0021-9991.

Official URL:

https://doi.org/10.1016/j.jcp.2020.110074

(3)

Contents lists available atScienceDirect

Journal of Computational Physics

www.elsevier.com/locate/jcp

Physically interpretable machine learning algorithm on multidimensional non-linear fields

Rem-Sophia Mouradia,b,∗,Cédric Goeurya,Olivier Thualb,c, Fabrice Zaouia, Pablo Tassia,d

aEDFR&D,NationalLaboratoryforHydraulicsandEnvironment(LNHE),6QuaiWatier,78400Chatou,France

bClimate,Environment,CouplingandUncertaintiesresearchunit(CECI)attheEuropeanCenterforResearchandAdvancedTrainingin ScientificComputation(CERFACS),FrenchNationalResearchCenter(CNRS),42AvenueGaspardCoriolis,31820Toulouse,France cInstitutdeMécaniquedesFluidesdeToulouse(IMFT),UniversitédeToulouse,CNRS,Toulouse,France

dSaint-VenantLaboratoryforHydraulics(LHSV),Chatou,France

a rt i c l e i nf o a b s t ra c t

Articlehistory:

Received25May2020

Receivedinrevisedform6November2020 Accepted10December2020

Availableonline7January2021

Keywords:

Data-DrivenModel(DDM)

ProperOrthogonalDecomposition(POD) DimensionalityReduction(DM) PolynomialChaosExpansion(PCE) MachineLearning(ML)

Geosciences

In an ever-increasinginterestfor MachineLearning (ML) and afavorable data develop- ment context, we here propose an original methodology for data-based prediction of two-dimensional physical fields. Polynomial Chaos Expansion (PCE), widely used inthe UncertaintyQuantificationcommunity(UQ),haslongbeenemployedasarobustrepresen- tationforprobabilisticinput-to-outputmapping.IthasbeenrecentlytestedinapureML context,andshowntobeaspowerfulasclassicalMLtechniquesforpoint-wiseprediction.

Some advantagesareinherenttothemethod,suchasitsexplicitnessand adaptabilityto smalltrainingsets,inadditiontotheassociatedprobabilisticframework.Simultaneously, DimensionalityReduction(DR)techniquesareincreasinglyusedforpatternrecognitionand datacompressionandhavegainedinterestduetoimproveddataquality.Inthisstudy,the interestofProperOrthogonalDecomposition(POD)fortheconstructionofastatisticalpre- dictivemodelisdemonstrated.BothPODandPCEhaveamplyprovedtheirworthintheir respective frameworks.The goalof thepresent paperwas tocombine themforafield- measurement-basedforecasting. Thedescribedstepsare alsousefultoanalyzethe data.

Somechallengingissuesencounteredwhenusingmultidimensionalfieldmeasurementsare addressed,forexamplewhendealingwithfewdata.ThePOD-PCEcouplingmethodology ispresented,withparticularfocusoninputdatacharacteristicsandtraining-setchoice.A simplemethodologyforevaluatingtheimportanceofeachphysicalparameterisproposed forthePCEmodelandextendedtothePOD-PCEcoupling.

©2020TheAuthor(s).PublishedbyElsevierInc.Thisisanopenaccessarticleunderthe CCBY-NC-NDlicense(http://creativecommons.org/licenses/by-nc-nd/4.0/).

1. Introduction

Deep Learning techniques(DL [1])and more generally Machine Learning (ML [2]), andtheir applications to physical problems(fluidmechanics[3];plasmaphysics[4];quantummechanics[5],etc.)havemadeapromisingtake-offinthelast few years.Thishasbeenparticularlythecaseforfieldswherethemeasurement potentialhasdramaticallyincreased(e.g.

GeoscienceData[6]).Inthiscontext,learningtechniquesareofinteresttoestablishnon-linearphysicalrelationships from

*Correspondingauthorat:EDFR&D,NationalLaboratoryforHydraulicsandEnvironment(LNHE),6QuaiWatier,78400Chatou,France.

E-mailaddress:[email protected](R.-S. Mouradi).

https://doi.org/10.1016/j.jcp.2020.110074

0021-9991/©2020TheAuthor(s).PublishedbyElsevierInc.ThisisanopenaccessarticleundertheCCBY-NC-NDlicense (http://creativecommons.org/licenses/by-nc-nd/4.0/).

(4)

Fig. 1.Representation of the POD-PCE ML approach.

the databya combinationofsteps,inparticularusingtransformation functions,to capturethe complexityofthe system [2].

In particular, multi-layer NeuralNetworks (NN) [7] are widely used for physical applications.Their popularitycomes fromthiscomplexstructure,whichmakesthemadaptableforvariousapplications[8,9].However,somelimitationsprevent theuseofNNforphysicalapplications:(i)itisdifficulttoprovideanexplicitinput-to-outputformulation,duetothecom- binationsofstepsinvolvedinthelearning(ActivationFunctions,HiddenLayers[1]).Physicalinterpretationoftheconstructed model is therefore tedious[10]; (ii) too manyhyper-parameters and choicesare involved, depending on the number of neuronsandlayers(curseofdimensionality)[11];(iii)nogeneralproofforthetheoreticalabilityofapproximatingarbitrary functionsisavailable,excepttheUniversalApproximationTheoremanditsextensions[12,13] forparticularcases.

Toovercometheselimitations,wepropose analternativeML method,basedonacouplingbetweenProperOrthogonal Decomposition (POD) [14] and Polynomial Chaos Expansion (PCE) [15,16]. This approach is proposed for the prediction ofspatially-distributed physicalfieldsthat varyin time.The ideais tousePOD toseparate thespatial patternsfromthe temporalvariations,thatarerelatedtotheconditioningparametersusingPCE.TocorrespondtocommonNNparadigms,an adequate representationofthisideaisgiveninFig.1.Inparticular, PODisusedforbothEncoding andDecodingwhereas PCEisusedasanActivationFunctionintheLatentRepresentation[1].

TheproposedPOD-PCEaddressesthesedrawbacksofML.

(i) Itisexplicitandsimpletoimplement, asitconsistsoftheassociationoftwo lineardecompositions.PODisalinear separationof the spatiotemporal patterns [17], shown to be accurate for both linearand non-linearproblems [18], combiningsimplicityandrelevance.PCEisawell-establishedmethodinUncertaintyQuantification(UQ)[19,20],widely used for the study of stochastic behavior in physics [21,22]. It is a linear polynomial expansion that allows non- linearitiestobegradually addedto themodelby increasingthepolynomialdegree. Thelinearityandorthonormality ofthePODandPCEcomponentsandtheprobabilisticframeworkofPCEmaketheoutput’sstatisticalmomentseasier tostudy[23],enablingstraightforwardphysicalinterpretationofthemodel[24].

(ii) Itonlyhastwo hyper-parameters:anumberofPODcomponents,andaPCEpolynomial degree.Both canbe chosen according to quantitative criteria[14,25]. All other forms of parameterization (choice of the polynomial basis) can be achieved with robust physical and/or statistical arguments [26], as assessed in the present paper. Furthermore, theorthonormality ofthe PODand PCEbases minimizesthe numberof componentsnecessary to captureessential variationsindata.Additionally,thePODmodescapturemoreenergythananyotherdecomposition[27],PCEisknown toexponentiallyconvergewithpolynomialdegree[16],andthecardinalityofthelattercanbereducedbysparsebasis selection[28].

(iii) Itcan beconsidered asa universalexpansion forphysicalfield approximation: aphysical fieldhasa finitevariance, whichimpliesthatitbelongstotheHilbertspaceofrandomvariableswithfinitesecondordermoments.Therethere- foreexists a numerable setoforthogonal random variables, thatform thebasis of thisHilbertspace, on whichthe fieldofinterestcan beexpanded(strictequality,notapproximation)[20].Amathematicalsettingforbasisconstruc- tion based on input was established by Soize and Ghanem [26] for the general case of dependent variables with arbitrarydensity,providedthatthesetofinputsisfinite.

Intheliterature,associatingregressiontechniquestoReducedOrderModels(ROM), thatincludePOD,isnot novel[29, 30].Thecitedstudies,however,focusedondimensionalityreduction,whereas theexplicitformulationandapplicabilityto complex physicalprocessesare emphasized inthe presentstudy.Secondly,coupling PCEtoROM was recentlyaddressed

(5)

[31,32] and theuseofPCEasML isconsistentwiththe workofTorreet al. [33],wherethe authorsshowedthat PCEis aspowerfulasclassicalMLtechniques.However,neitherspatiotemporalfieldsnorphysicalinterpretabilitywereaddressed.

Thedatainthesestudieswereeitherobtainedfromnumericalexperiments,emulatedfromanalyticalbenchmarkfunctions such asSobolorIshigami,orbasedonone-dimensionaldatasets[33].Incontrast,theproposedPOD-PCE methodologyis herein assessedon two-dimensionalphysicalfields.Inparticular, atoy examplewheresyntheticdataare emulatedusing an analytical function (groundwater perturbations due to tidal loadings [34]), and a real data set (high-resolution field measurementsofunderwatertopography)areused.Althoughsimilarfromalearningpointofview,thesetwoapplications are characterized withdifferences. In particular, thetoy problemispurely parametric and controllable,whereas the real data concern temporaldynamics andare of limitedsize. The casesare therefore complementary, in the sense that they allow demonstrating differentpropertiesof the proposed methodology. Hence, usingthe particularities ofeach case, the studyconsistsin:i)theevaluationofthecombineduseofPODandPCEasMLforpoint-wiseprediction;ii)therobustness ofthemethodologytonoise;iii)theapplicationtofielddatawiththeinherentchallengesnotencounteredwithnumerical data (e.g.paucity); iv) a focuson model explicitnessas a key condition forphysical understanding andv) theinfluence of forcing variablesstudy, basedon a classicalmeasure ofimportance (Garson weights [35]) directlycomputedwith the POD-PCEexpansioncoefficients.

Thepaperisorganizedasfollows.Section2givesadetailedexplanationofthemethodology,withaproposalforphysical importance measures inSubsection2.2.2.Section 3deals withthe assessmentofthe methodologyonsynthetic data,for bothpredictionandphysicalinterpretation.Inparticular,therobustnessoftheapproachtonoiseisevaluatedinSubsection 3.3.The modelis then deployed onfield measurements in Section 4.The study caseanddataare described in 4.1. POD andPCEperformancesare thendemonstratedindependentlyin4.2witha deepphysicalanalysis. Theperformance ofthe POD-PCEpredictorisdiscussedin4.3.Asummaryofthestudyandperspectivesoftheproposedmethodologyarepresented inSection5.

2. Theoreticalframework

Inthissection,theobjectiveistodefinetheframeworkoftheproposedPOD-PCEMachineLearningmethodology,along withphysicalinfluenceindicatorsforthe inputs.Thisis theobjectofSubsection2.3,butfirst,areminder oftheexisting PODandPCEtheoreticalbasesispresentedin2.1and2.2respectively.

2.1. ProperOrthogonalDecomposition

PODisa dimensionalityreductiontechnique [17] thatiswell documentedinliterature [14,18].Theoretical detailsand demonstrationscanbefoundin[27,36].Forclarity’ssake,theessentialelementsofPODaresummarizedbelow.

The goal of POD is to extract the main patterns of continuous bi-variate functions. These patterns, when added and multipliedbyappropriatecoefficients,explainthedynamicsofthevariableofinterest:areal-valuedphysicalfield.

Letu:×T→D beacontinuous functionoftwovariables(x,t)×T.Thefollowingrelationshipsandproperties holdforany×T andHilbertspaceD characterizedbyitsscalarproduct(. ,.)D andinducednorm||.||D.However,as isthecaseforamajorityofphysicalfields,weshallconsider asasetofspatialcoordinates(e.g.R2 orR3),T anevent space(e.g.parametersspaceRV with V∈N,ora temporalsubset[0,T]⊆R+),andD asasetofscalarrealvaluesor vector realvalues(e.g.R orR2). PODconsistsinan approximationof u(x,t)atagivenorderd∈N (Lumley[17])asin Equation(1),

u(x,t)d k=1

vk(tk(x) , (1)

where {vk(.)}dk=1C(T,R) and{φk(.)}kd=1C(,D),withC(A,B) denotingthe spaceof continuous functionsdefined overA andarrivingatB.TheobjectiveofPODistoidentify{φk(.)}dk=1 thatminimizesthedistanceoftheapproximation fromthetruevalue u(.,.),overthewhole×T domain,withanorthogonalityconstraintfor{φk(.)}dk=1 usingthescalar product(. ,.)D.Thiscanbedefined,intheleast-squaressense,asaminimizationproblem.

The minimization problem is defined for all orders d∈N, so that the members φk are ordered according to their importance.Inparticular,fororder1,φ1isthelineargeneratorofthesub-vectorspacemostrepresentativeofu(x,t)inD. ForD=Im(u),thefamily{φk(.)}dk=1iscalledthePODbasisofD ofrankd.Thesolutiontothisproblemhasalreadybeen established in literature [17,37]. The theoretical aspects ofPOD and demonstrations of mathematical properties can, for example,befoundin[27]:thePODbasisofD oforderdistheorthonormalsetofeigenvectorsofanoperatorR:D→D definedasRφ= (u,φ)D×uT,iftheeigenvectorsaretakenindecreasingorderofthecorrespondingeigenvaluesk}dk=1. Forthisexpansion,anaccuracyrate,alsocalledtheExplainedVarianceRate(EVR),denoteded atrankd,canbecalcu- latedasinEquation(2).EVRtendsto1(perfectapproximation)whend→ +∞.

ed=

kdλk +∞

k=1λk

. (2)

(6)

Inpractice,forD=R,whenu(.,.)isadiscretesample onasetofm∈N spacecoordinates X= {x1,. . . ,xm}andfor n∈N measurement eventsT = {t1,. . . ,tn}(e.g.realizationsof theparameters, timecoordinates, etc.),the available data setisarrangedinamatrixU(X,T)= [u(xi,tj)]i,j∈Rm×n,calledthesnapshotmatrix,soastobeabletoworkinadiscrete space.ThePODproblemformulatedinEquation(1) canbewritteninitsdiscreteformasU(X,T)=(d)(X)V(d)(T),where (d)(X):= [φj(xi)]i,j∈Rm×d andV(d)(T):= [vi(tj)]i,j∈Rd×n.Theproblemcanthereforebeviewedasifworkingwitha newfunctionU(X,.)= [u(xi,.)]i∈{1,...,m}:T →D=RM.Then,theaverageover T canbe definedasthestatisticalmean overthesubsetT,andthescalarproduct(. ,.)DasthecanonicalproductoverRm.ThePODoperatorRcanbewrittenas inEquation(3),

Rφ(X)=1 n

n j=1

U(X,tj)T(X)U(X,tj)=1

nU(X,T)U(X,T)T(X) , (3)

where U(X,tj)= [u(xi,.)]i∈{1,...,m} isthecolumnnumber j ofthematrixU(X,T)(i.e realizationtj ofthemeasurement overX),and(X)= [φ(xi)]i∈{1,...,m}.AsfindingthePODbasisisequivalenttoidentifyingtheorthonormalsetofeigenvec- torsoftheoperatorR,thenforthisdiscreterepresentationtheproblembecomesequivalent tosolvingtheeigenproblem ofthematrixR:=n1U(X,T)U(X,T)T,calledthecovariancematrix.Anumberd∈Nofeigenvectors(X)areidentified andstoredinthecolumnsofthematrix(d)(X).FortheeigenvaluesofthecovariancematrixRdenotedk}dk=1,theex- pansioninEquation(1) canalsobewrittenasinEquation(4),where{φk(.)}dk=1togetherwith{ak(.)}dk=1arebi-orthonormal, andvk(.)=ak(.)

n×λk.

u(x,t)d k=1

ak(t)

n×λkφk(x) . (4)

BydefiningthematrixA(d)(T):= [ai(tj)]i,j∈Rd×n andtheoperatorD(d)1,...,λd)correspondingtothediagonalmatrixof elements λi,wehaveU(X,T)=(d)(X)D(d)(

n×λ1,...,

n×λd)A(d)(T).ThereforethetransposedformisU(X,T)T= A(d)(T)TD(d)(

n×λ1,...,

n×λd)(d)(X)T. Thanksto theorthonormalityof{ak(.)}dk=1,the covariancematrixreadsR=

1

n(d)(X)D(d)(n×λ1,...,n×λd)(d)(X)T=(d)(X)D(d)1,...,λd)(d)(X)T.

Whenn<<m,itismorecomputationallyefficienttosolvetheeigenproblemofRT insteadoftheeigenproblemofRas highlightedbySirovich [37].Thisisoftenthecasewhenalimitednumberofoccurrencesismeasuredforatwo-dimensional physicalfield,asisthecaseencounteredfortheapplicationdescribedinSection4.

When an order d<<min(m,n) corresponds to a high EVR as defined in Equation (2), we speak of dimensionality reduction, because the data are projected in a sub-space that is of much smaller dimension than Rm×n. When diverse enoughrecordsareavailableforthevariableunderstudy,wemayconsiderthat{φk(X)}dk=1= {[φk(xi)]i∈{1,...,m}}dk=1,i.e.the resulting POD basis,is a generator ofall possiblestates. Predictingthe associated expansion coefficients{ak(t)}dk=1 fora giveneventt wouldthereforebeenoughtopredictthewholestate.Hence,weproposetousethePODasabasisextractor.

Thiswouldfirstenable studyofthedynamicsofthevariableofinterestandeventually extractionofphysicalinformation, as shownin theapplications Sections 4and 3. Then, the basis can be used asa generator forthe predictionof diverse states.Thisimpliespredicting{ak(t)}dk=1,forwhichwepropose tousePolynomialChaosExpansion(PCE),asdescribed in thefollowingSection2.2.

2.2. PolynomialChaosExpansion

AreminderofthetheoreticalbaseofPCEispresentedinSubsection2.2.1.Theoreticaldetails,demonstrationsandinter- esting referencescanbe foundin [23,19]. Then,a simpleindicator isproposed inSubsection2.2.2forthe analysisofthe variablesinfluenceontheoutputvalue.ThelatterislatergeneralizedforPOD-PCEinSection2.3.

2.2.1. Learning

The idea behind Polynomial ChaosExpansion (PCE)is to formulate an explicitmodel that linksa variable ofinterest (output)toconditioningparameters(inputs),bothinaprobabilityspace.Thisenablesthepropagationpathofprobabilistic information(uncertainties,occurrence frequencies)tobe mappedfromthe inputto theoutput space. Thevariable ofin- terest,Y,andtheinputparameters=12,...,θV)arethereforeconsideredrandomvariables,characterizedbyagiven ProbabilityDensityFunction(PDF)denoted f.ItshouldbekeptinmindthattheoutputsofourproblemarethePODex- pansioncoefficientsY= [ak(t)]k∈{1,...,d},andthattheinputscorrespondtophysicalforcings,asdescribedlaterinSection2.3.

The objective isto derive the variations of the PODcoefficients asthe outcome ofthe forcings.Let usnow recall some fundamentalsofthemathematical probabilisticframework, takingtheexampleofaone dimensionalreal-valued variable.

ThedefinitionscanbeeasilyextendedtoRM.

Let (,F,P) be a probability space, where is the eventspace (space ofall the possible eventsω) equipped with

σ-algebra F (someeventsof)anditsprobabilitymeasureP (likelihoodofagiveneventoccurrence).Arandomvariable definesanapplicationY(ω):DY⊆R,withrealizationsdenotedby yDY.ThePDF ofY isa function fY:DY →R

Références

Documents relatifs

Total milk yield was not increased after milk ejection, indicating that the transfer of an autocrine inhibitor of milk secretion from the alveolar tissue to the cisternal cavity

reconnait certaines propriétés géométriques qu'elle a étudiées en classe mais oubliées.. Figure 1 figure3 F N c

coïncide sans doute avec la mort d’un pensionnaire de la fosse, tout comme les études de lions écorchés de Delacroix résultent de la mort de deux spécimens ayant été

Index Terms—Kernel-based methods, Gram matrix, machine learning, pattern recognition, principal component analysis, kernel entropy component analysis, centering

Figure 14: Optimal PCE degrees for the H i P ; Stlj model and associated empirical errors of the training ( T ) and the prediction ( P ) sets.. For Modes 3 and 4, the optimal

Abstract—With the emergence of machine learning (ML) techniques in database research, ML has already proved a tremendous potential to dramatically impact the foundations,

Due to these reasons, we design our proposed method for exceptional parallel se- quential pattern mining using Map/Reduce - here, we can adapt the ideas of the GP- Growth

Model distillation is a technique which makes use of two different machine learn- ing models, namely - a student model and a teacher model.. A teacher model is a complex