• Aucun résultat trouvé

Structure-preserving neural networks

N/A
N/A
Protected

Academic year: 2021

Partager "Structure-preserving neural networks"

Copied!
17
0
0

Texte intégral

(1)

Science Arts & Métiers (SAM)

is an open access repository that collects the work of Arts et Métiers Institute of

Technology researchers and makes it freely available over the web where possible.

This is an author-deposited version published in:

https://sam.ensam.eu

Handle ID: .

http://hdl.handle.net/10985/19561

To cite this version :

Quercus HERNÁNDEZ, Alberto BADÍAS, David GONZÁLEZ, Francisco CHINESTA, Elías

CUETO - Structure-preserving neural networks - Journal of Computational Physics p.1-16 - 2020

Any correspondence concerning this service should be sent to the repository

(2)

Structure-preserving

neural

networks

Quercus Hernández

a

,

Alberto Badías

a

,

David González

a

, Francisco Chinesta

b

,

Elías Cueto

a

,

aAragonInstituteofEngineeringResearch(I3A),UniversidaddeZaragoza,MariadeLuna3,E-50018Zaragoza,Spain bESIChairandPIMMLab,ENSAMParisTech,155Boulevarddel’Hôpital,75013Paris,France

a

b

s

t

r

a

c

t

Keywords:

Scientificmachinelearning Neuralnetworks Structurepreservation GENERIC

We develop a method to learn physical systems from data that employs feedforward neural networks and whose predictions comply with the first and second principles of thermodynamics. The method employs a minimum amount of data by enforcing the metriplectic structure of dissipative Hamiltonian systems in the form of the so-calledGeneralEquationfortheNon-EquilibriumReversible-IrreversibleCoupling,GENERIC (Öttinger and Grmela (1997) [36]). The methoddoes not need to enforce any kind of balanceequation,andthusnopreviousknowledgeonthenatureofthesystemisneeded. Conservationofenergyand dissipationofentropyinthepredictionofpreviouslyunseen situationsariseasanaturalby-productofthestructureofthemethod.

Examplesoftheperformanceofthemethodareshownthatcompriseconservativeaswell asdissipativesystems,discreteaswellascontinuousones.

1. Introduction

Withtheirruptionoftheso-calledfourthparadigmofscience[17] agrowinginterestisdetectedonthemachinelearning ofscientific laws.Aplethora ofmethodshavebeendevelopedthat are abletoproducemore orlessaccurate predictions about the response of physical systems in previously unseensituations by employing techniques ranging from classical regressiontothemostsophisticateddeeplearningmethods.

Forinstance,recentworksinsolidmechanicshavesubstitutedtheconstitutiveequationswithexperimentaldata[1,25], whileconservingthetraditionalapproachonphysicallawswithhighepistemicvalue(i.e.,balanceequations,equilibrium). Similar approacheshave appliedthis conceptto the unveiling (orcorrection) ofplasticity models [21], while others cre-ated the newconcept of constitutive manifold [20,22]. Other approaches are designedto unveil an explicit,closed form expressionforthephysicallawgoverningthephenomenonathand[3].

Aninterestisobservedintheincorporationofthealreadyexistingscientificknowledgetothesedata-drivenprocedures. Thisinterestistwo-fold.Indeed,weprefernottogetridofcenturiesofscientificknowledgeandrelyexclusivelyonpowerful

ThisprojecthasbeenpartiallyfundedbytheESIGroupthroughtheESIChairatENSAMParisTechandthroughtheproject“SimulatedReality:an intelligence augmentationsystembased on hybridtwinsand augmentedreality”at the UniversityofZaragoza. Thesupportofthe SpanishMinistry ofEconomyandCompetitivenessthroughgrantnumberCICYT-DPI2017-85139-C2-1-RandbytheRegionalGovernmentofAragon,throughtheproject T24_20R,andtheEuropeanSocialFund,arealsogratefullyacknowledged.

*

Correspondingauthor.

E-mailaddresses:quercus@unizar.es(Q. Hernández),abadias@unizar.es(A. Badías),gonzal@unizar.es(D. González),francisco.chinesta@ensam.eu (F. Chinesta),ecueto@unizar.es (E. Cueto).

(3)

machinelearningstrategies.Existingtheorieshaveprovedtobeusefulinthepredictionofphysicalphenomenaandarestill inthepositionofhelpingtoproduceveryaccuratepredictions.Thisistheprocedurefollowedintheso-calleddata-driven computational mechanicsapproachmentionedbefore.Ontheotherhand,thesetheorieshelptokeeptheconsumption of datato aminimum.Data are expensivetoproduce andto maintain. Alreadyexisting scientific knowledgecould alleviate theamountofdataneededtoproduceasuccessfulprediction.

Thementionedworksondata-drivencomputationalmechanicsusuallyrelyontraditionalmachine learningalgorithms, which are very precise and tested butusually computationally expensive. With the recent advances in data processing, computing resources andmachine learning, neural networks have become a powerful tool to analyze traditionally hard problemssuchasimageclassification[26,45],speechrecognition[12,18] ordatacompressing[42,46].Thesenewmachine learningmethodsoutperformmanyofthetraditionalones,bothinmodeling capacityandcomputationaltime(oncetrained, certain neuralnetworkscaneasilyhandlerealtimerequirements). Recentworksinthemachinelearning community[29, 31,34] haveshownthatneuralnetworksarealsoversatileinconstraintoptimizations.

Thisistheapproachfollowedbyseveralauthorsinthecontextofphysicalsimulations,whichaimtosolveasetofpartial differential equations(PDEs) in complexdynamical systems. Physical problemsmust satisfy inherentlycertain conditions dictatedbyphysics,oftenformulatedasconservationlaws,andcanbeimposedtoaneuralnetworkusingextralossterms intheconstrainedoptimizationprocess[32].

Similar constraints areimposed in the so-calledphysically-informedneural networksapproach [40,48]. Thisfamily of methodsemploys neuralnetworkstosolvehighly nonlinearpartialdifferentialequations(PDEs)resultinginveryaccurate andnumericallystableresults.However,theyrelyonpriorknowledgeofthegoverningequationsoftheproblem.

Theauthorshaveintroducedtheso-calledthermodynamicallyconsistentdata-drivencomputationalmechanics[7,9,10]. Unlikeother existing works, thisapproachdoesnot imposeanyparticularbalance equation tosolve for.Instead,it relies ontheimposition oftherightthermodynamicstructure oftheresultingpredictions,asdictatedbytheso-calledGENERIC formalism [15]. As will be seen, this ensures conservation ofenergy and the right amount of entropy dissipation, thus givingrisetopredictionssatisfyingthefirstandsecondprinciplesofthermodynamics.Thesetechniques,however,employ regressiontounveilthethermodynamicstructureoftheproblematthesamplingpoints.Forpreviouslyunseensituations, theyemployinterpolationonthematrixmanifolddescribingthesystem.

Recentworksinsymplecticnetworks[23] haveby-passedthosedrawbacksbyexploitingthemathematicalpropertiesof Hamiltoniansystems,sonopriorknowledgeofthesystemisrequired.However, thistechniqueonlyoperateson conserva-tivesystemswithnoentropygeneration.

Theaimofthisworkisthedevelopmentofanewstructure-preservingneuralnetworkarchitecturecapableofpredicting thetimeevolutionofasystembasedonexperimentalobservationsonthesystem,withnopriorknowledgeofitsgoverning equations,to be valid forboth conservativeand dissipativesystems.The key idea is tomerge theproven computational powerofneuralnetworksinhighlynonlinearphysicswiththermodynamicconsistentdata-drivenalgorithms.Theresulting methodology,aswillbeseen,isapowerfulneuralnetworkarchitecture,conceptuallyverysimple—basedonstandard feed-forwardmethodologies—thatexploitstherightthermodynamicstructureofthesystemasunveiledfromexperimentaldata, andthatproducesinterpretableresults[33].

The outline of the paper is as follows. A brief description of the problem setup is presented in Section 2. Next, in Section 3, the methodology is presentedof both the GENERIC formalism and thefeed-forward neural networksused to solvethestatedproblem.Thistechniqueisusedin differentphysicalsystemsofincreasingcomplexity: adouble thermo-elasticpendulum(Section4)andaCouetteflowinaviscoelasticfluid(Section5).Thepaperiscompletedwithadiscussion inSection6.

2. Problemstatement

Weinan E seems to be the first author in interpreting the process oflearning physical systems as the solution of a dynamicalsystem[6].Considerasystemwhosegoverningvariableswillbehereafterdenotedbyz

M

⊆ R

n,with

M

the statespaceofthesevariables,whichisassumedtohavethestructureofadifferentiablemanifoldin

R

n.

Theproblemoflearningagivenphysicalphenomenoncanthusbeseenastheoneoffindinganexpressionforthetime evolutionoftheirgoverningvariables z,

˙

z

=

dz

dt

=

F

(

x

,

z

,

t

),

x

∈  ∈ R

D

,

t

I

= (

0

,

T

],

z

(

0

)

=

z

0

,

(1)

wherex andt refertothespaceandtimecoordinateswithinadomainwithD

=

2

,

3 dimensions. F

(

x

,

z

,

t

)

isthefunction thatgives,afteraprescribedtimehorizonT ,theflowmap z0

z

(

z0

,

T

)

.

Whilethisproblemcanbe seenasageneralsupervisedlearning problem(wefixboth z0 andz), whenwehave addi-tionalinformationaboutthephysics beingrepresentedbythesought function F ,itislegitimatetotrytoincludeitinthe searchprocedure.W.EseemstohavebeenthefirstinsuggestingtoimposeaHamiltonianstructureon F ifweknowthat energyisconserved,forinstance[6].Veryrecently,twodifferentapproachesfollowthissamerationale[2,23].

For conservativesystems, therefore,imposing a Hamiltonianstructure seems a very appealingwayto obtain thermo-dynamics-awareresults.However, when thesystemisdissipative, thismethod doesnotprovide withvalidresults.Given

(4)

theimportanteofdissipativephenomena (viscoussolids,fluiddynamics,...)weexploretherightthermodynamicstructure toimposetothesearchmethodology.

ThegoalofthispaperistodevelopanewmethodofsolvingEq. (1) usingstateoftheartdeeplearningtools,inorderto predictthetimeevolutionofthestatevariablesofagivensystem.Thesolutionisforcedtofulfillthebasicthermodynamic requirementsofenergyconservationandentropyinequalityrestrictionsviatheGENERICformalism, presentedinthenext section.

3. Methodology

Inthissectionwedeveloptheappropriatethermodynamicstructurefordissipativesystems.Classicalsystemsmodeling canbe doneatavariety ofscales. Wecould thinkofthemostdetailed(yetoftenimpractical) scaleofmolecular dynam-ics,whereenergyconservationappliesandtheHamiltonianparadigmcanbeimposed.However,thenumberofdegreesof freedom and, noteworthy,thetime scale,renders thisapproachoflittleinterest formanyapplications.Ontheother side ofthespectrumliesthermodynamics,whereonlyconserved,invariant, quantitiesare describedandthusthereisnoneed forconservationprinciples.Atanyother(mesoscopic)scale,unresolveddegreesoffreedom giverisetotheappearance of fluctuation intheresults(oritsequivalent, dissipation).Atthesescales,traditionalmodelingprocedures implyexpressing physicalinsights intheformofgoverningequations[13].Theseequationsare thenvalidatedfromexperimental observa-tions.

Alternatively,thermodynamicscanbe thoughtofasa meta-physics,inthesense thatitis actuallyatheory oftheories [14]. It provides us withthe right theoretic framework in which basic principlesare met.And, in particular for any of theseintermediate or mesoscopicscales, aso-calledmetriplectic structureemerges. The termmetriplecticcomes forthe combinationofsymplecticandRiemannian(metric)geometryandemphasizes thefactthatthereareconservative aswell as dissipativecontributions to the generalevolution ofsuch a system. Oncesuch a geometric structure is found forthe system,we areinthepositionoffixing theframeworkinwhichourneuralnetworkscanlookfortheadequateprediction ofthefuturestatesofthesystem.Theparticularmetriplecticstructurethatweemployforsuchataskisknown,asstated before,asGENERIC.

3.1. TheGENERICformalism

The “GeneralEquation forNon-EquilibriumReversible-Irreversible Coupling”, GENERIC,formalism [15,36] establishesa mathematical frameworkinorder tomodel thedynamicsofa system. Furthermore,itiscompatiblewithclassical equi-librium thermodynamics[35], preserving the symmetries ofthe systemas statedin Noether’stheorem.It has served as thebasisforthedevelopmentofseveralconsistentnumericalintegrationalgorithmsthatexploitthesedesirableproperties [11,41].

TheGENERICstructurefortheevolutioninEq. (1) isobtainedafterfindingtwoalgebraicordifferentialoperators L

:

T

M

T

M

,

M

:

T

M

TM

,

whereT

M

andT

M

represent,respectively,thecotangentandtangentbundlesof

M

.AsingeneralHamiltoniansystems, there will be an energypotential, which we will denotehereafter by E

(

z

)

.In orderto take into account the dissipative effects, asecond potential (the so-calledMassieupotential)is introduced intheformulation.It is,ofcourse, theentropy potential ofthe GENERICformulation, S

(

z

)

. Withall theseingredients,we arriveata descriptionof thedynamicsofthe systemofthetype

dz dt

=

L

E

z

+

M

S

z

.

(2)

AsshowninEq. (2),thetimeevolutionofthesystemdescribedbythenonlinearoperatorF

(

x

,

z

,

t

)

presentedinEq. (1) isnowsplitintwoseparatedterms:

ReversibleTerm:Itaccountsforallthereversible(non-dissipative)phenomenaofthesystem.Inthecontextofclassical mechanics,thistermisequivalenttoHamilton’sequationsofmotionthatrelatestheparticlepositionandmomentum. Theoperator L

(

z

)

isthePoissonmatrix—itdefinesaPoissonbracket—andisrequiredtobeskew-symmetric(a cosym-plecticmatrix).

Non-ReversibleTerm: The rest of the non-reversible (dissipative) phenomena of the system are modeled here. The operator M

(

z

)

isthefrictionmatrixandisrequiredtobesymmetricandpositivesemi-definite.

TheGENERICformulationoftheproblemiscompletedwiththefollowingso-calleddegeneracyconditions L

S

z

=

M

E

z

=

0

.

(3)

Thefirstconditionexpressesthe reversiblenatureoftheL contributiontothedynamicswhereasthesecondrequirement expressestheconservationofthetotalenergybythe M contribution.Thismeansnootherthingthattheenergypotential

(5)

doesnot contribute tothe production ofentropy and, conversely,that the entropy functionaldoes not contribute to re-versibledynamics.Thismutualdegeneracyrequirementinadditiontothealreadymentioned L andM matrixrequirements ensurethat:

E

t

=

E

z

·

z

t

=

E

z



L

E

z

+

M

S

z



=

0

,

whichexpressestheconservationofenergyinanisolatedsystem,alsoknownasthefirstlawofthermodynamics.Applying thesamereasoningtotheentropy S:

S

t

=

S

z

·

z

t

=

S

z



L

E

z

+

M

S

z



=

S

zM

S

z

0

,

whichguaranteestheentropyinequality,thisis,thesecondlawofthermodynamics.

3.2. Proposedintegrationalgorithm

Oncethelearningprocedureisaccomplished,ourneuralnetworkisexpectedtointegratethesystemdynamicsintime, givenpreviouslyunseeninitialconditions.InordertonumericallysolvetheGENERICequation,weformulatethediscretized versionofEq. (2) followingpreviousworks[11]:

zn+1

zn



t

=

L

·

DE Dz

+

M

·

DS Dz

.

(4)

The timederivative oftheoriginal equation isdiscretizedwitha forwardEulerschemeintime increments



t, where zn+1

=

zt+t. Land M are the discretized versionsof the Poisson andfriction matrices. Last, DEDz and DSDz represent the discretegradients,whichcanbeapproximatedinafiniteelementsenseas:

DE

Dz



Az

,

DS

Dz



B z

,

(5)

where A andB representthediscretematrixformofthegradientoperators.

Finally,manipulatingalgebraicallyEq. (4) withEq. (5) andincludingthedegeneracyconditionsofEq. (3),theproposed integrationschemeforpredictingthedynamicsofaphysicalsystemisthefollowing

zn+1

=

zn

+ 

t

(

L

·

Azn

+

M

·

B zn

)

(6)

subjectto:

L

·

B zn

=

0

,

M

·

Azn

=

0

,

ensuringthethermodynamicalconsistencyoftheresultingmodel.

Tosumup, themain objectiveofthiswork isto compute theformofthe A

(

z

)

and B

(

z

)

gradient operatormatrices, subjecttothedegeneracyconditions,inordertointegrate theinitialsystemstatevariables z0 over certaintime steps



t ofthetimeinterval

I

.Usually,theformofmatricesL andM isknowninadvance,giventhevastliteratureinthefield.If necessary,thesetermscanalsobecomputed[11].

3.3. Feed-forwardneuralnetworks

Intheintroductionwealreadymentionedtheintrinsicpowerofneuralnetworksinmanyfields.Themainreasonunder the fact that neural networksare able tolearn andreproduce such a variety ofproblems isthat they are considered to beuniversalapproximators[4,19],meaningthattheyarecapableofapproximatinganymeasurablefunctiontoanydesired degreeofaccuracy.Themainlimitationofthistechnique isthecorrectselectionofthetuningparameters ofthenetwork, alsocalledhyperparameters.

Anotheruniversalapproximatorarepolynomials,astheycanapproximateanyinfinitelydifferentiablefunctionasaTaylor powerseriesexpansion.The maindifferenceisthat neuralnetworksrelyoncomposition offunctionsratherthan sumof powerseries:

ˆ

y

= (

f[L]

f[L−1]

◦ ... ◦

f[l]

◦ ... ◦

f[2]

f[1]

)(

x

).

(7)

Eq. (7) showsthatthedesiredoutput y from

ˆ

adefinedinputx ofaneuralnetworkisacompositionofdifferentfunctions

f[l] asbuildingblocksofthenetworkinL totallayers.Thechallengeistoselectthebestcombinationoffunctionsinthe correctordersuchthatitapproximatesthesolutionofthestudiedproblem.

(6)

Fig. 1. Representation of a single neuron (left) as a part of a fully connected neural net (right).

The simplestbuilding block of artificial deep neural network architectures is the neuron or perceptron (Fig. 1, left). Severalneuronsarestackedinamultilayerperceptron(MLP),whichismathematicallydefinedasfollows

x[l]

=

σ

(

w[l]x[l−1]

+

b[l]

),

(8)

wherel istheindexofthecurrentlayer,x[l−1]andx[l]arethelayerinputandoutputvectorrespectively,w[l] istheweight matrixofthelastlayer,b[l] isthebiasvectorofthelastlayerand

σ

istheactivationfunction.Ifnoactivation functionis applied, theMLPisequivalent to alinearoperator. However,

σ

is chosen tobe anonlinear functioninorder toincrease thecapacityofmodeling morecomplexproblems,whicharecommonlynonlinear.Inclassificationproblems,thetraditional activation function is the logistic function (sigmoid)whereas inregression problems, Rectified Linear Unit(ReLU) [8] or hyperbolictangentarecommonlyused.

In thiswork, weuse a deepneural networkarchitecture known asfeed-forward neuralnetwork [44].It consistsofa severallayerofmultilayerperceptronswithnocyclicconnections,asshowninFig.1(right).

The inputoftheneuralnetisthe vectorstate ofagiventimestep zn,andtheoutputsare theconcatenatedGENERIC matrices AnnetandBnetn :forasystemwithn statevariablesthenumberofinputsandoutputsare Nin

=

n andNout

=

2n2. Then,usingtheGENERICintegrationscheme,thestatevectoratthenexttimestepznet

n+1isobtained.Thismethodisrepeated forthewholesimulationtimeT withatotalofNT snapshots.

Thestatevariablesofageneraldynamicalsystemmaydifferinseveralordersofmagnitudefromeachother,duetotheir ownphysicalnatureormeasurementunits.Then,apre-processingoftheinputdata(scalingornormalization)canimprove themodelperformanceandstability.

Thenumberofhiddenlayers Nh dependsonthecomplexityoftheproblem.Increasingthenetsizeraisesthe computa-tionalpowerofthenettomodelmorecomplexphenomena.However,itslowsthetrainingprocessandcouldleadtodata overfitting,limitingitsgeneralizationandextrapolationcapacity.Thesizeofthehiddenlayersischosen tobethesameas theoutputsizeofthenetNout.

Thecostfunctionforourneuralnetworkiscomposedofthreedifferentterms:

Dataloss:Themainlossconditionistheagreementbetweenthenetworkoutputandtherealdata.Itiscomputedas thesquarederrorsum,computedbetweenthepredictedstatevectorznet

n+1 andthegroundtruthsolutionznGT+1foreach timestep.

L

data

n

=

zGTn+1

znnet+1

22

.

(9)

Fulfillmentofthedegeneracyconditions:Thecost functionwill alsoaccount forthedegeneracyconditionsinorder to ensure the thermodynamic consistency of the solution, implemented asthe sum of the squaredelements ofthe degeneracyvectorsforeachtimestep,

L

degen

n

=

L

·

Bnetn znetn

22

+

M

·

Anetn znetn

22

.

(10) Thistermactsasaregularizationofthelossfunctionand,atthesametime,istheresponsibleofensuring thermody-namicconsistency.Sotospeak,itisthecornerstoneofourmethod.

Regularization:Inordertoavoidoverfitting,anextraL2regularizationterm

L

reg isaddedtothelossfunction,defined asthesumoverthesquaredweightparametersofthenetwork.

L

reg

=

L



l n[l]



i n



[l+1] j

(

w[il,]j

)

2

.

(11)

The totalcost functioniscomputedasthe sumsquarederror (SSE)ofthedata lossanddegeneracyresidual,in addi-tion tothe regularizationterm, atthe endofthe simulationtime T for eachtrain case. The regularizationloss ishighly

(7)

Fig. 2. Sketch of a structure-preserving neural network training algorithm.

dependentonthesizeofthenetworklayersandhasdifferentscalingwithrespecttotheotherterms,soitiscompensated withtheregularization hyperparameter(weightdecay)

λ

r.Anadditionalweight

λ

d isaddedto thedatalossterm, which accountsfortherelativescalingerrorwithrespecttothedegeneracyconditions.

L

=

NT



n=0

d

L

datan

+

L

degen n

)

+ λ

r

L

reg

.

(12)

Theusualbackpropagationalgorithm[37] isthenusedtocalculatethegradientofthelossfunctionforeachnet param-eter(weightandbiasvectors),whichareupdated withthegradientdescent technique[43]. Theprocess isthenrepeated foramaximumnumberofepochsnepoch.TheresultingtrainingalgorithmissketchedinFig.2.

The proposedmethodology istestedwithtwodifferentdatabasesofnonlinear physicalsystems,splitina partitionof train cases (Ntrain

=

80% of thedatabase) andtest cases(Ntest

=

20% ofthe database).The net performance isevaluated withthemeansquarederror(MSE)ofthestatevariablesprediction,associatedwiththedatalossterm,Eq. (9),overallthe timesnapshots, MSEdata

(

zi

)

=

1 NT NT



n=0



zGTi,n

zneti,n



2

.

(13)

Thesameprocedureisappliedtothedegeneracyconstraint,associatedwiththedegeneracylossterm,Eq. (10),overallthe timesnapshots, MSEdegen

(

zi

)

=

1 NT NT



n=0



L

·

Bneti,nzneti,n

+

M

·

Aneti,nzneti,n



.

(14)

As a general error magnitude of the algorithm, the average MSE of both the train (N

=

Ntrain) and test trajectories (N

=

Ntest)isalsoreportedforboththedata(m

=

data)anddegeneracy(m

=

degen)constraints,

MSEm

(

z

)

=

1 N N



i=1 MSEm

(

zi

).

(15)

Algorithm1andAlgorithm2showapseudocodeofourproposedalgorithmtoboththetrainingandtestprocesses.The proposedmethodisfullyimplementedinPyTorch[38] andtrainedinanIntelCorei7-8665UCPU.

4. Validationexamples:doublethermo-elasticpendulum

4.1. Description

The firstexampleisadoublethermo-elasticpendulum(Fig.3) consistingoftwomassesm1 andm2 connectedbytwo springsofvariablelengths

λ

1 and

λ

2 andnaturallengthsatrest

λ

01 and

λ

02.

Thesetofvariablesdescribingthedoublependulumareherechosentobe

S

= {

z

= (

q1

,

q2

,

p1

,

p2

,

s1

,

s2

)

∈ (R

2

× R

2

× R

2

× R

2

× R × R),

q1

=

0

,

q1

=

q2

},

(16) whereqi,piandsiaretheposition,linearmomentumandentropyofeachmassi

=

1

,

2.

(8)

Algorithm1Pseudocodeforthetrainalgorithm. Loadtraindatabase: zGT(trainpartition),t,L,M;

Definenetworkarchitecture: Nin,Nout=2N2in,Nh,σj;

Define hyperparameters:η,λd,λr; Initializewi,j,bj;

for epoch←1,nepochdo

for train_case←1,Ntraindo Initializestatevector:znet

0 ←zGT0 ; Initializelosses:Ldata,Ldegen=0;

for snapshot←1,NTdo Forwardpropagation:[Anet

n ,Bnnet]←Net(zGTn ); Eq. (8)

Timeintegration:

znet

n+1←znnet+ t(Anetn znetn +M·Bnnetznetn ); Eq. (4) Updatedataloss:Ldata← Ldata+ Ldata

n ; Eq. (9)

Updatedegeneracyloss:Ldegen← Ldegen+ Ldegen

n ; Eq. (10)

end for

SSElossfunction:L← λdLdata+ Ldegen+ λrLreg Eq. (11),Eq. (12) Backwardpropagation;

Optimizerstep;

end for

Learningratescheduler;

end for

Algorithm2Pseudocodeforthetestalgorithm. Loadtestdatabase: zGT(testpartition),t,L,M;

Loadnetworkparameters; for test_case←1,Ntestdo Initializestatevector:znet

0 ←zGT0 ;

for snapshot←1,NTdo Forwardpropagation:[Anet

n ,Bnetn ]←Net(znetn ); Eq. (8)

Timestepintegration:

znetn+1←znnet+ t(Anetn znetn +M·Bnnetznetn ); Eq. (4) Updatestatevector:znet

nznetn+1; Updatesnapshot:nn+1;

end for

ComputeMSEdata,MSEdegen; Eq. (13),Eq. (14)

end for

ComputeMSEdata,MSEdegen; Eq. (15)

Fig. 3. Double thermo-elastic pendulum.

Thelengthsofthesprings

λ

1and

λ

2aredefinedsolelyintermsofthepositionsas

λ

1

=

q1

·

q1

,

λ

2

=

(

q2

q1

)

· (

q2

q1

).

ThetotalenergyofthesystemcanbeexpressedasthesumofthekineticenergyofthetwomassesKi andtheinternal energyofthespringseifori

=

1

,

2,

E

=

E

(

z

)

=



i Ki

(

z

)

+



i ei

i

,

si

),

Ki

=

1 2mi

|

pi

|

2

.

(17)

(9)

Fig. 4. Lossevolutionofdataanddegeneracyconstraintsforeachepochofthestructure-preservingneuralnetworktrainingprocessofthedoublependulum example.

Thetotalentropyofthedoublependulumisthesumoftheentropiesofthetwomassessi,

S

=

S

(

z

)

=

s1

+

s2

.

(18)

ThismodelincludesthermaleffectsinthestretchingofthespringsduetotheGough-Jouleeffect.Theabsolute temper-aturesTiateachspringareobtained throughEq. (19).Thesetemperaturechangesinduceaheatfluxbetweenbothsprings, beingproportionaltothetemperaturedifferenceandaconductivityconstant

κ

>

0,

Ti

=

ei

si

.

(19)

In this case, there is a clear contribution of both conservative Hamiltonian mechanics (mass movement) and non-Hamiltonian dissipative effects (heat flux), resulting in a non-zero Poisson matrix (M

=

0). Thus, the GENERIC matrices associatedwiththisphysicalsystemareknowntobe[11]

L

=

0 0 1 0 0 0 0 0 0 1 0 0

1 0 0 0 0 0 0

1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

,

M

=

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1

1

/

2 0 0 0 0

1

/

2 1

.

(20)

4.2. Databaseandhyperparameters

The training databaseis generatedwithathermodynamically consistenttime-steppingalgorithm [41] in MATLAB.The massesofthedoublependulumaresettom1

=

1 kgandm2

=

2 kg,jointwithspringsofanaturallengthof

λ

01

=

2 mand

λ

02

=

1 mandthermalconstantofC1

=

0

.

02 JandC2

=

0

.

2 Jandconductivityconstantof

κ

=

0

.

5.Thesimulationtimeof themovementis T

=

60 sintimeincrementsof



t

=

0

.

3 s(NT

=

200 snapshots).

Thedatabaseconsistsofthestate vector,Eq. (16),of50differenttrajectorieswithrandominitialconditionsofposition qi andlinearmomentum pi ofbothmassesmi(i

=

1

,

2)aroundameanpositionandlinearmomentumofq1

= [

4

.

5

,

4

.

5

]

 m, p1

= [

2

,

4

.

5

]

kg

·

m/s, andq2

= [−

0

.

5

,

1

.

5

]

 m, p2

= [

1

.

4

,

0

.

2

]

kg

·

m/srespectively.Althoughtheinitialconditions ofthesimulationsaresimilar,itresultsinawidevarietyofthemasstrajectoriesduetothechaoticbehaviorofthesystem. This database issplit randomly in 40 train trajectories and10 test trajectories. Thus, there is a total of 80

.

000 training snapshotsand20

.

000 testsnapshots.

Thenetinput andoutputsizeis Nin

=

10 and Nout

=

2Nin2

=

200.The statevectorisnormalizedbasedonthetraining setstatisticalmeanandstandarddeviation.ThenumberofhiddenlayersisNh

=

5 withReLUactivationfunctionsandlinear inthelastlayer.ItisinitializedaccordingtotheKaimingmethod[16] withnormaldistributionandtheoptimizerusedis Adam[24] withaweightdecayof

λ

r

=

10−5 anddatalossweightof

λ

d

=

102.Amultisteplearningrateschedulerisused, starting in

η

=

10−3 anddecayingbyafactorof

γ

=

0

.

1 inepochs 600and1200.Thetrainingprocessendswhenafixed numberofepochsnepoch

=

1800 isreached.

(10)

Fig. 5. Timeevolutionofthestatevariablesinatesttrajectoryofadoublethermo-elasticpendulumusingatime-steppingsolver(GroundTruth,GT)and theproposedGENERICintegrationscheme(SPNN).Sinceeveryvariablehasavectorialcharacter,bothcomponentsaredepictedandlabeled as X andY , respectively.

Table 1

Meansquarederrorofthedataloss(MSEdata)anddegeneracyloss(MSEdegen)for allthestatevariablesofthedoublependulum.

State Variables MSEdata MSEdegen

Train Test Train Test

q1[m] X 1.95·10−2 3.87·10−2 3.56·10−8 4.43·10−8 Y 2.72·10−2 8.21·10−2 4.74·10−8 5.77·10−8 q2[m] X 2.04·10−2 3.65·10−2 9.28·10−8 8.55·10−8 Y 2.72·10−2 3 .73·10−2 3 .55·10−8 5 .01·10−8 p1[kg·m/s] X 6.43·10−4 1.33·10−4 4.00·10−8 7.08·10−8 Y 1.06·10−3 4.06·10−3 1.21·10−7 1.40·10−7 p2[kg·m/s] X 4.88·10−4 9.84·10−4 6.00·10−8 4.58·10−8 Y 8.47·10−4 1.79·10−4 9.76·10−8 1.20·10−7 s1[J/K] 1.21·10−5 3.51·10−5 1.31·10−7 2.06·10−7 s2[J/K] 1.22·10−5 3.18·10−5 2.40·10−7 2.95·10−7 Table 2

Meansquarederroroftheenergy(MSE(E)) andentropy(MSE(s))ofthedouble pendu-lum.

Variable Train Test E [J] 7.99·10−3 8.86·10−3 S [J/K] 6.52·10−8 6

.33·10−8

4.3. Results

Fig. 5 showsthe time evolution of thestate variables (position, momentum andentropy) of each mass givenby the solverandtheneuralnet.

Table1showsthemeansquarederrorofthedataanddegenerationlosstermsforall thestatevariablesofthedouble pendulum.TheresultsarecomputedseparatelyasthemeanoverallthetrainandtesttrajectoriesusingEq. (15).

Fig. 6andFig. 7show the time evolutionof theinternal andkinetic energy (Eq. (17))andthe entropy(Eq. (18)) re-spectivelyforthetwopendulummasses(i

=

1

,

2).Thetotalenergyisconservedandthetotalentropysatisfiestheentropy inequality,fulfillingthefirstandsecondlawsofthermodynamicsrespectively.Themeanerrorforbothtrainandtest trajec-toriesisreportedinTable2.

(11)

Fig. 6. Timeevolutionoftheenergyinatesttrajectoryofadoublethermo-elasticpendulumusingatime-steppingsolver(GroundTruth,GT)andthe proposedGENERICintegrationscheme(Net).

Fig. 7. Timeevolutionoftheentropyinatesttrajectoryofadoublethermo-elasticpendulumusingatime-steppingsolver(GroundTruth,GT)andthe proposedGENERICintegrationscheme(SPNN).

5. CouetteflowofanOldroyd-Bfluid

5.1. Description

The secondexampleisashear(Couette)flowofanOldroyd-Bfluidmodel.Thisisaconstitutivemodelforviscoelastic fluids,consistingoflinearelasticdumbbells(representingpolymerchains)immersedinasolvent.

The Oldroyd-Bmodelarisesinthe modeling offlows ofdilutedpolymeric solutions.Thismodelcan beobtainedboth fromapurelymacroscopicpointofviewaswellasfromamicroscopicone,bymodeling polymerchainsaslineardumbbells diluted ina Newtoniansubstrate. Alternatively, itcanalso beobtainedby considering thedeviatoricpart T of thestress tensor

σ

(theso-calledextra-stresstensor),tobeoftheform

T

+ λ

1 ∇ T

=

η

0



˙

γ

+ λ

2 ∇

˙

γ



,

(21)

(12)

Fig. 8. Couette flow in an Oldroyd-B fluid.

wherethetriangledenotesthenon-linearOldroyd’s upper-convectedderivative[39].Coefficients

η

0,

λ

1 and

λ

2 aremodel parameters.Itisstandardtodenotethestrainratetensorby

γ

˙

= (∇

sv

)

=

D.

Finally,thestressinthesolvent(denotedbyasubscripts)andpolymer(denotedbyasubscriptp)aregivenby

T

=

η

s

γ

˙

+

τ

,

sothat

τ

+ λ

1

τ

=

η

p

γ

˙

,

whichistheconstitutiveequationfortheelasticstress.

Pseudo-experimentaldata are obtainedby theCONNFFESSIT technique [27], basedon theFokker-Plank equation [28]. ThisequationissolvedbyconvertingitinitscorrespondingItôstochasticdifferentialequation,

drx

=



v

yry

1 2Werx



dt

+

1 WedVt

,

dry

= −

1 2Werydt

+

1

WedWt

,

(22)

where v istheflowvelocity, r

= [

rx

,

ry

]

,rx

=

rx

(

y

,

t

)

thepositionvectorandassuming aCouetteflowsothatry

=

ry

(

t

)

dependsonlyontime,we standfor the WeissenbergnumberandVt,Wt aretwoindependentone-dimensionalBrownian motions. Thisequation issolved viaMonteCarlotechniques,by replacingthe mathematicalexpectationby theempirical mean.

Themodelreliesonthemicroscopicdescriptionofthestate ofthedumbbells.Thus,itisparticularlyusefultobasethe microscopicdescriptionontheevolutionoftheconformationtensorc

= 

rr



,thisis,thesecondmomentofthedumbbell end-to-enddistancedistributionfunction.Thistensorisingeneralnotexperimentallymeasurableandplaystheroleofan internalvariable.Theexpectedxy stresscomponenttensorwillbegivenby

τ

=

We 1 K K



k=1 rxry

,

where K isthenumberofsimulateddumbbellsand

=

νp

νp istheratioofthepolymertosolventviscosities.

Thestatevariablesselectedforthisproblemarethepositionofthefluidoneachnodeofthemesh,seeFig.8,itsvelocity v inthexdirection,internalenergye andtheconformationtensorshearcomponent

τ

,

S

= {

z

= (

q

,

v

,

e

,

τ

)

∈ (R

2

× R × R × R)}.

(23)

TheGENERICmatricesassociatedwitheachnodeofthisphysicalsystemarethefollowing

L

=

0 0 1 0 0 0 0 0 0 0 0 0

1 0 0 0 1

1 0 0

1 0 0 0 0 0 0 0 0 0

,

M

=

0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 1

.

(24)

In order to simulatea measurement of real captureddata, Gaussian noise is added to thestate vector, computed as a random variablefollowing a normal distribution with zero meanand standard deviation proportional to the standard deviationofthedatabase

σ

z andnoiselevel

ν

,

(13)

Fig. 9. Loss evolution of data and degeneracy constraints for each epoch of the neural network training process of the Couette flow example.

zGTnoise

=

zGT

+

ν

·

σz

·

N

(

0

,

1

)

(25)

Theresultsofboththenoise-freeandthenoisydatabasearecomparedwithtwodifferentnetworkarchitectures:

Unconstrainednetwork:Thisarchitectureisthesameastheproposednetworkbutremovingthedegeneracyconditions oftheenergyandentropy,Eq. (10),inthelossfunction.Theseconditionsensurethethermodynamicconsistencyofthe resultingintegrator,sonotincludingthemaffectsnegativelyintheaccuracyoftheresults,aswillbeseen.

Black-Box network: In thiscase, noGENERIC architecture is imposed, actingas a black-boxintegrator trainedto di-rectlypredictthestatevector timeevolutionzt+1 fromtheprevioustime stepzt.Thisnaiveapproachisshowntobe inappropriate,asnophysicalrestrictionsaregiventothemodel.

5.2. Databaseandhyperparameters

Thetraining databaseforthisOldroyd-B modelisgeneratedinMATLABwithamultiscaleapproach[28] inthe dimen-sionlessform.ThefluidisdiscretizedintheverticaldirectionwithN

=

100 elements(101nodes)inatotalheightofH

=

1. Atotalof10,000 dumbellswereconsidered ateachnodallocationinthe model.The lidvelocityisset to V

=

1,the vis-coelastic WeissenbergnumberWe

=

1 andReynoldsnumberofRe

=

0

.

1.ThesimulationtimeofthemovementisT

=

1 in timeincrementsof



t

=

0

.

0067 (NT

=

150 snapshots).

The database consistedof the state vector (Eq. (23))of the 100 nodestrajectories (excludingthe node ath

=

H , for whichano-slipcondition v

=

0 hasbeenimposed).Thisdatabaseissplitin80traintrajectoriesand20testtrajectories.

ThenetinputandoutputsizeisNin

=

5 and Nout

=

2N2in

=

50.Thenumberofhiddenlayers isNh

=

5 withReLU acti-vationfunctionsandlinearinthelastlayer.ItisinitializedaccordingtotheKaimingmethod[16],withnormaldistribution and theoptimizer used is Adam[24], witha weight decay of

λ

r

=

10−5 and dataloss weight of

λ

d

=

103. A multistep learning ratescheduleris used, starting in

η

=

10−3 anddecaying by a factor of

γ

=

0

.

1 in epochs 500 and1000. The training process endswhenafixed numberofepochsnepoch

=

1500 isreached. Thesameparameters areconsidered also forthenoisydatabasenetwork(

ν

=

1%)andtheunconstrainednetwork.

Theblack-boxnetworktrainingparametersareanalogoustothestructure-preservingnetwork,exceptfortheoutputsize

Nout

=

Nin

=

5.Severalnetworkarchitecturesweretested,andthelowesterrorisachievedwithNh

=

5 hiddenlayersand 25neuronseachlayer.

Thetimeevolutionofthedata

L

data anddegeneracy

L

degenlosstermsforeachtrainingepochareshowninFig.9.

5.3. Results

Fig.10showsthetimeevolutionofthestatevariables(positionq,velocityv,internalenergye andconformationtensor shearcomponent

τ

)givenbythesolverandtheneuralnet.There isagoodagreementbetweenbothplots.Moreover,the proposedschemeisabletopredictthetimeevolutionoftheflowforseveralsnapshotsbeyondthetrainingsimulationtime

T

=

1,asshowninthesamefigure.

Table3showthemeansquarederrorofthedataanddegenerationlosstermsforallthestatevariablesoftheCouette flow ofan Oldroyd-Bfluid. Theresultsare computedseparatelyasthemeanover allthe trainandtest trajectoriesusing Eq. (15).

Fig. 11shows a box plotof thedata error(MSEdata) forthe train andtest sets inthe fourstudied architectures.The results of the structure-preserving neural network outperform the other two approaches even withnoisy training data. The errorof the unconstrained neural network is greater than one order of magnitude than our approach, proving the importance ofthedegeneracyconditionsintheGENERICformulation. Last,thenaive black-boxapproachshowstheworst performanceofthefournetworks,asnophysicalrestrictionisconsidered.

(14)

Fig. 10. TimeevolutionofthestatevariablesinfivetestnodesofaCouetteflowusingasolver(GroundTruth,GT)andtheproposedGENERICintegration scheme(Net).Thedottedverticallinerepresentsthe simulationtimeT=1 ofthetrainingdataset.

Table 3

Meansquarederrorofthedataloss(MSEdata)anddegeneracyloss(MSEdegen) forallthestatevariablesoftheCouetteflow.

State Variables MSEdata MSEdegen

Train Test Train Test

q [-] X 5.40·10−6 6.29·10−6 1.72·10−7 1.96·10−7 Y 0.00 0.00 0.00 0.00 v [-] 3.23·10−5 4.75·10−5 1.19·10−6 1.48·10−6 e [-] 7.85·10−6 6.60·10−6 7.06·10−7 9.11·10−7

τ[-] 2.36·10−5 1.26·10−5 1.07·10−6 1.31·10−6

Fig. 11. Box plots for the data integration mean squared error (MSEdata) of the Couette flow in both train and test cases.

Withrespect toour previouswork [11], that employeda piece-wise linearregressionapproach,these examplesshow similar levels ofaccuracy, buta much greater level ofrobustness. Forinstance, this sameexample was included inthe mentioned reference. However, inthat case,the problemhadtobe solved withthe helpof areducedorder modelwith only six degrees of freedom, due to the computational burden of the approach. In our former approach, the GENERIC structurewasidentifiedbypiece-wiselinearregressionforeachofthefewglobalmodes oftheapproximation.Sotospeak, inthat case, we learnt thecharacteristics oftheflow. Here,on thecontrary,the netisable tofindan approximation for anyvelocityvalueatthe101nodesofthemesh—say,fluidparticles—withoutanydifficulty.Inthiscase,wearelearning the

(15)

Table 4

Computationtrainingtimeoftheproposedalgorithm forthe tworeportedexamplesinthenoisefree net-works.

Example Epoch Time Total Time Double Pendulum 2.44 s/epoch 73.18 min Couette Flow 1.22 s/epoch 30.53 min

behavioroffluidparticles.Itwillbe interesting,however,tostudytowhatextenttheemployofvariationalautoencoders, asinBertalanetal. [2],couldhelpinsolvingmoreintricatemodels.Autoencodershelpindeterminingtheactualnumber ofdegreesoffreedomneededtorepresentagivenphysicalphenomenon.

6. Conclusions

In this work we have presented a new methodology to ensure thermodynamic consistency in the deep learning of physicalphenomena.Incontrasttoexistingmethods,thismethodologydoesnotneedtoknowinadvanceanyinformation relatedtobalanceequationsorthepreciseformofthePDEgoverningthephenomenaathand.Themethodisconstructed ontopoftherightthermodynamicprinciplesthatensurethefulfillmentoftheenergydissipationandentropyproduction. Itisvalid,therefore,forconservativeaswellasdissipativesystems,thusovercomingpreviousapproachesinthefield.

Whencomparedwithourpreviousworksinthefield (seeGonzalezetal.[11]),thepresentmethodologyshowedtobe morerobust,allowingustofindapproximationsforsystemswithordersofmagnitudemoredegreesoffreedom.Thisnew approachisalsolesscomputationallydemanding.Forthedoublependulumcase,thesnapshotoptimizationoftheGENERIC matricesproposedin[11] hasameasuredperformanceof10minpertrajectory,whichaddupto400minutesconsidering the40studiedtrajectories,whereasournewneural-networkapproachtrainsinonly73.18minutes.Thecomputationaltime oftheotherexamplesisshowninTable4.

Thereportedresultsshow goodagreementbetweenthenetworkoutputandthesyntheticgroundtruth solution,even withmoderatenoisydata.WehavealsoshowntheimportanceofincludingthedegeneracyconditionsoftheGENERIC for-mulationtotheneuralnetworkconstraints,asitensuresthethermodynamicalconsistencyoftheintegrator.The structure-preservingneuralnetworkoutperformsothernaiveblack-boxapproaches,sincethephysicalconstraintsactasaninductive bias,facilitatingthelearningprocess.However,theerrorcanbereducedusingseveraltechniques:

Database:As a generalmethod ofincreasing the precision of an Eulerintegration scheme,thetime step



t canbe decreased so the total number of snapshots is increased. On the contrary, the database will be larger, slowing the trainingprocess.Thesameway,thedatabasecanbeenrichedwithawidervarietyofcases,improvingthenetpredictive capabilities.

IntegrationScheme: Ahigher order Runge-Kutta integration scheme could be introduced inEq. (4) in order to get highersolutionaccuracy [47].However,itrequiresseveralforwardpassesthroughtheneuralnetforeachtimestep, in-crementingthecomplexityoftheintegrationschemeandthetrainingprocess.Additionally,GENERIC-basedintegration schemeshaveshowedverygoodperformanceevenforfirst-orderapproaches. [41]

NetArchitecture: Toincrease thecomputational power ofthe net,more andlarger hiddenlayers Nh can be added. However, thiscould leadtoa moreover-fittedsolutionwhichlimit thepredictionpowerandversatility ofthenet.It alsoincreasesthecomputationalcostofboththetrainingprocessandthetestingofthenet.

TrainingHyperparameters:Theneuralnetworkstrainedinthisworkcouldbeoptimizedusingseveralhyperparameter tuning methodssuch asrandomsearch, Bayesianoptimizationorgradient-based optimizationtoget amoreefficient solution.

Severalopenquestionsremainasafuturework.Amoreexhaustiveanalysiscanbeperformedtoevaluatetheinfluence ofnoisydatatotheintegratorevolution,inordertoaddrobustnesstothemethodandevenpredictwidersimulationtimes usingincrementallearning[5,30].

CRediTauthorshipcontributionstatement

QuercusHernandez:software,investigation,validation

AlbertoBadias:software,investigation,validation

DavidGonzález:data,investigation,validation

FranciscoChinesta:methodology,validation

(16)

Declarationofcompetinginterest

The authors declare that they haveno known competingfinancial interests or personal relationships that could have appearedtoinfluencetheworkreportedinthispaper.

Acknowledgements

This projecthas beenpartially funded by the ESIGroup through the ESIChair at ENSAMArtset MetiersInstitute of Technology,andthroughtheproject2019-0060“SimulatedReality”attheUniversityofZaragoza.Thesupportofthe Span-ish Ministry of Economy and Competitiveness through grant number CICYT-DPI2017-85139-C2-1-R andby the Regional GovernmentofAragon,grantT24_20R,andtheEuropeanSocialFund,arealsogratefullyacknowledged.

References

[1]JacoboAyensa-Jiménez,MohamedH.Doweidar,JoseA.Sanz-Herrera,ManuelDoblaré,Anewreliability-baseddata-drivenapproachfornoisy experi-mentaldatawithphysicalconstraints,Comput.MethodsAppl.Mech.Eng.328(2018)752–774.

[2]TomBertalan,FelixDietrich,IgorMezi ´c,IoannisG.Kevrekidis,OnlearningHamiltoniansystemsfromdata,Chaos:Interdiscip.J.NonlinearSci.29 (12) (Dec2019)121107.

[3]StevenL.Brunton,JoshuaL.Proctor,J.NathanKutz,Discoveringgoverningequationsfromdatabysparseidentificationofnonlineardynamicalsystems, Proc.Natl.Acad.Sci.(2016).

[4]EuntaeChoi,KyungmiLee,KiyoungChoi,Approximationbysuperpositionsofasigmoidalfunction,arXivpreprint,arXiv:1907.07872,2019. [5]GeorgeCybenko,Autoencoder-basedincrementalclasslearningwithoutretrainingonolddata,Math.ControlSignalsSyst.2 (4)(1989)303–314. [6]WeinanE,Aproposalonmachinelearningviadynamicalsystems,Commun.Math.Stat.5 (1)(Mar2017)1–11.

[7]ChadyGhnatios,IciarAlfaro,DavidGonzález,FranciscoChinesta, EliasCueto,Data-driven genericmodelingofporoviscoelastic materials,Entropy 21 (12)(2019).

[8]XavierGlorot,AntoineBordes,YoshuaBengio,Deepsparserectifierneuralnetworks,in:ProceedingsoftheFourteenthInternationalConferenceon ArtificialIntelligenceandStatistics,AISTATS,2011,pp. 315–323.

[9]D.González,F.Chinesta,E.Cueto,Consistentdata-drivencomputationalmechanics,AIPConf.Proc.1960 (1)(2018)090005. [10]DavidGonzález,FranciscoChinesta,ElíasCueto,Learningcorrectionsforhyperelasticmodelsfromdata,Front.Mater.6(2019)14.

[11]DavidGonzález,FranciscoChinesta,ElíasCueto,Thermodynamicallyconsistentdata-drivencomputationalmechanics,Contin.Mech.Thermodyn.31 (1) (2019)239–253.

[12]AlexGraves,Abdel-rahmanMohamed,GeoffreyHinton,Speechrecognitionwithdeeprecurrentneuralnetworks,in:2013IEEEInternational Confer-enceonAcoustics,SpeechandSignalProcessing,IEEE,2013,pp. 6645–6649.

[13]MiroslavGrmela,Genericguidetothemultiscaledynamicsandthermodynamics,Comput.Phys.Commun.2 (3)(2018)032001. [14] MiroslavGrmela,VaclavKlika,MichalPavelka,Gradientandgenericevolutiontowardsreduceddynamics,2019.

[15]MiroslavGrmela,HansChristianÖttinger,Dynamicsandthermodynamicsofcomplexfluids.I.Developmentofageneralformalism,Phys.Rev.E56 (6) (1997)6620.

[16]KaimingHe,XiangyuZhang,ShaoqingRen,JianSun,Delvingdeepintorectifiers:surpassinghuman-levelperformanceonimagenetclassification,in: ProceedingsoftheIEEEInternationalConferenceonComputerVision,ICCV,2015,pp. 1026–1034.

[17]TonyHey,StewartTansley,KristinTolle,TheFourthParadigm:Data-IntensiveScientificDiscovery,MicrosoftResearch,October2009.

[18]GeoffreyHinton,LiDeng,DongYu,GeorgeE.Dahl,Abdel-rahmanMohamed,NavdeepJaitly,AndrewSenior,VincentVanhoucke,PatrickNguyen,Tara N.Sainath,etal.,Deepneuralnetworksforacousticmodelinginspeechrecognition:thesharedviewsoffourresearchgroups,IEEESignalProcess. Mag.29 (6)(2012)82–97.

[19]KurtHornik,MaxwellStinchcombe,HalbertWhite,etal.,Multilayerfeedforwardnetworksareuniversalapproximators,NeuralNetw.2 (5)(1989) 359–366.

[20]RubénIbanez,EmmanuelleAbisset-Chavanne,JoseVicenteAguado,DavidGonzalez,EliasCueto,FranciscoChinesta,Amanifoldlearningapproachto data-drivencomputationalelasticityandinelasticity,Arch.Comput.MethodsEng.25 (1)(2018)47–57.

[21]RubénIbáñez,EmmanuelleAbisset-Chavanne,DavidGonzález,Jean-LouisDuval,EliasCueto,FranciscoChinesta,Hybridconstitutivemodeling: data-drivenlearningofcorrectionstoplasticitymodels,Int.J.Mater.Forming12 (4)(2019)717–725.

[22]RubenIbañez,DomenicoBorzacchiello,JoseVicenteAguado,EmmanuelleAbisset-Chavanne,ElíasCueto,PierreLadevèze,FranciscoChinesta, Data-drivennon-linearelasticity:constitutivemanifoldconstructionandproblemdiscretization,Comput.Mech.60 (5)(2017)813–826.

[23]PengzhanJin,AiqingZhu,GeorgeEmKarniadakis,YifaTang,Symplecticnetworks:intrinsicstructure-preservingnetworksforidentifyingHamiltonian systems,arXivpreprint,arXiv:2001.03750,2020.

[24]DiederikP.Kingma,JimmyBa,Adam:amethodforstochasticoptimization,arXivpreprint,arXiv:1412.6980,2014.

[25]TrentonKirchdoerfer,MichaelOrtiz,Data-drivencomputationalmechanics,Comput.MethodsAppl.Mech.Eng.304(2016)81–101.

[26]AlexKrizhevsky,IlyaSutskever,GeoffreyE.Hinton,Imagenetclassificationwithdeepconvolutionalneuralnetworks,in:F.Pereira,C.J.C.Burges,L. Bottou,K.Q.Weinberger(Eds.),AdvancesinNeuralInformationProcessingSystems,vol. 25,CurranAssociates,Inc.,2012,pp. 1097–1105.

[27]ManuelLaso,HansChristianÖttinger,Calculationofviscoelasticflowusingmolecularmodels:theconnffessitapproach,J.Non-Newton.FluidMech. 47(1993)1–20.

[28]ClaudeLeBris,TonyLelievre,Multiscalemodellingofcomplexfluids:amathematicalinitiation,in:MultiscaleModelingandSimulationinScience, Springer,2009,pp. 49–137.

[29]JayYoonLee,SanketVaibhavMehta,MichaelWick,Jean-BaptisteTristan,JaimeCarbonell,Gradient-basedinferencefornetworkswithoutput con-straints,in:ProceedingsoftheAAAIConferenceonArtificialIntelligence,2009.

[30]ZhizhongLi,DerekHoiem,Learningwithoutforgetting,in:IEEETransactionsonPatternAnalysisandMachineIntelligence,2017.

[31]PabloMárquez-Neila,MathieuSalzmann,PascalFua,Imposinghardconstraintsondeepnetworks:promisesandlimitations,arXivpreprint,arXiv: 1706.02025,2017.

[32]JimMagiera,DeepRay,JanS.Hesthaven, ChristianRohde,Constraint-awareneuralnetworksfor Riemannproblems,J. Comput.Phys.409(2020) 109345.

[33]W.JamesMurdoch,ChandanSingh,KarlKumbier,RezaAbbasi-Asl,BinYu,Interpretablemachinelearning:definitions,methods,andapplications,arXiv preprint,arXiv:1901.04592,2019.

[34]YatinNandwani,AbhishekPathak,ParagSingla,etal.,Aprimaldualformulationfordeeplearningwithconstraints,in:AdvancesinNeuralInformation ProcessingSystems,2019.

(17)

[35]HansChristianÖttinger,BeyondEquilibriumThermodynamics,JohnWiley&Sons,2005.

[36]HansChristianÖttinger,MiroslavGrmela,Dynamicsandthermodynamicsofcomplexfluids.II.Illustrationsofageneralformalism,Phys.Rev.E56 (6) (1997)6633.

[37]AdamPaszke,SamGross,SoumithChintala,GregoryChanan,EdwardYang,ZacharyDeVito,ZemingLin,AlbanDesmaison,LucaAntiga,AdamLerer, Automaticdifferentiationinpytorch,in:AutodiffWorkshop:TheFutureofGradient-BasedMachineLearningSoftwareandTechniques,2017. [38]AdamPaszke,SamGross,FranciscoMassa,AdamLerer,JamesBradbury,GregoryChanan,TrevorKilleen,ZemingLin,NataliaGimelshein,LucaAntiga,

AlbanDesmaison,AndreasKopf,EdwardYang,ZacharyDeVito,MartinRaison,AlykhanTejani,SasankChilamkurthy,BenoitSteiner,LuFang,JunjieBai, SoumithChintala,Pytorch:animperativestyle,high-performancedeeplearninglibrary,in:H.Wallach,H.Larochelle,A.Beygelzimer,F.d’AlchéBuc,E. Fox,R.Garnett(Eds.),AdvancesinNeuralInformationProcessingSystems,vol. 32,CurranAssociates,Inc.,2019,pp. 8026–8037.

[39]R.G.Owens,T.N.Phillips,ComputationalRheology,ImperialCollegePress,2002.

[40]MaziarRaissi,ParisPerdikaris,GeorgeE.Karniadakis,Physics-informedneuralnetworks:adeeplearningframeworkforsolvingforwardandinverse problemsinvolvingnonlinearpartialdifferentialequations,J.Comput.Phys.378(2019)686–707.

[41]IgnacioRomero,Thermodynamicallyconsistenttime-steppingalgorithmsfornon-linearthermomechanicalsystems,Int.J.Numer.MethodsEng.79 (6) (2009)706–732.

[42]JonathanRomero,JonathanP.Olson,AlanAspuru-Guzik,Quantumautoencodersforefficientcompressionofquantumdata,QuantumSci.Technol. 2 (4)(2017)045001.

[43]SebastianRuder,Anoverviewofgradientdescentoptimizationalgorithms,arXivpreprint,arXiv:1609.04747,2016. [44]JürgenSchmidhuber,Deeplearninginneuralnetworks:anoverview,NeuralNetw.61(2015)85–117.

[45]Hoo-ChangShin,HolgerR.Roth,MingchenGao,LeLu,ZiyueXu,IsabellaNogues,JianhuaYao,DanielMollura,RonaldM.Summers,Deepconvolutional neuralnetworksforcomputer-aideddetection:Cnnarchitectures,datasetcharacteristicsandtransferlearning,IEEETrans.Med.Imaging35 (5)(2016) 1285–1298.

[46]LucasTheis,WenzheShi,AndrewCunningham,FerencHuszár,Lossyimagecompressionwithcompressiveautoencoders,arXivpreprint,arXiv:1703. 00395,2017.

[47]Yi-JenWang,Chin-TengLin,Runge-Kuttaneuralnetworkforidentificationofdynamicalsystemsinhighaccuracy,IEEETrans.NeuralNetw.9 (2)(1998) 294–307.

[48]DongkunZhang,LingGuo,GeorgeEmKarniadakis,Learninginmodalspace:solvingtime-dependentstochasticpdesusingphysics-informedneural networks,SIAMJ.Sci.Comput.42 (2)(2020)A639–A665.

Figure

Fig. 1. Representation of a single neuron (left) as a part of a fully connected neural net (right).
Fig. 2. Sketch of a structure-preserving neural network training algorithm.
Fig. 3. Double thermo-elastic pendulum.
Fig. 4. Loss evolution of data and degeneracy constraints for each epoch of the structure-preserving neural network training process of the double pendulum example.
+6

Références

Documents relatifs

In the first step, a keyword-based search using all combinations of two groups of keywords of which the first group addresses CNN models (LeNet, AlexNet, NIN, ENet, GoogLeNet,

The Liege group also continues a ground based spectral survey from the ground using a high resolution infrared interferometer, which has provided unique data on

We present an elementary and concrete description of the Hilbert scheme of points on the spectrum of fraction rings k[X]u of the one-variable polynomial ring over

Due to the hybrid framework that allows the coexistence of VSNs and OSNs in a common platform, both SOLVER Mobile platform and the SOLVER Cloud platform show two separated

Synopsis Absolute total cross sections and asymmetry parameters for photodetachment of C − ( 4 S o ) have been measured using animated-crossed-beam and velocity map imaging

73 Comparison between neural network prediction and leasl squares eslimate of the pitch moment (horizontal circular arc r' = 0.1) 88... INTRODUCTION AND

Three places localized in the Têt catchment area were selected and used as networks inputs after a statistical study because of their significant impact on the river flow

the generic central sets is equivalent to the study of the Maxwell set in the space of smooth functions ([11])2. In this paper a further topological description