Science Arts & Métiers (SAM)
is an open access repository that collects the work of Arts et Métiers Institute of
Technology researchers and makes it freely available over the web where possible.
This is an author-deposited version published in:
https://sam.ensam.eu
Handle ID: .
http://hdl.handle.net/10985/19561
To cite this version :
Quercus HERNÁNDEZ, Alberto BADÍAS, David GONZÁLEZ, Francisco CHINESTA, Elías
CUETO - Structure-preserving neural networks - Journal of Computational Physics p.1-16 - 2020
Any correspondence concerning this service should be sent to the repository
Structure-preserving
neural
networks
✩
Quercus Hernández
a,
Alberto Badías
a,
David González
a, Francisco Chinesta
b,
Elías Cueto
a,
∗
aAragonInstituteofEngineeringResearch(I3A),UniversidaddeZaragoza,MariadeLuna3,E-50018Zaragoza,Spain bESIChairandPIMMLab,ENSAMParisTech,155Boulevarddel’Hôpital,75013Paris,France
a
b
s
t
r
a
c
t
Keywords:
Scientificmachinelearning Neuralnetworks Structurepreservation GENERIC
We develop a method to learn physical systems from data that employs feedforward neural networks and whose predictions comply with the first and second principles of thermodynamics. The method employs a minimum amount of data by enforcing the metriplectic structure of dissipative Hamiltonian systems in the form of the so-calledGeneralEquationfortheNon-EquilibriumReversible-IrreversibleCoupling,GENERIC (Öttinger and Grmela (1997) [36]). The methoddoes not need to enforce any kind of balanceequation,andthusnopreviousknowledgeonthenatureofthesystemisneeded. Conservationofenergyand dissipationofentropyinthepredictionofpreviouslyunseen situationsariseasanaturalby-productofthestructureofthemethod.
Examplesoftheperformanceofthemethodareshownthatcompriseconservativeaswell asdissipativesystems,discreteaswellascontinuousones.
1. Introduction
Withtheirruptionoftheso-calledfourthparadigmofscience[17] agrowinginterestisdetectedonthemachinelearning ofscientific laws.Aplethora ofmethodshavebeendevelopedthat are abletoproducemore orlessaccurate predictions about the response of physical systems in previously unseensituations by employing techniques ranging from classical regressiontothemostsophisticateddeeplearningmethods.
Forinstance,recentworksinsolidmechanicshavesubstitutedtheconstitutiveequationswithexperimentaldata[1,25], whileconservingthetraditionalapproachonphysicallawswithhighepistemicvalue(i.e.,balanceequations,equilibrium). Similar approacheshave appliedthis conceptto the unveiling (orcorrection) ofplasticity models [21], while others cre-ated the newconcept of constitutive manifold [20,22]. Other approaches are designedto unveil an explicit,closed form expressionforthephysicallawgoverningthephenomenonathand[3].
Aninterestisobservedintheincorporationofthealreadyexistingscientificknowledgetothesedata-drivenprocedures. Thisinterestistwo-fold.Indeed,weprefernottogetridofcenturiesofscientificknowledgeandrelyexclusivelyonpowerful
✩ ThisprojecthasbeenpartiallyfundedbytheESIGroupthroughtheESIChairatENSAMParisTechandthroughtheproject“SimulatedReality:an intelligence augmentationsystembased on hybridtwinsand augmentedreality”at the UniversityofZaragoza. Thesupportofthe SpanishMinistry ofEconomyandCompetitivenessthroughgrantnumberCICYT-DPI2017-85139-C2-1-RandbytheRegionalGovernmentofAragon,throughtheproject T24_20R,andtheEuropeanSocialFund,arealsogratefullyacknowledged.
*
Correspondingauthor.E-mailaddresses:quercus@unizar.es(Q. Hernández),abadias@unizar.es(A. Badías),gonzal@unizar.es(D. González),francisco.chinesta@ensam.eu (F. Chinesta),ecueto@unizar.es (E. Cueto).
machinelearningstrategies.Existingtheorieshaveprovedtobeusefulinthepredictionofphysicalphenomenaandarestill inthepositionofhelpingtoproduceveryaccuratepredictions.Thisistheprocedurefollowedintheso-calleddata-driven computational mechanicsapproachmentionedbefore.Ontheotherhand,thesetheorieshelptokeeptheconsumption of datato aminimum.Data are expensivetoproduce andto maintain. Alreadyexisting scientific knowledgecould alleviate theamountofdataneededtoproduceasuccessfulprediction.
Thementionedworksondata-drivencomputationalmechanicsusuallyrelyontraditionalmachine learningalgorithms, which are very precise and tested butusually computationally expensive. With the recent advances in data processing, computing resources andmachine learning, neural networks have become a powerful tool to analyze traditionally hard problemssuchasimageclassification[26,45],speechrecognition[12,18] ordatacompressing[42,46].Thesenewmachine learningmethodsoutperformmanyofthetraditionalones,bothinmodeling capacityandcomputationaltime(oncetrained, certain neuralnetworkscaneasilyhandlerealtimerequirements). Recentworksinthemachinelearning community[29, 31,34] haveshownthatneuralnetworksarealsoversatileinconstraintoptimizations.
Thisistheapproachfollowedbyseveralauthorsinthecontextofphysicalsimulations,whichaimtosolveasetofpartial differential equations(PDEs) in complexdynamical systems. Physical problemsmust satisfy inherentlycertain conditions dictatedbyphysics,oftenformulatedasconservationlaws,andcanbeimposedtoaneuralnetworkusingextralossterms intheconstrainedoptimizationprocess[32].
Similar constraints areimposed in the so-calledphysically-informedneural networksapproach [40,48]. Thisfamily of methodsemploys neuralnetworkstosolvehighly nonlinearpartialdifferentialequations(PDEs)resultinginveryaccurate andnumericallystableresults.However,theyrelyonpriorknowledgeofthegoverningequationsoftheproblem.
Theauthorshaveintroducedtheso-calledthermodynamicallyconsistentdata-drivencomputationalmechanics[7,9,10]. Unlikeother existing works, thisapproachdoesnot imposeanyparticularbalance equation tosolve for.Instead,it relies ontheimposition oftherightthermodynamicstructure oftheresultingpredictions,asdictatedbytheso-calledGENERIC formalism [15]. As will be seen, this ensures conservation ofenergy and the right amount of entropy dissipation, thus givingrisetopredictionssatisfyingthefirstandsecondprinciplesofthermodynamics.Thesetechniques,however,employ regressiontounveilthethermodynamicstructureoftheproblematthesamplingpoints.Forpreviouslyunseensituations, theyemployinterpolationonthematrixmanifolddescribingthesystem.
Recentworksinsymplecticnetworks[23] haveby-passedthosedrawbacksbyexploitingthemathematicalpropertiesof Hamiltoniansystems,sonopriorknowledgeofthesystemisrequired.However, thistechniqueonlyoperateson conserva-tivesystemswithnoentropygeneration.
Theaimofthisworkisthedevelopmentofanewstructure-preservingneuralnetworkarchitecturecapableofpredicting thetimeevolutionofasystembasedonexperimentalobservationsonthesystem,withnopriorknowledgeofitsgoverning equations,to be valid forboth conservativeand dissipativesystems.The key idea is tomerge theproven computational powerofneuralnetworksinhighlynonlinearphysicswiththermodynamicconsistentdata-drivenalgorithms.Theresulting methodology,aswillbeseen,isapowerfulneuralnetworkarchitecture,conceptuallyverysimple—basedonstandard feed-forwardmethodologies—thatexploitstherightthermodynamicstructureofthesystemasunveiledfromexperimentaldata, andthatproducesinterpretableresults[33].
The outline of the paper is as follows. A brief description of the problem setup is presented in Section 2. Next, in Section 3, the methodology is presentedof both the GENERIC formalism and thefeed-forward neural networksused to solvethestatedproblem.Thistechniqueisusedin differentphysicalsystemsofincreasingcomplexity: adouble thermo-elasticpendulum(Section4)andaCouetteflowinaviscoelasticfluid(Section5).Thepaperiscompletedwithadiscussion inSection6.
2. Problemstatement
Weinan E seems to be the first author in interpreting the process oflearning physical systems as the solution of a dynamicalsystem[6].Considerasystemwhosegoverningvariableswillbehereafterdenotedbyz
∈
M
⊆ R
n,withM
the statespaceofthesevariables,whichisassumedtohavethestructureofadifferentiablemanifoldinR
n.Theproblemoflearningagivenphysicalphenomenoncanthusbeseenastheoneoffindinganexpressionforthetime evolutionoftheirgoverningvariables z,
˙
z
=
dzdt
=
F(
x,
z,
t),
x∈ ∈ R
D
,
t∈
I
= (
0,
T],
z(
0)
=
z0
,
(1)wherex andt refertothespaceandtimecoordinateswithinadomainwithD
=
2,
3 dimensions. F(
x,
z,
t)
isthefunction thatgives,afteraprescribedtimehorizonT ,theflowmap z0→
z(
z0,
T)
.Whilethisproblemcanbe seenasageneralsupervisedlearning problem(wefixboth z0 andz), whenwehave addi-tionalinformationaboutthephysics beingrepresentedbythesought function F ,itislegitimatetotrytoincludeitinthe searchprocedure.W.EseemstohavebeenthefirstinsuggestingtoimposeaHamiltonianstructureon F ifweknowthat energyisconserved,forinstance[6].Veryrecently,twodifferentapproachesfollowthissamerationale[2,23].
For conservativesystems, therefore,imposing a Hamiltonianstructure seems a very appealingwayto obtain thermo-dynamics-awareresults.However, when thesystemisdissipative, thismethod doesnotprovide withvalidresults.Given
theimportanteofdissipativephenomena (viscoussolids,fluiddynamics,...)weexploretherightthermodynamicstructure toimposetothesearchmethodology.
ThegoalofthispaperistodevelopanewmethodofsolvingEq. (1) usingstateoftheartdeeplearningtools,inorderto predictthetimeevolutionofthestatevariablesofagivensystem.Thesolutionisforcedtofulfillthebasicthermodynamic requirementsofenergyconservationandentropyinequalityrestrictionsviatheGENERICformalism, presentedinthenext section.
3. Methodology
Inthissectionwedeveloptheappropriatethermodynamicstructurefordissipativesystems.Classicalsystemsmodeling canbe doneatavariety ofscales. Wecould thinkofthemostdetailed(yetoftenimpractical) scaleofmolecular dynam-ics,whereenergyconservationappliesandtheHamiltonianparadigmcanbeimposed.However,thenumberofdegreesof freedom and, noteworthy,thetime scale,renders thisapproachoflittleinterest formanyapplications.Ontheother side ofthespectrumliesthermodynamics,whereonlyconserved,invariant, quantitiesare describedandthusthereisnoneed forconservationprinciples.Atanyother(mesoscopic)scale,unresolveddegreesoffreedom giverisetotheappearance of fluctuation intheresults(oritsequivalent, dissipation).Atthesescales,traditionalmodelingprocedures implyexpressing physicalinsights intheformofgoverningequations[13].Theseequationsare thenvalidatedfromexperimental observa-tions.
Alternatively,thermodynamicscanbe thoughtofasa meta-physics,inthesense thatitis actuallyatheory oftheories [14]. It provides us withthe right theoretic framework in which basic principlesare met.And, in particular for any of theseintermediate or mesoscopicscales, aso-calledmetriplectic structureemerges. The termmetriplecticcomes forthe combinationofsymplecticandRiemannian(metric)geometryandemphasizes thefactthatthereareconservative aswell as dissipativecontributions to the generalevolution ofsuch a system. Oncesuch a geometric structure is found forthe system,we areinthepositionoffixing theframeworkinwhichourneuralnetworkscanlookfortheadequateprediction ofthefuturestatesofthesystem.Theparticularmetriplecticstructurethatweemployforsuchataskisknown,asstated before,asGENERIC.
3.1. TheGENERICformalism
The “GeneralEquation forNon-EquilibriumReversible-Irreversible Coupling”, GENERIC,formalism [15,36] establishesa mathematical frameworkinorder tomodel thedynamicsofa system. Furthermore,itiscompatiblewithclassical equi-librium thermodynamics[35], preserving the symmetries ofthe systemas statedin Noether’stheorem.It has served as thebasisforthedevelopmentofseveralconsistentnumericalintegrationalgorithmsthatexploitthesedesirableproperties [11,41].
TheGENERICstructurefortheevolutioninEq. (1) isobtainedafterfindingtwoalgebraicordifferentialoperators L
:
T∗M
→
TM
,
M:
T∗M
→
TM,
whereT∗
M
andTM
represent,respectively,thecotangentandtangentbundlesofM
.AsingeneralHamiltoniansystems, there will be an energypotential, which we will denotehereafter by E(
z)
.In orderto take into account the dissipative effects, asecond potential (the so-calledMassieupotential)is introduced intheformulation.It is,ofcourse, theentropy potential ofthe GENERICformulation, S(
z)
. Withall theseingredients,we arriveata descriptionof thedynamicsofthe systemofthetypedz dt
=
L∂
E∂
z+
M∂
S∂
z.
(2)AsshowninEq. (2),thetimeevolutionofthesystemdescribedbythenonlinearoperatorF
(
x,
z,
t)
presentedinEq. (1) isnowsplitintwoseparatedterms:•
ReversibleTerm:Itaccountsforallthereversible(non-dissipative)phenomenaofthesystem.Inthecontextofclassical mechanics,thistermisequivalenttoHamilton’sequationsofmotionthatrelatestheparticlepositionandmomentum. Theoperator L(
z)
isthePoissonmatrix—itdefinesaPoissonbracket—andisrequiredtobeskew-symmetric(a cosym-plecticmatrix).•
Non-ReversibleTerm: The rest of the non-reversible (dissipative) phenomena of the system are modeled here. The operator M(
z)
isthefrictionmatrixandisrequiredtobesymmetricandpositivesemi-definite.TheGENERICformulationoftheproblemiscompletedwiththefollowingso-calleddegeneracyconditions L
∂
S∂
z=
M∂
E∂
z=
0.
(3)Thefirstconditionexpressesthe reversiblenatureoftheL contributiontothedynamicswhereasthesecondrequirement expressestheconservationofthetotalenergybythe M contribution.Thismeansnootherthingthattheenergypotential
doesnot contribute tothe production ofentropy and, conversely,that the entropy functionaldoes not contribute to re-versibledynamics.Thismutualdegeneracyrequirementinadditiontothealreadymentioned L andM matrixrequirements ensurethat:
∂
E∂
t=
∂
E∂
z·
∂
z∂
t=
∂
E∂
z L∂
E∂
z+
M∂
S∂
z=
0,
whichexpressestheconservationofenergyinanisolatedsystem,alsoknownasthefirstlawofthermodynamics.Applying thesamereasoningtotheentropy S:
∂
S∂
t=
∂
S∂
z·
∂
z∂
t=
∂
S∂
z L∂
E∂
z+
M∂
S∂
z=
∂
S∂
zM∂
S∂
z≥
0,
whichguaranteestheentropyinequality,thisis,thesecondlawofthermodynamics.
3.2. Proposedintegrationalgorithm
Oncethelearningprocedureisaccomplished,ourneuralnetworkisexpectedtointegratethesystemdynamicsintime, givenpreviouslyunseeninitialconditions.InordertonumericallysolvetheGENERICequation,weformulatethediscretized versionofEq. (2) followingpreviousworks[11]:
zn+1
−
znt
=
L·
DE Dz+
M·
DS Dz.
(4)The timederivative oftheoriginal equation isdiscretizedwitha forwardEulerschemeintime increments
t, where zn+1
=
zt+t. Land M are the discretized versionsof the Poisson andfriction matrices. Last, DEDz and DSDz represent the discretegradients,whichcanbeapproximatedinafiniteelementsenseas:DE
Dz
Az,
DS
Dz
B z,
(5)where A andB representthediscretematrixformofthegradientoperators.
Finally,manipulatingalgebraicallyEq. (4) withEq. (5) andincludingthedegeneracyconditionsofEq. (3),theproposed integrationschemeforpredictingthedynamicsofaphysicalsystemisthefollowing
zn+1
=
zn+
t(
L·
Azn+
M·
B zn)
(6)subjectto:
L
·
B zn=
0,
M
·
Azn=
0,
ensuringthethermodynamicalconsistencyoftheresultingmodel.
Tosumup, themain objectiveofthiswork isto compute theformofthe A
(
z)
and B(
z)
gradient operatormatrices, subjecttothedegeneracyconditions,inordertointegrate theinitialsystemstatevariables z0 over certaintime stepst ofthetimeinterval
I
.Usually,theformofmatricesL andM isknowninadvance,giventhevastliteratureinthefield.If necessary,thesetermscanalsobecomputed[11].3.3. Feed-forwardneuralnetworks
Intheintroductionwealreadymentionedtheintrinsicpowerofneuralnetworksinmanyfields.Themainreasonunder the fact that neural networksare able tolearn andreproduce such a variety ofproblems isthat they are considered to beuniversalapproximators[4,19],meaningthattheyarecapableofapproximatinganymeasurablefunctiontoanydesired degreeofaccuracy.Themainlimitationofthistechnique isthecorrectselectionofthetuningparameters ofthenetwork, alsocalledhyperparameters.
Anotheruniversalapproximatorarepolynomials,astheycanapproximateanyinfinitelydifferentiablefunctionasaTaylor powerseriesexpansion.The maindifferenceisthat neuralnetworksrelyoncomposition offunctionsratherthan sumof powerseries:
ˆ
y
= (
f[L]◦
f[L−1]◦ ... ◦
f[l]◦ ... ◦
f[2]◦
f[1])(
x).
(7)Eq. (7) showsthatthedesiredoutput y from
ˆ
adefinedinputx ofaneuralnetworkisacompositionofdifferentfunctionsf[l] asbuildingblocksofthenetworkinL totallayers.Thechallengeistoselectthebestcombinationoffunctionsinthe correctordersuchthatitapproximatesthesolutionofthestudiedproblem.
Fig. 1. Representation of a single neuron (left) as a part of a fully connected neural net (right).
The simplestbuilding block of artificial deep neural network architectures is the neuron or perceptron (Fig. 1, left). Severalneuronsarestackedinamultilayerperceptron(MLP),whichismathematicallydefinedasfollows
x[l]
=
σ
(
w[l]x[l−1]+
b[l]),
(8)wherel istheindexofthecurrentlayer,x[l−1]andx[l]arethelayerinputandoutputvectorrespectively,w[l] istheweight matrixofthelastlayer,b[l] isthebiasvectorofthelastlayerand
σ
istheactivationfunction.Ifnoactivation functionis applied, theMLPisequivalent to alinearoperator. However,σ
is chosen tobe anonlinear functioninorder toincrease thecapacityofmodeling morecomplexproblems,whicharecommonlynonlinear.Inclassificationproblems,thetraditional activation function is the logistic function (sigmoid)whereas inregression problems, Rectified Linear Unit(ReLU) [8] or hyperbolictangentarecommonlyused.In thiswork, weuse a deepneural networkarchitecture known asfeed-forward neuralnetwork [44].It consistsofa severallayerofmultilayerperceptronswithnocyclicconnections,asshowninFig.1(right).
The inputoftheneuralnetisthe vectorstate ofagiventimestep zn,andtheoutputsare theconcatenatedGENERIC matrices AnnetandBnetn :forasystemwithn statevariablesthenumberofinputsandoutputsare Nin
=
n andNout=
2n2. Then,usingtheGENERICintegrationscheme,thestatevectoratthenexttimestepznetn+1isobtained.Thismethodisrepeated forthewholesimulationtimeT withatotalofNT snapshots.
Thestatevariablesofageneraldynamicalsystemmaydifferinseveralordersofmagnitudefromeachother,duetotheir ownphysicalnatureormeasurementunits.Then,apre-processingoftheinputdata(scalingornormalization)canimprove themodelperformanceandstability.
Thenumberofhiddenlayers Nh dependsonthecomplexityoftheproblem.Increasingthenetsizeraisesthe computa-tionalpowerofthenettomodelmorecomplexphenomena.However,itslowsthetrainingprocessandcouldleadtodata overfitting,limitingitsgeneralizationandextrapolationcapacity.Thesizeofthehiddenlayersischosen tobethesameas theoutputsizeofthenetNout.
Thecostfunctionforourneuralnetworkiscomposedofthreedifferentterms:
•
Dataloss:Themainlossconditionistheagreementbetweenthenetworkoutputandtherealdata.Itiscomputedas thesquarederrorsum,computedbetweenthepredictedstatevectorznetn+1 andthegroundtruthsolutionznGT+1foreach timestep.
L
datan
=
zGTn+1−
znnet+122
.
(9)•
Fulfillmentofthedegeneracyconditions:Thecost functionwill alsoaccount forthedegeneracyconditionsinorder to ensure the thermodynamic consistency of the solution, implemented asthe sum of the squaredelements ofthe degeneracyvectorsforeachtimestep,L
degenn
=
L·
Bnetn znetn22
+
M·
Anetn znetn22
.
(10) Thistermactsasaregularizationofthelossfunctionand,atthesametime,istheresponsibleofensuring thermody-namicconsistency.Sotospeak,itisthecornerstoneofourmethod.•
Regularization:Inordertoavoidoverfitting,anextraL2regularizationtermL
reg isaddedtothelossfunction,defined asthesumoverthesquaredweightparametersofthenetwork.L
reg=
L l n[l] i n[l+1] j
(
w[il,]j)
2.
(11)The totalcost functioniscomputedasthe sumsquarederror (SSE)ofthedata lossanddegeneracyresidual,in addi-tion tothe regularizationterm, atthe endofthe simulationtime T for eachtrain case. The regularizationloss ishighly
Fig. 2. Sketch of a structure-preserving neural network training algorithm.
dependentonthesizeofthenetworklayersandhasdifferentscalingwithrespecttotheotherterms,soitiscompensated withtheregularization hyperparameter(weightdecay)
λ
r.Anadditionalweightλ
d isaddedto thedatalossterm, which accountsfortherelativescalingerrorwithrespecttothedegeneracyconditions.L
=
NT n=0(λ
dL
datan+
L
degen n)
+ λ
rL
reg.
(12)Theusualbackpropagationalgorithm[37] isthenusedtocalculatethegradientofthelossfunctionforeachnet param-eter(weightandbiasvectors),whichareupdated withthegradientdescent technique[43]. Theprocess isthenrepeated foramaximumnumberofepochsnepoch.TheresultingtrainingalgorithmissketchedinFig.2.
The proposedmethodology istestedwithtwodifferentdatabasesofnonlinear physicalsystems,splitina partitionof train cases (Ntrain
=
80% of thedatabase) andtest cases(Ntest=
20% ofthe database).The net performance isevaluated withthemeansquarederror(MSE)ofthestatevariablesprediction,associatedwiththedatalossterm,Eq. (9),overallthe timesnapshots, MSEdata(
zi)
=
1 NT NT n=0 zGTi,n−
zneti,n 2.
(13)Thesameprocedureisappliedtothedegeneracyconstraint,associatedwiththedegeneracylossterm,Eq. (10),overallthe timesnapshots, MSEdegen
(
zi)
=
1 NT NT n=0L
·
Bneti,nzneti,n+
M·
Aneti,nzneti,n.
(14)As a general error magnitude of the algorithm, the average MSE of both the train (N
=
Ntrain) and test trajectories (N=
Ntest)isalsoreportedforboththedata(m=
data)anddegeneracy(m=
degen)constraints,MSEm
(
z)
=
1 N N i=1 MSEm(
zi).
(15)Algorithm1andAlgorithm2showapseudocodeofourproposedalgorithmtoboththetrainingandtestprocesses.The proposedmethodisfullyimplementedinPyTorch[38] andtrainedinanIntelCorei7-8665UCPU.
4. Validationexamples:doublethermo-elasticpendulum
4.1. Description
The firstexampleisadoublethermo-elasticpendulum(Fig.3) consistingoftwomassesm1 andm2 connectedbytwo springsofvariablelengths
λ
1 andλ
2 andnaturallengthsatrestλ
01 andλ
02.Thesetofvariablesdescribingthedoublependulumareherechosentobe
S
= {
z= (
q1,
q2,
p1,
p2,
s1,
s2)
∈ (R
2× R
2× R
2× R
2× R × R),
q1=
0,
q1=
q2},
(16) whereqi,piandsiaretheposition,linearmomentumandentropyofeachmassi=
1,
2.Algorithm1Pseudocodeforthetrainalgorithm. Loadtraindatabase: zGT(trainpartition),t,L,M;
Definenetworkarchitecture: Nin,Nout=2N2in,Nh,σj;
Define hyperparameters:η,λd,λr; Initializewi,j,bj;
for epoch←1,nepochdo
for train_case←1,Ntraindo Initializestatevector:znet
0 ←zGT0 ; Initializelosses:Ldata,Ldegen=0;
for snapshot←1,NTdo Forwardpropagation:[Anet
n ,Bnnet]←Net(zGTn ); Eq. (8)
Timeintegration:
znet
n+1←znnet+ t(L·Anetn znetn +M·Bnnetznetn ); Eq. (4) Updatedataloss:Ldata← Ldata+ Ldata
n ; Eq. (9)
Updatedegeneracyloss:Ldegen← Ldegen+ Ldegen
n ; Eq. (10)
end for
SSElossfunction:L← λdLdata+ Ldegen+ λrLreg Eq. (11),Eq. (12) Backwardpropagation;
Optimizerstep;
end for
Learningratescheduler;
end for
Algorithm2Pseudocodeforthetestalgorithm. Loadtestdatabase: zGT(testpartition),t,L,M;
Loadnetworkparameters; for test_case←1,Ntestdo Initializestatevector:znet
0 ←zGT0 ;
for snapshot←1,NTdo Forwardpropagation:[Anet
n ,Bnetn ]←Net(znetn ); Eq. (8)
Timestepintegration:
znetn+1←znnet+ t(L·Anetn znetn +M·Bnnetznetn ); Eq. (4) Updatestatevector:znet
n ←znetn+1; Updatesnapshot:n←n+1;
end for
ComputeMSEdata,MSEdegen; Eq. (13),Eq. (14)
end for
ComputeMSEdata,MSEdegen; Eq. (15)
Fig. 3. Double thermo-elastic pendulum.
Thelengthsofthesprings
λ
1andλ
2aredefinedsolelyintermsofthepositionsasλ
1=
√
q1·
q1,
λ
2=
(
q2−
q1)
· (
q2−
q1).
ThetotalenergyofthesystemcanbeexpressedasthesumofthekineticenergyofthetwomassesKi andtheinternal energyofthespringseifori
=
1,
2,E
=
E(
z)
=
i Ki(
z)
+
i ei(λ
i,
si),
Ki=
1 2mi|
pi|
2.
(17)Fig. 4. Lossevolutionofdataanddegeneracyconstraintsforeachepochofthestructure-preservingneuralnetworktrainingprocessofthedoublependulum example.
Thetotalentropyofthedoublependulumisthesumoftheentropiesofthetwomassessi,
S
=
S(
z)
=
s1+
s2.
(18)ThismodelincludesthermaleffectsinthestretchingofthespringsduetotheGough-Jouleeffect.Theabsolute temper-aturesTiateachspringareobtained throughEq. (19).Thesetemperaturechangesinduceaheatfluxbetweenbothsprings, beingproportionaltothetemperaturedifferenceandaconductivityconstant
κ
>
0,Ti
=
∂
ei∂
si.
(19)In this case, there is a clear contribution of both conservative Hamiltonian mechanics (mass movement) and non-Hamiltonian dissipative effects (heat flux), resulting in a non-zero Poisson matrix (M
=
0). Thus, the GENERIC matrices associatedwiththisphysicalsystemareknowntobe[11]L
=
⎡
⎢
⎢
⎢
⎢
⎢
⎣
0 0 1 0 0 0 0 0 0 1 0 0−
1 0 0 0 0 0 0−
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0⎤
⎥
⎥
⎥
⎥
⎥
⎦
,
M=
⎡
⎢
⎢
⎢
⎢
⎢
⎣
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1−
1/
2 0 0 0 0−
1/
2 1⎤
⎥
⎥
⎥
⎥
⎥
⎦
.
(20)4.2. Databaseandhyperparameters
The training databaseis generatedwithathermodynamically consistenttime-steppingalgorithm [41] in MATLAB.The massesofthedoublependulumaresettom1
=
1 kgandm2=
2 kg,jointwithspringsofanaturallengthofλ
01=
2 mandλ
02=
1 mandthermalconstantofC1=
0.
02 JandC2=
0.
2 Jandconductivityconstantofκ
=
0.
5.Thesimulationtimeof themovementis T=
60 sintimeincrementsoft
=
0.
3 s(NT=
200 snapshots).Thedatabaseconsistsofthestate vector,Eq. (16),of50differenttrajectorieswithrandominitialconditionsofposition qi andlinearmomentum pi ofbothmassesmi(i
=
1,
2)aroundameanpositionandlinearmomentumofq1= [
4.
5,
4.
5]
m, p1= [
2,
4.
5]
kg·
m/s, andq2= [−
0.
5,
1.
5]
m, p2= [
1.
4,
−
0.
2]
kg·
m/srespectively.Althoughtheinitialconditions ofthesimulationsaresimilar,itresultsinawidevarietyofthemasstrajectoriesduetothechaoticbehaviorofthesystem. This database issplit randomly in 40 train trajectories and10 test trajectories. Thus, there is a total of 80.
000 training snapshotsand20.
000 testsnapshots.Thenetinput andoutputsizeis Nin
=
10 and Nout=
2Nin2=
200.The statevectorisnormalizedbasedonthetraining setstatisticalmeanandstandarddeviation.ThenumberofhiddenlayersisNh=
5 withReLUactivationfunctionsandlinear inthelastlayer.ItisinitializedaccordingtotheKaimingmethod[16] withnormaldistributionandtheoptimizerusedis Adam[24] withaweightdecayofλ
r=
10−5 anddatalossweightofλ
d=
102.Amultisteplearningrateschedulerisused, starting inη
=
10−3 anddecayingbyafactorofγ
=
0.
1 inepochs 600and1200.Thetrainingprocessendswhenafixed numberofepochsnepoch=
1800 isreached.Fig. 5. Timeevolutionofthestatevariablesinatesttrajectoryofadoublethermo-elasticpendulumusingatime-steppingsolver(GroundTruth,GT)and theproposedGENERICintegrationscheme(SPNN).Sinceeveryvariablehasavectorialcharacter,bothcomponentsaredepictedandlabeled as X andY , respectively.
Table 1
Meansquarederrorofthedataloss(MSEdata)anddegeneracyloss(MSEdegen)for allthestatevariablesofthedoublependulum.
State Variables MSEdata MSEdegen
Train Test Train Test
q1[m] X 1.95·10−2 3.87·10−2 3.56·10−8 4.43·10−8 Y 2.72·10−2 8.21·10−2 4.74·10−8 5.77·10−8 q2[m] X 2.04·10−2 3.65·10−2 9.28·10−8 8.55·10−8 Y 2.72·10−2 3 .73·10−2 3 .55·10−8 5 .01·10−8 p1[kg·m/s] X 6.43·10−4 1.33·10−4 4.00·10−8 7.08·10−8 Y 1.06·10−3 4.06·10−3 1.21·10−7 1.40·10−7 p2[kg·m/s] X 4.88·10−4 9.84·10−4 6.00·10−8 4.58·10−8 Y 8.47·10−4 1.79·10−4 9.76·10−8 1.20·10−7 s1[J/K] 1.21·10−5 3.51·10−5 1.31·10−7 2.06·10−7 s2[J/K] 1.22·10−5 3.18·10−5 2.40·10−7 2.95·10−7 Table 2
Meansquarederroroftheenergy(MSE(E)) andentropy(MSE(s))ofthedouble pendu-lum.
Variable Train Test E [J] 7.99·10−3 8.86·10−3 S [J/K] 6.52·10−8 6
.33·10−8
4.3. Results
Fig. 5 showsthe time evolution of thestate variables (position, momentum andentropy) of each mass givenby the solverandtheneuralnet.
Table1showsthemeansquarederrorofthedataanddegenerationlosstermsforall thestatevariablesofthedouble pendulum.TheresultsarecomputedseparatelyasthemeanoverallthetrainandtesttrajectoriesusingEq. (15).
Fig. 6andFig. 7show the time evolutionof theinternal andkinetic energy (Eq. (17))andthe entropy(Eq. (18)) re-spectivelyforthetwopendulummasses(i
=
1,
2).Thetotalenergyisconservedandthetotalentropysatisfiestheentropy inequality,fulfillingthefirstandsecondlawsofthermodynamicsrespectively.Themeanerrorforbothtrainandtest trajec-toriesisreportedinTable2.Fig. 6. Timeevolutionoftheenergyinatesttrajectoryofadoublethermo-elasticpendulumusingatime-steppingsolver(GroundTruth,GT)andthe proposedGENERICintegrationscheme(Net).
Fig. 7. Timeevolutionoftheentropyinatesttrajectoryofadoublethermo-elasticpendulumusingatime-steppingsolver(GroundTruth,GT)andthe proposedGENERICintegrationscheme(SPNN).
5. CouetteflowofanOldroyd-Bfluid
5.1. Description
The secondexampleisashear(Couette)flowofanOldroyd-Bfluidmodel.Thisisaconstitutivemodelforviscoelastic fluids,consistingoflinearelasticdumbbells(representingpolymerchains)immersedinasolvent.
The Oldroyd-Bmodelarisesinthe modeling offlows ofdilutedpolymeric solutions.Thismodelcan beobtainedboth fromapurelymacroscopicpointofviewaswellasfromamicroscopicone,bymodeling polymerchainsaslineardumbbells diluted ina Newtoniansubstrate. Alternatively, itcanalso beobtainedby considering thedeviatoricpart T of thestress tensor
σ
(theso-calledextra-stresstensor),tobeoftheformT
+ λ
1 ∇ T=
η
0˙
γ
+ λ
2 ∇˙
γ
,
(21)Fig. 8. Couette flow in an Oldroyd-B fluid.
wherethetriangledenotesthenon-linearOldroyd’s upper-convectedderivative[39].Coefficients
η
0,λ
1 andλ
2 aremodel parameters.Itisstandardtodenotethestrainratetensorbyγ
˙
= (∇
sv)
=
D.Finally,thestressinthesolvent(denotedbyasubscripts)andpolymer(denotedbyasubscriptp)aregivenby
T
=
η
sγ
˙
+
τ
,
sothatτ
+ λ
1∇
τ
=
η
pγ
˙
,
whichistheconstitutiveequationfortheelasticstress.
Pseudo-experimentaldata are obtainedby theCONNFFESSIT technique [27], basedon theFokker-Plank equation [28]. ThisequationissolvedbyconvertingitinitscorrespondingItôstochasticdifferentialequation,
drx
=
∂
v∂
yry−
1 2Werx dt+
√
1 WedVt,
dry= −
1 2Werydt+
1√
WedWt,
(22)where v istheflowvelocity, r
= [
rx,
ry]
,rx=
rx(
y,
t)
thepositionvectorandassuming aCouetteflowsothatry=
ry(
t)
dependsonlyontime,we standfor the WeissenbergnumberandVt,Wt aretwoindependentone-dimensionalBrownian motions. Thisequation issolved viaMonteCarlotechniques,by replacingthe mathematicalexpectationby theempirical mean.Themodelreliesonthemicroscopicdescriptionofthestate ofthedumbbells.Thus,itisparticularlyusefultobasethe microscopicdescriptionontheevolutionoftheconformationtensorc
=
rr,thisis,thesecondmomentofthedumbbell end-to-enddistancedistributionfunction.Thistensorisingeneralnotexperimentallymeasurableandplaystheroleofan internalvariable.Theexpectedxy stresscomponenttensorwillbegivenbyτ
=
We 1 K K k=1 rxry
,
where K isthenumberofsimulateddumbbellsand
=
νpνp istheratioofthepolymertosolventviscosities.
Thestatevariablesselectedforthisproblemarethepositionofthefluidoneachnodeofthemesh,seeFig.8,itsvelocity v inthexdirection,internalenergye andtheconformationtensorshearcomponent
τ
,S
= {
z= (
q,
v,
e,
τ
)
∈ (R
2× R × R × R)}.
(23)TheGENERICmatricesassociatedwitheachnodeofthisphysicalsystemarethefollowing
L
=
⎡
⎢
⎢
⎢
⎣
0 0 1 0 0 0 0 0 0 0 0 0−
1 0 0 0 1−
1 0 0−
1 0 0 0 0 0 0 0 0 0⎤
⎥
⎥
⎥
⎦
,
M=
⎡
⎢
⎢
⎢
⎣
0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 1⎤
⎥
⎥
⎥
⎦
.
(24)In order to simulatea measurement of real captureddata, Gaussian noise is added to thestate vector, computed as a random variablefollowing a normal distribution with zero meanand standard deviation proportional to the standard deviationofthedatabase
σ
z andnoiselevelν
,Fig. 9. Loss evolution of data and degeneracy constraints for each epoch of the neural network training process of the Couette flow example.
zGTnoise
=
zGT+
ν
·
σz
·
N
(
0,
1)
(25)Theresultsofboththenoise-freeandthenoisydatabasearecomparedwithtwodifferentnetworkarchitectures:
•
Unconstrainednetwork:Thisarchitectureisthesameastheproposednetworkbutremovingthedegeneracyconditions oftheenergyandentropy,Eq. (10),inthelossfunction.Theseconditionsensurethethermodynamicconsistencyofthe resultingintegrator,sonotincludingthemaffectsnegativelyintheaccuracyoftheresults,aswillbeseen.•
Black-Box network: In thiscase, noGENERIC architecture is imposed, actingas a black-boxintegrator trainedto di-rectlypredictthestatevector timeevolutionzt+1 fromtheprevioustime stepzt.Thisnaiveapproachisshowntobe inappropriate,asnophysicalrestrictionsaregiventothemodel.5.2. Databaseandhyperparameters
Thetraining databaseforthisOldroyd-B modelisgeneratedinMATLABwithamultiscaleapproach[28] inthe dimen-sionlessform.ThefluidisdiscretizedintheverticaldirectionwithN
=
100 elements(101nodes)inatotalheightofH=
1. Atotalof10,000 dumbellswereconsidered ateachnodallocationinthe model.The lidvelocityisset to V=
1,the vis-coelastic WeissenbergnumberWe=
1 andReynoldsnumberofRe=
0.
1.ThesimulationtimeofthemovementisT=
1 in timeincrementsoft
=
0.
0067 (NT=
150 snapshots).The database consistedof the state vector (Eq. (23))of the 100 nodestrajectories (excludingthe node ath
=
H , for whichano-slipcondition v=
0 hasbeenimposed).Thisdatabaseissplitin80traintrajectoriesand20testtrajectories.ThenetinputandoutputsizeisNin
=
5 and Nout=
2N2in=
50.Thenumberofhiddenlayers isNh=
5 withReLU acti-vationfunctionsandlinearinthelastlayer.ItisinitializedaccordingtotheKaimingmethod[16],withnormaldistribution and theoptimizer used is Adam[24], witha weight decay ofλ
r=
10−5 and dataloss weight ofλ
d=
103. A multistep learning ratescheduleris used, starting inη
=
10−3 anddecaying by a factor ofγ
=
0.
1 in epochs 500 and1000. The training process endswhenafixed numberofepochsnepoch=
1500 isreached. Thesameparameters areconsidered also forthenoisydatabasenetwork(ν
=
1%)andtheunconstrainednetwork.Theblack-boxnetworktrainingparametersareanalogoustothestructure-preservingnetwork,exceptfortheoutputsize
Nout
=
Nin=
5.Severalnetworkarchitecturesweretested,andthelowesterrorisachievedwithNh=
5 hiddenlayersand 25neuronseachlayer.Thetimeevolutionofthedata
L
data anddegeneracyL
degenlosstermsforeachtrainingepochareshowninFig.9.5.3. Results
Fig.10showsthetimeevolutionofthestatevariables(positionq,velocityv,internalenergye andconformationtensor shearcomponent
τ
)givenbythesolverandtheneuralnet.There isagoodagreementbetweenbothplots.Moreover,the proposedschemeisabletopredictthetimeevolutionoftheflowforseveralsnapshotsbeyondthetrainingsimulationtimeT
=
1,asshowninthesamefigure.Table3showthemeansquarederrorofthedataanddegenerationlosstermsforallthestatevariablesoftheCouette flow ofan Oldroyd-Bfluid. Theresultsare computedseparatelyasthemeanover allthe trainandtest trajectoriesusing Eq. (15).
Fig. 11shows a box plotof thedata error(MSEdata) forthe train andtest sets inthe fourstudied architectures.The results of the structure-preserving neural network outperform the other two approaches even withnoisy training data. The errorof the unconstrained neural network is greater than one order of magnitude than our approach, proving the importance ofthedegeneracyconditionsintheGENERICformulation. Last,thenaive black-boxapproachshowstheworst performanceofthefournetworks,asnophysicalrestrictionisconsidered.
Fig. 10. TimeevolutionofthestatevariablesinfivetestnodesofaCouetteflowusingasolver(GroundTruth,GT)andtheproposedGENERICintegration scheme(Net).Thedottedverticallinerepresentsthe simulationtimeT=1 ofthetrainingdataset.
Table 3
Meansquarederrorofthedataloss(MSEdata)anddegeneracyloss(MSEdegen) forallthestatevariablesoftheCouetteflow.
State Variables MSEdata MSEdegen
Train Test Train Test
q [-] X 5.40·10−6 6.29·10−6 1.72·10−7 1.96·10−7 Y 0.00 0.00 0.00 0.00 v [-] 3.23·10−5 4.75·10−5 1.19·10−6 1.48·10−6 e [-] 7.85·10−6 6.60·10−6 7.06·10−7 9.11·10−7
τ[-] 2.36·10−5 1.26·10−5 1.07·10−6 1.31·10−6
Fig. 11. Box plots for the data integration mean squared error (MSEdata) of the Couette flow in both train and test cases.
Withrespect toour previouswork [11], that employeda piece-wise linearregressionapproach,these examplesshow similar levels ofaccuracy, buta much greater level ofrobustness. Forinstance, this sameexample was included inthe mentioned reference. However, inthat case,the problemhadtobe solved withthe helpof areducedorder modelwith only six degrees of freedom, due to the computational burden of the approach. In our former approach, the GENERIC structurewasidentifiedbypiece-wiselinearregressionforeachofthefewglobalmodes oftheapproximation.Sotospeak, inthat case, we learnt thecharacteristics oftheflow. Here,on thecontrary,the netisable tofindan approximation for anyvelocityvalueatthe101nodesofthemesh—say,fluidparticles—withoutanydifficulty.Inthiscase,wearelearning the
Table 4
Computationtrainingtimeoftheproposedalgorithm forthe tworeportedexamplesinthenoisefree net-works.
Example Epoch Time Total Time Double Pendulum 2.44 s/epoch 73.18 min Couette Flow 1.22 s/epoch 30.53 min
behavioroffluidparticles.Itwillbe interesting,however,tostudytowhatextenttheemployofvariationalautoencoders, asinBertalanetal. [2],couldhelpinsolvingmoreintricatemodels.Autoencodershelpindeterminingtheactualnumber ofdegreesoffreedomneededtorepresentagivenphysicalphenomenon.
6. Conclusions
In this work we have presented a new methodology to ensure thermodynamic consistency in the deep learning of physicalphenomena.Incontrasttoexistingmethods,thismethodologydoesnotneedtoknowinadvanceanyinformation relatedtobalanceequationsorthepreciseformofthePDEgoverningthephenomenaathand.Themethodisconstructed ontopoftherightthermodynamicprinciplesthatensurethefulfillmentoftheenergydissipationandentropyproduction. Itisvalid,therefore,forconservativeaswellasdissipativesystems,thusovercomingpreviousapproachesinthefield.
Whencomparedwithourpreviousworksinthefield (seeGonzalezetal.[11]),thepresentmethodologyshowedtobe morerobust,allowingustofindapproximationsforsystemswithordersofmagnitudemoredegreesoffreedom.Thisnew approachisalsolesscomputationallydemanding.Forthedoublependulumcase,thesnapshotoptimizationoftheGENERIC matricesproposedin[11] hasameasuredperformanceof10minpertrajectory,whichaddupto400minutesconsidering the40studiedtrajectories,whereasournewneural-networkapproachtrainsinonly73.18minutes.Thecomputationaltime oftheotherexamplesisshowninTable4.
Thereportedresultsshow goodagreementbetweenthenetworkoutputandthesyntheticgroundtruth solution,even withmoderatenoisydata.WehavealsoshowntheimportanceofincludingthedegeneracyconditionsoftheGENERIC for-mulationtotheneuralnetworkconstraints,asitensuresthethermodynamicalconsistencyoftheintegrator.The structure-preservingneuralnetworkoutperformsothernaiveblack-boxapproaches,sincethephysicalconstraintsactasaninductive bias,facilitatingthelearningprocess.However,theerrorcanbereducedusingseveraltechniques:
•
Database:As a generalmethod ofincreasing the precision of an Eulerintegration scheme,thetime stept canbe decreased so the total number of snapshots is increased. On the contrary, the database will be larger, slowing the trainingprocess.Thesameway,thedatabasecanbeenrichedwithawidervarietyofcases,improvingthenetpredictive capabilities.
•
IntegrationScheme: Ahigher order Runge-Kutta integration scheme could be introduced inEq. (4) in order to get highersolutionaccuracy [47].However,itrequiresseveralforwardpassesthroughtheneuralnetforeachtimestep, in-crementingthecomplexityoftheintegrationschemeandthetrainingprocess.Additionally,GENERIC-basedintegration schemeshaveshowedverygoodperformanceevenforfirst-orderapproaches. [41]•
NetArchitecture: Toincrease thecomputational power ofthe net,more andlarger hiddenlayers Nh can be added. However, thiscould leadtoa moreover-fittedsolutionwhichlimit thepredictionpowerandversatility ofthenet.It alsoincreasesthecomputationalcostofboththetrainingprocessandthetestingofthenet.•
TrainingHyperparameters:Theneuralnetworkstrainedinthisworkcouldbeoptimizedusingseveralhyperparameter tuning methodssuch asrandomsearch, Bayesianoptimizationorgradient-based optimizationtoget amoreefficient solution.Severalopenquestionsremainasafuturework.Amoreexhaustiveanalysiscanbeperformedtoevaluatetheinfluence ofnoisydatatotheintegratorevolution,inordertoaddrobustnesstothemethodandevenpredictwidersimulationtimes usingincrementallearning[5,30].
CRediTauthorshipcontributionstatement
•
QuercusHernandez:software,investigation,validation•
AlbertoBadias:software,investigation,validation•
DavidGonzález:data,investigation,validation•
FranciscoChinesta:methodology,validationDeclarationofcompetinginterest
The authors declare that they haveno known competingfinancial interests or personal relationships that could have appearedtoinfluencetheworkreportedinthispaper.
Acknowledgements
This projecthas beenpartially funded by the ESIGroup through the ESIChair at ENSAMArtset MetiersInstitute of Technology,andthroughtheproject2019-0060“SimulatedReality”attheUniversityofZaragoza.Thesupportofthe Span-ish Ministry of Economy and Competitiveness through grant number CICYT-DPI2017-85139-C2-1-R andby the Regional GovernmentofAragon,grantT24_20R,andtheEuropeanSocialFund,arealsogratefullyacknowledged.
References
[1]JacoboAyensa-Jiménez,MohamedH.Doweidar,JoseA.Sanz-Herrera,ManuelDoblaré,Anewreliability-baseddata-drivenapproachfornoisy experi-mentaldatawithphysicalconstraints,Comput.MethodsAppl.Mech.Eng.328(2018)752–774.
[2]TomBertalan,FelixDietrich,IgorMezi ´c,IoannisG.Kevrekidis,OnlearningHamiltoniansystemsfromdata,Chaos:Interdiscip.J.NonlinearSci.29 (12) (Dec2019)121107.
[3]StevenL.Brunton,JoshuaL.Proctor,J.NathanKutz,Discoveringgoverningequationsfromdatabysparseidentificationofnonlineardynamicalsystems, Proc.Natl.Acad.Sci.(2016).
[4]EuntaeChoi,KyungmiLee,KiyoungChoi,Approximationbysuperpositionsofasigmoidalfunction,arXivpreprint,arXiv:1907.07872,2019. [5]GeorgeCybenko,Autoencoder-basedincrementalclasslearningwithoutretrainingonolddata,Math.ControlSignalsSyst.2 (4)(1989)303–314. [6]WeinanE,Aproposalonmachinelearningviadynamicalsystems,Commun.Math.Stat.5 (1)(Mar2017)1–11.
[7]ChadyGhnatios,IciarAlfaro,DavidGonzález,FranciscoChinesta, EliasCueto,Data-driven genericmodelingofporoviscoelastic materials,Entropy 21 (12)(2019).
[8]XavierGlorot,AntoineBordes,YoshuaBengio,Deepsparserectifierneuralnetworks,in:ProceedingsoftheFourteenthInternationalConferenceon ArtificialIntelligenceandStatistics,AISTATS,2011,pp. 315–323.
[9]D.González,F.Chinesta,E.Cueto,Consistentdata-drivencomputationalmechanics,AIPConf.Proc.1960 (1)(2018)090005. [10]DavidGonzález,FranciscoChinesta,ElíasCueto,Learningcorrectionsforhyperelasticmodelsfromdata,Front.Mater.6(2019)14.
[11]DavidGonzález,FranciscoChinesta,ElíasCueto,Thermodynamicallyconsistentdata-drivencomputationalmechanics,Contin.Mech.Thermodyn.31 (1) (2019)239–253.
[12]AlexGraves,Abdel-rahmanMohamed,GeoffreyHinton,Speechrecognitionwithdeeprecurrentneuralnetworks,in:2013IEEEInternational Confer-enceonAcoustics,SpeechandSignalProcessing,IEEE,2013,pp. 6645–6649.
[13]MiroslavGrmela,Genericguidetothemultiscaledynamicsandthermodynamics,Comput.Phys.Commun.2 (3)(2018)032001. [14] MiroslavGrmela,VaclavKlika,MichalPavelka,Gradientandgenericevolutiontowardsreduceddynamics,2019.
[15]MiroslavGrmela,HansChristianÖttinger,Dynamicsandthermodynamicsofcomplexfluids.I.Developmentofageneralformalism,Phys.Rev.E56 (6) (1997)6620.
[16]KaimingHe,XiangyuZhang,ShaoqingRen,JianSun,Delvingdeepintorectifiers:surpassinghuman-levelperformanceonimagenetclassification,in: ProceedingsoftheIEEEInternationalConferenceonComputerVision,ICCV,2015,pp. 1026–1034.
[17]TonyHey,StewartTansley,KristinTolle,TheFourthParadigm:Data-IntensiveScientificDiscovery,MicrosoftResearch,October2009.
[18]GeoffreyHinton,LiDeng,DongYu,GeorgeE.Dahl,Abdel-rahmanMohamed,NavdeepJaitly,AndrewSenior,VincentVanhoucke,PatrickNguyen,Tara N.Sainath,etal.,Deepneuralnetworksforacousticmodelinginspeechrecognition:thesharedviewsoffourresearchgroups,IEEESignalProcess. Mag.29 (6)(2012)82–97.
[19]KurtHornik,MaxwellStinchcombe,HalbertWhite,etal.,Multilayerfeedforwardnetworksareuniversalapproximators,NeuralNetw.2 (5)(1989) 359–366.
[20]RubénIbanez,EmmanuelleAbisset-Chavanne,JoseVicenteAguado,DavidGonzalez,EliasCueto,FranciscoChinesta,Amanifoldlearningapproachto data-drivencomputationalelasticityandinelasticity,Arch.Comput.MethodsEng.25 (1)(2018)47–57.
[21]RubénIbáñez,EmmanuelleAbisset-Chavanne,DavidGonzález,Jean-LouisDuval,EliasCueto,FranciscoChinesta,Hybridconstitutivemodeling: data-drivenlearningofcorrectionstoplasticitymodels,Int.J.Mater.Forming12 (4)(2019)717–725.
[22]RubenIbañez,DomenicoBorzacchiello,JoseVicenteAguado,EmmanuelleAbisset-Chavanne,ElíasCueto,PierreLadevèze,FranciscoChinesta, Data-drivennon-linearelasticity:constitutivemanifoldconstructionandproblemdiscretization,Comput.Mech.60 (5)(2017)813–826.
[23]PengzhanJin,AiqingZhu,GeorgeEmKarniadakis,YifaTang,Symplecticnetworks:intrinsicstructure-preservingnetworksforidentifyingHamiltonian systems,arXivpreprint,arXiv:2001.03750,2020.
[24]DiederikP.Kingma,JimmyBa,Adam:amethodforstochasticoptimization,arXivpreprint,arXiv:1412.6980,2014.
[25]TrentonKirchdoerfer,MichaelOrtiz,Data-drivencomputationalmechanics,Comput.MethodsAppl.Mech.Eng.304(2016)81–101.
[26]AlexKrizhevsky,IlyaSutskever,GeoffreyE.Hinton,Imagenetclassificationwithdeepconvolutionalneuralnetworks,in:F.Pereira,C.J.C.Burges,L. Bottou,K.Q.Weinberger(Eds.),AdvancesinNeuralInformationProcessingSystems,vol. 25,CurranAssociates,Inc.,2012,pp. 1097–1105.
[27]ManuelLaso,HansChristianÖttinger,Calculationofviscoelasticflowusingmolecularmodels:theconnffessitapproach,J.Non-Newton.FluidMech. 47(1993)1–20.
[28]ClaudeLeBris,TonyLelievre,Multiscalemodellingofcomplexfluids:amathematicalinitiation,in:MultiscaleModelingandSimulationinScience, Springer,2009,pp. 49–137.
[29]JayYoonLee,SanketVaibhavMehta,MichaelWick,Jean-BaptisteTristan,JaimeCarbonell,Gradient-basedinferencefornetworkswithoutput con-straints,in:ProceedingsoftheAAAIConferenceonArtificialIntelligence,2009.
[30]ZhizhongLi,DerekHoiem,Learningwithoutforgetting,in:IEEETransactionsonPatternAnalysisandMachineIntelligence,2017.
[31]PabloMárquez-Neila,MathieuSalzmann,PascalFua,Imposinghardconstraintsondeepnetworks:promisesandlimitations,arXivpreprint,arXiv: 1706.02025,2017.
[32]JimMagiera,DeepRay,JanS.Hesthaven, ChristianRohde,Constraint-awareneuralnetworksfor Riemannproblems,J. Comput.Phys.409(2020) 109345.
[33]W.JamesMurdoch,ChandanSingh,KarlKumbier,RezaAbbasi-Asl,BinYu,Interpretablemachinelearning:definitions,methods,andapplications,arXiv preprint,arXiv:1901.04592,2019.
[34]YatinNandwani,AbhishekPathak,ParagSingla,etal.,Aprimaldualformulationfordeeplearningwithconstraints,in:AdvancesinNeuralInformation ProcessingSystems,2019.
[35]HansChristianÖttinger,BeyondEquilibriumThermodynamics,JohnWiley&Sons,2005.
[36]HansChristianÖttinger,MiroslavGrmela,Dynamicsandthermodynamicsofcomplexfluids.II.Illustrationsofageneralformalism,Phys.Rev.E56 (6) (1997)6633.
[37]AdamPaszke,SamGross,SoumithChintala,GregoryChanan,EdwardYang,ZacharyDeVito,ZemingLin,AlbanDesmaison,LucaAntiga,AdamLerer, Automaticdifferentiationinpytorch,in:AutodiffWorkshop:TheFutureofGradient-BasedMachineLearningSoftwareandTechniques,2017. [38]AdamPaszke,SamGross,FranciscoMassa,AdamLerer,JamesBradbury,GregoryChanan,TrevorKilleen,ZemingLin,NataliaGimelshein,LucaAntiga,
AlbanDesmaison,AndreasKopf,EdwardYang,ZacharyDeVito,MartinRaison,AlykhanTejani,SasankChilamkurthy,BenoitSteiner,LuFang,JunjieBai, SoumithChintala,Pytorch:animperativestyle,high-performancedeeplearninglibrary,in:H.Wallach,H.Larochelle,A.Beygelzimer,F.d’AlchéBuc,E. Fox,R.Garnett(Eds.),AdvancesinNeuralInformationProcessingSystems,vol. 32,CurranAssociates,Inc.,2019,pp. 8026–8037.
[39]R.G.Owens,T.N.Phillips,ComputationalRheology,ImperialCollegePress,2002.
[40]MaziarRaissi,ParisPerdikaris,GeorgeE.Karniadakis,Physics-informedneuralnetworks:adeeplearningframeworkforsolvingforwardandinverse problemsinvolvingnonlinearpartialdifferentialequations,J.Comput.Phys.378(2019)686–707.
[41]IgnacioRomero,Thermodynamicallyconsistenttime-steppingalgorithmsfornon-linearthermomechanicalsystems,Int.J.Numer.MethodsEng.79 (6) (2009)706–732.
[42]JonathanRomero,JonathanP.Olson,AlanAspuru-Guzik,Quantumautoencodersforefficientcompressionofquantumdata,QuantumSci.Technol. 2 (4)(2017)045001.
[43]SebastianRuder,Anoverviewofgradientdescentoptimizationalgorithms,arXivpreprint,arXiv:1609.04747,2016. [44]JürgenSchmidhuber,Deeplearninginneuralnetworks:anoverview,NeuralNetw.61(2015)85–117.
[45]Hoo-ChangShin,HolgerR.Roth,MingchenGao,LeLu,ZiyueXu,IsabellaNogues,JianhuaYao,DanielMollura,RonaldM.Summers,Deepconvolutional neuralnetworksforcomputer-aideddetection:Cnnarchitectures,datasetcharacteristicsandtransferlearning,IEEETrans.Med.Imaging35 (5)(2016) 1285–1298.
[46]LucasTheis,WenzheShi,AndrewCunningham,FerencHuszár,Lossyimagecompressionwithcompressiveautoencoders,arXivpreprint,arXiv:1703. 00395,2017.
[47]Yi-JenWang,Chin-TengLin,Runge-Kuttaneuralnetworkforidentificationofdynamicalsystemsinhighaccuracy,IEEETrans.NeuralNetw.9 (2)(1998) 294–307.
[48]DongkunZhang,LingGuo,GeorgeEmKarniadakis,Learninginmodalspace:solvingtime-dependentstochasticpdesusingphysics-informedneural networks,SIAMJ.Sci.Comput.42 (2)(2020)A639–A665.