Augmenting Physical Models with Deep Networks for Complex Dynamic Forecasting
Nicolas Thome - Prof. at Cnam Paris CEDRIC Lab, Vertigo Team Artificial Intelligence Seminar Imagerie et Vision Artificielle (Imvia) Lab
July 8, 2021
Outline
1 Context
2 PhyDNet: Physically-constrained deep video prediction
3 APHYNITY: physics & ML cooperation
Big data & complex dynamics
▸ Big data: superabundance of data: times series (sensor measurements), images (fisheye, satellite), spatio-temporal data (weather forecasts), videos, text, etc
weather forecasts Sensors, pyranometers 100M monitoring cameras
▸ Obvious need for Artificial Intelligencewith these data
⇒ Forecasting
1/ 37 [email protected] - Augmented Physical Models
Ex: forecasting in climate
▸ Hurricane track forecast [7]
▸ Sea surface temperature prediction [3]
State-of-the-art in complex dynamic forecasting
▸ Model-Based (MB) approaches,e.g.using PDE/ODE: deep understanding of complex underlying phenomenon
▸ Machine Learning (ML) / Deep learning (DL): more agnostic, now state-of-the art in several tasks,e.g.ConvLSTM [26], Neural ODE [2]
3/ 37 [email protected] - Augmented Physical Models
Neural ODE [2]
▸ Parameterizing ODE hidden state derivative using a NN instead of a discrete sequence of hidden layers
Residual network ODE network ht+1=ht+f(ht, θt) dhdt(t) =f(h(t), t, θ)
▸ Backward pass with the adjoint method
Solar energy forecasting
Industrial application at EDF: short-term solar energy forecasting
Data: > 7 Million Fisheye images and measured solar irradiance every 10s
Goal: predict future solar irradiance values (0-20min) given previous fisheye images
5/ 37 [email protected] - Augmented Physical Models
Challenges in solar energy forecasting
▸ Data-driven forecasts struggle to properly extrapolate
▸ Lag behind GT, unable to capture sharp changes
Figure:5min solar irradiance forescasting (from Paletta et al. [17])
Research directions to improve forecasts
▸ Design training loss functions with shape and temporal criteria (instead of the dominant MSE), e.g.DILATE [8]
▸ Incorporating prior (physical) knowledge in data-driven models
7/ 37 [email protected] - Augmented Physical Models
Incorporating prior information in machine learning
▸ Learn transformations between images [6, 10, 27], use optical flow [13, 12, 11]
▸ Use disentangled representations [20, 5, 9]
▸ Combine MB and ML models, hybrid models(gray box)
⇒ Physically-constrained deep forecasting[4, 14]
Figure:Advection-diffusion flow [4]
PDE-Net [14]
NN architecture to approximate the solution of a linear PDE
∂u
∂t(t, x, y) =F(x, y,∂u
∂x,∂u
∂y,∂2u
∂x2, ...) = ∑
(i,j)∶i+j≤q
ci,j
∂i+jh
∂xi∂yj(t,x) (1)
▸ Euler discretization: u(t+1) ≈u(t) +δt.F(x, y, ux(t), ...) ⇒approximate derivatives with a residual architecture and constrained convolutions
▸ Estimate unknown ODE parameters,e.g. {ci,j}in Eq (1)
⇒ Physical parameters identification
9/ 37 [email protected] - Augmented Physical Models
Focus & Contributions
▸ Physical models: approximations of real world dynamics
⇒ Augmenting simplified physical models with data-driven networks
▸ Fully vs partial observability: physical models not always applicable in input space
⇒Leveraging PDE dynamics in a learned latent space
▸ Proper cooperation between prior physical and data-driven augmentation
⇒Decomposition with uniqueness guarantees
Outline
1 Context
2 PhyDNet: Physically-constrained deep video prediction
3 APHYNITY: physics & ML cooperation
Unsupervised Video Prediction
▸ Deep learning state-of-the-art for video prediction
▸ Seminal works: Sequence To Sequence LSTM [19], ConvLSTM [26]
▸ Many further architectures based on 2D/3D CNNs [15, 21, 18], or RNNs [24, 22, 16, 25, 23]
▸ However, still very challenging for pure data-driven methods:
▸ high-dimensionality of images
▸ inherent uncertainty of future (often blurry predictions)
PhyDNet
Motivation
Disentangling Physics,e.g. PDE, from complementary info in latent space
12/ 37 [email protected] - Augmented Physical Models
PhyDNet
Modeling physics in latent space
▸ input image: u(t,x)
▸ E[u(t,x)] =h(t,x)in latent spaceH
▸ PDE dynamics inH:
∂h(t,x)
∂t =∂hp
∂t +∂hr
∂t ∶=Mp(hp,u)+Mr(hr,u)
▸ Latent dynamics decomposed into:
▸ ∂h∂tp with physical prior⇒PhyCell
▸ Data-driven augmentation: ConvLSTM
▸ Such that: D[h(t+1,x))] ≈u(t+1,x)
PhyCell
▸ Prediction/correction:Mp(h,u) ∶=Φ(h) + C(h,u), discretization:
⎧⎪⎪⎨⎪⎪
⎩
h˜t+1=ht+Φ(ht) Prediction ht+1=h˜t+1+Kt⊙ (E(ut) −h˜t+1) Correction
(2) (3)
PhyCell
Atomic recurrent cell for building physically-constrained RNNs
Prediction-correction revisited with deep learning
▸ Decoupling prediction/correction:
good for long-term forecasting and missing data
Prediction:h˜t+1=ht+Φ(ht)
▸ Φ(h(t,x)) = ∑i,j∶i+j≤qci,j
∂i+jh
∂xi∂yj(t,x), PDE-Net in latent space
▸ Approximate ∂i+jh
∂xi∂yj with conv & moment loss
▸ {ci,j}learned
Correction:
ht+1=h˜t+1+Kt⊙ (E(ut) −h˜t+1)
▸ Learned Kalman gainKt for correction:Kt=
tanh(Wh∗h˜t+1+Wu∗E(ut) +bk)
▸ ≠locally linear approximate Kalman gain in [1]
16/ 37 [email protected] - Augmented Physical Models
Experiments
State-of-the-art wrt
▸ Recent general video prediction methods
▸ specialized models for Moving Mnist and SST
Qualitative results
18/ 37 [email protected] - Augmented Physical Models
Ablation study
Ablation study
20/ 37 [email protected] - Augmented Physical Models
Analysis of learned filters
Moving MNIST Traffic BJ
SST Human 3.6
Figure:Mean amplitude of the combining coefficientsci,j with respect to the order of the differential operators approximated.
Long-term prediction and missing data
(a) Long-term forecasting (b) Missing data
Figure:MSE comparison between PhyDNet and DDPAE [9] when dealing with unreliable inputs.
22/ 37 [email protected] - Augmented Physical Models
Outline
1 Context
2 PhyDNet: Physically-constrained deep video prediction
3 APHYNITY: physics & ML cooperation
Motivation: data-driven vs. simplified physical models
▸ Data-driven modelsstruggle to extrapolate complex dynamics, in particular in data-scarce contexts
▸ Physical modelsfail to extrapolate when they are misspecified:
forecasting & parameter identification failure
(a) Data-driven Neural ODE (b) Simple physical model (c) Our APHYNITY framework
⇒ Augmenting PHYsical models for ideNtIfying and forecasTing complex dYnamic (APHYNITY)
23/ 37 [email protected] - Augmented Physical Models
APHYNITY
▸ dXdtt =F(Xt), Xt∈Rd (vector) orXt(x) ∈Rd,x∈Ω⊂Rk(vector field)
▸ F∈ Fnormed vector space,Fp∈ Fp⊂ Fphysical model (ODE/PDE)
▸ Augment approximate physical modelFp with data-drivenFa∈ F:
dXt
dt =F(Xt) =Fp+Fa
▸ However, decompositionF=Fp+Fain general not unique
▸ APHYNITY:
Fp∈Fp ,Fa∈Fmin ∥Fa∥ subject toF= (Fp+Fa)(4)
▸ IfFpChebyshev set1, decomposition in Eq (4) exists and is unique (metric projection ontoFp).
Intuition:min∣∣Fa∣∣ ⇒augmentation only models information that cannot be captured by the physical priorFp
aIn finite-dim space, closed convex sets
APHYNITY training
▸ Dataset of observed trajectories: D = {X⋅∶ [0, T] → F ∣ ∀t∈ [0, T],dXt/dt=F(Xt)}
APHYNITY objective:
min
Fp∈Fp,Fa∈F ∥Fa∥ subject to ∀X∈ D,∀t,dXt
dt = (Fp+Fa)(Xt)
▸ Parametrized modelsFpθp (θp physical parameters),Faθa (θa deep NN)
25/ 37 [email protected] - Augmented Physical Models
APHYNITY optimization
▸ Trajectory based training: multi-step prediction, differentiable ODE solver
▸ In practice: adaptive constraint optimization (variant of Uzawa algorithm):
Lλj(θp, θa) = ∥Faθa∥ +λj⋅ Ltraj(θp, θa) (5)
▸ Ltraj(θp, θa) = ∑Ni=1∑Th=1/∆t∥Xh∆t(i) − ̃Xh∆t(i) ∥
▸ θ= (θp, θa), Iterativeλj setting:
▸ λj+1=λj+τ2Ltraj(θj+1),τ2
hyper-parameter
▸ Stable and robust convergence
Algorithm 1: APHYNITY
Initialization:λ0≥0, τ1>0, τ2>0;
forepoch =1∶Nepochsdo foriter in1∶Niterdo
forbatch in1∶Bdo
θj+1=θj−τ1∇ [λjLtraj(θj) + ∥Fa∥]
λj+1=λj+τ2Ltraj(θj+1)
APHYNITY - quantitative results
Experiments on 3 classes of physical phenomena:
▸ Damped pendulum: ddt2θ2 +ω02sinθ+λdθdt =0
▸ SimplifiedFp: Hamiltonian (energy conservation), ODE withoutλ
▸ Reaction-diffusion: ∂u∂t =a∆u+Ru(u, v;k), ∂v
∂t =b∆v+Rv(u, v))
▸ Reaction terms: Ru(u, v;k) =u−u3−k−v, Rv(u, v) =u−v
▸ SimplifiedFp: PDE without reaction
▸ Damped wave: ∂∂t2w2 −c2∆w+k∂w∂t =0
▸ SimplifiedFp: PDE without damping
AllFp’s are closed and convex inF ⇒Chebyshev
27/ 37 [email protected] - Augmented Physical Models
Experiments: APHYNITY results
▸ Better forecasting performances
▸ Better physical parameter identification
▸ ∣∣Fa∣∣2∼level ofFp
approximation
APHYNITY - qualitative results
29/ 37 [email protected] - Augmented Physical Models
APHYNITY - model analysis on indentification
Application to solar energy forecasting (CVPR’20 workshop)
▸ Short-term (<20min) solar irradiance forecasting with fisheye images
▸ Improved PhyDNet model with separate encoders/decoders & min∣∣Fa∣∣2
31/ 37 [email protected] - Augmented Physical Models
Qualitative results
Conclusions and perspectives
New hybrid MB/ML models
▸ Leveraging approximate physics to improve deep forecasting models
▸ Importance of proper physics/ML decomposition, improving forecasting performances and parameter identification
Future works:
▸ application for other important applicative problems,e.g.optical flow
▸ application for control, model-based RL (e.g.modelling contacts)
▸ application for misspecified physical phenomena (e.g.modelling turbulences)
33/ 37 [email protected] - Augmented Physical Models
Thank your for your attention!
Questions?
▸ PhyDNet: Vincent Le Guen, Nicolas Thome
▸ CVPR’20 paper: Disentangling Physical Dynamics from Unknown Factors for Unsupervised Video Prediction
▸ CVPR’20 OMNI-CV workshop: A Deep Physical Model for Solar Irradiance Forecasting with Fisheye Images.
▸ GitHub code: https://github.com/vincent-leguen/PhyDNet
▸ APHYNITY: Yuan Yin, Vincent Le Guen, Jérémie Dona, Ibrahim Ayed, Emmanuel de Bézenac, Nicolas Thome, Patrick Gallinari.
▸ ICLR’21 paper: Augmenting Physical Models with Deep Networks for Complex Dynamics Forecasting
References I
[1] Philipp Becker, Harit Pandya, Gregor Gebhardt, Cheng Zhao, C James Taylor, and Gerhard Neumann.
Recurrent Kalman networks: Factorized inference in high-dimensional deep feature spaces.
InInternational Conference on Machine Learning (ICML), pages 544–552, 2019.
[2] Tian Qi Chen, Yulia Rubanova, Jesse Bettencourt, and David K Duvenaud.
Neural ordinary differential equations.
InAdvances in neural information processing systems (NeurIPS), 2018.
[3] Emmanuel de Bezenac, Arthur Pajot, and Patrick Gallinari.
Deep learning for physical processes: Incorporating prior scientific knowledge.
ICLR, 2018.
[4] Emmanuel de Bezenac, Arthur Pajot, and Patrick Gallinari.
Deep learning for physical processes: Incorporating prior scientific knowledge.
International Conference on Learning Representations (ICLR), 2018.
[5] Emily L Denton et al.
Unsupervised learning of disentangled representations from video.
InAdvances in neural information processing systems (NeurIPS), pages 4414–4423, 2017.
[6] Chelsea Finn, Ian Goodfellow, and Sergey Levine.
Unsupervised learning for physical interaction through video prediction.
InAdvances in neural information processing systems (NeurIPS), pages 64–72, 2016.
[7] Sophie Giffard-Roisin, Mo Yang, Guillaume Charpiat, Balázs Kégl, and Claire Monteleoni.
Deep learning for hurricane track forecasting from aligned spatio-temporal climate datasets.
2018.
[8] Vincent Le Guen and Nicolas Thome.
Shape and time distortion loss for training deep time series forecasting models.
arXiv preprint arXiv:1909.09020, 2019.
[9] Jun-Ting Hsieh, Bingbin Liu, De-An Huang, Li F Fei-Fei, and Juan Carlos Niebles.
Learning to decompose and disentangle representations for video prediction.
InAdvances in Neural Information Processing Systems (NeurIPS), pages 517–526, 2018.
References II
[10] Xu Jia, Bert De Brabandere, Tinne Tuytelaars, and Luc V Gool.
Dynamic filter networks.
InAdvances in Neural Information Processing Systems (NeurIPS), pages 667–675, 2016.
[11] Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, and Ming-Hsuan Yang.
Flow-grounded spatial-temporal video prediction from still images.
InEuropean Conference on Computer Vision (ECCV), pages 600–615, 2018.
[12] Xiaodan Liang, Lisa Lee, Wei Dai, and Eric P Xing.
Dual motion GAN for future-flow embedded video prediction.
InInternational Conference on Computer Vision (ICCV), pages 1744–1752, 2017.
[13] Ziwei Liu, Raymond A Yeh, Xiaoou Tang, Yiming Liu, and Aseem Agarwala.
Video frame synthesis using deep voxel flow.
InInternational Conference on Computer Vision (ICCV), pages 4463–4471, 2017.
[14] Zichao Long, Yiping Lu, Xianzhong Ma, and Bin Dong.
PDE-Net: Learning PDEs from data.
InInternational Conference on Machine Learning, pages 3214–3222, 2018.
[15] Michael Mathieu, Camille Couprie, and Yann LeCun.
Deep multi-scale video prediction beyond mean square error.
InInternational Conference on Learning Representations (ICLR), 2015.
[16] Marc Oliu, Javier Selva, and Sergio Escalera.
Folded recurrent neural networks for future video prediction.
InEuropean Conference on Computer Vision (ECCV), pages 716–731, 2018.
[17] Quentin Paletta, Guillaume Arbod, and Joan Lasenby.
Benchmarking of deep learning irradiance forecasting models from sky images–an in-depth analysis.
arXiv preprint arXiv:2102.00721, 2021.
References III
[18] Fitsum A Reda, Guilin Liu, Kevin J Shih, Robert Kirby, Jon Barker, David Tarjan, Andrew Tao, and Bryan Catanzaro.
SDC-Net: Video prediction using spatially-displaced convolution.
InEuropean Conference on Computer Vision (ECCV), pages 718–733, 2018.
[19] Nitish Srivastava, Elman Mansimov, and Ruslan Salakhudinov.
Unsupervised learning of video representations using LSTMs.
InInternational Conference on Machine Learning (ICML), pages 843–852, 2015.
[20] Ruben Villegas, Jimei Yang, Seunghoon Hong, Xunyu Lin, and Honglak Lee.
Decomposing motion and content for natural video sequence prediction.
International Conference on Learning Representations (ICLR), 2017.
[21] Carl Vondrick, Hamed Pirsiavash, and Antonio Torralba.
Generating videos with scene dynamics.
InAdvances In Neural Information Processing Systems (NeurIPS), pages 613–621, 2016.
[22] Yunbo Wang, Zhifeng Gao, Mingsheng Long, Jianmin Wang, and Philip S Yu.
PredRNN++: Towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning.
arXiv preprint arXiv:1804.06300, 2018.
[23] Yunbo Wang, Lu Jiang, Ming-Hsuan Yang, Li-Jia Li, Mingsheng Long, and Li Fei-Fei.
Eidetic 3D LSTM: A model for video prediction and beyond.
InInternational Conference on Learning Representations (ICLR), 2019.
[24] Yunbo Wang, Mingsheng Long, Jianmin Wang, Zhifeng Gao, and S Yu Philip.
PredRNN: Recurrent neural networks for predictive learning using spatiotemporal lstms.
InAdvances in Neural Information Processing Systems (NeurIPS), pages 879–888, 2017.
[25] Yunbo Wang, Jianjin Zhang, Hongyu Zhu, Mingsheng Long, Jianmin Wang, and Philip S Yu.
Memory in memory: A predictive neural network for learning higher-order non-stationarity from spatiotemporal dynamics.
InComputer Vision and Pattern Recognition (CVPR), pages 9154–9162, 2019.
References IV
[26] Shi Xingjian, Zhourong Chen, Hao Wang, Dit-Yan Yeung, Wai-Kin Wong, and Wang-chun Woo.
Convolutional LSTM network: A machine learning approach for precipitation nowcasting.
InAdvances in neural information processing systems (NeurIPS), pages 802–810, 2015.
[27] Tianfan Xue, Jiajun Wu, Katherine Bouman, and Bill Freeman.
Visual dynamics: Probabilistic future frame synthesis via cross convolutional networks.
InAdvances in neural information processing systems (NeurIPS), pages 91–99, 2016.