ResNet After All: Neural ODEs and Their Numerical Solution

03/05/2021

ResNet After All: Neural ODEs and Their Numerical Solution

Katharina Ott, Prateek Katiyar, Philipp Hennig, Michael Tiemann

Keywords:

Abstract Paper Similar Papers

Abstract: A key appeal of the recently proposed Neural Ordinary Differential Equation (ODE) framework is that it seems to provide a continuous-time extension of discrete residual neural networks. As we show herein, though, trained Neural ODE models actually depend on the specific numerical method used during training. If the trained model is supposed to be a flow generated from an ODE, it should be possible to choose another numerical solver with equal or smaller numerical error without loss of performance. We observe that if training relies on a solver with overly coarse discretization, then testing with another solver of equal or smaller numerical error results in a sharp drop in accuracy. In such cases, the combination of vector field and numerical method cannot be interpreted as a flow generated from an ODE, which arguably poses a fatal breakdown of the Neural ODE concept. We observe, however, that there exists a critical step size beyond which the training yields a valid ODE vector field. We propose a method that monitors the behavior of the ODE solver during training to adapt its step size, aiming to ensure a valid ODE without unnecessarily increasing computational cost. We verify this adaption algorithm on a common bench mark dataset as well as a synthetic dataset.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

14/06/2020

Fixed-Point Back-Propagation Training

Xishan Zhang, Shaoli Liu, Rui Zhang and
Chang Liu, Di Huang, Shiyi Zhou, Jiaming Guo, Qi Guo, Zidong Du, Tian Zhi, Yunji Chen

Keywords Paper

network quantization, fixed-point training, deep learning, neural network

1

0

0

0

1:01

06/12/2021

Second-Order Neural ODE Optimizer

Guan-Horng Liu, Tianrong Chen, Evangelos Theodorou

Keywords Paper

deep learning, optimization, machine learning, vision

0

0

0

0

14:59

26/04/2020

Effect of Activation Functions on the Training of Overparametrized Neural Nets

Abhishek Panigrahi, Abhishek Shetty, Navin Goyal

Keywords Paper

activation functions, deep learning theory, neural networks

0

0

0

0

5:13

26/04/2020

Implicit Bias of Gradient Descent based Adversarial Training on Separable Data

Yan Li, Ethan X.Fang, Huan Xu, Tuo Zhao

Keywords Paper

implicit bias, adversarial training, robustness, gradient descent

0

0

0

0

4:53

26/04/2020

Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity

Jingzhao Zhang, Tianxing He, Suvrit Sra, Ali Jadbabaie

Keywords Paper

Adaptive methods, optimization, deep learning

1

0

0

0

14:15

06/12/2020

Neural Controlled Differential Equations for Irregular Time Series

Patrick Kidger, James Morrill, James Foster, Terry Lyons

Keywords Paper

0

0

0

0

3:09

12/07/2020

How to Train Your Neural ODE: the World of Jacobian and Kinetic Regularization

Chris Finlay, Joern-Henrik Jacobsen, Levon Nurbekyan, Adam Oberman

Keywords Paper

Probabilistic Inference - Models and Probabilistic Programming

0

0

0

0

12:34

06/12/2020

STEER : Simple Temporal Regularization For Neural ODE

Arnab Ghosh, HARKIRAT Behl, Emilien Dupont and
Philip Torr, Vinay Namboodiri

Keywords Paper

0

0

0

0

3:19

26/04/2020

On Robustness of Neural Ordinary Differential Equations

Hanshu YAN, Jiawei DU, Vincent TAN, Jiashi FENG

Keywords Paper

Neural ODE

0

0

0

0

5:09

19/08/2021

Towards Understanding the Spectral Bias of Deep Learning

Yuan Cao, Zhiying Fang, Yue Wu and
Ding-Xuan Zhou, Quanquan Gu

Keywords Paper

Machine Learning, Deep Learning, Kernel Methods

0

0

0

0

14:42

03/05/2021

CPT: Efficient Deep Neural Network Training via Cyclic Precision

Yonggan Fu, Han Guo, Meng Li and
Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

Keywords Paper

low precision training, Efficient training

0

0

0

0

8:55

06/12/2021

When Are Solutions Connected in Deep Networks?

Quynh Nguyen, Pierre Bréchet, Marco Mondelli

Keywords Paper

theory, deep learning, optimization

0

0

0

0

14:44

02/02/2021

HyDRA: Hypergradient Data Relevance Analysis for Interpreting Deep Neural Networks

Yuanyuan Chen, Boyang Li, Han Yu and
Pengcheng Wu, Chunyan Miao

Keywords Paper

0

0

0

0

20:40

02/02/2021

Any-Precision Deep Neural Networks

Haichao Yu, Haoxiang Li, Humphrey Shi and
Thomas S. Huang, Gang Hua

Keywords Paper

0

0

0

0

14:26

06/12/2021

Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State

Mingqing Xiao, Qingyan Meng, Zongpeng Zhang and
Yisen Wang, Zhouchen Lin

Keywords Paper

deep learning

0

0

0

0

12:22

09/07/2020

Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss

Lénaïc Chizat, Francis Bach

Keywords Paper

Neural networks/deep learning, Non-convex optimization

0

0

0

0

14:41

18/07/2021

Guarantees for Tuning the Step Size using a Learning-to-Learn Approach

Xiang Wang, Shuai Yuan, Chenwei Wu, Rong Ge

Keywords Paper

Theory, Computational Learning Theory

0

0

0

0

5:20

02/02/2021

Step-Ahead Error Feedback for Distributed Training with Compressed Gradient

An Xu, Zhouyuan Huo, Heng Huang

Keywords Paper

0

0

0

0

18:26

06/12/2021

Explicit loss asymptotics in the gradient descent training of neural networks

Maksim Velikanov, Dmitry Yarotsky

Keywords Paper

theory, deep learning, optimization

0

0

0

0

9:54

13/04/2021

On the generalization properties of adversarial training

Yue Xing, Qifan Song, Guang Cheng

Keywords Paper

0

0

0

0

3:05

02/02/2021

Distribution Adaptive INT8 Quantization for Training CNNs

Kang Zhao, Sida Huang, Pan Pan and
Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Keywords Paper

0

0

0

0

16:42

18/07/2021

Momentum Residual Neural Networks

Michael Sander, Pierre Ablin, Mathieu Blondel, Gabriel Peyré

Keywords Paper

Deep Learning

0

0

0

0

5:07

03/05/2021

Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability

Jeremy Cohen, Simran Kaur, Yuanzhi Li and
Zico Kolter, Ameet Talwalkar

Keywords Paper

implicit bias, stability, science of deep learning, L-smoothness, trajectory, optimization, sharpness, implicit regularization, deep learning theory

0

0

0

0

5:06

06/12/2020

A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks

Zixiang Chen, Yuan Cao, Quanquan Gu, Tong Zhang

Keywords Paper

0

0

0

0

3:16

18/07/2021

On the Explicit Role of Initialization on the Convergence and Implicit Bias of Overparametrized Linear Networks

Hancheng Min, Salma Tarmoun, Rene Vidal, Enrique Mallada

Keywords Paper

Theory

0

0

0

0

5:16

06/12/2020

FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training

Yonggan Fu, Haoran You, Yang Zhao and
Yue Wang, Chaojian Li, Kailash Gopalakrishnan, Zhangyang Wang, Yingyan Lin

Keywords Paper

0

1

0

1

3:19

20/07/2020

A type of generalization error induced by initialization in deep neural networks

Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

Keywords Paper

0

0

0

0

17:33

12/07/2020

Dynamics of Deep Neural Networks and Neural Tangent Hierarchy

Jiaoyang Huang, Horng-Tzer Yau

Keywords Paper

Deep Learning - Theory

0

0

0

0

15:28

06/12/2020

Self-Distillation Amplifies Regularization in Hilbert Space

Hossein Mobahi, Mehrdad Farajtabar, Peter Bartlett

Keywords Paper

0

0

0

0

3:18

18/07/2021

Align, then memorise: the dynamics of learning with feedback alignment

Maria Refinetti, Stéphane d'Ascoli, Ruben Ohana, Sebastian Goldt

Keywords Paper

Theory, Models of Learning and Generalization

0

0

0

0

4:38

06/12/2021

End-to-End Weak Supervision

Salva Rühling Cachay, Benedikt Boecking, Artur Dubrawski

Keywords Paper

deep learning, machine learning, robustness

0

0

0

0

14:43

12/07/2020

Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE

Juntang Zhuang, Nicha Dvornek, Xiaoxiao Li and
Sekhar Tatikonda, Xenophon Papademetris, James Duncan

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

14:04

06/12/2020

Understanding and Improving Fast Adversarial Training

Maksym Andriushchenko, Nicolas Flammarion

Keywords Paper

0

0

0

0

3:23

26/04/2020

Training Recurrent Neural Networks Online by Learning Explicit State Variables

Somjit Nath, Vincent Liu, Alan Chan and
Xin Li, Adam White, Martha White

Keywords Paper

Recurrent Neural Network, Partial Observability, Online Prediction, Incremental Learning

0

0

0

0

5:06

06/12/2020

A Dynamical Central Limit Theorem for Shallow Neural Networks

Zhengdao Chen, Grant Rotskoff, Joan Bruna, Eric Vanden-Eijnden

Keywords Paper

0

0

0

0

3:22

12/07/2020

Do We Need Zero Training Loss After Achieving Zero Training Error?

Takashi Ishida, Ikko Yamane, Tomoya Sakai and
Gang Niu, Masashi Sugiyama

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

9:58

18/07/2021

Skew Orthogonal Convolutions

Sahil Singla, Soheil Feizi

Keywords Paper

Algorithms, Adversarial Examples

0

0

0

0

5:18

19/04/2021

Neural data-to-text generation with LM-based text augmentation

Ernie Chang, Xiaoyu Shen, Dawei Zhu and
Vera Demberg, Hui Su

Keywords Paper

0

0

0

0

7:32

12/07/2020

A Mean Field Analysis Of Deep ResNet And Beyond: Towards Provably Optimization Via Overparameterization From Depth

Yiping Lu, Chao Ma, Yulong Lu and
Jianfeng Lu, Lexing Ying

Keywords Paper

Deep Learning - Theory

0

0

0

0

4:37

06/12/2021

Convergence and Alignment of Gradient Descent with Random Backpropagation Weights

Ganlin Song, Ruitu Xu, John Lafferty

Keywords Paper

deep learning, optimization

0

0

0

0

5:13