"Hey, that's not an ODE": Faster ODE Adjoints via Seminorms

Abstract: Neural differential equations may be trained by backpropagating gradients via the adjoint method, which is another differential equation typically solved using an adaptive-step-size numerical differential equation solver. A proposed step is accepted if its error, \emph{relative to some norm}, is sufficiently small; else it is rejected, the step is shrunk, and the process is repeated. Here, we demonstrate that the particular structure of the adjoint equations makes the usual choices of norm (such as $L^2$) unnecessarily stringent. By replacing it with a more appropriate (semi)norm, fewer steps are unnecessarily rejected and the backpropagation is made faster. This requires only minor code modifications. Experiments on a wide range of tasks---including time series, generative modeling, and physical control---demonstrate a median improvement of 40\% fewer function evaluations. On some problems we see as much as 62\% fewer function evaluations, so that the overall training time is roughly halved.

26/04/2020

"Hey, that's not an ODE": Faster ODE Adjoints via Seminorms

Patrick Kidger, Ricky T. Q. Chen, Terry Lyons

Comments

Similar Papers

SNODE: Spectral Discretization of Neural ODEs for System Identification

Alessio Quaglino, Marco Gallieri, Jonathan Masci, Jan Koutník

Keywords Abstract Paper

Recurrent neural networks, system identification, neural ODEs

How to Train Your Neural ODE: the World of Jacobian and Kinetic Regularization

Chris Finlay, Joern-Henrik Jacobsen, Levon Nurbekyan, Adam Oberman

Keywords Abstract Paper

Second-Order Neural ODE Optimizer

Guan-Horng Liu, Tianrong Chen, Evangelos Theodorou

Keywords Abstract Paper

deep learning, optimization, machine learning, vision

Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State

Mingqing Xiao, Qingyan Meng, Zongpeng Zhang and Yisen Wang, Zhouchen Lin

Keywords Abstract Paper

AC-GC: Lossy Activation Compression with Guaranteed Convergence

R David Evans, Tor Aamodt

Keywords Abstract Paper

deep learning, optimization, graph learning

Up or Down? Adaptive Rounding for Post-Training Quantization

Markus Nagel, Rana Ali Amjad, Marinus van Baalen and Christos Louizos, Tijmen Blankevoort

Keywords Abstract Paper

A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems

Jiawei Zhang, Peijun Xiao, Ruoyu Sun, Zhiquan Luo

Keywords Abstract Paper

Fast Sketching of Polynomial Kernels of Polynomial Degree

Zhao Song, David Woodruff, Zheng Yu, Lichen Zhang

Keywords Abstract Paper

Heavy Ball Neural Ordinary Differential Equations

Hedi Xia, Vai Suliafu, Hangjie Ji and Tan Nguyen, Andrea Bertozzi, Stanley Osher, Bao Wang

Keywords Abstract Paper

deep learning, optimization, machine learning, vision

Guarantees for Tuning the Step Size using a Learning-to-Learn Approach

Xiang Wang, Shuai Yuan, Chenwei Wu, Rong Ge

Keywords Abstract Paper

Theory, Computational Learning Theory

Improved Natural Language Generation via Loss Truncation

Daniel Kang, Tatsunori Hashimoto

Keywords Abstract Paper

Natural Generation, optimization, estimation, distinguishability

Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss

Lénaïc Chizat, Francis Bach

Keywords Abstract Paper

Neural networks/deep learning, Non-convex optimization

Effect of Activation Functions on the Training of Overparametrized Neural Nets

Abhishek Panigrahi, Abhishek Shetty, Navin Goyal

Keywords Abstract Paper

activation functions, deep learning theory, neural networks

Differentiable Optimization of Generalized Nondecomposable Functions using Linear Programs

Zihang Meng, Lopamudra Mukherjee, Yichao Wu and Vikas Singh, Sathya Narayanan Ravi

Keywords Abstract Paper

deep learning, optimization

Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: Joint Gradient Estimation and Tracking

Haoran Sun, Songtao Lu, Mingyi Hong

Keywords Abstract Paper

Stability & Generalisation of Gradient Descent for Shallow Neural Networks without the Neural Tangent Kernel

Dominic Richards, Ilja Kuzborskij

Keywords Abstract Paper

deep learning, optimization

Efficient Learning of Generative Models via Finite-Difference Score Matching

Tianyu Pang, Kun Xu, Chongxuan LI and Yang Song, Stefano Ermon, Jun Zhu

Keywords Abstract Paper

Opening the Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics

Avik Pal, Yingbo Ma, Viral Shah, Christopher Rackauckas

Keywords Abstract Paper

Step-Ahead Error Feedback for Distributed Training with Compressed Gradient

An Xu, Zhouyuan Huo, Heng Huang

Keywords Abstract Paper

A Corrective View of Neural Networks: Representation, Memorization and Learning

Dheeraj M Nagaraj, Guy Bresler

Keywords Abstract Paper

Neural networks/deep learning, Learning with algebraic or combinatorial structure, Supervised learning

Class Normalization for (Continual)? Generalized Zero-Shot Learning

Ivan Skorokhodov, Mohamed Elhoseiny

Keywords Abstract Paper

initialization, normalization, zero-shot learning, continual learning

Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems

Keywords Paper

Keywords Paper

Keywords Paper

Mingqing Xiao, Qingyan Meng, Zongpeng Zhang and
Yisen Wang, Zhouchen Lin

Keywords Paper

Keywords Paper

Markus Nagel, Rana Ali Amjad, Marinus van Baalen and
Christos Louizos, Tijmen Blankevoort

Keywords Paper

Keywords Paper

Keywords Paper

Hedi Xia, Vai Suliafu, Hangjie Ji and
Tan Nguyen, Andrea Bertozzi, Stanley Osher, Bao Wang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Zihang Meng, Lopamudra Mukherjee, Yichao Wu and
Vikas Singh, Sathya Narayanan Ravi

Keywords Paper

Keywords Paper

Keywords Paper

Tianyu Pang, Kun Xu, Chongxuan LI and
Yang Song, Stefano Ermon, Jun Zhu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jiawei Huang, Ruomin Huang, wenjie liu and
Nikolaos Freris, Hu Ding

Keywords Paper

Yogesh Balaji, Mohammadmahdi Sajedi, Neha Kalibhat and
Mucong Ding, Dominik Stöger, Mahdi Soltanolkotabi, Soheil Feizi

Keywords Paper

Pan Zhou, Hanshu Yan, Xiaotong Yuan and
Jiashi Feng, Shuicheng Yan

Keywords Paper

Keywords Paper

Keywords Paper

Neha Wadia, Daniel Duckworth, Samuel Schoenholz and
Ethan Dyer, Jascha Sohl-Dickstein

Keywords Paper

Keywords Paper

Keywords Paper

Ben Mussay, Margarita Osadchy, Vladimir Braverman and
Samson Zhou, Dan Feldman

Keywords Paper

Keywords Paper

Davide Abati, Jakub Tomczak, Tijmen Blankevoort and
Simone Calderara, Rita Cucchiara, Babak Ehteshami Bejnordi

Keywords Paper

Keywords Paper

Yibo Yang, Hongyang Li, Shan You and
Fei Wang, Chen Qian, Zhouchen Lin

Keywords Paper

Keywords Paper