Second-Order Neural ODE Optimizer

Abstract: We propose a novel second-order optimization framework for training the emerging deep continuous-time models, specifically the Neural Ordinary Differential Equations (Neural ODEs). Since their training already involves expensive gradient computation by solving a backward ODE, deriving efficient second-order methods becomes highly nontrivial. Nevertheless, inspired by the recent Optimal Control (OC) interpretation of training deep networks, we show that a specific continuous-time OC methodology, called Differential Programming, can be adopted to derive backward ODEs for higher-order derivatives at the same O(1) memory cost. We further explore a low-rank representation of the second-order derivatives and show that it leads to efficient preconditioned updates with the aid of Kronecker-based factorization. The resulting method – named SNOpt – converges much faster than first-order baselines in wall-clock time, and the improvement remains consistent across various applications, e.g. image classification, generative flow, and time-series prediction. Our framework also enables direct architecture optimization, such as the integration time of Neural ODEs, with second-order feedback policies, strengthening the OC perspective as a principled tool of analyzing optimization in deep learning. Our code is available at https://github.com/ghliu/snopt.

06/12/2021

Second-Order Neural ODE Optimizer

Guan-Horng Liu, Tianrong Chen, Evangelos Theodorou

Comments

Similar Papers

A Faster Decentralized Algorithm for Nonconvex Minimax Problems

Wenhan Xian, Feihu Huang, Yanfu Zhang, Heng Huang

Keywords Abstract Paper

optimization, machine learning, adversarial robustness and security

SNODE: Spectral Discretization of Neural ODEs for System Identification

Alessio Quaglino, Marco Gallieri, Jonathan Masci, Jan Koutník

Keywords Abstract Paper

Recurrent neural networks, system identification, neural ODEs

Opening the Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics

Avik Pal, Yingbo Ma, Viral Shah, Christopher Rackauckas

Keywords Abstract Paper

Deep Learning

SpikeGrad: An ANN-equivalent Computation Model for Implementing Backpropagation with Spikes

Johannes C. Thiele, Olivier Bichler, Antoine Dupret

Keywords Abstract Paper

spiking neural network, neuromorphic engineering, backpropagation

Implicit Bias of Gradient Descent based Adversarial Training on Separable Data

Yan Li, Ethan X.Fang, Huan Xu, Tuo Zhao

Keywords Abstract Paper

implicit bias, adversarial training, robustness, gradient descent

Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State

Mingqing Xiao, Qingyan Meng, Zongpeng Zhang and Yisen Wang, Zhouchen Lin

Keywords Abstract Paper

deep learning

Meta-Learning with Neural Tangent Kernels

Yufan Zhou, Zhenyi Wang, Jiayi Xian and Changyou Chen, Jinhui Xu

Keywords Abstract Paper

neural tangent kernel, meta-learning

Training Recurrent Neural Networks Online by Learning Explicit State Variables

Somjit Nath, Vincent Liu, Alan Chan and Xin Li, Adam White, Martha White

Keywords Abstract Paper

Recurrent Neural Network, Partial Observability, Online Prediction, Incremental Learning

Efficient Learning of Generative Models via Finite-Difference Score Matching

Tianyu Pang, Kun Xu, Chongxuan LI and Yang Song, Stefano Ermon, Jun Zhu

Keywords Abstract Paper

Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss

Lénaïc Chizat, Francis Bach

Keywords Abstract Paper

Neural networks/deep learning, Non-convex optimization

Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems

Subhabrata Dutta, Tanya Gautam, Soumen Chakrabarti, Tanmoy Chakraborty

Keywords Abstract Paper

deep learning, transformers

Asynchronous Optimization Methods for Efficient Training of Deep Neural Networks with Guarantees

Vyacheslav Kungurtsev, Malcolm Egan, Bapi Chatterjee, Dan Alistarh

Keywords Abstract Paper

On the Curse of Memory in Recurrent Neural Networks: Approximation and Optimization Analysis

Zhong Li, Jiequn Han, Weinan E, Qianxiao Li

Keywords Abstract Paper

universal approximation, optimization, curse of memory, recurrent neural network, dynamical system

Distribution Adaptive INT8 Quantization for Training CNNs

Kang Zhao, Sida Huang, Pan Pan and Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Keywords Abstract Paper

DDPNOpt: Differential Dynamic Programming Neural Optimizer

Guan-Horng Liu, Tianrong Chen, Evangelos Theodorou

Keywords Abstract Paper

differential dynamica programming, trajectory optimization, deep learning training, optimal control

Temporal Spike Sequence Learning via Backpropagation for Deep Spiking Neural Networks

Wenrui Zhang, Peng Li

Keywords Abstract Paper

Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks

Zhishuai Guo, Mingrui Liu, Zhuoning Yuan and Li Shen, Wei Liu, Tianbao Yang

Keywords Abstract Paper

Optimization - Large Scale, Parallel and Distributed

Robust Implicit Networks via Non-Euclidean Contractions

Saber Jafarpour, Alexander Davydov, Anton Proskurnikov, Francesco Bullo

Keywords Abstract Paper

theory, deep learning, machine learning, robustness, vision

Momentum Residual Neural Networks

Michael Sander, Pierre Ablin, Mathieu Blondel, Gabriel Peyré

Keywords Abstract Paper

Deep Learning

Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians

Juhan Bae, Roger Grosse

Keywords Abstract Paper

Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Mingqing Xiao, Qingyan Meng, Zongpeng Zhang and
Yisen Wang, Zhouchen Lin

Keywords Paper

Yufan Zhou, Zhenyi Wang, Jiayi Xian and
Changyou Chen, Jinhui Xu

Keywords Paper

Somjit Nath, Vincent Liu, Alan Chan and
Xin Li, Adam White, Martha White

Keywords Paper

Tianyu Pang, Kun Xu, Chongxuan LI and
Yang Song, Stefano Ermon, Jun Zhu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Kang Zhao, Sida Huang, Pan Pan and
Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Keywords Paper

Keywords Paper

Keywords Paper

Zhishuai Guo, Mingrui Liu, Zhuoning Yuan and
Li Shen, Wei Liu, Tianbao Yang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

T. Anderson Keller, Jorn Peters, Priyank Jaini and
Emiel Hoogeboom, Patrick Forré, Max Welling

Keywords Paper

Keywords Paper

Keywords Paper

Yangjun Ruan, Yuanhao Xiong, Sashank Reddi and
Sanjiv Kumar, Cho-Jui Hsieh

Keywords Paper

Zhuoran Yang, Chi Jin, Zhaoran Wang and
Mengdi Wang, Michael Jordan

Keywords Paper

Mingrui Liu, Youssef Mroueh, Jerret Ross and
Wei Zhang, Xiaodong Cui, Payel Das, Tianbao Yang

Keywords Paper

Yibo Yang, Hongyang Li, Shan You and
Fei Wang, Chen Qian, Zhouchen Lin

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper