How to Train Your Neural ODE: the World of Jacobian and Kinetic Regularization

Abstract: Training neural ODEs on large datasets has not been tractable due to the necessity of allowing the adaptive numerical ODE solver to refine its step size to very small values. In practice this leads to dynamics equivalent to many hundreds or even thousands of layers. In this paper, we overcome this apparent difficulty by introducing a theoretically-grounded combination of both optimal transport and stability regularizations which encourage neural ODEs to prefer simpler dynamics out of all the dynamics that solve a problem well. Simpler dynamics lead to faster convergence and to fewer discretizations of the solver, considerably decreasing wall-clock time without loss in performance. Our approach allows us to train neural ODE-based generative models to the same performance as the unregularized dynamics, with significant reductions in training time. This brings neural ODEs closer to practical relevance in large-scale applications.

06/12/2021

How to Train Your Neural ODE: the World of Jacobian and Kinetic Regularization

Chris Finlay, Joern-Henrik Jacobsen, Levon Nurbekyan, Adam Oberman

Comments

Similar Papers

Powerpropagation: A sparsity inducing weight reparameterisation

Jonathan Schwarz, Siddhant M Jayakumar, Razvan Pascanu and Peter E Latham, Yee Teh

Keywords Abstract Paper

deep learning, optimization, continual learning

"Hey, that's not an ODE": Faster ODE Adjoints via Seminorms

Patrick Kidger, Ricky T. Q. Chen, Terry Lyons

Keywords Abstract Paper

Step-Ahead Error Feedback for Distributed Training with Compressed Gradient

An Xu, Zhouyuan Huo, Heng Huang

Keywords Abstract Paper

Rigging the Lottery: Making All Tickets Winners

Utku Evci, Trevor Gale, Jacob Menick and Pablo Samuel Castro, Erich Elsen

Keywords Abstract Paper

STEER : Simple Temporal Regularization For Neural ODE

Arnab Ghosh, HARKIRAT Behl, Emilien Dupont and Philip Torr, Vinay Namboodiri

Keywords Abstract Paper

Top-KAST: Top-K Always Sparse Training

Sid Jayakumar, Razvan Pascanu, Jack Rae and Simon Osindero, Erich Elsen

Keywords Abstract Paper

Boost Neural Networks by Checkpoints

Feng Wang, Guoyizhe Wei, Qiao Liu and Jinxiang Ou, xian wei, Hairong Lv

Keywords Abstract Paper

Training independent subnetworks for robust prediction

Marton Havasi, Rodolphe Jenatton, Stanislav Fort and Jeremiah Zhe Liu, Jasper Snoek, Balaji Lakshminarayanan, Andrew Dai, Dustin Tran

Keywords Abstract Paper

robustness, Efficient ensembles

Network Pruning by Greedy Subnetwork Selection

Mao Ye, Chengyue Gong, Lizhen Nie and Denny Zhou, Adam Klivans, Qiang Liu

Keywords Abstract Paper

Improved Natural Language Generation via Loss Truncation

Daniel Kang, Tatsunori Hashimoto

Keywords Abstract Paper

Natural Generation, optimization, estimation, distinguishability

Opening the Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics

Avik Pal, Yingbo Ma, Viral Shah, Christopher Rackauckas

Keywords Abstract Paper

Second-Order Neural ODE Optimizer

Guan-Horng Liu, Tianrong Chen, Evangelos Theodorou

Keywords Abstract Paper

deep learning, optimization, machine learning, vision

Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks

Alexander Shevchenko, Marco Mondelli

Keywords Abstract Paper

MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius

Runtian Zhai, Chen Dan, Di He and Huan Zhang, Boqing Gong, Pradeep Ravikumar, Cho-Jui Hsieh, Liwei Wang

Keywords Abstract Paper

Adversarial Robustness, Provable Adversarial Defense, Randomized Smoothing, Robustness Certification

Learning Deeper Non-Monotonic Networks by Softly Transferring Solution Space

Zheng-Fan Wu, Hui Xue, Weimin Bai

Keywords Abstract Paper

Machine Learning, Kernel Methods, Deep Learning, Classification

Inducing and Exploiting Activation Sparsity for Fast Inference on Deep Neural Networks

Mark Kurtz, Justin Kopinsky, Rati Gelashvili and Alexander Matveev, John Carr, Michael Goin, William Leiserson, Sage Moore, Nir Shavit, Dan Alistarh

Keywords Abstract Paper

Analyzing the effect of neural network architecture on training performance

Karthik Abinav Sankararaman, Soham De, Zheng Xu and W. Ronny Huang, Tom Goldstein

Keywords Abstract Paper

Sparse Spiking Gradient Descent

Nicolas Perez-Nieves, Dan Goodman

Keywords Abstract Paper

deep learning, optimization

Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss

Lénaïc Chizat, Francis Bach

Keywords Abstract Paper

Neural networks/deep learning, Non-convex optimization

Confidence-Aware Learning for Deep Neural Networks

Sangheum Hwang, Jooyoung Moon, Jihyo Kim, Younghak Shin

Keywords Abstract Paper

PHEW : Constructing Sparse Networks that Learn Fast and Generalize Well without Training Data

Shreyas Malakarjun Patil, Constantine Dovrolis

Keywords Abstract Paper

AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty

Dan Hendrycks*, Norman Mu*, Ekin Dogus Cubuk and Barret Zoph, Justin Gilmer, Balaji Lakshminarayanan

Keywords Abstract Paper

robustness, uncertainty

Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Jonathan Schwarz, Siddhant M Jayakumar, Razvan Pascanu and
Peter E Latham, Yee Teh

Keywords Paper

Keywords Paper

Keywords Paper

Utku Evci, Trevor Gale, Jacob Menick and
Pablo Samuel Castro, Erich Elsen

Keywords Paper

Arnab Ghosh, HARKIRAT Behl, Emilien Dupont and
Philip Torr, Vinay Namboodiri

Keywords Paper

Sid Jayakumar, Razvan Pascanu, Jack Rae and
Simon Osindero, Erich Elsen

Keywords Paper

Feng Wang, Guoyizhe Wei, Qiao Liu and
Jinxiang Ou, xian wei, Hairong Lv

Keywords Paper

Marton Havasi, Rodolphe Jenatton, Stanislav Fort and
Jeremiah Zhe Liu, Jasper Snoek, Balaji Lakshminarayanan, Andrew Dai, Dustin Tran

Keywords Paper

Mao Ye, Chengyue Gong, Lizhen Nie and
Denny Zhou, Adam Klivans, Qiang Liu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Runtian Zhai, Chen Dan, Di He and
Huan Zhang, Boqing Gong, Pradeep Ravikumar, Cho-Jui Hsieh, Liwei Wang

Keywords Paper

Keywords Paper

Mark Kurtz, Justin Kopinsky, Rati Gelashvili and
Alexander Matveev, John Carr, Michael Goin, William Leiserson, Sage Moore, Nir Shavit, Dan Alistarh

Keywords Paper

Karthik Abinav Sankararaman, Soham De, Zheng Xu and
W. Ronny Huang, Tom Goldstein

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Dan Hendrycks, Norman Mu, Ekin Dogus Cubuk and
Barret Zoph, Justin Gilmer, Balaji Lakshminarayanan

Keywords Paper

Zhuohan Li, Eric Wallace, Sheng Shen and
Kevin Lin, Kurt Keutzer, Dan Klein, Joseph Gonzalez

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Dawei Gao, Xiaoxi He, Zimu Zhou and
Yongxin Tong, Ke Xu, Lothar Thiele

Keywords Paper

Jiawei Huang, Ruomin Huang, wenjie liu and
Nikolaos Freris, Hu Ding

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Elad Hoffer, Tal Ben-Nun, Itay Hubara and
Niv Giladi, Torsten Hoefler, Daniel Soudry

Keywords Paper

Brian Chmiel, Liad Ben-Uri, Moran Shkolnik and
Elad Hoffer, Ron Banner, Daniel Soudry

Keywords Paper

Keywords Paper

Robin Ru, Clare Lyle, Lisa Schut and
Miroslav Fil, Mark van der Wilk, Yarin Gal

Keywords Paper

Fangcheng Fu, Yuzheng Hu, Yihan He and
Jiawei Jiang, Yingxia Shao, Ce Zhang, Bin Cui

Keywords Paper