Towards Adaptive Residual Network Training: A Neural-ODE Perspective

Abstract: Serving as a crucial factor, the depth of residual networks balances model capacity, performance, and training efficiency. However, depth has been long fixed as a hyper-parameter and needs laborious tuning, due to the lack of theories describing its dynamics. Here, we conduct theoretical analysis on network depth and introduce adaptive residual network training, which gradually increases model depth during training. Specifically, from an ordinary differential equation perspective, we describe the effect of depth growth with embedded errors, characterize the impact of model depth with truncation errors, and derive bounds for them. Illuminated by these derivations, we propose an adaptive training algorithm for residual networks, LipGrow, which automatically increases network depth and accelerates model training. In our experiments, it achieves better or comparable performance while reducing ~50% of training time.

02/02/2021

Towards Adaptive Residual Network Training: A Neural-ODE Perspective

chengyu dong, Liyuan Liu, Zichao Li, Jingbo Shang

Comments

Similar Papers

Distribution Adaptive INT8 Quantization for Training CNNs

Kang Zhao, Sida Huang, Pan Pan and Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Keywords Abstract Paper

Bayesian Optimization for Iterative Learning

Vu Nguyen, Sebastian Schulze, Michael A Osborne

Keywords Abstract Paper

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Abstract Paper

Deep Learning - Algorithms

Predicting Training Time Without Training

Luca Zancato, Alessandro Achille, Avinash Ravichandran and RAHUL Bhotika, Stefano Soatto

Keywords Abstract Paper

Step-Ahead Error Feedback for Distributed Training with Compressed Gradient

An Xu, Zhouyuan Huo, Heng Huang

Keywords Abstract Paper

Self Normalizing Flows

T. Anderson Keller, Jorn Peters, Priyank Jaini and Emiel Hoogeboom, Patrick Forré, Max Welling

Keywords Abstract Paper

Deep Learning, Generative Models

Neural Network Quantization with Scale-Adjusted Training

Qing Jin, Linjie Yang, Zhenyu Liao, Xiaoning Qian

Keywords Abstract Paper

neural network quantization, over-fitting, regularization

Training Recurrent Neural Networks Online by Learning Explicit State Variables

Somjit Nath, Vincent Liu, Alan Chan and Xin Li, Adam White, Martha White

Keywords Abstract Paper

Recurrent Neural Network, Partial Observability, Online Prediction, Incremental Learning

Faster & more reliable tuning of neural networks: Bayesian optimization with importance sampling

Setareh Ariafar, Zelda Mariet, Dana Brooks and Jennifer Dy, Jasper Snoek

Keywords Abstract Paper

Learned step size quantization

Steven K. Esser, Jeffrey L. McKinstry, Deepika Bablani and Rathinakumar Appuswamy, Dharmendra S. Modha

Keywords Abstract Paper

deep learning, low precision, classification, quantization

GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning

Krishnateja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, Rishabh Iyer

Keywords Abstract Paper

Deep Reinforcement Learning with Smooth Policy

Qianli Shen, Yan Li, Haoming Jiang and Zhaoran Wang, Tuo Zhao

Keywords Abstract Paper

Reinforcement Learning - Deep RL

Optimal ANN-SNN Conversion for Fast and Accurate Inference in Deep Spiking Neural Networks

Jianhao Ding, Zhaofei Yu, Yonghong Tian, Tiejun Huang

Keywords Abstract Paper

Machine Learning, Deep Learning, Cognitive Modeling

An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems

Ahmed M. Abdelmoniem, Ahmed Elzanaty Elzanaty, Mohamed-Slim Alouini , Marco Canini

Keywords Abstract Paper

An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems

Ahmed M. Abdelmoniem, Ahmed Elzanaty Elzanaty, Mohamed-Slim Alouini , Marco Canini

Keywords Abstract Paper

Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TinyScript

Fangcheng Fu, Yuzheng Hu, Yihan He and Jiawei Jiang, Yingxia Shao, Ce Zhang, Bin Cui

Keywords Abstract Paper

Optimization - Large Scale, Parallel and Distributed

Any-Precision Deep Neural Networks

Haichao Yu, Haoxiang Li, Humphrey Shi and Thomas S. Huang, Gang Hua

Keywords Abstract Paper

Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks

Xiangyu Chang, Yingcong Li, Samet Oymak, Christos Thrampoulidis

Keywords Abstract Paper

FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training

Yonggan Fu, Haoran You, Yang Zhao and Yue Wang, Chaojian Li, Kailash Gopalakrishnan, Zhangyang Wang, Yingyan Lin

Keywords Abstract Paper

Functional Gradient Boosting for Learning Residual-like Networks with Statistical Guarantees

Atsushi Nitanda, Taiji Suzuki

Keywords Abstract Paper

Growing Efficient Deep Networks by Structured Continuous Sparsification

Xin Yuan, Pedro Savarese, Michael Maire

Keywords Abstract Paper

network pruning, computer vision, deep learning, neural architecture search

Model-Based Domain Generalization

Alexander Robey, George J. Pappas, Hamed Hassani

Keywords Abstract Paper

theory, deep learning, optimization, robustness, domain adaptation

Kang Zhao, Sida Huang, Pan Pan and
Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Keywords Paper

Keywords Paper

Keywords Paper

Luca Zancato, Alessandro Achille, Avinash Ravichandran and
RAHUL Bhotika, Stefano Soatto

Keywords Paper

Keywords Paper

T. Anderson Keller, Jorn Peters, Priyank Jaini and
Emiel Hoogeboom, Patrick Forré, Max Welling

Keywords Paper

Keywords Paper

Somjit Nath, Vincent Liu, Alan Chan and
Xin Li, Adam White, Martha White

Keywords Paper

Setareh Ariafar, Zelda Mariet, Dana Brooks and
Jennifer Dy, Jasper Snoek

Keywords Paper

Steven K. Esser, Jeffrey L. McKinstry, Deepika Bablani and
Rathinakumar Appuswamy, Dharmendra S. Modha

Keywords Paper

Keywords Paper

Qianli Shen, Yan Li, Haoming Jiang and
Zhaoran Wang, Tuo Zhao

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Fangcheng Fu, Yuzheng Hu, Yihan He and
Jiawei Jiang, Yingxia Shao, Ce Zhang, Bin Cui

Keywords Paper

Haichao Yu, Haoxiang Li, Humphrey Shi and
Thomas S. Huang, Gang Hua

Keywords Paper

Keywords Paper

Yonggan Fu, Haoran You, Yang Zhao and
Yue Wang, Chaojian Li, Kailash Gopalakrishnan, Zhangyang Wang, Yingyan Lin

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Michael Luo, Jiahao Yao, Richard Liaw and
Eric Liang, Ion Stoica

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Karthik Abinav Sankararaman, Soham De, Zheng Xu and
W. Ronny Huang, Tom Goldstein

Keywords Paper

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and
Danil Karpushkin, Dmitry Vetrov

Keywords Paper

Keywords Paper

Chao-Yuan Wu, Ross Girshick, Kaiming He and
Christoph Feichtenhofer, Philipp Krähenbühl

Keywords Paper

Keywords Paper

Ching-Yao Chuang, Youssef Mroueh, Kristjan Greenewald and
Antonio Torralba, Stefanie Jegelka

Keywords Paper

Jean Kossaifi, Antoine Toisoul, Adrian Bulat and
Yannis Panagakis, Timothy M. Hospedales, Maja Pantic

Keywords Paper