CPT: Efficient Deep Neural Network Training via Cyclic Precision

03/05/2021

CPT: Efficient Deep Neural Network Training via Cyclic Precision

Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

Keywords: low precision training, Efficient training

Abstract Paper Similar Papers

Abstract: Low-precision deep neural network (DNN) training has gained tremendous attention as reducing precision is one of the most effective knobs for boosting DNNs' training time/energy efficiency. In this paper, we attempt to explore low-precision training from a new perspective as inspired by recent findings in understanding DNN training: we conjecture that DNNs' precision might have a similar effect as the learning rate during DNN training, and advocate dynamic precision along the training trajectory for further boosting the time/energy efficiency of DNN training. Specifically, we propose Cyclic Precision Training (CPT) to cyclically vary the precision between two boundary values which can be identified using a simple precision range test within the first few training epochs. Extensive simulations and ablation studies on five datasets and eleven models demonstrate that CPT's effectiveness is consistent across various models/tasks (including classification and language modeling). Furthermore, through experiments and visualization we show that CPT helps to (1) converge to a wider minima with a lower generalization error and (2) reduce training variance which we believe opens up a new design knob for simultaneously improving the optimization and efficiency of DNN training.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training

Yonggan Fu, Haoran You, Yang Zhao and
Yue Wang, Chaojian Li, Kailash Gopalakrishnan, Zhangyang Wang, Yingyan Lin

Keywords Paper

0

1

0

1

3:19

18/07/2021

Training Data Subset Selection for Regression with Controlled Generalization Error

Durga S, Rishabh Iyer, Ganesh Ramakrishnan, Abir De

Keywords Paper

, Algorithms, Online Learning, Algorithms, Supervised Learning

0

0

0

0

4:15

12/07/2020

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

13:21

06/12/2020

Minimax Lower Bounds for Transfer Learning with Linear and One-hidden Layer Neural Networks

Mohammadreza Mousavi Kalan, Zalan Fabian, Salman Avestimehr, Mahdi Soltanolkotabi

Keywords Paper

0

0

0

0

3:16

14/06/2020

Fixed-Point Back-Propagation Training

Xishan Zhang, Shaoli Liu, Rui Zhang and
Chang Liu, Di Huang, Shiyi Zhou, Jiaming Guo, Qi Guo, Zidong Du, Tian Zhi, Yunji Chen

Keywords Paper

network quantization, fixed-point training, deep learning, neural network

1

0

0

0

1:01

18/07/2021

Offline Meta-Reinforcement Learning with Advantage Weighting

Eric Mitchell, Rafael Rafailov, Xue Bin Peng and
Sergey Levine, Chelsea Finn

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

1

0

0

0

5:08

06/12/2021

A Theoretical Analysis of Fine-tuning with Linear Teachers

Gal Shachaf, Alon Brutzkus, Amir Globerson

Keywords Paper

theory, deep learning, transfer learning

0

0

0

0

14:01

06/12/2021

Second-Order Neural ODE Optimizer

Guan-Horng Liu, Tianrong Chen, Evangelos Theodorou

Keywords Paper

deep learning, optimization, machine learning, vision

0

0

0

0

14:59

13/04/2021

On the generalization properties of adversarial training

Yue Xing, Qifan Song, Guang Cheng

Keywords Paper

0

0

0

0

3:05

12/07/2020

Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TinyScript

Fangcheng Fu, Yuzheng Hu, Yihan He and
Jiawei Jiang, Yingxia Shao, Ce Zhang, Bin Cui

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

9:59

06/12/2020

Auxiliary Task Reweighting for Minimum-data Learning

Baifeng Shi, Judy Hoffman, Kate Saenko and
Trevor Darrell, Huijuan Xu

Keywords Paper

0

0

0

0

3:28

02/02/2021

Any-Precision Deep Neural Networks

Haichao Yu, Haoxiang Li, Humphrey Shi and
Thomas S. Huang, Gang Hua

Keywords Paper

0

0

0

0

14:26

12/07/2020

Double Trouble in Double Descent: Bias and Variance(s) in the Lazy Regime

Stéphane d'Ascoli, Maria Refinetti, Giulio Biroli, Florent Krzakala

Keywords Paper

Deep Learning - Theory

0

0

0

0

15:11

20/07/2020

A type of generalization error induced by initialization in deep neural networks

Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

Keywords Paper

0

0

0

0

17:33

03/05/2021

Dataset Meta-Learning from Kernel Ridge-Regression

Timothy Nguyen, Zhourong Chen, Jaehoon Lee

Keywords Paper

dataset corruption, infinite-width networks, neural kernels, kernel-ridge regression, dataset compression, dataset distillation, meta-learning

0

0

0

0

4:59

26/04/2020

The Break-Even Point on Optimization Trajectories of Deep Neural Networks

Stanislaw Jastrzebski, Maciej Szymczak, Stanislav Fort and
Devansh Arpit, Jacek Tabor, Kyunghyun Cho, Krzysztof Geras

Keywords Paper

generalization, sgd, learning rate, batch size, hessian, curvature, trajectory, optimization

0

0

0

0

4:42

03/05/2021

BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction

Yuhang Li, Ruihao Gong, Xu Tan and
Yang Yang, Peng Hu, Qi Zhang, fengwei yu, Wei Wang, Shi Gu

Keywords Paper

Second-order analysis, Mixed Precision, Post Training Quantization

0

0

0

0

4:36

06/12/2020

Predicting Training Time Without Training

Luca Zancato, Alessandro Achille, Avinash Ravichandran and
RAHUL Bhotika, Stefano Soatto

Keywords Paper

0

0

0

0

3:23

06/12/2021

R-Drop: Regularized Dropout for Neural Networks

xiaobo liang, Lijun Wu, Juntao Li and
Yue Wang, Qi Meng, Tao Qin, Wei Chen, Min Zhang, Tie-Yan Liu

Keywords Paper

deep learning, machine learning, transformers, vision, language

0

0

0

0

13:02

03/05/2021

Meta-learning with negative learning rates

Alberto Bernacchia

Keywords Paper

Meta-learning

0

0

0

0

5:19

18/07/2021

DORO: Distributional and Outlier Robust Optimization

Runtian Zhai, Chen Dan, Zico Kolter, Pradeep Ravikumar

Keywords Paper

Probabilistic Methods, Robust statistics

0

0

0

1

5:06

09/07/2020

Precise Tradeoffs in Adversarial Training for Linear Regression

Adel Javanmard, Mahdi Soltanolkotabi, Hamed Hassani

Keywords Paper

Adversarial learning and robustness, High-dimensional statistics, Regression

0

0

0

0

15:49

18/07/2021

Nondeterminism and Instability in Neural Network Optimization

Cecilia Summers, Michael J Dinneen

Keywords Paper

Deep Learning, Optimization for Deep Networks

0

0

0

0

5:12

26/04/2020

Learned step size quantization

Steven K. Esser, Jeffrey L. McKinstry, Deepika Bablani and
Rathinakumar Appuswamy, Dharmendra S. Modha

Keywords Paper

deep learning, low precision, classification, quantization

0

0

0

0

4:40

03/05/2021

Entropic gradient descent algorithms and wide flat minima

Fabrizio Pittorino, Carlo Lucibello, Christoph Feinauer and
Gabriele Perugini, Carlo Baldassi, Elizaveta Demyanenko, Riccardo Zecchina

Keywords Paper

flat minima, belief-propagation, statistical physics, entropic algorithms

0

0

0

0

5:38

18/07/2021

Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision

Johan Björck, Xiangyu Chen, Christopher De Sa and
Carla Gomes, Kilian Weinberger

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:19

18/07/2021

Monotonic Robust Policy Optimization with Model Discrepancy

yuankun jiang, Chenglin Li, Wenrui Dai and
Junni Zou, Hongkai Xiong

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:17

06/12/2020

Curriculum Learning by Dynamic Instance Hardness

Tianyi Zhou, Shengjie Wang, Jeff A Bilmes

Keywords Paper

0

0

0

0

3:24

18/07/2021

Whitening and Second Order Optimization Both Make Information in the Dataset Unusable During Training, and Can Reduce or Prevent Generalization

Neha Wadia, Daniel Duckworth, Samuel Schoenholz and
Ethan Dyer, Jascha Sohl-Dickstein

Keywords Paper

Optimization, Probabilistic Methods, Topic Models, Probabilistic Methods, Latent Variable Models

0

0

0

0

5:17

19/08/2021

Towards Understanding the Spectral Bias of Deep Learning

Yuan Cao, Zhiying Fang, Yue Wu and
Ding-Xuan Zhou, Quanquan Gu

Keywords Paper

Machine Learning, Deep Learning, Kernel Methods

0

0

0

0

14:42

06/12/2020

Temporal Spike Sequence Learning via Backpropagation for Deep Spiking Neural Networks

Wenrui Zhang, Peng Li

Keywords Paper

0

0

0

0

3:06

26/04/2020

Revisiting Self-Training for Neural Sequence Generation

Junxian He, Jiatao Gu, Jiajun Shen, Marc'Aurelio Ranzato

Keywords Paper

self-training, semi-supervised learning, neural sequence generatioin

0

0

0

0

5:07

06/12/2020

Training Stronger Baselines for Learning to Optimize

Tianlong Chen, Weiyi Zhang, Zhou Jingyang and
Shiyu Chang, Sijia Liu, Lisa Amini, Zhangyang Wang

Keywords Paper

0

0

0

0

3:18

02/02/2021

Distribution Adaptive INT8 Quantization for Training CNNs

Kang Zhao, Sida Huang, Pan Pan and
Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Keywords Paper

0

0

0

0

16:42

03/05/2021

Class Normalization for (Continual)? Generalized Zero-Shot Learning

Ivan Skorokhodov, Mohamed Elhoseiny

Keywords Paper

initialization, normalization, zero-shot learning, continual learning

0

0

0

0

4:45

06/12/2020

Task-Robust Model-Agnostic Meta-Learning

Liam Collins, Aryan Mokhtari, Sanjay Shakkottai

Keywords Paper

0

0

0

0

3:17

03/05/2021

ResNet After All: Neural ODEs and Their Numerical Solution

Katharina Ott, Prateek Katiyar, Philipp Hennig, Michael Tiemann

Keywords Paper

0

0

0

0

5:10

06/12/2021

Functional Regularization for Reinforcement Learning via Learned Fourier Features

Alexander Li, Deepak Pathak

Keywords Paper

deep learning, optimization, reinforcement learning and planning

0

0

0

0

14:35

07/09/2020

Meta-RetinaNet for Few-shot Object Detection

Shaoqi Li, Wenfeng Song, Shuai Li and
Aimin Hao, Hong Qin

Keywords Paper

Few shot, object detection, meta-learning, Meta-RetinaNet, Balanced Loss, coefficient vector

0

0

0

0

8:51

06/12/2020

On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them

Chen Liu, Mathieu Salzmann, Tao Lin and
Ryota Tomioka, Sabine Süsstrunk

Keywords Paper

Algorithms -> Representation Learning, Applications -> Dialog- or Communication-Based Learning

0

0

0

0

3:29