Efficient Combination of Rematerialization and Offloading for Training DNNs

06/12/2021

Efficient Combination of Rematerialization and Offloading for Training DNNs

Olivier Beaumont, Lionel Eyraud-Dubois, Alena Shilova

Keywords: deep learning, optimization

Abstract Paper Similar Papers

Abstract: Rematerialization and offloading are two well known strategies to save memory during the training phase of deep neural networks, allowing data scientists to consider larger models, batch sizes or higher resolution data. Rematerialization trades memory for computation time, whereas Offloading trades memory for data movements. As these two resources are independent, it is appealing to consider the simultaneous combination of both strategies to save even more memory. We precisely model the costs and constraints corresponding to Deep Learning frameworks such as PyTorch or Tensorflow, we propose optimal algorithms to find a valid sequence of memory-constrained operations and finally, we evaluate the performance of proposed algorithms on realistic networks and computation platforms. Our experiments show that the possibility to offload can remove one third of the overhead of rematerialization, and that together they can reduce the memory used for activations by a factor 4 to 6, with an overhead below 20%.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Efficient Training of Retrieval Models using Negative Cache

Erik Lindgren, Sashank Reddi, Ruiqi Guo, Sanjiv Kumar

Keywords Paper

deep learning, machine learning

0

0

0

0

10:41

22/09/2020

Model size reduction using frequency based double hashing for recommender systems

Caojin Zhang, Yicun Liu, Yuanpu Xie and
Sofia Ira Ktena, Alykhan Tejani, Akshay Gupta, Pranay Kumar Myana, Deepak Dilipkumar, Suvadip Paul, Ikuhiro Ihara, Prasang Upadhyaya, Ferenc Huszar, Wenzhe Shi

Keywords Paper

neural networks, recommendation system, model size reduction

0

0

0

0

3:00

12/07/2020

Operation-Aware Soft Channel Pruning using Differentiable Masks

Minsoo Kang, Bohyung Han

Keywords Paper

Applications - Computer Vision

0

0

0

0

14:56

12/07/2020

Learning to Rank Learning Curves

Martin Wistuba, Tejaswini Pedapati

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

15:34

06/12/2020

GCN meets GPU: Decoupling “When to Sample” from “How to Sample”

Morteza Ramezani, Weilin Cong, Mehrdad Mahdavi and
Anand Sivasubramaniam, Mahmut Kandemir

Keywords Paper

0

0

0

0

3:24

06/12/2021

Sparse Flows: Pruning Continuous-depth Models

Lucas Liebenwein, Ramin Hasani, Alexander Amini, Daniela Rus

Keywords Paper

deep learning, generative model

0

0

0

0

12:51

19/08/2021

EventDrop: Data Augmentation for Event-based Learning

Fuqiang Gu, Weicong Sng, Xuke Hu, Fangwen Yu

Keywords Paper

Computer Vision, Recognition, Classification

0

0

0

0

8:48

06/12/2020

On Warm-Starting Neural Network Training

Jordan Ash, Ryan Adams

Keywords Paper

0

0

0

0

2:30

26/04/2020

Minimizing FLOPs to Learn Efficient Sparse Representations

Biswajit Paria, Chih-Kuan Yeh, Ian E.H. Yen and
Ning Xu, Pradeep Ravikumar, Barnabás Póczos

Keywords Paper

sparse embeddings, deep representations, metric learning, regularization

0

0

0

0

4:41

26/04/2020

Shifted and Squeezed 8-bit Floating Point format for Low-Precision Training of Deep Neural Networks

Leopold Cambier, Anahita Bhiwandiwalla, Ting Gong and
Oguz H. Elibol, Mehran Nekuii, Hanlin Tang

Keywords Paper

Low-precision training, numerics, deep learning

0

0

0

0

4:46

06/12/2021

Boost Neural Networks by Checkpoints

Feng Wang, Guoyizhe Wei, Qiao Liu and
Jinxiang Ou, xian wei, Hairong Lv

Keywords Paper

deep learning

1

0

0

0

4:45

06/12/2020

Untangling tradeoffs between recurrence and self-attention in artificial neural networks

Giancarlo Kerg, bhargav104 Kanuparthi, Anirudh Goyal ALIAS PARTH GOYAL and
Kyle Goyette, Yoshua Bengio, Guillaume Lajoie

Keywords Paper

0

0

0

0

3:20

12/07/2020

Analyzing the effect of neural network architecture on training performance

Karthik Abinav Sankararaman, Soham De, Zheng Xu and
W. Ronny Huang, Tom Goldstein

Keywords Paper

Deep Learning - Theory

0

0

0

0

14:03

03/05/2021

Dataset Meta-Learning from Kernel Ridge-Regression

Timothy Nguyen, Zhourong Chen, Jaehoon Lee

Keywords Paper

dataset corruption, infinite-width networks, neural kernels, kernel-ridge regression, dataset compression, dataset distillation, meta-learning

0

0

0

0

4:59

06/12/2021

Improving Computational Efficiency in Visual Reinforcement Learning via Stored Embeddings

Lili Chen, Kimin Lee, Aravind Srinivas, Pieter Abbeel

Keywords Paper

deep learning, reinforcement learning and planning

0

0

0

0

4:36

14/09/2020

Finding the Optimal Network Depth in Classification Tasks

Bartosz Wójcik, Maciej Wołczyk, Klaudia Bałazy, Jacek Tabor

Keywords Paper

model compression and acceleration, multi-head networks

0

0

0

0

8:13

12/07/2020

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

13:21

06/12/2020

A Group-Theoretic Framework for Data Augmentation

Shuxiao Chen, Edgar Dobriban, Jane Lee

Keywords Paper

0

0

0

0

3:28

06/12/2021

RETRIEVE: Coreset Selection for Efficient and Robust Semi-Supervised Learning

Krishnateja Killamsetty, Xujiang Zhao, Feng Chen, Rishabh Iyer

Keywords Paper

optimization, semi-supervised learning

0

0

0

0

13:59

02/02/2021

Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks

Xin Chen, Lingxi Xie, Jun Wu and
Longhui Wei, Yuhui Xu, Qi Tian

Keywords Paper

0

0

0

0

15:02

26/04/2020

Training Recurrent Neural Networks Online by Learning Explicit State Variables

Somjit Nath, Vincent Liu, Alan Chan and
Xin Li, Adam White, Martha White

Keywords Paper

Recurrent Neural Network, Partial Observability, Online Prediction, Incremental Learning

0

0

0

0

5:06

03/05/2021

Reducing the Computational Cost of Deep Generative Models with Binary Neural Networks

Thomas Bird, Friso Kingma, David Barber

Keywords Paper

generative, binary, optimization, compression

0

0

0

0

5:14

26/04/2020

Continual learning with hypernetworks

Johannes von Oswald, Christian Henning, João Sacramento, Benjamin F. Grewe

Keywords Paper

Continual Learning, Catastrophic Forgetting, Meta Model, Hypernetwork

0

0

0

0

5:04

03/05/2021

SkipW: Resource Adaptable RNN with Strict Upper Computational Limit

Tsiry MAYET, Anne Lambert, Pascal Le Guyadec and
Francoise Le Bolzer, François Schnitzler

Keywords Paper

Recurrent neural networks, Computational resources, Flexibility

0

0

0

0

4:57

26/08/2020

Doubly Sparse Variational Gaussian Processes

Vincent Adam, Stefanos Eleftheriadis, Artem Artemev and
Nicolas Durrande, James Hensman

Keywords Paper

0

0

0

0

15:06

18/07/2021

A Novel Sequential Coreset Method for Gradient Descent Algorithms

Jiawei Huang, Ruomin Huang, wenjie liu and
Nikolaos Freris, Hu Ding

Keywords Paper

Optimization

0

0

0

0

5:15

03/05/2021

Gradient Projection Memory for Continual Learning

Gobinda Saha, Isha Garg, Kaushik Roy

Keywords Paper

Continual Learning, Representation Learning, Computer Vision, Deep learning

0

0

0

0

17:12

06/12/2021

CAM-GAN: Continual Adaptation Modules for Generative Adversarial Networks

Sakshi Varshney, Vinay Kumar Verma, P. K. Srijith and
Lawrence Carin, Piyush Rai

Keywords Paper

generative model, representation learning, continual learning

0

0

0

0

14:50

03/05/2021

MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training

Beidi Chen, Zichang Liu, Binghui Peng and
Zhaozhuo Xu, Jonathan L Li, Tri Dao, Zhao Song, Anshumali Shrivastava, Christopher Re

Keywords Paper

Randomized Algorithms, Efficient Training, Large-scale Machine Learning, Large-scale Deep Learning

0

0

0

0

15:07

18/11/2020

Deep-n-cheap: An automated search framework for low complexity deep learning

Sourya Dey, Saikrishna C. Kanala, Keith M. Chugg, Peter A. Beerel

Keywords Paper

0

0

0

0

11:59

14/06/2020

Recursive Least-Squares Estimator-Aided Online Learning for Visual Tracking

Jin Gao, Weiming Hu, Yan Lu

Keywords Paper

online learning, visual tracking, continual learning, recursive least-squares estimation, deep learning, memory retention, recursive learning, mini-batch sgd, normal equation, mlp layer

0

0

0

0

5:01

12/07/2020

Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers

Zhuohan Li, Eric Wallace, Sheng Shen and
Kevin Lin, Kurt Keutzer, Dan Klein, Joseph Gonzalez

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

15:21

06/12/2021

Gradient-based Editing of Memory Examples for Online Task-free Continual Learning

Xisen Jin, Arka Sadhu, Junyi Du, Xiang Ren

Keywords Paper

continual learning

0

0

0

0

13:57

06/12/2021

Zero Time Waste: Recycling Predictions in Early Exit Neural Networks

Maciej Wołczyk, Bartosz Wójcik, Klaudia Bałazy and
Igor T Podolak, Jacek Tabor, Marek Śmieja, Tomasz Trzcinski

Keywords Paper

deep learning

0

0

0

0

8:28

12/07/2020

Adversarial Robustness via Runtime Masking and Cleansing

Yi-Hsuan Wu, Chia-Hung Yuan, Shan-Hung (Brandon) Wu

Keywords Paper

Adversarial Examples

0

0

0

0

13:38

03/05/2021

Revisiting Locally Supervised Learning: an Alternative to End-to-end Training

Yulin Wang, Zanlin Ni, Shiji Song and
Le Yang, Gao Huang

Keywords Paper

Deep learning, Locally supervised training

1

0

0

1

5:03

12/07/2020

Finding trainable sparse networks through Neural Tangent Transfer

Tianlin Liu, Friedemann Zenke

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

13:43

06/12/2020

Batch Normalization Biases Residual Blocks Towards the Identity Function in Deep Networks

Soham De, Sam Smith

Keywords Paper

0

0

0

0

3:23

18/07/2021

Opening the Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics

Avik Pal, Yingbo Ma, Viral Shah, Christopher Rackauckas

Keywords Paper

Deep Learning

0

0

0

0

5:11

14/06/2020

F-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation

Konstantin Sofiiuk, Ilia Petrov, Olga Barinova, Anton Konushin

Keywords Paper

interactive segmentation, interactive, instance segmentation, segmentation, backpropagating refinement, refinement

0

0

0

0

4:56