Critical parameters for scalable distributed learning with large batches and asynchronous updates

Abstract: It has been experimentally observed that the efficiency of distributed training with stochastic gradient (SGD) depends decisively on the batch size and—in asynchronous implementations—on the gradient staleness. Especially, it has been observed that the speedup saturates beyond a certain batch size and/or when the delays grow too large. We identify a data-dependent parameter that explains the speedup saturation in both these settings. Our comprehensive theoretical analysis, for strongly convex, convex and non-convex settings, unifies and generalized prior work directions that often focused on only one of these two aspects. In particular, our approach allows us to derive improved speedup results under frequently considered sparsity assumptions. Our insights give rise to theoretically based guidelines on how the learning rates can be adjusted in practice. We show that our results are tight and illustrate key findings in numerical experiments.

06/12/2020

Critical parameters for scalable distributed learning with large batches and asynchronous updates

Sebastian Stich, Amirkeivan Mohtashami, Martin Jaggi

Comments

Similar Papers

On Warm-Starting Neural Network Training

Jordan Ash, Ryan Adams

Keywords Abstract Paper

Model Performance Scaling with Multiple Data Sources

Tatsunori Hashimoto

Keywords Abstract Paper

Algorithms, Supervised Learning

Efficient Training of Retrieval Models using Negative Cache

Erik Lindgren, Sashank Reddi, Ruiqi Guo, Sanjiv Kumar

Keywords Abstract Paper

deep learning, machine learning

GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning

Krishnateja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, Rishabh Iyer

Keywords Abstract Paper

Batch Active Learning at Scale

Gui Citovsky, Giulia DeSalvo, Claudio Gentile and Lazaros Karydas, Anand Rajagopalan, Afshin Rostamizadeh, Sanjiv Kumar

Keywords Abstract Paper

active learning

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Abstract Paper

Deep Learning - Algorithms

On the Loss Landscape of Adversarial Training: Identifying Challenges and How to Overcome Them

Chen Liu, Mathieu Salzmann, Tao Lin and Ryota Tomioka, Sabine Süsstrunk

Keywords Abstract Paper

Algorithms -> Representation Learning, Applications -> Dialog- or Communication-Based Learning

Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints

Mengtian Li, Ersin Yumer, Deva Ramanan

Keywords Abstract Paper

budgeted training, learning rate schedule, linear schedule, annealing, learning rate decay

Robust Unsupervised Learning via L-statistic Minimization

Andreas Maurer, Daniela Angela Parletta, Andrea Paudice, Massimiliano Pontil

Keywords Abstract Paper

Theory, Statistical Learning Theory

Theoretical bounds on estimation error for meta-learning

James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI and Toniann Pitassi, Richard Zemel

Keywords Abstract Paper

meta learning, minimax risk, few-shot, lower bounds, learning theory

The Impact of Record Linkage on Learning from Feature Partitioned Data

Richard Nock, Stephen J Hardy, Wilko Henecka and Hamish Ivey-Law, Jakub Nabaglo, Giorgio Patrini, Guillaume Smith, Brian Thorne

Keywords Abstract Paper

Theory, Statistical Learning Theory

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and Danil Karpushkin, Dmitry Vetrov

Keywords Abstract Paper

deep learning, optimization

RATT: Leveraging Unlabeled Data to Guarantee Generalization

Saurabh Garg, Sivaraman Balakrishnan, Zico Kolter, Zachary Lipton

Keywords Abstract Paper

Probabilistic Methods, Graphical Models, Theory, Computational Complexity, Theory, Models of Learning and Generalization

Self Normalizing Flows

T. Anderson Keller, Jorn Peters, Priyank Jaini and Emiel Hoogeboom, Patrick Forré, Max Welling

Keywords Abstract Paper

Deep Learning, Generative Models

Submodular Meta-Learning

Arman Adibi, Aryan Mokhtari, Hamed Hassani

Keywords Abstract Paper

Task-Robust Model-Agnostic Meta-Learning

Liam Collins, Aryan Mokhtari, Sanjay Shakkottai

Keywords Abstract Paper

Training Data Subset Selection for Regression with Controlled Generalization Error

Durga S, Rishabh Iyer, Ganesh Ramakrishnan, Abir De

Keywords Abstract Paper

, Algorithms, Online Learning, Algorithms, Supervised Learning

Adversarial Regression with Doubly Non-negative Weighting Matrices

Tam Le, Truyen Nguyen, Makoto Yamada and Jose Blanchet, Viet Anh Nguyen

Keywords Abstract Paper

machine learning

Generative adversarial training of product of policies for robust and adaptive movement primitives

Emmanuel Pignat, Hakan Girgin, Sylvain Calinon

Keywords Abstract Paper

Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models

Rares-Darius Buhai, Yoni Halpern, Yoon Kim and Andrej Risteski, David Sontag

Keywords Abstract Paper

Probabilistic Inference - Models and Probabilistic Programming

More Data Can Expand The Generalization Gap Between Adversarially Robust and Standard Models

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Gui Citovsky, Giulia DeSalvo, Claudio Gentile and
Lazaros Karydas, Anand Rajagopalan, Afshin Rostamizadeh, Sanjiv Kumar

Keywords Paper

Keywords Paper

Chen Liu, Mathieu Salzmann, Tao Lin and
Ryota Tomioka, Sabine Süsstrunk

Keywords Paper

Keywords Paper

Keywords Paper

James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI and
Toniann Pitassi, Richard Zemel

Keywords Paper

Richard Nock, Stephen J Hardy, Wilko Henecka and
Hamish Ivey-Law, Jakub Nabaglo, Giorgio Patrini, Guillaume Smith, Brian Thorne

Keywords Paper

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and
Danil Karpushkin, Dmitry Vetrov

Keywords Paper

Keywords Paper

T. Anderson Keller, Jorn Peters, Priyank Jaini and
Emiel Hoogeboom, Patrick Forré, Max Welling

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Tam Le, Truyen Nguyen, Makoto Yamada and
Jose Blanchet, Viet Anh Nguyen

Keywords Paper

Keywords Paper

Rares-Darius Buhai, Yoni Halpern, Yoon Kim and
Andrej Risteski, David Sontag

Keywords Paper

Keywords Paper

Sean Sinclair, Tianyu Wang, Gauri Jain and
Sid Banerjee, Christina Yu

Keywords Paper

Keywords Paper

Giancarlo Kerg, bhargav104 Kanuparthi, Anirudh Goyal ALIAS PARTH GOYAL and
Kyle Goyette, Yoshua Bengio, Guillaume Lajoie

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Mohammad Pezeshki, Oumar Kaba, Yoshua Bengio and
Aaron Courville, Doina Precup, Guillaume Lajoie

Keywords Paper

Panteha Naderian, Gabriel Loaiza-Ganem, Harry Braviner and
Anthony Caterini, Jesse C Cresswell, Tong Li, Animesh Garg

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper