Elastic Consistency: A Practical Consistency Model for Distributed Stochastic Gradient Descent

02/02/2021

Elastic Consistency: A Practical Consistency Model for Distributed Stochastic Gradient Descent

Giorgi Nadiradze, Ilia Markov, Bapi Chatterjee, Vyacheslav Kungurtsev, Dan Alistarh

Keywords:

Abstract Paper Similar Papers

Abstract: One key element behind the recent progress of machine learning has been the ability to train machine learning models in large-scale distributed shared-memory and message-passing environments. Most of these models are trained employing variants of stochastic gradient descent (SGD) based optimization, but most methods involve some type of consistency relaxation relative to sequential SGD, to mitigate its large communication or synchronization costs at scale. In this paper, we introduce a general consistency condition covering communication-reduced and asynchronous distributed SGD implementations. Our framework, called elastic consistency, decouples the system-specific aspects of the implementation from the SGD convergence requirements, giving a general way to obtain convergence bounds for a wide variety of distributed SGD methods used in practice. Elastic consistency can be used to re-derive or improve several previous convergence bounds in message-passing and shared-memory settings, but also to analyze new models and distribution schemes. As a direct application, we propose and analyze a new synchronization-avoiding scheduling scheme for distributed SGD, and show that it can be used to efficiently train deep convolutional models for image classification.

The video of this talk cannot be embedded. You can watch it here:

https://slideslive.com/38949244

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AAAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

26/08/2020

Deterministic Decoding for Discrete Data in Variational Autoencoders

Daniil Polykovskiy, Dmitry Vetrov

Keywords Paper

0

0

0

0

9:00

03/05/2021

On Data-Augmentation and Consistency-Based Semi-Supervised Learning

Atin Ghosh, alexandre thiery

Keywords Paper

Semi-Supervised Learning, Regularization, Data augmentation

0

0

0

0

4:42

12/07/2020

Decoupled Greedy Learning of CNNs

Eugene Belilovsky, Michael Eickenberg, Edouard Oyallon

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

16:04

26/04/2020

Target-Embedding Autoencoders for Supervised Representation Learning

Daniel Jarrett, Mihaela van der Schaar

Keywords Paper

autoencoders, supervised learning, representation learning, target-embedding, label-embedding

0

0

0

0

10:47

02/02/2021

Asynchronous Optimization Methods for Efficient Training of Deep Neural Networks with Guarantees

Vyacheslav Kungurtsev, Malcolm Egan, Bapi Chatterjee, Dan Alistarh

Keywords Paper

0

0

0

0

19:56

14/06/2020

Semi-Supervised Semantic Segmentation With Cross-Consistency Training

Yassine Ouali, Céline Hudelot, Myriam Tami

Keywords Paper

semantic segmentation, semi-supervised learning, consistency training, semi-supervised semantic segmentation

0

0

0

0

1:01

06/12/2020

Learning from Aggregate Observations

Yivan Zhang, Nontawat Charoenphakdee, Zhenguo Wu, Masashi Sugiyama

Keywords Paper

0

0

0

0

3:21

19/08/2021

On Learning Sets of Symmetric Elements (Extended Abstract)

Haggai Maron, Or Litany, Gal Chechik, Ethan Fetaya

Keywords Paper

Machine Learning, Deep Learning

0

0

0

0

13:14

06/12/2020

Untangling tradeoffs between recurrence and self-attention in artificial neural networks

Giancarlo Kerg, bhargav104 Kanuparthi, Anirudh Goyal ALIAS PARTH GOYAL and
Kyle Goyette, Yoshua Bengio, Guillaume Lajoie

Keywords Paper

0

0

0

0

3:20

18/07/2021

A Wasserstein Minimax Framework for Mixed Linear Regression

Theo Diamandis, Yonina Eldar, Alireza Fallah and
Farzan Farnia, Asuman Ozdaglar

Keywords Paper

Algorithms, Multimodal Learning

0

0

0

0

25:41

12/07/2020

On Learning Sets of Symmetric Elements

Haggai Maron, Or Litany, Gal Chechik, Ethan Fetaya

Keywords Paper

Deep Learning - General

0

0

0

0

11:46

18/07/2021

Cross-Gradient Aggregation for Decentralized Learning from Non-IID Data

Yasaman Esfandiari, Sin Yong Tan, Zhanhong Jiang and
Aditya Balu, Ethan Herron, Chinmay Hegde, Soumik Sarkar

Keywords Paper

Optimization, Distributed and Parallel Optimization

1

0

0

1

5:15

16/11/2020

Generative adversarial training of product of policies for robust and adaptive movement primitives

Emmanuel Pignat, Hakan Girgin, Sylvain Calinon

Keywords Paper

0

0

0

0

4:26

03/05/2021

Learning explanations that are hard to vary

Giambattista Parascandolo, Alexander Neitz, Antonio Orvieto and
Luigi Gresele, Bernhard Schoelkopf

Keywords Paper

invariances, gradient alignment, consistency

0

0

0

0

5:16

18/07/2021

Byzantine-Resilient High-Dimensional SGD with Local Iterations on Heterogeneous Data

Deepesh Data, Suhas Diggavi

Keywords Paper

Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

1

5:12

06/12/2021

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and
Danil Karpushkin, Dmitry Vetrov

Keywords Paper

deep learning, optimization

0

0

0

0

14:26

14/06/2020

Online Deep Clustering for Unsupervised Representation Learning

Xiaohang Zhan, Jiahao Xie, Ziwei Liu and
Yew-Soon Ong, Chen Change Loy

Keywords Paper

unsupervised representation learning, self-supervised learning, clustering, unsupervised learning, unlabeled data, recognition, low-shot, classification, imagenet, feature

0

0

0

0

1:00

12/07/2020

Learning Autoencoders with Relational Regularization

Hongteng Xu, Dixin Luo, Ricardo Henao and
Svati Shah, Lawrence Carin

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

13:59

18/07/2021

Stochastic Sign Descent Methods: New Algorithms and Better Theory

Mher Safaryan, Peter Richtarik

Keywords Paper

Optimization, Distributed and Parallel Optimization

0

0

0

0

5:12

18/07/2021

Asynchronous Decentralized Optimization With Implicit Stochastic Variance Reduction

Kenta Niwa, Guoqiang Zhang, W. Bastiaan Kleijn and
Noboru Harada, Hiroshi Sawada, Akinori Fujino

Keywords Paper

Optimization, Distributed and Parallel Optimization

0

0

0

0

5:41

06/12/2021

Learning with Algorithmic Supervision via Continuous Relaxations

Felix Petersen, Christian Borgelt, Hilde Kuehne, Oliver Deussen

Keywords Paper

deep learning

0

0

0

0

11:39

18/07/2021

A New Representation of Successor Features for Transfer across Dissimilar Environments

Majid Abdolshah, Hung Le, Thommen Karimpanal George and
Sunil Gupta, Santu Rana, Svetha Venkatesh

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:43

26/04/2020

Meta-Learning with Warped Gradient Descent

Sebastian Flennerhag, Andrei A. Rusu, Razvan Pascanu and
Francesco Visin, Hujun Yin, Raia Hadsell

Keywords Paper

meta-learning, transfer learning

0

0

0

0

13:43

07/09/2020

On the Exploration of Incremental Learning for Fine-grained Image Retrieval

Wei Chen, Yu Liu, Weiping Wang and
Tinne Tuytelaars, Erwin M. Bakker, Michael Lew

Keywords Paper

Incremental learning, Fine-grained image retrieval, Catastrophic forgetting, Maximum Mean Discrepancy

0

0

0

0

8:32

06/12/2020

Beyond the Mean-Field: Structured Deep Gaussian Processes Improve the Predictive Uncertainties

Jakob Lindinger, David Reeb, Christoph Lippert, Barbara Rakitsch

Keywords Paper

0

0

0

0

3:21

22/11/2021

Multi-Source Domain Adaptation via supervised contrastive learning and confident consistency regularization

Marin Scalbert, Florent Couzinié-Devy, Maria Vakalopoulou

Keywords Paper

unsupervised domain adaptation, contrastive learning, semi-supervised learning, consistency regularization, domain shift

0

0

0

0

2:57

18/07/2021

Benchmarks, Algorithms, and Metrics for Hierarchical Disentanglement

Andrew Ross, Finale Doshi-Velez

Keywords Paper

Deep Learning, Embedding and Representation learning

0

0

0

0

15:59

03/05/2021

CPR: Classifier-Projection Regularization for Continual Learning

Sungmin Cha, Hsiang Hsu, Taebaek Hwang and
Flavio Calmon, Taesup Moon

Keywords Paper

regularization, wide local minima, continual learning

0

0

0

1

5:21

18/07/2021

Robust Unsupervised Learning via L-statistic Minimization

Andreas Maurer, Daniela Angela Parletta, Andrea Paudice, Massimiliano Pontil

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:03

13/04/2021

Contrastive learning of strong-mixing continuous-time stochastic processes

Bingbin Liu, Pradeep Ravikumar, Andrej Risteski

Keywords Paper

0

0

0

0

2:57

06/12/2021

Unleashing the Power of Contrastive Self-Supervised Visual Models via Contrast-Regularized Fine-Tuning

Yifan Zhang, Bryan Hooi, Dapeng Hu and
Jian Liang, Jiashi Feng

Keywords Paper

optimization, machine learning, self-supervised learning, vision, contrastive learning, representation learning, transfer learning

0

0

0

0

14:34

03/05/2021

Conditional Generative Modeling via Learning the Latent Space

Sameera Ramasinghe, Kanchana Ranasinghe, Salman Khan and
Nick Barnes, Stephen Gould

Keywords Paper

Generative Modeling, Conditional Generation, Multimodal Spaces

0

0

0

0

4:57

02/02/2021

Harmonized Dense Knowledge Distillation Training for Multi-Exit Architectures

Xinglu Wang, Yingming Li

Keywords Paper

0

0

0

0

15:12

06/12/2021

Constrained Robust Submodular Partitioning

Shengjie Wang, Tianyi Zhou, Chandrashekhar Lavania, Jeff A Bilmes

Keywords Paper

optimization, machine learning

0

0

0

0

15:20

03/05/2021

Initialization and Regularization of Factorized Neural Layers

Misha Khodak, Neil Tenenholtz, Lester Mackey, Nicolo Fusi

Keywords Paper

matrix factorization, knowledge distillation, multi-head attention, model compression

0

0

0

0

4:25

07/09/2020

Adversarial Concurrent Training: Optimizing Robustness and Accuracy Trade-off of Deep Neural Networks

Elahe Arani, Fahad Sarfraz, Bahram Zonooz

Keywords Paper

Adversarial Robustness, Generalization, Adversarial Training, Deep Learning, Collaborative Learning

0

0

0

0

3:39

06/12/2021

Generalization Bounds For Meta-Learning: An Information-Theoretic Analysis

Qi CHEN, Changjian Shui, Mario Marchand

Keywords Paper

deep learning, meta learning, few shot learning

0

0

0

0

11:45

03/05/2021

Disentangled Recurrent Wasserstein Autoencoder

Jun Han, Martin Min, Ligong Han and
Li Erran Li, Xuan Zhang

Keywords Paper

Recurrent Generative Model, Sequential Representation Learning, Disentanglement

0

0

0

0

9:17

03/05/2021

Remembering for the Right Reasons: Explanations Reduce Catastrophic Forgetting

Sayna Ebrahimi, Suzanne Petryk, Akash Gokul and
William Gan, Joseph E Gonzalez, Marcus Rohrbach, trevor darrell

Keywords Paper

Explainability, Catastrophic Forgetting, Continual Learning, XAI, Lifelong Learning

0

0

0

0

5:13

06/12/2021

A Minimalist Approach to Offline Reinforcement Learning

Scott Fujimoto, Shixiang (Shane) Gu

Keywords Paper

reinforcement learning and planning, generative model

1

0

0

0

8:31