A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning

03/05/2021

A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning

Samuel Horváth, Peter Richtarik

Keywords: distributed optimization, communication efficiency

Abstract Paper Similar Papers

Abstract: Modern large-scale machine learning applications require stochastic optimization algorithms to be implemented on distributed computing systems. A key bottleneck of such systems is the communication overhead for exchanging information across the workers, such as stochastic gradients. Among the many techniques proposed to remedy this issue, one of the most successful is the framework of compressed communication with error feedback (EF). EF remains the only known technique that can deal with the error induced by contractive compressors which are not unbiased, such as Top-$K$ or PowerSGD. In this paper, we propose a new and theoretically and practically better alternative to EF for dealing with contractive compressors. In particular, we propose a construction which can transform any contractive compressor into an induced unbiased compressor. Following this transformation, existing methods able to work with unbiased compressors can be applied. We show that our approach leads to vast improvements over EF, including reduced memory requirements, better communication complexity guarantees and fewer assumptions. We further extend our results to federated learning with partial participation following an arbitrary distribution over the nodes and demonstrate the benefits thereof. We perform several numerical experiments which validate our theoretical findings.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Practical Low-Rank Communication Compression in Decentralized Deep Learning

Thijs Vogels, Sai Praneeth Karimireddy, Martin Jaggi

Keywords Paper

0

0

0

0

3:18

06/12/2021

Error Compensated Distributed SGD Can Be Accelerated

Xun Qian, Peter Richtarik, Tong Zhang

Keywords Paper

machine learning

0

0

0

0

8:18

06/12/2021

Escaping Saddle Points with Compressed SGD

Dmitrii Avdiukhin, Grigory Yaroslavtsev

Keywords Paper

optimization, machine learning

0

0

0

0

11:42

06/12/2021

Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems

Subhabrata Dutta, Tanya Gautam, Soumen Chakrabarti, Tanmoy Chakraborty

Keywords Paper

deep learning, transformers

0

0

0

0

11:54

12/07/2020

Variable-Bitrate Neural Compression via Bayesian Arithmetic Coding

Yibo Yang, Robert Bamler, Stephan Mandt

Keywords Paper

Deep Learning - General

0

0

0

0

15:08

14/06/2020

Learning to Optimize on SPD Manifolds

Zhi Gao, Yuwei Wu, Yunde Jia, Mehrtash Harandi

Keywords Paper

riemannian optimization, symmetric positive definite (spd) manifolds, optimization-based meta-learning, automatical spd optimizer design, learning to optimize, gradiend-based spd optimization, optimization problems with spd constraints

0

0

0

0

0:50

02/02/2021

A Flexible Framework for Communication-Efficient Machine Learning

Sarit Khirirat, Sindri Magnússon, Arda Aytekin, Mikael Johansson

Keywords Paper

0

0

0

0

17:49

06/12/2020

ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training

Chia-Yu Chen, Jiamin Ni, Songtao Lu and
Xiaodong Cui, Pin-Yu Chen, Xiao Sun, Naigang Wang, Swagath Venkataramani, Vijayalakshmi (Viji) Srinivasan, Wei Zhang, Kailash Gopalakrishnan

Keywords Paper

0

0

0

0

3:06

03/05/2021

Reducing the Computational Cost of Deep Generative Models with Binary Neural Networks

Thomas Bird, Friso Kingma, David Barber

Keywords Paper

generative, binary, optimization, compression

0

0

0

0

5:14

02/02/2021

Learning a Gradient-free Riemannian Optimizer on Tangent Spaces

Xiaomeng Fan, Zhi Gao, Yuwei Wu and
Yunde Jia, Mehrtash Harandi

Keywords Paper

0

0

0

0

16:43

03/05/2021

IDF++: Analyzing and Improving Integer Discrete Flows for Lossless Compression

Rianne van den Berg, Alexey Gritsenko, Mostafa Dehghani and
Casper Sønderby, Tim Salimans

Keywords Paper

generative modeling, normalizing flows, lossless source compression

0

0

0

0

5:04

06/12/2021

EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback

Peter Richtarik, Igor Sokolov, Ilyas Fatkhullin

Keywords Paper

optimization, machine learning

0

0

0

0

19:56

13/04/2021

Communication efficient primal-dual algorithm for nonconvex nonsmooth distributed optimization

Congliang Chen, Jiawei Zhang, Li Shen and
Peilin Zhao, Zhiquan Luo

Keywords Paper

0

0

0

0

3:01

06/12/2021

Smoothness Matrices Beat Smoothness Constants: Better Communication Compression Techniques for Distributed Optimization

Mher Safaryan, Filip Hanzely, Peter Richtarik

Keywords Paper

theory, optimization, machine learning

0

0

0

0

10:21

13/04/2021

A linearly convergent algorithm for decentralized optimization: Sending less bits for free!

Dmitry Kovalev, Anastasia Koloskova, Martin Jaggi and
Peter Richtarik, Sebastian Stich

Keywords Paper

0

0

0

0

3:07

18/07/2021

Communication-Efficient Distributed Optimization with Quantized Preconditioners

Foivos Alimisis, Peter Davies, Dan Alistarh

Keywords Paper

Optimization, Distributed and Parallel Optimization

0

0

0

0

5:33

06/12/2020

Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians

Juhan Bae, Roger Grosse

Keywords Paper

0

0

0

0

3:20

06/12/2020

Memory-Efficient Learning of Stable Linear Dynamical Systems for Prediction and Control

Giorgos Mamakoukas, Orest Xherija, Todd Murphey

Keywords Paper

Optimization -> Non-Convex Optimization, Optimization -> Stochastic Optimization

0

0

0

0

3:13

06/12/2020

Efficient Learning of Generative Models via Finite-Difference Score Matching

Tianyu Pang, Kun Xu, Chongxuan LI and
Yang Song, Stefano Ermon, Jun Zhu

Keywords Paper

0

0

0

0

2:59

18/07/2021

A Novel Sequential Coreset Method for Gradient Descent Algorithms

Jiawei Huang, Ruomin Huang, wenjie liu and
Nikolaos Freris, Hu Ding

Keywords Paper

Optimization

0

0

0

0

5:15

06/12/2021

Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding

Shengjie Luo, Shanda Li, Tianle Cai and
Di He, Dinglan Peng, Shuxin Zheng, Guolin Ke, Liwei Wang, Tie-Yan Liu

Keywords Paper

optimization, machine learning, transformers, vision

0

0

0

0

10:07

12/07/2020

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

13:21

06/12/2021

Differentiable Spline Approximations

Minsu Cho, Aditya Balu, Ameya Joshi and
Anjana Deva Prasad, Biswajit Khara, Soumik Sarkar, Baskar Ganapathysubramanian, Adarsh Krishnamurthy, Chinmay Hegde

Keywords Paper

optimization, machine learning

0

0

0

0

7:18

18/07/2021

Intermediate Layer Optimization for Inverse Problems using Deep Generative Models

Giannis Daras, Joseph Dean, Ajil Jalal, Alex Dimakis

Keywords Paper

Algorithms, Sparsity and Compressed Sensing

0

0

0

0

5:16

02/02/2021

Communication-Efficient Frank-Wolfe Algorithm for Nonconvex Decentralized Distributed Learning

Wenhan Xian, Feihu Huang, Heng Huang

Keywords Paper

0

0

0

0

16:02

06/12/2021

Discrete-Valued Neural Communication

Dianbo Liu, Alex Lamb, Kenji Kawaguchi and
Anirudh Goyal ALIAS PARTH GOYAL, Chen Sun, Michael Mozer, Yoshua Bengio

Keywords Paper

deep learning, robustness, transformers, generative model, graph learning

0

0

0

0

11:09

12/07/2020

Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks

Zhishuai Guo, Mingrui Liu, Zhuoning Yuan and
Li Shen, Wei Liu, Tianbao Yang

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

14:42

06/12/2020

Walsh-Hadamard Variational Inference for Bayesian Deep Learning

Simone Rossi, Sebastien Marmin, Maurizio Filippone

Keywords Paper

0

0

0

0

2:59

14/06/2020

Automatic Neural Network Compression by Sparsity-Quantization Joint Learning: A Constrained Optimization-Based Approach

Haichuan Yang, Shupeng Gui, Yuhao Zhu, Ji Liu

Keywords Paper

model compression, pruning, quantization, structured projection

0

0

0

0

1:01

14/06/2020

Forward and Backward Information Retention for Accurate Binary Neural Networks

Haotong Qin, Ruihao Gong, Xianglong Liu and
Mingzhu Shen, Ziran Wei, Fengwei Yu, Jingkuan Song

Keywords Paper

model compression, binary neural networks, deep learning, quantization, computer vision

0

0

0

0

1:00

06/12/2021

Smooth Bilevel Programming for Sparse Regularization

Clarice Poon, Gabriel Peyré

Keywords Paper

machine learning

0

0

0

0

13:06

06/12/2021

Second-Order Neural ODE Optimizer

Guan-Horng Liu, Tianrong Chen, Evangelos Theodorou

Keywords Paper

deep learning, optimization, machine learning, vision

0

0

0

0

14:59

05/04/2021

An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems

Ahmed M. Abdelmoniem, Ahmed Elzanaty Elzanaty, Mohamed-Slim Alouini , Marco Canini

Keywords Paper

0

0

0

0

4:13

05/04/2021

An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems

Ahmed M. Abdelmoniem, Ahmed Elzanaty Elzanaty, Mohamed-Slim Alouini , Marco Canini

Keywords Paper

0

0

0

0

22:37

12/07/2020

Duality in RKHSs with Infinite Dimensional Outputs: Application to Robust Losses

Pierre Laforgue, Alex Lambert, Luc Brogat-Motte, Florence d'Alche-Buc

Keywords Paper

General Machine Learning Techniques

0

0

0

0

14:36

06/12/2021

Differentiable Optimization of Generalized Nondecomposable Functions using Linear Programs

Zihang Meng, Lopamudra Mukherjee, Yichao Wu and
Vikas Singh, Sathya Narayanan Ravi

Keywords Paper

deep learning, optimization

0

0

0

0

13:21

26/08/2020

Gaussian-Smoothed Optimal Transport: Metric Structure and Statistical Efficiency

Ziv Goldfeld, Kristjan Greenewald

Keywords Paper

0

0

0

0

14:45

06/12/2020

CSER: Communication-efficient SGD with Error Reset

Cong Xie, Shuai Zheng, Sanmi Koyejo and
Indranil Gupta, Mu Li, Haibin Lin

Keywords Paper

0

0

0

0

3:12

06/12/2021

Distributional Gradient Matching for Learning Uncertain Neural Dynamics Models

Lenart Treven, Philippe Wenk, Florian Dorfler, Andreas Krause

Keywords Paper

deep learning, reinforcement learning and planning, kernel methods, active learning

0

0

0

0

14:46

06/12/2021

A Faster Decentralized Algorithm for Nonconvex Minimax Problems

Wenhan Xian, Feihu Huang, Yanfu Zhang, Heng Huang

Keywords Paper

optimization, machine learning, adversarial robustness and security

0

0

0

0

13:59