Practical Low-Rank Communication Compression in Decentralized Deep Learning

06/12/2020

Practical Low-Rank Communication Compression in Decentralized Deep Learning

Thijs Vogels, Sai Praneeth Karimireddy, Martin Jaggi

Keywords:

Abstract Paper Similar Papers

Abstract: Lossy gradient compression has become a practical tool to overcome the communication bottleneck in centrally coordinated distributed training of machine learning models. However, algorithms for decentralized training with compressed communication over arbitrary connected networks have been more complicated, requiring additional memory and hyperparameters. We introduce a simple algorithm that directly compresses the model differences between neighboring workers using low-rank linear compressors. We prove that our method does not require any additional hyperparameters, converges faster than prior methods, and is asymptotically independent of both the network and the compression. Inspired the PowerSGD algorithm for centralized deep learning, we execute power iteration steps on model differences to maximize the information transferred per bit. Out of the box, these compressors perform on par with state-of-the-art tuned compression algorithms in a series of deep learning benchmarks.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

03/05/2021

A Better Alternative to Error Feedback for Communication-Efficient Distributed Learning

Samuel Horváth, Peter Richtarik

Keywords Paper

distributed optimization, communication efficiency

0

0

0

0

4:28

05/04/2021

An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems

Ahmed M. Abdelmoniem, Ahmed Elzanaty Elzanaty, Mohamed-Slim Alouini , Marco Canini

Keywords Paper

0

0

0

0

4:13

05/04/2021

An Efficient Statistical-based Gradient Compression Technique for Distributed Training Systems

Ahmed M. Abdelmoniem, Ahmed Elzanaty Elzanaty, Mohamed-Slim Alouini , Marco Canini

Keywords Paper

0

0

0

0

22:37

05/04/2021

Pufferfish: Communication-efficient Models At No Extra Cost

Hongyi Wang, Saurabh Agarwal, Dimitrios Papailiopoulos

Keywords Paper

0

0

0

0

20:07

05/04/2021

Pufferfish: Communication-efficient Models At No Extra Cost

Hongyi Wang, Saurabh Agarwal, Dimitrios Papailiopoulos

Keywords Paper

0

0

0

0

4:47

13/04/2021

A linearly convergent algorithm for decentralized optimization: Sending less bits for free!

Dmitry Kovalev, Anastasia Koloskova, Martin Jaggi and
Peter Richtarik, Sebastian Stich

Keywords Paper

0

0

0

0

3:07

06/12/2020

ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training

Chia-Yu Chen, Jiamin Ni, Songtao Lu and
Xiaodong Cui, Pin-Yu Chen, Xiao Sun, Naigang Wang, Swagath Venkataramani, Vijayalakshmi (Viji) Srinivasan, Wei Zhang, Kailash Gopalakrishnan

Keywords Paper

0

0

0

0

3:06

06/12/2021

DeepReduce: A Sparse-tensor Communication Framework for Federated Deep Learning

Hang Xu, Kelly Kostopoulou, Aritra Dutta and
Xin Li, Alexandros Ntoulas, Panos Kalnis

Keywords Paper

deep learning, federated learning

0

0

0

0

12:15

07/09/2020

Paying more Attention to Snapshots of Iterative Pruning: Improving Model Compression via Ensemble Distillation

Duong Le, Nhan Vo, Nam Thoai

Keywords Paper

network pruning, knowledge distillation, ensemble learning

0

0

0

0

8:30

06/12/2021

Escaping Saddle Points with Compressed SGD

Dmitrii Avdiukhin, Grigory Yaroslavtsev

Keywords Paper

optimization, machine learning

0

0

0

0

11:42

30/11/2020

Lossless Image Compression Using a Multi-Scale Progressive Statistical Model

Honglei Zhang, Francesco Cricri, Hamed R. Tavakoli and
Nannan Zou, Emre Aksu, Miska M. Hannuksela

Keywords Paper

0

0

0

0

9:33

12/07/2020

Variable-Bitrate Neural Compression via Bayesian Arithmetic Coding

Yibo Yang, Robert Bamler, Stephan Mandt

Keywords Paper

Deep Learning - General

0

0

0

0

15:08

06/12/2020

Improved Analysis of Clipping Algorithms for Non-convex Optimization

Bohang Zhang, Jikai Jin, Cong Fang, Liwei Wang

Keywords Paper

0

0

0

0

3:16

12/07/2020

Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks

Zhishuai Guo, Mingrui Liu, Zhuoning Yuan and
Li Shen, Wei Liu, Tianbao Yang

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

14:42

02/02/2021

A Flexible Framework for Communication-Efficient Machine Learning

Sarit Khirirat, Sindri Magnússon, Arda Aytekin, Mikael Johansson

Keywords Paper

0

0

0

0

17:49

06/12/2020

Constant-Expansion Suffices for Compressed Sensing with Generative Priors

Constantinos Daskalakis, Dhruv Rohatgi, Emmanouil Zampetakis

Keywords Paper

0

0

0

0

3:13

12/07/2020

Operation-Aware Soft Channel Pruning using Differentiable Masks

Minsoo Kang, Bohyung Han

Keywords Paper

Applications - Computer Vision

0

0

0

0

14:56

06/12/2021

AC-GC: Lossy Activation Compression with Guaranteed Convergence

R David Evans, Tor Aamodt

Keywords Paper

deep learning, optimization, graph learning

0

0

0

0

14:39

06/12/2021

Error Compensated Distributed SGD Can Be Accelerated

Xun Qian, Peter Richtarik, Tong Zhang

Keywords Paper

machine learning

0

0

0

0

8:18

06/12/2020

Self-Supervised Generative Adversarial Compression

Chong Yu, Jeff Pool

Keywords Paper

0

0

0

0

3:20

06/12/2021

Rethinking gradient sparsification as total error minimization

Atal Sahu, Aritra Dutta, Ahmed M. Abdelmoniem and
Trambak Banerjee, Marco Canini, Panos Kalnis

Keywords Paper

deep learning, optimization

0

0

0

0

12:31

13/04/2021

Federated learning with compression: Unified analysis and sharp guarantees

Farzin Haddadpour, Mohammad Mahdi Kamani, Aryan Mokhtari, Mehrdad Mahdavi

Keywords Paper

0

0

0

0

3:03

06/12/2021

Smoothness Matrices Beat Smoothness Constants: Better Communication Compression Techniques for Distributed Optimization

Mher Safaryan, Filip Hanzely, Peter Richtarik

Keywords Paper

theory, optimization, machine learning

0

0

0

0

10:21

18/07/2021

A Novel Sequential Coreset Method for Gradient Descent Algorithms

Jiawei Huang, Ruomin Huang, wenjie liu and
Nikolaos Freris, Hu Ding

Keywords Paper

Optimization

0

0

0

0

5:15

06/12/2021

A Faster Decentralized Algorithm for Nonconvex Minimax Problems

Wenhan Xian, Feihu Huang, Yanfu Zhang, Heng Huang

Keywords Paper

optimization, machine learning, adversarial robustness and security

0

0

0

0

13:59

18/07/2021

Self Normalizing Flows

T. Anderson Keller, Jorn Peters, Priyank Jaini and
Emiel Hoogeboom, Patrick Forré, Max Welling

Keywords Paper

Deep Learning, Generative Models

0

1

1

0

4:24

09/07/2020

Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss

Lénaïc Chizat, Francis Bach

Keywords Paper

Neural networks/deep learning, Non-convex optimization

0

0

0

0

14:41

18/07/2021

Training Adversarially Robust Sparse Networks via Bayesian Connectivity Sampling

Ozan Özdenizci, Robert Legenstein

Keywords Paper

Algorithms, Adversarial Examples

0

0

0

1

6:27

06/12/2021

Large-Scale Learning with Fourier Features and Tensor Decompositions

Frederiek Wesel, Kim Batselier

Keywords Paper

machine learning, kernel methods

0

0

0

0

15:01

03/05/2021

Linear Convergent Decentralized Optimization with Compression

Xiaorui Liu, Yao Li, Rongrong Wang and
Jiliang Tang, Ming Yan

Keywords Paper

Decentralized Optimization, Heterogeneous data, Linear Convergence, Communication Compression

0

0

0

0

5:20

14/06/2020

Recursive Least-Squares Estimator-Aided Online Learning for Visual Tracking

Jin Gao, Weiming Hu, Yan Lu

Keywords Paper

online learning, visual tracking, continual learning, recursive least-squares estimation, deep learning, memory retention, recursive learning, mini-batch sgd, normal equation, mlp layer

0

0

0

0

5:01

14/06/2020

On the Acceleration of Deep Learning Model Parallelism With Staleness

An Xu, Zhouyuan Huo, Heng Huang

Keywords Paper

layer-wise staleness, asynchronous model parallelism, convolutional neural networks.

0

0

0

0

1:01

23/08/2020

Rethinking pruning for accelerating deep inference at the edge

Dawei Gao, Xiaoxi He, Zimu Zhou and
Yongxin Tong, Ke Xu, Lothar Thiele

Keywords Paper

automatic speech recognition, deep learning, name entity recognition, network pruning, sequence labelling

0

0

0

0

13:43

26/04/2020

Data-Independent Neural Pruning via Coresets

Ben Mussay, Margarita Osadchy, Vladimir Braverman and
Samson Zhou, Dan Feldman

Keywords Paper

coresets, neural pruning, network compression

0

0

0

0

4:23

18/07/2021

Opening the Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics

Avik Pal, Yingbo Ma, Viral Shah, Christopher Rackauckas

Keywords Paper

Deep Learning

0

0

0

0

5:11

26/04/2020

Why Gradient Clipping Accelerates Training: A Theoretical Justification for Adaptivity

Jingzhao Zhang, Tianxing He, Suvrit Sra, Ali Jadbabaie

Keywords Paper

Adaptive methods, optimization, deep learning

1

0

0

0

14:15

06/12/2020

Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks

Alexander Shekhovtsov, Viktor Yanush, Boris Flach

Keywords Paper

0

0

0

0

3:24

06/12/2021

Second-Order Neural ODE Optimizer

Guan-Horng Liu, Tianrong Chen, Evangelos Theodorou

Keywords Paper

deep learning, optimization, machine learning, vision

0

0

0

0

14:59

06/12/2021

EF21: A New, Simpler, Theoretically Better, and Practically Faster Error Feedback

Peter Richtarik, Igor Sokolov, Ilyas Fatkhullin

Keywords Paper

optimization, machine learning

0

0

0

0

19:56

26/04/2020

SpikeGrad: An ANN-equivalent Computation Model for Implementing Backpropagation with Spikes

Johannes C. Thiele, Olivier Bichler, Antoine Dupret

Keywords Paper

spiking neural network, neuromorphic engineering, backpropagation

0

0

0

0

5:21