LENA: Communication-efficient distributed learning with self-triggered gradient uploads

Abstract: In distributed optimization, parameter updates from the gradient computing node devices have to be aggregated in every iteration on the orchestrating server. When these updates are sent over an arbitrary commodity network, bandwidth and latency can be limiting factors. We propose a communication framework where nodes may skip unnecessary uploads. Every node locally accumulates an error vector in memory and self-triggers the upload of the memory contents to the parameter server using a significance filter. The server then uses a history of the nodes’ gradients to update the parameter. We characterize the convergence rate of our algorithm in smooth settings (strongly-convex, convex, and non-convex) and show that it enjoys the same convergence rate as when sending gradients every iteration, with substantially fewer uploads. Numerical experiments on real data indicate a significant reduction of used network resources (total communicated bits and latency), especially in large networks, compared to state-of-the-art algorithms. Our results provide important practical insights for using machine learning over resource-constrained networks, including Internet-of-Things and geo-separated datasets across the globe.

12/07/2020

distributed optimization, decentralized training methods, communication-efficient distributed training with momentum, large-scale parallel SGD

5:07

03/08/2020

LENA: Communication-efficient distributed learning with self-triggered gradient uploads

Hossein Shokri Ghadikolaei, Sebastian Stich, Martin Jaggi

Comments

Similar Papers

Distributed Online Optimization over a Heterogeneous Network

Nima Eshraghi, Ben Liang

Keywords Abstract Paper

Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks

Zhishuai Guo, Mingrui Liu, Zhuoning Yuan and Li Shen, Wei Liu, Tianbao Yang

Keywords Abstract Paper

Optimization - Large Scale, Parallel and Distributed

SlowMo: Improving Communication-Efficient Distributed SGD with Slow Momentum

Jianyu Wang, Vinayak Tantia, Nicolas Ballas, Michael Rabbat

Keywords Abstract Paper

distributed optimization, decentralized training methods, communication-efficient distributed training with momentum, large-scale parallel SGD

Brief announcement: Deterministic lower bound for dynamic balanced graph partitioning

Maciej Pacut, Mahmoud Parham, Stefan Schmid

Keywords Abstract Paper

online algorithms, graph partitioning, self-adjusting networks

Learning to Sparsify Differences of Synaptic Signal for Efficient Event Processing

Yusuke Sekikawa, Keisuke Uto

Keywords Abstract Paper

event-based processing, sigma-delta neuron, temporally sparse processing, learning to sparsify

Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization

Jianyu Wang, Qinghua Liu, Hao Liang and Gauri Joshi, H. Vincent Poor

Keywords Abstract Paper

Decentralised Learning with Random Features and Distributed Gradient Descent

Dominic Richards, Patrick Rebeschini, Lorenzo Rosasco

Keywords Abstract Paper

GCN meets GPU: Decoupling “When to Sample” from “How to Sample”

Morteza Ramezani, Weilin Cong, Mehrdad Mahdavi and Anand Sivasubramaniam, Mahmut Kandemir

Keywords Abstract Paper

Statistically Preconditioned Accelerated Gradient Method for Distributed Optimization

Hadrien Hendrikx, Lin Xiao, Sebastien Bubeck and Francis Bach, Laurent Massoulié

Keywords Abstract Paper

Optimization - Large Scale, Parallel and Distributed

Communication-efficient SGD: From Local SGD to One-Shot Averaging

Artin Spiridonoff, Alex Olshevsky, Yannis Paschalidis

Keywords Abstract Paper

SCAFFOLD: Stochastic Controlled Averaging for Federated Learning

Sai Praneeth Reddy Karimireddy, Satyen Kale, Mehryar Mohri and Sashank Jakkam Reddi, Sebastian Stich, Ananda Theertha Suresh

Keywords Abstract Paper

On the Convergence of FedAvg on Non-IID Data

Xiang Li, Kaixuan Huang, Wenhao Yang and Shusen Wang, Zhihua Zhang

Keywords Abstract Paper

Federated Learning, stochastic optimization, Federated Averaging

Revisiting Consistent Hashing with Bounded Loads

John Chen, Benjamin Coleman, Anshumali Shrivastava

Keywords Abstract Paper

99% of Worker-Master Communication in Distributed Optimization Is Not Needed

Konstantin Mishchenko, Filip Hanzely, Peter Richtarik

Keywords Abstract Paper

A Hybrid Variance-Reduced Method for Decentralized Stochastic Non-Convex Optimization

Ran Xin, Usman Khan, Soummya Kar

Keywords Abstract Paper

Optimization, Distributed and Parallel Optimization

Symplectic Adjoint Method for Exact Gradient of Neural ODE with Minimal Memory

Takashi Matsubara, Yuto Miyatake, Takaharu Yaguchi

Keywords Abstract Paper

deep learning, graph learning

Optimal and Practical Algorithms for Smooth and Strongly Convex Decentralized Optimization

Dmitry Kovalev, Adil Salim, Peter Richtarik

Keywords Abstract Paper

Hierarchical Multiple Kernel Clustering

Jiyuan Liu, Xinwang Liu, Siwei Wang and Sihang Zhou, Yuexiang Yang

Keywords Abstract Paper

Efficient Learning of Generative Models via Finite-Difference Score Matching

Tianyu Pang, Kun Xu, Chongxuan LI and Yang Song, Stefano Ermon, Jun Zhu

Keywords Abstract Paper

A Unified Theory of Decentralized SGD with Changing Topology and Local Updates

Anastasiia Koloskova, Nicolas Loizou, Sadra Boreiri and Martin Jaggi, Sebastian Stich

Keywords Abstract Paper

Optimization - Large Scale, Parallel and Distributed

STL-SGD: Speeding Up Local SGD with Stagewise Communication Period

Shuheng Shen, Yifei Cheng, Jingchang Liu, Linli Xu

Keywords Abstract Paper

Inducing and Exploiting Activation Sparsity for Fast Inference on Deep Neural Networks

Mark Kurtz, Justin Kopinsky, Rati Gelashvili and Alexander Matveev, John Carr, Michael Goin, William Leiserson, Sage Moore, Nir Shavit, Dan Alistarh

Keywords Abstract Paper

Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks

Keywords Paper

Zhishuai Guo, Mingrui Liu, Zhuoning Yuan and
Li Shen, Wei Liu, Tianbao Yang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jianyu Wang, Qinghua Liu, Hao Liang and
Gauri Joshi, H. Vincent Poor

Keywords Paper

Keywords Paper

Morteza Ramezani, Weilin Cong, Mehrdad Mahdavi and
Anand Sivasubramaniam, Mahmut Kandemir

Keywords Paper

Hadrien Hendrikx, Lin Xiao, Sebastien Bubeck and
Francis Bach, Laurent Massoulié

Keywords Paper

Keywords Paper

Sai Praneeth Reddy Karimireddy, Satyen Kale, Mehryar Mohri and
Sashank Jakkam Reddi, Sebastian Stich, Ananda Theertha Suresh

Keywords Paper

Xiang Li, Kaixuan Huang, Wenhao Yang and
Shusen Wang, Zhihua Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jiyuan Liu, Xinwang Liu, Siwei Wang and
Sihang Zhou, Yuexiang Yang

Keywords Paper

Tianyu Pang, Kun Xu, Chongxuan LI and
Yang Song, Stefano Ermon, Jun Zhu

Keywords Paper

Anastasiia Koloskova, Nicolas Loizou, Sadra Boreiri and
Martin Jaggi, Sebastian Stich

Keywords Paper

Keywords Paper

Mark Kurtz, Justin Kopinsky, Rati Gelashvili and
Alexander Matveev, John Carr, Michael Goin, William Leiserson, Sage Moore, Nir Shavit, Dan Alistarh

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Michele Borassi, Alessandro Epasto, Silvio Lattanzi and
Sergei Vassilvitskii, Morteza Zadimoghaddam

Keywords Paper

Keywords Paper

Shaohong Li, Xi Wang, Xiao Zhang and
Vasileios Kontorinis, Sreekumar Kodakara, David Lo, Parthasarathy Ranganathan

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Peter Davies, Vijaykrishna Gurunathan, Niusha Moshrefi and
Saleh Ashkboos, Dan Alistarh

Keywords Paper

Keywords Paper