Minibatch vs Local SGD for Heterogeneous Distributed Learning

06/12/2020

Minibatch vs Local SGD for Heterogeneous Distributed Learning

Blake Woodworth, Kumar Kshitij Patel, Nati Srebro

Keywords:

Abstract Paper Similar Papers

Abstract: We analyze Local SGD (aka parallel or federated SGD) and Minibatch SGD in the heterogeneous distributed setting, where each machine has access to stochastic gradient estimates for a different, machine-specific, convex objective; the goal is to optimize w.r.t.~the average objective; and machines can only communicate intermittently. We argue that, (i) Minibatch SGD (even without acceleration) dominates all existing analysis of Local SGD in this setting, (ii) accelerated Minibatch SGD is optimal when the heterogeneity is high, and (iii) present the first upper bound for Local SGD that improves over Minibatch SGD in a non-homogeneous regime.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

12/07/2020

A Unified Theory of Decentralized SGD with Changing Topology and Local Updates

Anastasiia Koloskova, Nicolas Loizou, Sadra Boreiri and
Martin Jaggi, Sebastian Stich

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

13:46

12/07/2020

Randomly Projected Additive Gaussian Processes for Regression

Ian Delbridge, David Bindel, Andrew Wilson

Keywords Paper

Gaussian Processes

0

0

0

0

17:58

12/07/2020

Is Local SGD Better than Minibatch SGD?

Blake Woodworth, Kumar Kshitij Patel, Sebastian Stich and
Zhen Dai, Brian Bullins, Brendan McMahan, Ohad Shamir, Nati Srebro

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

14:36

02/02/2021

STL-SGD: Speeding Up Local SGD with Stagewise Communication Period

Shuheng Shen, Yifei Cheng, Jingchang Liu, Linli Xu

Keywords Paper

0

0

0

0

14:53

03/05/2021

New Bounds For Distributed Mean Estimation and Variance Reduction

Peter Davies, Vijaykrishna Gurunathan, Niusha Moshrefi and
Saleh Ashkboos, Dan Alistarh

Keywords Paper

distributed machine learning, variance reduction, mean estimation, lattices

0

0

0

0

4:51

06/12/2021

Dynamics of Stochastic Momentum Methods on Large-scale, Quadratic Models

Courtney Paquette, Elliot Paquette

Keywords Paper

theory, optimization

0

0

0

0

15:10

06/12/2021

Fast Minimum-norm Adversarial Attacks through Adaptive Norm Constraints

Maura Pintor, Fabio Roli, Wieland Brendel, Battista Biggio

Keywords Paper

optimization, machine learning, robustness, adversarial robustness and security, vision

0

0

0

0

11:35

06/12/2021

Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation

Ke Wang, Vidya Muthukumar, Christos Thrampoulidis

Keywords Paper

machine learning

0

0

0

0

12:38

06/12/2020

A novel variational form of the Schatten-$p$ quasi-norm

Paris Giampouras, Rene Vidal, Athanasios Rontogiannis, Benjamin Haeffele

Keywords Paper

0

0

0

0

3:14

06/12/2021

Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD

Rémi Bardenet, Subhroshekhar Ghosh, Meixia LIN

Keywords Paper

optimization, machine learning

0

0

0

0

14:51

06/12/2021

Double Machine Learning Density Estimation for Local Treatment Effects with Instruments

Yonghan Jung, Jin Tian, Elias Bareinboim

Keywords Paper

machine learning, causality

0

0

0

0

14:24

06/12/2020

Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization

Jianyu Wang, Qinghua Liu, Hao Liang and
Gauri Joshi, H. Vincent Poor

Keywords Paper

0

0

0

0

3:14

06/12/2020

CSER: Communication-efficient SGD with Error Reset

Cong Xie, Shuai Zheng, Sanmi Koyejo and
Indranil Gupta, Mu Li, Haibin Lin

Keywords Paper

0

0

0

0

3:12

02/02/2021

Efficient Truthful Scheduling and Resource Allocation through Monitoring

Dimitris Fotakis, Piotr Krysta, Carmine Ventre

Keywords Paper

0

0

0

0

19:40

03/05/2021

Understanding Over-parameterization in Generative Adversarial Networks

Yogesh Balaji, Mohammadmahdi Sajedi, Neha Kalibhat and
Mucong Ding, Dominik Stöger, Mahdi Soltanolkotabi, Soheil Feizi

Keywords Paper

min-max optimization, Over-parameterization, GAN

0

0

0

0

5:04

12/07/2020

Random extrapolation for primal-dual coordinate descent

Ahmet Alacaoglu, Olivier Fercoq, Volkan Cevher

Keywords Paper

Optimization - Convex

0

0

0

0

14:34

06/12/2021

Spatio-Temporal Variational Gaussian Processes

Oliver Hamelijnck, William Wilkinson, Niki Loppi and
Arno Solin, Theodoros Damoulas

Keywords Paper

generative model, kernel methods

0

0

0

0

6:04

06/12/2021

Identity testing for Mallows model

Róbert Busa-Fekete, Dimitris Fotakis, Balazs Szorenyi, Emmanouil Zampetakis

Keywords Paper

0

0

0

0

14:51

12/07/2020

Tensor denoising and completion based on ordinal observations

Chanwoo Lee, Miaoyan Wang

Keywords Paper

General Machine Learning Techniques

0

0

0

0

12:44

06/12/2021

An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias

Lu Yu, Krishnakumar Balasubramanian, Stanislav Volgushev, Murat Erdogdu

Keywords Paper

optimization, machine learning

0

0

0

0

10:21

06/12/2020

Nonasymptotic Guarantees for Spiked Matrix Recovery with Generative Priors

Jorio Cocola, Paul Hand, Vlad Voroninski

Keywords Paper

0

0

0

0

3:15

13/04/2021

Accumulations of projections—a unified framework for random sketches in kernel ridge regression

Yifan Chen, Yun Yang

Keywords Paper

0

0

0

0

3:03

06/12/2021

The Benefits of Implicit Regularization from SGD in Least Squares Problems

Difan Zou, Jingfeng Wu, Vladimir Braverman and
Quanquan Gu, Dean Foster, Sham Kakade

Keywords Paper

optimization, machine learning

0

0

0

0

16:05

06/12/2021

Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems

Suhas Kowshik, Dheeraj Nagaraj, Prateek Jain, Praneeth Netrapalli

Keywords Paper

theory

0

0

0

0

14:43

06/12/2021

Low-Rank Extragradient Method for Nonsmooth and Low-Rank Matrix Optimization Problems

Atara Kaplan, Dan Garber

Keywords Paper

optimization, machine learning

0

0

0

0

15:02

06/12/2020

Multi-task Additive Models for Robust Estimation and Automatic Structure Discovery

Yingjie Wang, Hong Chen, Feng Zheng and
Chen Xu, Tieliang Gong, Yanhong Chen

Keywords Paper

Applications -> Time Series Analysis; Probabilistic Methods -> Variational Inference, Probabilistic Methods -> Causal Inference

0

0

0

0

3:00

06/12/2020

Dual-Free Stochastic Decentralized Optimization with Variance Reduction

Hadrien Hendrikx, Francis Bach, Laurent Massoulié

Keywords Paper

0

0

0

0

3:28

06/12/2020

Task-Robust Model-Agnostic Meta-Learning

Liam Collins, Aryan Mokhtari, Sanjay Shakkottai

Keywords Paper

0

0

0

0

3:17

09/07/2020

High probability guarantees for stochastic convex optimization

Damek Davis, Dmitriy Drusvyatskiy

Keywords Paper

Stochastic optimization, Computational complexity, Convex optimization, Excess risk bounds and generalization error bounds

0

0

0

0

15:10

03/05/2021

Sharpness-aware Minimization for Efficiently Improving Generalization

Pierre Foret, Ariel Kleiner, Hossein Mobahi, Behnam Neyshabur

Keywords Paper

Generalization, Deep Learning, Training Method, Regularization, Sharpness Minimization

0

0

0

0

13:14

06/12/2021

An Improved Analysis and Rates for Variance Reduction under Without-replacement Sampling Orders

Xinmeng Huang, Kun Yuan, Xianghui Mao, Wotao Yin

Keywords Paper

optimization

0

0

0

0

14:39

06/12/2020

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

Julian Katz-Samuels, Lalit Jain, zohar karnin, Kevin Jamieson

Keywords Paper

0

0

0

0

3:20

26/08/2020

DAve-QN: A Distributed Averaged Quasi-Newton Method with Local Superlinear Convergence Rate

Saeed Soori, Konstantin Mishchenko, Aryan Mokhtari and
Maryam Mehri Dehnavi, Mert Gurbuzbalaban

Keywords Paper

0

0

0

0

8:45

13/04/2021

One-round communication efficient distributed m-estimation

Yajie Bao, Weijia Xiong

Keywords Paper

0

0

0

0

3:00

06/12/2021

Kernel Functional Optimisation

Arun Kumar Anjanapura Venkatesh, Alistair Shilton, Santu Rana and
Sunil Gupta, Svetha Venkatesh

Keywords Paper

machine learning, kernel methods

0

0

0

0

12:48

06/12/2020

Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses

Raef Bassily, Vitaly Feldman, Cristóbal Guzmán, Kunal Talwar

Keywords Paper

0

0

0

0

3:11

06/12/2020

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Paper

0

0

0

0

3:11

19/08/2021

GSPL: A Succinct Kernel Model for Group-Sparse Projections Learning of Multiview Data

Danyang Wu, Jin Xu, Xia Dong and
Meng Liao, Rong Wang, Feiping Nie, Xuelong Li

Keywords Paper

Machine Learning, Learning Sparse Models, Multi-instance; Multi-label; Multi-view learning, Unsupervised Learning

0

0

0

0

11:48

18/07/2021

Active Slices for Sliced Stein Discrepancy

Wenbo Gong, Kaibo Zhang, Yingzhen Li, Jose Miguel Hernandez-Lobato

Keywords Paper

, Deep Learning, Efficient Inference Methods, Algorithms, Kernel Methods

0

0

0

0

5:47

06/12/2021

Averaging on the Bures-Wasserstein manifold: dimension-free convergence of gradient descent

Jason Altschuler, Sinho Chewi, Patrik R Gerber, Austin Stromme

Keywords Paper

optimization, optimal transport

0

0

0

0

15:03