Random Reshuffling is Not Always Better

Abstract: Many learning algorithms, such as stochastic gradient descent, are affected by the order in which training examples are used. It is often observed that sampling the training examples without-replacement, also known as random reshuffling, causes learning algorithms to converge faster. We give a counterexample to the Operator Inequality of Noncommutative Arithmetic and Geometric Means, a longstanding conjecture that relates to the performance of random reshuffling in learning algorithms (Recht and Ré, "Toward a noncommutative arithmetic-geometric mean inequality: conjectures, case-studies, and consequences," COLT 2012). We use this to give an example of a learning task and algorithm for which with-replacement random sampling actually outperforms random reshuffling.

26/08/2020

Random Reshuffling is Not Always Better

Christopher De Sa

Comments

Similar Papers

Revisiting Stochastic Extragradient

Konstantin Mishchenko, Dmitry Kovalev, Egor Shulgin and Peter Richtarik, Yura Malitsky

Keywords Abstract Paper

Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods

Taiji Suzuki, Akiyama Shunta

Keywords Abstract Paper

local Rademacher complexity, minimax optimal rate, Excess risk, linear estimator, kernel method, fast learning rate

Robust Density Estimation from Batches: The Best Things in Life are (Nearly) Free

Ayush Jain, Alon Orlitsky

Keywords Abstract Paper

Theory, Statistical Learning Theory

Large deviations for the perceptron model and consequences for active learning

Hugo Cui, Luca Saglietti, Lenka Zdeborova

Keywords Abstract Paper

Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation

Yuxuan Song, Ning Miao, Hao Zhou and Lantao Yu, Mingxuan Wang, Lei Li

Keywords Abstract Paper

How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks

Keyulu Xu, Mozhi Zhang, Jingling Li and Simon Du, Ken-Ichi Kawarabayashi, Stefanie Jegelka

Keywords Abstract Paper

graph neural networks, out-of-distribution, deep learning, extrapolation, deep learning theory

Gradient descent follows the regularization path for general losses

Ziwei Ji, Miroslav Dudik, Robert Schapire, Matus Telgarsky

Keywords Abstract Paper

Loss functions, Classification, Convex optimization

Reconciling Modern Deep Learning with Traditional Optimization Analyses: The Intrinsic Learning Rate

Zhiyuan Li, Kaifeng Lyu, Sanjeev Arora

Keywords Abstract Paper

Linear-Sample Learning of Low-Rank Distributions

Ayush Jain, Alon Orlitsky

Keywords Abstract Paper

Theoretical bounds on estimation error for meta-learning

James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI and Toniann Pitassi, Richard Zemel

Keywords Abstract Paper

meta learning, minimax risk, few-shot, lower bounds, learning theory

Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks

Zhishuai Guo, Mingrui Liu, Zhuoning Yuan and Li Shen, Wei Liu, Tianbao Yang

Keywords Abstract Paper

Optimization - Large Scale, Parallel and Distributed

Finite-Time Analysis for Double Q-learning

Huaqing Xiong, Lin Zhao, Yingbin Liang, Wei Zhang

Keywords Abstract Paper

Deep Learning -> Embedding Approaches, Applications -> Natural Language Processing

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and Danil Karpushkin, Dmitry Vetrov

Keywords Abstract Paper

deep learning, optimization

Don't Waste Your Bits! Squeeze Activations and Gradients for Deep Neural Networks via TinyScript

Fangcheng Fu, Yuzheng Hu, Yihan He and Jiawei Jiang, Yingxia Shao, Ce Zhang, Bin Cui

Keywords Abstract Paper

Optimization - Large Scale, Parallel and Distributed

Asynchronous Gibbs Sampling

Alexander Terenin, Daniel Simpson, David Draper

Keywords Abstract Paper

Training Recurrent Neural Networks Online by Learning Explicit State Variables

Somjit Nath, Vincent Liu, Alan Chan and Xin Li, Adam White, Martha White

Keywords Abstract Paper

Recurrent Neural Network, Partial Observability, Online Prediction, Incremental Learning

Learning with Optimized Random Features: Exponential Speedup by Quantum Machine Learning without Sparsity and Low-Rank Assumptions

Hayata Yamasaki, Sathyawageeswar Subramanian, Sho Sonoda, Masato Koashi

Keywords Abstract Paper

Efficiently learning structured distributions from untrusted batches

Sitan Chen, Jerry Li, Ankur Moitra

Keywords Abstract Paper

sum-of-squares, federated learning, VC complexity, Robust statistics

Fast Rates for Structured Prediction

Vivien A Cabannnes, Francis Bach, Alessandro Rudi

Keywords Abstract Paper

At Stability's Edge: How to Adjust Hyperparameters to Preserve Minima Selection in Asynchronous Training of Neural Networks?

Niv Giladi, Mor Shpigel Nacson, Elad Hoffer, Daniel Soudry

Keywords Abstract Paper

implicit bias, stability, neural networks, generalization gap, asynchronous SGD

A Minimalist Approach to Offline Reinforcement Learning

Scott Fujimoto, Shixiang (Shane) Gu

Keywords Abstract Paper

reinforcement learning and planning, generative model

Konstantin Mishchenko, Dmitry Kovalev, Egor Shulgin and
Peter Richtarik, Yura Malitsky

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yuxuan Song, Ning Miao, Hao Zhou and
Lantao Yu, Mingxuan Wang, Lei Li

Keywords Paper

Keyulu Xu, Mozhi Zhang, Jingling Li and
Simon Du, Ken-Ichi Kawarabayashi, Stefanie Jegelka

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI and
Toniann Pitassi, Richard Zemel

Keywords Paper

Zhishuai Guo, Mingrui Liu, Zhuoning Yuan and
Li Shen, Wei Liu, Tianbao Yang

Keywords Paper

Keywords Paper

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and
Danil Karpushkin, Dmitry Vetrov

Keywords Paper

Fangcheng Fu, Yuzheng Hu, Yihan He and
Jiawei Jiang, Yingxia Shao, Ce Zhang, Bin Cui

Keywords Paper

Keywords Paper

Somjit Nath, Vincent Liu, Alan Chan and
Xin Li, Adam White, Martha White

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

T. Anderson Keller, Jorn Peters, Priyank Jaini and
Emiel Hoogeboom, Patrick Forré, Max Welling

Keywords Paper

Keywords Paper

Emmanuel Abbe, Pritish Kamath, Eran Malach and
Colin Sandon, Nathan Srebro

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Haichao Yu, Haoxiang Li, Humphrey Shi and
Thomas S. Huang, Gang Hua

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Vladimir Braverman, Avinatan Hassidim, Yossi Matias and
Mariano Schain, Sandeep Silwal, Samson Zhou

Keywords Paper

Keywords Paper