A Rule for Gradient Estimator Selection, with an Application to Variational Inference

Abstract: Stochastic gradient descent (SGD) is the workhorse of modern machine learning. Sometimes, there are many different potential gradient estimators that can be used. When so, choosing the one with the best tradeoff between cost and variance is important. This paper analyzes the convergence rates of SGD as a function of time, rather than iterations. This results in a simple rule to select the estimator that leads to the best optimization convergence guarantee. This choice is the same for different variants of SGD, and with different assumptions about the objective (e.g. convexity or smoothness). Inspired by this principle, we propose a technique to automatically select an estimator when a finite pool of estimators is given. Then, we extend to infinite pools of estimators, where each one is indexed by control variate weights. Empirically, automatically choosing an estimator performs comparably to the best estimator chosen with hindsight.

26/04/2020

A Rule for Gradient Estimator Selection, with an Application to Variational Inference

Tomas Geffner, Justin Domke

Comments

Similar Papers

Kernelized Wasserstein Natural Gradient

M Arbel, A Gretton, W Li, G Montufar

Keywords Abstract Paper

kernel methods, natural gradient, information geometry, Wasserstein metric

Slice Sampling Reparameterization Gradients

David M Zoltowski, Diana Cai, Ryan Adams

Keywords Abstract Paper

optimization, machine learning, generative model

A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Wei Deng, Guang Lin, Faming Liang

Keywords Abstract Paper

Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization

Kenji Kawaguchi, Haihao Lu

Keywords Abstract Paper

Greed Meets Sparsity: Understanding and Improving Greedy Coordinate Descent for Sparse Optimization

Huang Fang, Zhenan Fan, Yifan Sun, Michael Friedlander

Keywords Abstract Paper

Fair regression via plug-in estimator and recalibration with statistical guarantees

Evgenii Chzhen, Christophe Denis, Mohamed Hebiri and Luca Oneto, Massimiliano Pontil

Keywords Abstract Paper

One Sample Stochastic Frank-Wolfe

Mingrui Zhang, Zebang Shen, Aryan Mokhtari and Hamed Hassani, Amin Karbasi

Keywords Abstract Paper

Super-efficiency of automatic differentiation for functions defined as a minimum

Pierre Ablin, Gabriel Peyré, Thomas Moreau

Keywords Abstract Paper

Optimization - General

Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD

Rémi Bardenet, Subhroshekhar Ghosh, Meixia LIN

Keywords Abstract Paper

optimization, machine learning

Parabolic Approximation Line Search for DNNs

Maximus Mutschler, Andreas Zell

Keywords Abstract Paper

Learning Interpretable Decision Rule Sets: A Submodular Optimization Approach

Fan Yang, Kai He, Linxiao Yang and Hongxia Du, Jingbang Yang, Bo Yang, Liang Sun

Keywords Abstract Paper

optimization

How Data Augmentation affects Optimization for Linear Regression

Boris Hanin, Yi Sun

Keywords Abstract Paper

optimization, machine learning

Fast convergence of stochastic subgradient method under interpolation

Huang Fang, Zhenan Fan, Michael Friedlander

Keywords Abstract Paper

interpolation, stochastic subgradient method, convergence analysis, Optimization

Model Selection for Bayesian Autoencoders

Ba-Hien Tran, Simone Rossi, Dimitrios Milios and Pietro Michiardi, Edwin Bonilla, Maurizio Filippone

Keywords Abstract Paper

optimization, self-supervised learning, generative model, representation learning

Bayesian Optimization of Function Networks

Raul Astudillo, Peter Frazier

Keywords Abstract Paper

optimization, reinforcement learning and planning, kernel methods

On the Bias-Variance-Cost Tradeoff of Stochastic Optimization

Yifan Hu, Xin Chen, Niao He

Keywords Abstract Paper

theory, optimization, machine learning

Differentiable greedy algorithm for monotone submodular maximization: Guarantees, gradient estimators, and applications

Shinsaku Sakaue

Keywords Abstract Paper

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Abstract Paper

Stability and Generalization for Randomized Coordinate Descent

Puyu Wang, Liang Wu, Yunwen Lei

Keywords Abstract Paper

Machine Learning, Learning Theory, Online Learning

Stochastic Optimization for Non-convex Inf-Projection Problems

Yan Yan, Yi Xu, Lijun Zhang and Wang Xiaoyu, Tianbao Yang

Keywords Abstract Paper

Optimization - Non-convex

Sinkhorn Barycenter via Functional Gradient Descent

Zebang Shen, Zhenfu Wang, Alejandro Ribeiro, Hamed Hassani

Keywords Abstract Paper

Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Evgenii Chzhen, Christophe Denis, Mohamed Hebiri and
Luca Oneto, Massimiliano Pontil

Keywords Paper

Mingrui Zhang, Zebang Shen, Aryan Mokhtari and
Hamed Hassani, Amin Karbasi

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Fan Yang, Kai He, Linxiao Yang and
Hongxia Du, Jingbang Yang, Bo Yang, Liang Sun

Keywords Paper

Keywords Paper

Keywords Paper

Ba-Hien Tran, Simone Rossi, Dimitrios Milios and
Pietro Michiardi, Edwin Bonilla, Maurizio Filippone

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yan Yan, Yi Xu, Lijun Zhang and
Wang Xiaoyu, Tianbao Yang

Keywords Paper

Keywords Paper

Guy Lorberbom, Chris J. Maddison, Nicolas Heess and
Tamir Hazan, Daniel Tarlow

Keywords Paper

Ayya Alieva, Aiden Aceves, Jialin Song and
Stephen Mayo, Yisong Yue, Yuxin Chen

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Saeed Soori, Konstantin Mishchenko, Aryan Mokhtari and
Maryam Mehri Dehnavi, Mert Gurbuzbalaban

Keywords Paper

Keywords Paper

Tianyu Pang, Kun Xu, Chongxuan LI and
Yang Song, Stefano Ermon, Jun Zhu

Keywords Paper

Keywords Paper

Keywords Paper

Vu Nguyen, Vaden Masrani, Rob Brekelmans and
Michael A Osborne, Frank Wood

Keywords Paper

Vincent Derkinderen, Evert Heylen, Pedro Zuidberg Dos Martires and
Samuel Kolb, Luc Raedt

Keywords Paper

Keywords Paper

Keywords Paper