Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

Abstract: In the context of statistical supervised learning, the noiseless linear model assumes that there exists a deterministic linear relation $Y = \langle \theta_*, \Phi(U) \rangle$ between the random output $Y$ and the random feature vector $\Phi(U)$, a potentially non-linear transformation of the inputs~$U$. We analyze the convergence of single-pass, fixed step-size stochastic gradient descent on the least-square risk under this model. The convergence of the iterates to the optimum $\theta_*$ and the decay of the generalization error follow polynomial convergence rates with exponents that both depend on the regularities of the optimum $\theta_*$ and of the feature vectors $\Phi(U)$. We interpret our result in the reproducing kernel Hilbert space framework. As a special case, we analyze an online algorithm for estimating a real function on the unit hypercube from the noiseless observation of its value at randomly sampled points; the convergence depends on the Sobolev smoothness of the function and of a chosen kernel. Finally, we apply our analysis beyond the supervised learning setting to obtain convergence rates for the averaging process (a.k.a. gossip algorithm) on a graph depending on its spectral dimension.

06/12/2020

convex optimization, Normalizing flows, universal approximation, optimal transport, invertible neural networks, variational inference, generative models

5:13

26/04/2020

Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

Raphaël Berthier, Francis Bach, Pierre Gaillard

Comments

Similar Papers

Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation

Devavrat Shah, Dogyoon Song, Zhi Xu, Yuzhe Yang

Keywords Abstract Paper

Langevin Monte Carlo without smoothness

Niladri Chatterji, Jelena Diakonikolas, Michael Jordan, Peter Bartlett

Keywords Abstract Paper

Fourier Sparse Leverage Scores and Approximate Kernel Learning

Tamas Erdelyi, Cameron Musco, Christopher Musco

Keywords Abstract Paper

Towards a Unified Information-Theoretic Framework for Generalization

Mahdi Haghifam, Gintare Karolina Dziugaite, Shay Moran, Dan Roy

Keywords Abstract Paper

graph learning

Decentralized Riemannian Gradient Descent on the Stiefel Manifold

Shixiang Chen, Alfredo Garcia, Mingyi Hong, Shahin Shahrampour

Keywords Abstract Paper

Applications, Computer Vision, , Optimization, Distributed and Parallel Optimization

Convex Potential Flows: Universal Probability Distributions with Optimal Transport and Convex Optimization

Chin-Wei Huang, Ricky T. Q. Chen, Christos Tsirigotis, Aaron Courville

Keywords Abstract Paper

convex optimization, Normalizing flows, universal approximation, optimal transport, invertible neural networks, variational inference, generative models

Gradient Descent Maximizes the Margin of Homogeneous Neural Networks

Kaifeng Lyu, Jian Li

Keywords Abstract Paper

margin, homogeneous, gradient descent

The Heavy-Tail Phenomenon in SGD

Mert Gurbuzbalaban, Umut Simsekli, Lingjiong Zhu

Keywords Abstract Paper

Optimization, Stochastic Optimization

Last iterate convergence of SGD for Least-Squares in the Interpolation regime.

Aditya Vardhan Varre, Loucas Pillaud-Vivien, Nicolas Flammarion

Keywords Abstract Paper

deep learning, optimization

vqSGD: Vector quantized stochastic gradient descent

Venkata Gandikota, Daniel Kane, Raj Kumar Maity, Arya Mazumdar

Keywords Abstract Paper

Learning the optimal Tikhonov regularizer for inverse problems

Giovanni Alberti, Ernesto De Vito, Matti Lassas and Luca Ratti, Matteo Santacesaria

Keywords Abstract Paper

self-supervised learning, graph learning

SNODE: Spectral Discretization of Neural ODEs for System Identification

Alessio Quaglino, Marco Gallieri, Jonathan Masci, Jan Koutník

Keywords Abstract Paper

Recurrent neural networks, system identification, neural ODEs

Large-Scale Learning with Fourier Features and Tensor Decompositions

Frederiek Wesel, Kim Batselier

Keywords Abstract Paper

machine learning, kernel methods

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Abstract Paper

Neural Active Learning with Performance Guarantees

Zhilei Wang, Pranjal Awasthi, Christoph Dann and Ayush Sekhari, Claudio Gentile

Keywords Abstract Paper

deep learning, active learning

On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization

Xu Cai, Jonathan Scarlett

Keywords Abstract Paper

Applications, Natural Language Processing, Applications, Network Analysis, Reinforcement Learning and Planning, Bandits

Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems

Luo Luo, Haishan Ye, Zhichao Huang, Tong Zhang

Keywords Abstract Paper

A Reduction from Reinforcement Learning to No-Regret Online Learning

Ching-An Cheng, Remi Tachet des Combes, Byron Boots, Geoff Gordon

Keywords Abstract Paper

Slice Sampling Reparameterization Gradients

David M Zoltowski, Diana Cai, Ryan Adams

Keywords Abstract Paper

optimization, machine learning, generative model

Projected Stochastic Gradient Langevin Algorithms for Constrained Sampling and Non-Convex Learning

Andrew Lamperski

Keywords Abstract Paper

A Continuous-Time Mirror Descent Approach to Sparse Phase Retrieval

Fan Wu, Patrick Rebeschini

Keywords Abstract Paper

Nearly Minimax Optimal Reinforcement Learning for Linear Mixture MDPs

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Giovanni Alberti, Ernesto De Vito, Matti Lassas and
Luca Ratti, Matteo Santacesaria

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Zhilei Wang, Pranjal Awasthi, Christoph Dann and
Ayush Sekhari, Claudio Gentile

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jincheng Mei, Yue Gao, Bo Dai and
Csaba Szepesvari, Dale Schuurmans

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper