Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation

Abstract: We consider the question of learning $Q$-function in a sample efficient manner for reinforcement learning with continuous state and action spaces under a generative model. If $Q$-function is Lipschitz continuous, then the minimal sample complexity for estimating $\epsilon$-optimal $Q$-function is known to scale as $\Omega(\frac{1}{\epsilon^{d_1+d_2+2}})$ per classical non-parametric learning theory, where $d_1$ and $d_2$ denote the dimensions of the state and action spaces respectively. The $Q$-function, when viewed as a kernel, induces a Hilbert-Schmidt operator and hence possesses square-summable spectrum. This motivates us to consider a parametric class of $Q$-functions parameterized by its "rank" $r$, which contains all Lipschitz $Q$-functions as $r\to\infty$. As our key contribution, we develop a simple, iterative learning algorithm that finds $\epsilon$-optimal $Q$-function with sample complexity of $\widetilde{O}(\frac{1}{\epsilon^{\max(d_1, d_2)+2}})$ when the optimal $Q$-function has low rank $r$ and the discounting factor $\gamma$ is below a certain threshold. Thus, this provides an exponential improvement in sample complexity. To enable our result, we develop a novel Matrix Estimation algorithm that faithfully estimates an unknown low-rank matrix in the $\ell_\infty$ sense even in the presence of arbitrary bounded noise, which might be of interest in its own right. Empirical results on several stochastic control tasks confirm the efficacy of our "low-rank" algorithms.

06/12/2020

Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation

Devavrat Shah, Dogyoon Song, Zhi Xu, Yuzhe Yang

Comments

Similar Papers

Fourier Sparse Leverage Scores and Approximate Kernel Learning

Tamas Erdelyi, Cameron Musco, Christopher Musco

Keywords Abstract Paper

Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model

Raphaël Berthier, Francis Bach, Pierre Gaillard

Keywords Abstract Paper

Optimization -> Non-Convex Optimization, Deep Learning -> Optimization for Deep Networks

Nearly Minimax Optimal Reinforcement Learning for Linear Mixture MDPs

Dongruo Zhou, Quanquan Gu, Csaba Szepesvari

Keywords Abstract Paper

Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations

Zhuoran Yang, Chi Jin, Zhaoran Wang and Mengdi Wang, Michael Jordan

Keywords Abstract Paper

Statistical-Query Lower Bounds via Functional Gradients

Surbhi Goel, Aravind Gollakota, Adam Klivans

Keywords Abstract Paper

Combinatorial Pure Exploration with Full-Bandit or Partial Linear Feedback

Yihan Du, Yuko Kuroki, Wei Chen

Keywords Abstract Paper

Robust Implicit Networks via Non-Euclidean Contractions

Saber Jafarpour, Alexander Davydov, Anton Proskurnikov, Francesco Bullo

Keywords Abstract Paper

theory, deep learning, machine learning, robustness, vision

Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: Joint Gradient Estimation and Tracking

Haoran Sun, Songtao Lu, Mingyi Hong

Keywords Abstract Paper

Optimization - Non-convex

A Reduction from Reinforcement Learning to No-Regret Online Learning

Ching-An Cheng, Remi Tachet des Combes, Byron Boots, Geoff Gordon

Keywords Abstract Paper

Exponential Weights Algorithms for Selective Learning

Mingda Qiao, Gregory Valiant

Keywords Abstract Paper

Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems

Luo Luo, Haishan Ye, Zhichao Huang, Tong Zhang

Keywords Abstract Paper

A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Wei Deng, Guang Lin, Faming Liang

Keywords Abstract Paper

Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, Karthikeyan Shanmugam

Keywords Abstract Paper

Nearly Horizon-Free Offline Reinforcement Learning

Tongzheng Ren, Jialian Li, Bo Dai and Simon Du, Sujay Sanghavi

Keywords Abstract Paper

theory, optimization, reinforcement learning and planning

A Single-Loop Smoothed Gradient Descent-Ascent Algorithm for Nonconvex-Concave Min-Max Problems

Jiawei Zhang, Peijun Xiao, Ruoyu Sun, Zhiquan Luo

Keywords Abstract Paper

Learning Polynomials in Few Relevant Dimensions

Sitan Chen, Raghu Meka

Keywords Abstract Paper

Regression, Convex optimization, High-dimensional statistics, Non-convex optimization

Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction

Gen Li, Yuting Wei, Yuejie Chi and Yuantao Gu, Yuxin Chen

Keywords Abstract Paper

Outlier-Robust Learning of Ising Models Under Dobrushin's Condition

Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart, Yuxin Sun

Keywords Abstract Paper

Towards a Unified Information-Theoretic Framework for Generalization

Mahdi Haghifam, Gintare Karolina Dziugaite, Shay Moran, Dan Roy

Keywords Abstract Paper

graph learning

PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization

Zhize Li, Hongyan Bao, Xiangliang Zhang, Peter Richtarik

Keywords Abstract Paper

Optimization

Estimating Principal Components under Adversarial Perturbations

Pranjal Awasthi, Xue Chen, Aravindan Vijayaraghavan

Keywords Abstract Paper

Unsupervised and semi-supervised learning, Adversarial learning and robustness

From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model

Aadirupa Saha, Aditya Gopalan

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II

Keywords Paper

Keywords Paper

Keywords Paper

Zhuoran Yang, Chi Jin, Zhaoran Wang and
Mengdi Wang, Michael Jordan

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Tongzheng Ren, Jialian Li, Bo Dai and
Simon Du, Sujay Sanghavi

Keywords Paper

Keywords Paper

Keywords Paper

Gen Li, Yuting Wei, Yuejie Chi and
Yuantao Gu, Yuxin Chen

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jincheng Mei, Yue Gao, Bo Dai and
Csaba Szepesvari, Dale Schuurmans

Keywords Paper

Mingrui Zhang, Zebang Shen, Aryan Mokhtari and
Hamed Hassani, Amin Karbasi

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Zhilei Wang, Pranjal Awasthi, Christoph Dann and
Ayush Sekhari, Claudio Gentile

Keywords Paper

Ilias Diakonikolas, Daniel Kane, Daniel Kongsgaard and
Jerry Li, Kevin Tian

Keywords Paper

Keywords Paper

Keywords Paper