Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with √T Regret

18/07/2021

Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with √T Regret

Asaf Cassel, Tomer Koren

Keywords: Theory, RL, Decisions and Control Theory

Abstract Paper Similar Papers

Abstract: We consider the task of learning to control a linear dynamical system under fixed quadratic costs, known as the Linear Quadratic Regulator (LQR) problem. While model-free approaches are often favorable in practice, thus far only model-based methods, which rely on costly system identification, have been shown to achieve regret that scales with the optimal dependence on the time horizon T. We present the first model-free algorithm that achieves similar regret guarantees. Our method relies on an efficient policy gradient scheme, and a novel and tighter analysis of the cost of exploration in policy space in this setting.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning

Alekh Agarwal, Mikael Henaff, Sham Kakade, Wen Sun

Keywords Paper

0

0

0

0

3:13

06/12/2021

Offline Constrained Multi-Objective Reinforcement Learning via Pessimistic Dual Value Iteration

Runzhe Wu, Yufeng Zhang, Zhuoran Yang, Zhaoran Wang

Keywords Paper

reinforcement learning and planning

0

0

0

0

13:40

06/12/2021

Derivative-Free Policy Optimization for Linear Risk-Sensitive and Robust Control Design: Implicit Regularization and Sample Complexity

Kaiqing Zhang, Xiangyuan Zhang, Bin Hu, Tamer Basar

Keywords Paper

theory, optimization, reinforcement learning and planning

0

0

0

0

15:57

12/07/2020

Responsive Safety in Reinforcement Learning

Adam Stooke, Joshua Achiam, Pieter Abbeel

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

13:36

02/02/2021

Sample Efficient Reinforcement Learning with REINFORCE

Junzi Zhang, Jongho Kim, Brendan O'Donoghue, Stephen Boyd

Keywords Paper

0

0

0

0

20:13

18/07/2021

EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL

Seyed Kamyar Seyed Ghasemipour, Dale Schuurmans, Shixiang Gu

Keywords Paper

Reinforcement Learning and Planning

0

0

0

1

5:54

06/12/2020

Off-Policy Imitation Learning from Observations

Zhuangdi Zhu, Kaixiang Lin, Bo Dai, Jiayu Zhou

Keywords Paper

0

0

0

1

3:24

18/07/2021

Provably Correct Optimization and Exploration with Non-linear Policies

Fei Feng, Wotao Yin, Alekh Agarwal, Lin Yang

Keywords Paper

Deep Learning, Adversarial Networks, Applications, Fairness, Accountability, and Transparency, Theory, RL, Decisions and Control Theory

0

0

0

0

5:03

13/04/2021

Online model selection for reinforcement learning with function approximation

Jonathan Lee, Aldo Pacchiano, Vidya Muthukumar and
Weihao Kong, Emma Brunskill

Keywords Paper

0

0

0

0

3:15

06/12/2020

Dynamic Regret of Policy Optimization in Non-Stationary Environments

Yingjie Fei, Zhuoran Yang, Zhaoran Wang, Qiaomin Xie

Keywords Paper

0

0

0

0

2:41

06/12/2021

Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning

Aurelien Bibaut, Nathan Kallus, Maria Dimakopoulou and
Antoine Chambaz, Mark van der Laan

Keywords Paper

theory, reinforcement learning and planning, machine learning, bandits

0

0

0

0

16:07

09/07/2020

Provably Efficient Reinforcement Learning with Linear Function Approximation

Chi Jin, Zhuoran Yang, Zhaoran Wang, Michael Jordan

Keywords Paper

Reinforcement learning,

0

0

0

0

13:04

18/07/2021

A Distribution-dependent Analysis of Meta Learning

Mikhail Konobeev, Ilja Kuzborskij, Csaba Szepesvari

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:06

09/07/2020

Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes

Alekh Agarwal, Sham Kakade, Jason Lee, Gaurav Mahajan

Keywords Paper

Reinforcement learning, Non-convex optimization

0

0

0

0

11:00

06/12/2020

Adaptive Sampling for Stochastic Risk-Averse Learning

Sebastian Curi, Kfir Y. Levy, Stefanie Jegelka, Andreas Krause

Keywords Paper

0

0

0

0

3:13

18/07/2021

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Zaynah Javed, Daniel Brown, Satvik Sharma and
Jerry Zhu, Ashwin Balakrishna, Marek Petrik, Anca Dragan, Ken Goldberg

Keywords Paper

Social Aspects of Machine Learning, AI Safety

0

0

0

1

5:10

03/05/2021

Modeling the Second Player in Distributionally Robust Optimization

Paul Michel, Tatsunori Hashimoto, Graham Neubig

Keywords Paper

adversarial learning, deep learning, robustness, distributionally robust optimization

0

0

0

0

5:09

06/12/2020

Differentiable Meta-Learning of Bandit Policies

Craig Boutilier, Chih-wei Hsu, Branislav Kveton and
Martin Mladenov, Csaba Szepesvari, Manzil Zaheer

Keywords Paper

0

0

0

0

3:10

18/07/2021

Is Pessimism Provably Efficient for Offline RL?

Ying Jin, Zhuoran Yang, Zhaoran Wang

Keywords Paper

Reinforcement Learning and Planning, Others

0

0

0

0

5:17

14/06/2020

Cogradient Descent for Bilinear Optimization

Li'an Zhuo, Baochang Zhang, Linlin Yang and
Hanlin Chen, Qixiang Ye, David Doermann, Rongrong Ji, Guodong Guo

Keywords Paper

bilinear optimization, gradient descent algorithm, convolutional sparse coding, network pruning

0

0

0

0

1:01

06/12/2020

Geometric Exploration for Online Control

Orestis Plevrakis, Elad Hazan

Keywords Paper

0

0

0

0

3:21

12/07/2020

Tightening Exploration in Upper Confidence Reinforcement Learning

Hippolyte Bourel, Odalric-Ambrym Maillard, Mohammad Sadegh Talebi

Keywords Paper

Reinforcement Learning - General

0

0

0

0

16:14

06/12/2021

Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses

Haipeng Luo, Chen-Yu Wei, Chung-Wei Lee

Keywords Paper

optimization, reinforcement learning and planning, bandits

0

0

0

0

15:17

02/02/2021

Stable Adversarial Learning under Distributional Shifts

Jiashuo Liu, Zheyan Shen, Peng Cui and
Linjun Zhou, Kun Kuang, Bo Li, Yishi Lin

Keywords Paper

0

0

0

0

14:30

03/05/2021

Blending MPC & Value Function Approximation for Efficient Reinforcement Learning

Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots

Keywords Paper

reinforcement learning, model-predictive control

0

0

0

0

5:09

06/12/2021

Joint inference and input optimization in equilibrium networks

Swaminathan Gurumurthy, Shaojie Bai, Zachary Manchester, J. Zico Kolter

Keywords Paper

deep learning, optimization, adversarial robustness and security, generative model, meta learning

0

0

0

0

15:03

14/09/2020

Escaping Saddle Points of Empirical Risk Privately and Scalably via DP-Trust Region Method

Di Wang, Jinhui Xu

Keywords Paper

differential privacy, empirical risk minimization, private machine learning

0

0

0

0

15:13

26/04/2020

Ranking Policy Gradient

Kaixiang Lin, Jiayu Zhou

Keywords Paper

Sample-efficient reinforcement learning, off-policy learning.

0

0

0

0

5:43

06/12/2021

Towards Robust Bisimulation Metric Learning

Mete Kemertas, Tristan Aumentado-Armstrong

Keywords Paper

reinforcement learning and planning, robustness, representation learning

0

0

0

0

12:24

02/02/2021

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Stefanos Leonardos, Georgios Piliouras

Keywords Paper

0

0

0

0

20:17

02/02/2021

Deep Bayesian Quadrature Policy Optimization

Ravi Tej Akella, Kamyar Azizzadenesheli, Mohammad Ghavamzadeh and
Animashree Anandkumar, Yisong Yue

Keywords Paper

0

0

0

0

15:39

04/08/2021

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Paper

0

0

0

0

20:29

16/11/2020

Safe Policy Learning for Continuous Control

Yinlam Chow, Ofir Nachum, Aleksandra Faust and
Edgar Dueñez-Guzman, Mohammad Ghavamzadeh

Keywords Paper

0

0

0

0

5:20

06/12/2020

Stage-wise Conservative Linear Bandits

Ahmadreza Moradipari, Christos Thrampoulidis, Mahnoosh Alizadeh

Keywords Paper

0

0

0

0

3:18

18/07/2021

Guarantees for Tuning the Step Size using a Learning-to-Learn Approach

Xiang Wang, Shuai Yuan, Chenwei Wu, Rong Ge

Keywords Paper

Theory, Computational Learning Theory

0

0

0

0

5:20

06/12/2020

Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration

Andrea Zanette, Alessandro Lazaric, Mykel J Kochenderfer, Emma Brunskill

Keywords Paper

0

0

0

0

3:11

06/12/2021

Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD

Rémi Bardenet, Subhroshekhar Ghosh, Meixia LIN

Keywords Paper

optimization, machine learning

0

0

0

0

14:51

12/07/2020

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

13:21

26/04/2020

CAQL: Continuous Action Q-Learning

Moonkyung Ryu, Yinlam Chow, Ross Anderson and
Christian Tjandraatmadja, Craig Boutilier

Keywords Paper

Reinforcement learning (RL), DQN, Continuous control, Mixed-Integer Programming (MIP)

0

0

0

0

5:36

13/04/2021

Power of hints for online learning with movement costs

Aditya Bhaskara, Ashok Cutkosky, Ravi Kumar, Manish Purohit

Keywords Paper

0

0

0

0

3:19