Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees

18/07/2021

Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees

Kishan Panaganti, Dileep Kalathil

Keywords: Theory, RL, Decisions and Control Theory

Abstract Paper Similar Papers

Abstract: This paper addresses the problem of model-free reinforcement learning for Robust Markov Decision Process (RMDP) with large state spaces. The goal of the RMDPs framework is to find a policy that is robust against the parameter uncertainties due to the mismatch between the simulator model and real-world settings. We first propose the Robust Least Squares Policy Evaluation algorithm, which is a multi-step online model-free learning algorithm for policy evaluation. We prove the convergence of this algorithm using stochastic approximation techniques. We then propose Robust Least Squares Policy Iteration (RLSPI) algorithm for learning the optimal robust policy. We also give a general weighted Euclidean norm bound on the error (closeness to optimality) of the resulting policy. Finally, we demonstrate the performance of our RLSPI algorithm on some benchmark problems from OpenAI Gym.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

Robust Reinforcement Learning: A Case Study in Linear Quadratic Regulation

Bo Pang, Zhong-Ping Jiang

Keywords Paper

0

0

0

0

20:01

18/07/2021

Provably Efficient Algorithms for Multi-Objective Competitive RL

Tiancheng Yu, Yi Tian, Jingzhao Zhang, Suvrit Sra

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

17:04

06/12/2020

Robust Multi-Agent Reinforcement Learning with Model Uncertainty

Kaiqing Zhang, TAO SUN, Yunzhe Tao and
Sahika Genc, Sunil Mallya, Tamer Basar

Keywords Paper

0

0

0

0

3:11

06/12/2021

Adversarial Intrinsic Motivation for Reinforcement Learning

Ishan Durugkar, Mauricio Tec, Scott Niekum, Peter Stone

Keywords Paper

reinforcement learning and planning, generative model

0

0

0

0

13:11

06/12/2021

Online Robust Reinforcement Learning with Model Uncertainty

Yue Wang, Shaofeng Zou

Keywords Paper

reinforcement learning and planning, robustness

0

0

0

0

14:45

26/08/2020

Finite-Time Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Jun Sun, Gang Wang, Georgios B. Giannakis and
Qinmin Yang, Zaiyue Yang

Keywords Paper

0

0

0

0

17:07

18/07/2021

Density Constrained Reinforcement Learning

Zengyi Qin, Yuxiao Chen, Chuchu Fan

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

4:50

06/12/2020

A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Wei Deng, Guang Lin, Faming Liang

Keywords Paper

0

0

0

0

3:26

18/07/2021

A Sharp Analysis of Model-based Reinforcement Learning with Self-Play

Qinghua Liu, Tiancheng Yu, Yu Bai, Chi Jin

Keywords Paper

Algorithms, Multitask and Transfer Learning, Algorithms, Meta-Learning; Applications, Object Recognition; Data, Challenges, Implementations, and Software, Benchmarks;, Theory, RL, Decisions and Control Theory

0

0

0

0

4:49

13/04/2021

Logistic q-learning

Joan Bas-Serrano, Sebastian Curi, Andreas Krause, Gergely Neu

Keywords Paper

0

0

0

0

2:44

13/04/2021

Provably efficient safe exploration via primal-dual policy optimization

Dongsheng Ding, Xiaohan Wei, Zhuoran Yang and
Zhaoran Wang, Mihailo Jovanovic

Keywords Paper

0

0

0

0

3:07

18/07/2021

Adaptive Sampling for Best Policy Identification in Markov Decision Processes

Aymen Al Marjani, Alexandre Proutiere

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:35

02/02/2021

Loop Estimator for Discounted Values in Markov Reward Processes

Falcon Z. Dai, Matthew R. Walter

Keywords Paper

0

0

0

0

21:51

06/12/2020

On the Stability and Convergence of Robust Adversarial Reinforcement Learning: A Case Study on Linear Quadratic Systems

Kaiqing Zhang, Bin Hu, Tamer Basar

Keywords Paper

0

0

0

0

3:22

06/12/2020

Generalization Bound of Gradient Descent for Non-Convex Metric Learning

MINGZHI DONG, Xiaochen Yang, Rui Zhu and
Yujiang Wang, Jing-Hao Xue

Keywords Paper

0

0

0

0

3:18

06/12/2020

Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning

Fei Feng, Ruosong Wang, Wotao Yin and
Simon Du, Lin Yang

Keywords Paper

Reinforcement Learning and Planning -> Decision and Control, Probabilistic Methods -> Gaussian Processes

0

0

0

0

3:11

06/12/2021

A Max-Min Entropy Framework for Reinforcement Learning

Seungyul Han, Youngchul Sung

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

14:35

13/04/2021

Finite-sample regret bound for distributionally robust offline tabular reinforcement learning

Zhengqing Zhou, Zhengyuan Zhou, Qinxun Bai and
Linhai Qiu, Jose Blanchet, Peter Glynn

Keywords Paper

0

0

0

0

3:02

06/12/2021

Towards a Unified Information-Theoretic Framework for Generalization

Mahdi Haghifam, Gintare Karolina Dziugaite, Shay Moran, Dan Roy

Keywords Paper

graph learning

0

0

0

0

11:51

18/07/2021

Model-based Reinforcement Learning for Continuous Control with Posterior Sampling

Ying Fan, Yifei Ming

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

18:34

09/07/2020

Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes

Alekh Agarwal, Sham Kakade, Jason Lee, Gaurav Mahajan

Keywords Paper

Reinforcement learning, Non-convex optimization

0

0

0

0

11:00

26/08/2020

Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration

Matteo Papini, Andrea Battistello, Marcello Restelli

Keywords Paper

0

0

0

0

12:47

06/12/2020

Deep Inverse Q-learning with Constraints

Gabriel Kalweit, Maria Huegle, Moritz Werling, Joschka Boedecker

Keywords Paper

0

0

0

0

3:14

19/08/2021

Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts

Weinan Zhang, Xihuai Wang, Jian Shen, Ming Zhou

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Multi-agent Learning

0

0

0

0

13:10

06/12/2021

Finite Sample Analysis of Average-Reward TD Learning and $Q$-Learning

Sheng Zhang, Zhe Zhang, Siva Theja Maguluri

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

10:20

06/12/2021

An Analysis of Constant Step Size SGD in the Non-convex Regime: Asymptotic Normality and Bias

Lu Yu, Krishnakumar Balasubramanian, Stanislav Volgushev, Murat Erdogdu

Keywords Paper

optimization, machine learning

0

0

0

0

10:21

13/04/2021

Adversarially robust estimate and risk analysis in linear regression

Yue Xing, Ruizhi Zhang, Guang Cheng

Keywords Paper

0

0

0

0

3:03

18/07/2021

Stability and Generalization of Stochastic Gradient Methods for Minimax Problems

Yunwen Lei, Zhenhuan Yang, Tianbao Yang, Yiming Ying

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

16:24

16/11/2020

Safe Policy Learning for Continuous Control

Yinlam Chow, Ofir Nachum, Aleksandra Faust and
Edgar Dueñez-Guzman, Mohammad Ghavamzadeh

Keywords Paper

0

0

0

0

5:20

06/12/2020

Adaptive Discretization for Model-Based Reinforcement Learning

Sean Sinclair, Tianyu Wang, Gauri Jain and
Sid Banerjee, Christina Yu

Keywords Paper

0

0

0

0

3:12

26/04/2020

CAQL: Continuous Action Q-Learning

Moonkyung Ryu, Yinlam Chow, Ross Anderson and
Christian Tjandraatmadja, Craig Boutilier

Keywords Paper

Reinforcement learning (RL), DQN, Continuous control, Mixed-Integer Programming (MIP)

0

0

0

0

5:36

06/12/2021

Taming Communication and Sample Complexities in Decentralized Policy Evaluation for Cooperative Multi-Agent Reinforcement Learning

Xin Zhang, Zhuqing Liu, Jia Liu and
Zhengyuan Zhu, Songtao Lu

Keywords Paper

theory, optimization, reinforcement learning and planning

0

0

0

0

14:54

12/07/2020

Representations for Stable Off-Policy Reinforcement Learning

Dibya Ghosh, Marc Bellemare

Keywords Paper

Reinforcement Learning - General

0

0

0

0

14:38

09/07/2020

Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise

Maksim Kaledin, Eric Moulines, Alexey Naumov and
Vladislav Tadic, Hoi-To Wai

Keywords Paper

Stochastic optimization, Reinforcement learning

0

0

0

0

12:29

09/07/2020

Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium

Qiaomin Xie, Yudong Chen, Zhaoran Wang, Zhuoran Yang

Keywords Paper

Reinforcement learning, Planning and control

0

0

0

0

15:16

06/12/2021

Reward is enough for convex MDPs

Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, Satinder Singh

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:12

18/07/2021

Provably Correct Optimization and Exploration with Non-linear Policies

Fei Feng, Wotao Yin, Alekh Agarwal, Lin Yang

Keywords Paper

Deep Learning, Adversarial Networks, Applications, Fairness, Accountability, and Transparency, Theory, RL, Decisions and Control Theory

0

0

0

0

5:03

26/08/2020

Finite-Time Error Bounds for Biased Stochastic Approximation with Applications to Q-Learning

Gang Wang, Georgios B. Giannakis

Keywords Paper

0

0

0

0

14:03

13/04/2021

Provably eﬃcient actor-critic for risk-sensitive and robust adversarial RL: A linear-quadratic case

Yufeng Zhang, Zhuoran Yang, Zhaoran Wang

Keywords Paper

0

0

0

0

2:53

06/12/2021

Neural Active Learning with Performance Guarantees

Zhilei Wang, Pranjal Awasthi, Christoph Dann and
Ayush Sekhari, Claudio Gentile

Keywords Paper

deep learning, active learning

0

0

0

0

10:43