Model-Based Reinforcement Learning for Infinite-Horizon Discounted Constrained Markov Decision Processes

Abstract: In many real-world reinforcement learning (RL) problems, in addition to maximizing the objective, the learning agent has to maintain some necessary safety constraints. We formulate the problem of learning a safe policy as an infinite-horizon discounted Constrained Markov Decision Process (CMDP) with an unknown transition probability matrix, where the safety requirements are modeled as constraints on expected cumulative costs. We propose two model-based constrained reinforcement learning (CRL) algorithms for learning a safe policy, namely, (i) GM-CRL algorithm, where the algorithm has access to a generative model, and (ii) UC-CRL algorithm, where the algorithm learns the model using an upper confidence style online exploration method. We characterize the sample complexity of these algorithms, i.e., the the number of samples needed to ensure a desired level of accuracy with high probability, both with respect to objective maximization and constraint satisfaction.

13/04/2021

Model-Based Reinforcement Learning for Infinite-Horizon Discounted Constrained Markov Decision Processes

Aria HasanzadeZonuzy, Dileep Kalathil, Srinivas Shakkottai

Comments

Similar Papers

Provably efficient safe exploration via primal-dual policy optimization

Dongsheng Ding, Xiaohan Wei, Zhuoran Yang and Zhaoran Wang, Mihailo Jovanovic

Keywords Abstract Paper

WCSAC: Worst-Case Soft Actor Critic for Safety-Constrained Reinforcement Learning

Qisong Yang, Thiago D. Simão, Simon H Tindemans, Matthijs T. J. Spaan

Keywords Abstract Paper

Constrained Markov Decision Processes via Backward Value Functions

Harsh Satija, Philip Amortila, Joelle Pineau

Keywords Abstract Paper

Safe Policy Learning for Continuous Control

Yinlam Chow, Ofir Nachum, Aleksandra Faust and Edgar Dueñez-Guzman, Mohammad Ghavamzadeh

Keywords Abstract Paper

Provably safe PAC-MDP exploration using analogies

Melrose Roderick, Vaishnavh Nagarajan, Zico Kolter

Keywords Abstract Paper

Infinite Time Horizon Safety of Bayesian Neural Networks

Mathias Lechner, Đorđe Žikelić, Krishnendu Chatterjee, Thomas Henzinger

Keywords Abstract Paper

deep learning, reinforcement learning and planning

Learning Policies with Zero or Bounded Constraint Violation for Constrained MDPs

Tao Liu, Ruida Zhou, Dileep Kalathil and Panganamala Kumar, Chao Tian

Keywords Abstract Paper

Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration

Matteo Papini, Andrea Battistello, Marcello Restelli

Keywords Abstract Paper

Learning with Safety Constraints: Sample Complexity of Reinforcement Learning for Constrained MDPs

Aria HasanzadeZonuzy, Archana Bura, Dileep Kalathil, Srinivas Shakkottai

Keywords Abstract Paper

Stage-wise Conservative Linear Bandits

Ahmadreza Moradipari, Christos Thrampoulidis, Mahnoosh Alizadeh

Keywords Abstract Paper

CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee

Tengyu Xu, Yingbin LIANG, Guanghui Lan

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs

harsh satija, Philip S. Thomas, Joelle Pineau, Romain Laroche

Keywords Abstract Paper

Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey

Yongshuai Liu, Avishai Halev, Xin Liu

Keywords Abstract Paper

Machine learning, General, General, General

Conservative Safety Critics for Exploration

Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart and Sergey Levine, Florian Shkurti, Animesh Garg

Keywords Abstract Paper

Safe exploration, Reinforcement Learning

Density Constrained Reinforcement Learning

Zengyi Qin, Yuxiao Chen, Chuchu Fan

Keywords Abstract Paper

Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees

Kishan Panaganti, Dileep Kalathil

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

Regularized policies are reward robust

Hisham Husain, Kamil Ciosek, Ryota Tomioka

Keywords Abstract Paper

Safe Reinforcement Learning Using Advantage-Based Intervention

Nolan Wagener, Byron Boots, Ching-An Cheng

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

High Confidence Generalization for Reinforcement Learning

James Kostas, Yash Chandak, Scott Jordan and Georgios Theocharous, Philip Thomas

Keywords Abstract Paper

Algorithms, AutoML, Probabilistic Methods, Gaussian Processes, Reinforcement Learning and Planning

Adaptive Sampling for Best Policy Identification in Markov Decision Processes

Aymen Al Marjani, Alexandre Proutiere

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning

Tianyu Li, Bogdan Mazoure, Doina Precup, Guillaume Rabusseau

Keywords Abstract Paper

Expert-Supervised Reinforcement Learning for Offline Policy Learning and Evaluation

Aaron Sonabend, Junwei Lu, Leo Anthony Celi and Tianxi Cai, Peter Szolovits

Keywords Abstract Paper

Conservative Offline Distributional Reinforcement Learning

Yecheng Ma, Dinesh Jayaraman, Osbert Bastani

Dongsheng Ding, Xiaohan Wei, Zhuoran Yang and
Zhaoran Wang, Mihailo Jovanovic

Keywords Paper

Keywords Paper

Keywords Paper

Yinlam Chow, Ofir Nachum, Aleksandra Faust and
Edgar Dueñez-Guzman, Mohammad Ghavamzadeh

Keywords Paper

Keywords Paper

Keywords Paper

Tao Liu, Ruida Zhou, Dileep Kalathil and
Panganamala Kumar, Chao Tian

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart and
Sergey Levine, Florian Shkurti, Animesh Garg

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

James Kostas, Yash Chandak, Scott Jordan and
Georgios Theocharous, Philip Thomas

Keywords Paper

Keywords Paper

Keywords Paper

Aaron Sonabend, Junwei Lu, Leo Anthony Celi and
Tianxi Cai, Peter Szolovits

Keywords Paper

Keywords Paper

Keywords Paper

Tsung-Yen Yang, Michael Y Hu, Yinlam Chow and
Peter J Ramadge, Karthik Narasimhan

Keywords Paper

Keywords Paper

Keywords Paper

Alexander Robey, Luiz Chamon, George J. Pappas and
Hamed Hassani, Alejandro Ribeiro

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper