Learning with Safety Constraints: Sample Complexity of Reinforcement Learning for Constrained MDPs

Abstract: Many physical systems have underlying safety considerations that require that the policy employed ensures the satisfaction of a set of constraints. The analytical formulation usually takes the form of a Constrained Markov Decision Process (CMDP). We focus on the case where the CMDP is unknown, and RL algorithms obtain samples to discover the model and compute an optimal constrained policy. Our goal is to characterize the relationship between safety constraints and the number of samples needed to ensure a desired level of accuracy---both objective maximization and constraint satisfaction---in a PAC sense. We explore two classes of RL algorithms, namely, (i) a generative model based approach, wherein samples are taken initially to estimate a model, and (ii) an online approach, wherein the model is updated as samples are obtained. Our main finding is that compared to the best known bounds of the unconstrained regime, the sample complexity of constrained RL algorithms are increased by a factor that is logarithmic in the number of constraints, which suggests that the approach may be easily utilized in real systems.

19/08/2021

Learning with Safety Constraints: Sample Complexity of Reinforcement Learning for Constrained MDPs

Aria HasanzadeZonuzy, Archana Bura, Dileep Kalathil, Srinivas Shakkottai

Comments

Similar Papers

Model-Based Reinforcement Learning for Infinite-Horizon Discounted Constrained Markov Decision Processes

Aria HasanzadeZonuzy, Dileep Kalathil, Srinivas Shakkottai

Keywords Abstract Paper

Machine Learning, Reinforcement Learning, Markov Decisions Processes

Safe Policy Learning for Continuous Control

Yinlam Chow, Ofir Nachum, Aleksandra Faust and Edgar Dueñez-Guzman, Mohammad Ghavamzadeh

Keywords Abstract Paper

Density Constrained Reinforcement Learning

Zengyi Qin, Yuxiao Chen, Chuchu Fan

Keywords Abstract Paper

Reinforcement Learning and Planning

Policy Learning with Constraints in Model-free Reinforcement Learning: A Survey

Yongshuai Liu, Avishai Halev, Xin Liu

Keywords Abstract Paper

Machine learning, General, General, General

CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee

Tengyu Xu, Yingbin LIANG, Guanghui Lan

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

Counterexample Guided RL Policy Refinement Using Bayesian Optimization

Briti Gangopadhyay, Pallab Dasgupta

Keywords Abstract Paper

optimization, reinforcement learning and planning

Task-Optimal Exploration in Linear Dynamical Systems

Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

First Order Constrained Optimization in Policy Space

Yiming Zhang, Quan Vuong, Keith Ross

Keywords Abstract Paper

Sim2Real Transfer for Deep Reinforcement Learning with Stochastic State Transition Delays

Sandeep Singh Sandha, Luis Garcia, Bharathan Balaji and Fatima Anwar, Mani Srivastava

Keywords Abstract Paper

Efficient Online Estimation of Causal Effects by Deciding What to Observe

Shantanu Gupta, Zachary Lipton, David Childers

Keywords Abstract Paper

reinforcement learning and planning, graph learning, causality

Design of Experiments for Stochastic Contextual Linear Bandits

Andrea Zanette, Kefan Dong, Jonathan N Lee, Emma Brunskill

Keywords Abstract Paper

reinforcement learning and planning, bandits

Blending MPC & Value Function Approximation for Efficient Reinforcement Learning

Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots

Keywords Abstract Paper

reinforcement learning, model-predictive control

Progression Heuristics for Planning with Probabilistic LTL Constraints

Ian Mallett, Sylvie Thiebaux, Felipe Trevizan

Keywords Abstract Paper

A Provably Efficient Sample Collection Strategy for Reinforcement Learning

Jean Tarbouriech, Matteo Pirotta, Michal Valko, Alessandro Lazaric

Keywords Abstract Paper

theory, reinforcement learning and planning, generative model

Auditing Black-Box Prediction Models for Data Minimization Compliance

Bashir Rastegarpanah, Krishna Gummadi, Mark Crovella

Keywords Abstract Paper

reinforcement learning and planning, bandits, privacy

High-Dimensional Contextual Policy Search with Unknown Context Rewards using Bayesian Optimization

Qing Feng , Ben Letham, Hongzi Mao, Eytan Bakshy

Keywords Abstract Paper

Scalable Intervention Target Estimation in Linear Models

Burak Varici, Karthikeyan Shanmugam, Prasanna Sattigeri, Ali Tajer

Keywords Abstract Paper

theory, graph learning, causality

Learning Composable Energy Surrogates for PDE Order Reduction

Alex Beatson, Jordan Ash, Geoffrey Roeder and Tianju Xue, Ryan Adams

Keywords Abstract Paper

High Dimensional Level Set Estimation with Bayesian Neural Network

Huong Ha, Sunil Gupta, Santu Rana, Svetha Venkatesh

Keywords Abstract Paper

Gaussian Process-Based Real-Time Learning for Safety Critical Applications

Armin Lederer, Alejandro Ordóñez Conejo, Korbinian Maier and Wenxin Xiao, Jonas Umlauft, Sandra Hirche

Keywords Abstract Paper

Probabilistic Methods, Gaussian Processes and Bayesian non-parametrics

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs

harsh satija, Philip S. Thomas, Joelle Pineau, Romain Laroche

Keywords Abstract Paper

Keywords Paper

Yinlam Chow, Ofir Nachum, Aleksandra Faust and
Edgar Dueñez-Guzman, Mohammad Ghavamzadeh

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Sandeep Singh Sandha, Luis Garcia, Bharathan Balaji and
Fatima Anwar, Mani Srivastava

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Alex Beatson, Jordan Ash, Geoffrey Roeder and
Tianju Xue, Ryan Adams

Keywords Paper

Keywords Paper

Armin Lederer, Alejandro Ordóñez Conejo, Korbinian Maier and
Wenxin Xiao, Jonas Umlauft, Sandra Hirche

Keywords Paper

Keywords Paper

Keywords Paper

Emmanuel Bengio, Moksh Jain, Maksym Korablyov and
Doina Precup, Yoshua Bengio

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Haochen Liu, Xiangyu Zhao, Chong Wang and
Xiaobing Liu, Jiliang Tang

Keywords Paper

Keywords Paper

Keywords Paper

Michael Lutter, Shie Mannor, Jan Peters and
Dieter Fox, Animesh Garg

Keywords Paper

Keywords Paper

Keywords Paper

Dongsheng Ding, Xiaohan Wei, Zhuoran Yang and
Zhaoran Wang, Mihailo Jovanovic

Keywords Paper

Keywords Paper

Keywords Paper

Yuan Yin, Ibrahim Ayed, Emmanuel de Bézenac and
Nicolas Baskiotis, Patrick Gallinari

Keywords Paper