Safe Reinforcement Learning in Constrained Markov Decision Processes

12/07/2020

Safe Reinforcement Learning in Constrained Markov Decision Processes

Akifumi Wachi, Yanan Sui

Keywords: Reinforcement Learning - General

Abstract Paper Similar Papers

Abstract: Safe reinforcement learning has been a promising approach for optimizing the policy of an agent that operates in safety-critical applications. In this paper, we propose an algorithm, SNO-MDP, that explores and optimizes Markov decision processes under unknown safety constraints. Specifically, we take a step-wise approach for optimizing safety and cumulative reward. In our method, the agent first learns safety constraints by expanding the safe region, and then optimizes the cumulative reward in the certified safe region. We provide theoretical guarantees on both the satisfaction of the safety constraint and the near-optimality of the cumulative reward under proper regularity assumptions. In our experiments, we demonstrate the effectiveness of SNO-MDP through two experiments: one uses a synthetic data in a new, openly-available environment named GP-Safety-Gym, and the other simulates Mars surface exploration by using real observation data.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

Density Constrained Reinforcement Learning

Zengyi Qin, Yuxiao Chen, Chuchu Fan

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

4:50

26/08/2020

Adaptive Discretization for Evaluation of Probabilistic Cost Functions

Christoph Zimmer, Danny Driess, Mona Meister, Nguyen-Tuong Duy

Keywords Paper

0

0

0

0

14:13

06/12/2021

Safe Policy Optimization with Local Generalized Linear Function Approximations

Akifumi Wachi, Yunyue Wei, Yanan Sui

Keywords Paper

theory, optimization, reinforcement learning and planning

0

0

0

0

9:58

06/12/2020

Multi-Robot Collision Avoidance under Uncertainty with Probabilistic Safety Barrier Certificates

Wenhao Luo, Wen Sun, Ashish Kapoor

Keywords Paper

Algorithms -> Clustering; Algorithms -> Semi-Supervised Learning; Theory -> Learning Theory, Algorithms -> Active Learning

0

0

0

0

3:20

18/07/2021

Quantum algorithms for reinforcement learning with a generative model

Daochen Wang, Aarthi Sundaram, Robin Kothari and
Ashish Kapoor, Martin Roetteler

Keywords Paper

Optimization, Non-Convex Optimization, Algorithms, Collaborative Filtering; Applications, Information Retrieval; Applications, Matrix and Tensor Factorization; , Theory, RL, Decisions and Control Theory

0

0

0

0

4:55

06/12/2020

Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization

Sreejith Balakrishnan, Quoc Phong Nguyen, Bryan Kian Hsiang Low, Harold Soh

Keywords Paper

0

0

0

0

3:22

03/05/2021

Conservative Safety Critics for Exploration

Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart and
Sergey Levine, Florian Shkurti, Animesh Garg

Keywords Paper

Safe exploration, Reinforcement Learning

0

0

0

0

5:14

16/11/2020

Probably Approximately Correct Vision-Based Planning using Motion Primitives

Sushant Veer, Anirudha Majumdar

Keywords Paper

0

0

0

0

5:01

18/07/2021

Principled Exploration via Optimistic Bootstrapping and Backward Induction

Chenjia Bai, Lingxiao Wang, Lei Han and
Jianye Hao, Animesh Garg, Peng Liu, Zhaoran Wang

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:18

18/07/2021

Gaussian Process-Based Real-Time Learning for Safety Critical Applications

Armin Lederer, Alejandro Ordóñez Conejo, Korbinian Maier and
Wenxin Xiao, Jonas Umlauft, Sandra Hirche

Keywords Paper

Probabilistic Methods, Gaussian Processes and Bayesian non-parametrics

0

0

0

0

4:59

26/04/2020

Keep Doing What Worked: Behavior Modelling Priors for Offline Reinforcement Learning

Noah Siegel, Jost Tobias Springenberg, Felix Berkenkamp and
Abbas Abdolmaleki, Michael Neunert, Thomas Lampe, Roland Hafner, Nicolas Heess, Martin Riedmiller

Keywords Paper

Reinforcement Learning, Off-policy, Multitask, Continuous Control

0

0

0

0

5:04

13/04/2021

Deep probabilistic accelerated evaluation: A robust certifiable rare-event simulation methodology for black-box safety-critical systems

Mansur Arief, Zhiyuan Huang, Guru Koushik Senthil Kumar and
Yuanlu Bai, Shengyi He, Wenhao Ding, Henry Lam, Ding Zhao

Keywords Paper

0

0

0

0

3:03

06/12/2021

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs

harsh satija, Philip S. Thomas, Joelle Pineau, Romain Laroche

Keywords Paper

reinforcement learning and planning

0

0

0

0

12:27

02/02/2021

Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning

Shangtong Zhang, Bo Liu, Shimon Whiteson

Keywords Paper

0

0

0

0

17:22

16/11/2020

Learning Certified Control Using Contraction Metric

Dawei Sun, Susmit Jha, Chuchu Fan

Keywords Paper

0

0

0

0

5:02

16/11/2020

ACNMP: Skill Transfer and Task Extrapolation through Learning from Demonstration and Reinforcement Learning via Representation Sharing

Mete Akbulut, Erhan Oztop, Muhammet Yunus Seker and
Hh X, Ahmet Tekden, Emre Ugur

Keywords Paper

0

0

0

0

5:03

19/08/2021

Model-Based Reinforcement Learning for Infinite-Horizon Discounted Constrained Markov Decision Processes

Aria HasanzadeZonuzy, Dileep Kalathil, Srinivas Shakkottai

Keywords Paper

Machine Learning, Reinforcement Learning, Markov Decisions Processes

0

0

0

0

13:26

13/04/2021

Provably safe PAC-MDP exploration using analogies

Melrose Roderick, Vaishnavh Nagarajan, Zico Kolter

Keywords Paper

0

0

0

0

2:51

12/07/2020

Enhancing Simple Models by Exploiting What They Already Know

Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss

Keywords Paper

Supervised Learning

0

0

0

0

13:57

03/05/2021

Plan-Based Relaxed Reward Shaping for Goal-Directed Tasks

Ingmar Schubert, Oz Oguz, Marc Toussaint

Keywords Paper

reinforcement learning, robotics, robotic manipulation, plan-based reward shaping, reward shaping

0

0

0

0

4:38

06/12/2021

Counterexample Guided RL Policy Refinement Using Bayesian Optimization

Briti Gangopadhyay, Pallab Dasgupta

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

12:49

18/07/2021

On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game

Shuang Qiu, Jieping Ye, Zhaoran Wang, Zhuoran Yang

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

11:19

06/12/2020

PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals

Henry Charlesworth, Giovanni Montana

Keywords Paper

0

0

0

0

3:20

16/11/2020

EXI-Net: EXplicitly/Implicitly Conditioned Network for Multiple Environment Sim-to-Real Transfer

Takayuki Murooka, Masashi Hamaya, Felix von Drigalski and
Kazutoshi Tanaka, Yoshihisa Ijiri

Keywords Paper

0

0

0

0

4:44

06/12/2020

First Order Constrained Optimization in Policy Space

Yiming Zhang, Quan Vuong, Keith Ross

Keywords Paper

0

0

0

0

3:15

18/07/2021

CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee

Tengyu Xu, Yingbin LIANG, Guanghui Lan

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:15

06/12/2021

Physics-Integrated Variational Autoencoders for Robust and Interpretable Generative Modeling

Naoya Takeishi, Alexandros Kalousis

Keywords Paper

deep learning, machine learning, generative model, interpretability

0

0

0

0

8:13

19/08/2021

Deep Reinforcement Learning for Multi-contact Motion Planning of Hexapod Robots

Huiqiao Fu, Kaiqiang Tang, Peng Li and
Wenqi Zhang, Xinpeng Wang, Guizhou Deng, Tao Wang, Chunlin Chen

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Learning in Robotics, Motion and Path Planning

0

0

0

0

10:54

12/07/2020

Constrained Markov Decision Processes via Backward Value Functions

Harsh Satija, Philip Amortila, Joelle Pineau

Keywords Paper

Reinforcement Learning - General

0

0

0

0

10:40

16/11/2020

Sim2Real Transfer for Deep Reinforcement Learning with Stochastic State Transition Delays

Sandeep Singh Sandha, Luis Garcia, Bharathan Balaji and
Fatima Anwar, Mani Srivastava

Keywords Paper

0

0

0

0

4:59

18/07/2021

Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees

Kishan Panaganti, Dileep Kalathil

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:15

18/07/2021

PODS: Policy Optimization via Differentiable Simulation

Miguel Angel Zamora Mora, Momchil Peychev, Sehoon Ha and
Martin Vechev, Stelian Coros

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

4:28

18/07/2021

Value-at-Risk Optimization with Gaussian Processes

Quoc Phong Nguyen, Zhongxiang Dai, Bryan Kian Hsiang Low, Patrick Jaillet

Keywords Paper

Probabilistic Methods, Gaussian Processes and Bayesian non-parametrics

0

0

0

0

4:56

06/12/2021

Generalized Proximal Policy Optimization with Sample Reuse

James Queeney, Yannis Paschalidis, Christos G Cassandras

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

13:45

06/12/2020

Classification with Valid and Adaptive Coverage

Yaniv Romano, Matteo Sesia, Emmanuel Candes

Keywords Paper

0

0

0

0

3:14

06/12/2021

Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees

Gregory Dexter, Kevin Bello, Jean Honorio

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

14:49

18/07/2021

Robust Pure Exploration in Linear Bandits with Limited Budget

Ayya Alieva, Ashok Cutkosky, Abhimanyu Das

Keywords Paper

Algorithms, Adversarial Learning, Algorithms, Unsupervised Learning, Reinforcement Learning and Planning, Bandits

0

0

0

0

6:02

06/12/2021

Safe Pontryagin Differentiable Programming

Wanxin Jin, Shaoshuai Mou, George J. Pappas

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

14:31

26/10/2020

Revisiting Bounded-Suboptimal Safe Interval Path Planning

Konstantin Yakovlev, Anton Andreychuk, Roni Stern

Keywords Paper

Safe interval path planning, Heuristic search, Path planning, Motion planning

0

0

0

0

10:29

02/02/2021

Learning with Safety Constraints: Sample Complexity of Reinforcement Learning for Constrained MDPs

Aria HasanzadeZonuzy, Archana Bura, Dileep Kalathil, Srinivas Shakkottai

Keywords Paper

0

0

0

0

17:18