Conservative Safety Critics for Exploration

Abstract: Safe exploration presents a major challenge in reinforcement learning (RL): when active data collection requires deploying partially trained policies, we must ensure that these policies avoid catastrophically unsafe regions, while still enabling trial and error learning. In this paper, we target the problem of safe exploration in RL, by learning a conservative safety estimate of environment states through a critic, and provably upper bound the likelihood of catastrophic failures at every training iteration. We theoretically characterize the tradeoff between safety and policy improvement, show that the safety constraints are satisfied with high probability during training, derive provable convergence guarantees for our approach which is no worse asymptotically then standard RL, and empirically demonstrate the efficacy of the proposed approach on a suite of challenging navigation, manipulation, and locomotion tasks. Our results demonstrate that the proposed approach can achieve competitive task performance, while incurring significantly lower catastrophic failure rates during training as compared to prior methods. Videos are at this URL https://sites.google.com/view/conservative-safety-critics/

18/07/2021

Conservative Safety Critics for Exploration

Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart, Sergey Levine, Florian Shkurti, Animesh Garg

Comments

Similar Papers

Safe Reinforcement Learning Using Advantage-Based Intervention

Nolan Wagener, Byron Boots, Ching-An Cheng

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

Provably safe PAC-MDP exploration using analogies

Melrose Roderick, Vaishnavh Nagarajan, Zico Kolter

Keywords Abstract Paper

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs

harsh satija, Philip S. Thomas, Joelle Pineau, Romain Laroche

Keywords Abstract Paper

reinforcement learning and planning

Safe Policy Optimization with Local Generalized Linear Function Approximations

Akifumi Wachi, Yunyue Wei, Yanan Sui

Keywords Abstract Paper

theory, optimization, reinforcement learning and planning

Counterexample Guided RL Policy Refinement Using Bayesian Optimization

Briti Gangopadhyay, Pallab Dasgupta

Keywords Abstract Paper

optimization, reinforcement learning and planning

WCSAC: Worst-Case Soft Actor Critic for Safety-Constrained Reinforcement Learning

Qisong Yang, Thiago D. Simão, Simon H Tindemans, Matthijs T. J. Spaan

Keywords Abstract Paper

Safe Reinforcement Learning by Imagining the Near Future

Garrett Thomas, Yuping Luo, Tengyu Ma

Keywords Abstract Paper

reinforcement learning and planning

Safe Pontryagin Differentiable Programming

Wanxin Jin, Shaoshuai Mou, George J. Pappas

Keywords Abstract Paper

optimization, reinforcement learning and planning

Security Analysis of Safe and Seldonian Reinforcement Learning Algorithms

Pinar Ozisik, Philip Thomas

Keywords Abstract Paper

Model-Based Reinforcement Learning for Infinite-Horizon Discounted Constrained Markov Decision Processes

Aria HasanzadeZonuzy, Dileep Kalathil, Srinivas Shakkottai

Keywords Abstract Paper

Machine Learning, Reinforcement Learning, Markov Decisions Processes

Constrained Markov Decision Processes via Backward Value Functions

Harsh Satija, Philip Amortila, Joelle Pineau

Keywords Abstract Paper

Reinforcement Learning - General

Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings

Jesse Zhang, Brian Cheung, Chelsea Finn and Sergey Levine, Dinesh Jayaraman

Keywords Abstract Paper

Reinforcement Learning - Deep RL

First Order Constrained Optimization in Policy Space

Yiming Zhang, Quan Vuong, Keith Ross

Keywords Abstract Paper

Guaranteeing Safety of Learned Perception Modules via Measurement-Robust Control Barrier Functions

Sarah Dean, Andrew Taylor, Ryan Cosner and Benjamin Recht, Aaron Ames

Keywords Abstract Paper

Towards Safe Policy Improvement for Non-Stationary MDPs

Yash Chandak, Scott Jordan, Georgios Theocharous and Martha White, Philip Thomas

Keywords Abstract Paper

Applications -> Computer Vision; Deep Learning -> Attention Models, Deep Learning

PAC Confidence Predictions for Deep Neural Network Classifiers

Sangdon Park, Shuo Li, Insup Lee, Osbert Bastani

Keywords Abstract Paper

classification, fast DNN inference, probably approximated correct guarantee, calibration, safe planning

Safe Policy Learning for Continuous Control

Yinlam Chow, Ofir Nachum, Aleksandra Faust and Edgar Dueñez-Guzman, Mohammad Ghavamzadeh

Keywords Abstract Paper

Deep Verifier Networks: Verification of Deep Discriminative Models with Deep Generative Models

Tong Che, Xiaofeng Liu, Site Li and Yubin Ge, Ruixiang Zhang, Caiming Xiong, Yoshua Bengio

Keywords Abstract Paper

Safe Reinforcement Learning with Natural Language Constraints

Tsung-Yen Yang, Michael Y Hu, Yinlam Chow and Peter J Ramadge, Karthik Narasimhan

Keywords Abstract Paper

reinforcement learning and planning

Safe Reinforcement Learning with Linear Function Approximation

Sanae Amani, Christos Thrampoulidis, Lin Yang

Keywords Abstract Paper

Reinforcement Learning and Planning

Adaptive Discretization for Evaluation of Probabilistic Cost Functions

Christoph Zimmer, Danny Driess, Mona Meister, Nguyen-Tuong Duy

Keywords Abstract Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jesse Zhang, Brian Cheung, Chelsea Finn and
Sergey Levine, Dinesh Jayaraman

Keywords Paper

Keywords Paper

Sarah Dean, Andrew Taylor, Ryan Cosner and
Benjamin Recht, Aaron Ames

Keywords Paper

Yash Chandak, Scott Jordan, Georgios Theocharous and
Martha White, Philip Thomas

Keywords Paper

Keywords Paper

Yinlam Chow, Ofir Nachum, Aleksandra Faust and
Edgar Dueñez-Guzman, Mohammad Ghavamzadeh

Keywords Paper

Tong Che, Xiaofeng Liu, Site Li and
Yubin Ge, Ruixiang Zhang, Caiming Xiong, Yoshua Bengio

Keywords Paper

Tsung-Yen Yang, Michael Y Hu, Yinlam Chow and
Peter J Ramadge, Karthik Narasimhan

Keywords Paper

Keywords Paper

Keywords Paper

Mansur Arief, Zhiyuan Huang, Guru Koushik Senthil Kumar and
Yuanlu Bai, Shengyi He, Wenhao Ding, Henry Lam, Ding Zhao

Keywords Paper

Dongsheng Ding, Xiaohan Wei, Zhuoran Yang and
Zhaoran Wang, Mihailo Jovanovic

Keywords Paper

Keywords Paper

Chen Chen, Hongyao Tang, Jianye Hao and
Wulong Liu, Zhaopeng Meng

Keywords Paper

Keywords Paper

Aaron Sonabend, Junwei Lu, Leo Anthony Celi and
Tianxi Cai, Peter Szolovits

Keywords Paper

Keywords Paper

Keywords Paper

Zengyi Qin, Kaiqing Zhang, chenyx Chen and
Jingkai Chen, Chuchu Fan

Keywords Paper

Keywords Paper

Armin Lederer, Alejandro Ordóñez Conejo, Korbinian Maier and
Wenxin Xiao, Jonas Umlauft, Sandra Hirche

Keywords Paper

Keywords Paper

Keywords Paper

Jun Yamada, Youngwoon Lee, Gautam Salhotra and
Karl Pertsch, Max Pflueger, Gaurav Sukhatme, Joseph Lim, Peter Englert

Keywords Paper

James Kostas, Yash Chandak, Scott Jordan and
Georgios Theocharous, Philip Thomas

Keywords Paper