Provably safe PAC-MDP exploration using analogies

Abstract: A key challenge in applying reinforcement learning to safety-critical domains is understanding how to balance exploration (needed to attain good performance on the task) with safety (needed to avoid catastrophic failure). Although a growing line of work in reinforcement learning has investigated this area of "safe exploration," most existing techniques either 1) do not guarantee safety during the actual exploration process; and/or 2) limit the problem to a priori known and/or deterministic transition dynamics with strong smoothness assumptions. Addressing this gap, we propose Analogous Safe-state Exploration (ASE), an algorithm for provably safe exploration in MDPs with unknown, stochastic dynamics. Our method exploits analogies between state-action pairs to safely learn a near-optimal policy in a PAC-MDP sense. Additionally, ASE also guides exploration towards the most task-relevant states, which empirically results in significant improvements in terms of sample efficiency, when compared to existing methods.

03/05/2021

Provably safe PAC-MDP exploration using analogies

Melrose Roderick, Vaishnavh Nagarajan, Zico Kolter

Comments

Similar Papers

Conservative Safety Critics for Exploration

Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart and Sergey Levine, Florian Shkurti, Animesh Garg

Keywords Abstract Paper

Safe exploration, Reinforcement Learning

Constrained Markov Decision Processes via Backward Value Functions

Harsh Satija, Philip Amortila, Joelle Pineau

Keywords Abstract Paper

Model-Based Reinforcement Learning for Infinite-Horizon Discounted Constrained Markov Decision Processes

Aria HasanzadeZonuzy, Dileep Kalathil, Srinivas Shakkottai

Keywords Abstract Paper

Machine Learning, Reinforcement Learning, Markov Decisions Processes

WCSAC: Worst-Case Soft Actor Critic for Safety-Constrained Reinforcement Learning

Qisong Yang, Thiago D. Simão, Simon H Tindemans, Matthijs T. J. Spaan

Keywords Abstract Paper

Safe Reinforcement Learning Using Advantage-Based Intervention

Nolan Wagener, Byron Boots, Ching-An Cheng

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

Safe Policy Optimization with Local Generalized Linear Function Approximations

Akifumi Wachi, Yunyue Wei, Yanan Sui

Keywords Abstract Paper

theory, optimization, reinforcement learning and planning

PAC Confidence Predictions for Deep Neural Network Classifiers

Sangdon Park, Shuo Li, Insup Lee, Osbert Bastani

Keywords Abstract Paper

classification, fast DNN inference, probably approximated correct guarantee, calibration, safe planning

Safe Reinforcement Learning by Imagining the Near Future

Garrett Thomas, Yuping Luo, Tengyu Ma

Keywords Abstract Paper

Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings

Jesse Zhang, Brian Cheung, Chelsea Finn and Sergey Levine, Dinesh Jayaraman

Keywords Abstract Paper

Safe Reinforcement Learning with Linear Function Approximation

Sanae Amani, Christos Thrampoulidis, Lin Yang

Keywords Abstract Paper

Infinite Time Horizon Safety of Bayesian Neural Networks

Mathias Lechner, Đorđe Žikelić, Krishnendu Chatterjee, Thomas Henzinger

Keywords Abstract Paper

deep learning, reinforcement learning and planning

Deep probabilistic accelerated evaluation: A robust certifiable rare-event simulation methodology for black-box safety-critical systems

Mansur Arief, Zhiyuan Huang, Guru Koushik Senthil Kumar and Yuanlu Bai, Shengyi He, Wenhao Ding, Henry Lam, Ding Zhao

Keywords Abstract Paper

Neurosymbolic Reinforcement Learning with Formally Verified Exploration

Greg Anderson, Abhinav Verma, Isil Dillig, Swarat Chaudhuri

Keywords Abstract Paper

Guaranteeing Safety of Learned Perception Modules via Measurement-Robust Control Barrier Functions

Sarah Dean, Andrew Taylor, Ryan Cosner and Benjamin Recht, Aaron Ames

Keywords Abstract Paper

Provably efficient safe exploration via primal-dual policy optimization

Dongsheng Ding, Xiaohan Wei, Zhuoran Yang and Zhaoran Wang, Mihailo Jovanovic

Keywords Abstract Paper

Learning Barrier Certificates: Towards Safe Reinforcement Learning with Zero Training-time Violations

Yuping Luo, Tengyu Ma

Keywords Abstract Paper

reinforcement learning and planning, adversarial robustness and security

Gaussian Process-Based Real-Time Learning for Safety Critical Applications

Armin Lederer, Alejandro Ordóñez Conejo, Korbinian Maier and Wenxin Xiao, Jonas Umlauft, Sandra Hirche

Keywords Abstract Paper

Probabilistic Methods, Gaussian Processes and Bayesian non-parametrics

Expert-Supervised Reinforcement Learning for Offline Policy Learning and Evaluation

Aaron Sonabend, Junwei Lu, Leo Anthony Celi and Tianxi Cai, Peter Szolovits

Keywords Abstract Paper

Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning

Tianyu Li, Bogdan Mazoure, Doina Precup, Guillaume Rabusseau

Keywords Abstract Paper

Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration

Matteo Papini, Andrea Battistello, Marcello Restelli

Keywords Abstract Paper

Relaxing Local Robustness

Klas Leino, Matt Fredrikson

Keywords Abstract Paper

deep learning, optimization, machine learning, robustness, adversarial robustness and security

Density Constrained Reinforcement Learning

Zengyi Qin, Yuxiao Chen, Chuchu Fan

Keywords Abstract Paper

Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration

Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart and
Sergey Levine, Florian Shkurti, Animesh Garg

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jesse Zhang, Brian Cheung, Chelsea Finn and
Sergey Levine, Dinesh Jayaraman

Keywords Paper

Keywords Paper

Keywords Paper

Mansur Arief, Zhiyuan Huang, Guru Koushik Senthil Kumar and
Yuanlu Bai, Shengyi He, Wenhao Ding, Henry Lam, Ding Zhao

Keywords Paper

Keywords Paper

Sarah Dean, Andrew Taylor, Ryan Cosner and
Benjamin Recht, Aaron Ames

Keywords Paper

Dongsheng Ding, Xiaohan Wei, Zhuoran Yang and
Zhaoran Wang, Mihailo Jovanovic

Keywords Paper

Keywords Paper

Armin Lederer, Alejandro Ordóñez Conejo, Korbinian Maier and
Wenxin Xiao, Jonas Umlauft, Sandra Hirche

Keywords Paper

Aaron Sonabend, Junwei Lu, Leo Anthony Celi and
Tianxi Cai, Peter Szolovits

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Kevin Li, Abhishek Gupta, Ashwin D Reddy and
Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Tsung-Yen Yang, Michael Y Hu, Yinlam Chow and
Peter J Ramadge, Karthik Narasimhan

Keywords Paper

Keywords Paper

Alexander Robey, Luiz Chamon, George J. Pappas and
Hamed Hassani, Alejandro Ribeiro

Keywords Paper

Keywords Paper

Tianren Zhang, Shangqi Guo, Tian Tan and
Xiaolin Hu, Feng Chen

Keywords Paper