Counterexample Guided RL Policy Refinement Using Bayesian Optimization

Abstract: Constructing Reinforcement Learning (RL) policies that adhere to safety requirements is an emerging field of study. RL agents learn via trial and error with an objective to optimize a reward signal. Often policies that are designed to accumulate rewards do not satisfy safety specifications. We present a methodology for counterexample guided refinement of a trained RL policy against a given safety specification. Our approach has two main components. The first component is an approach to discover failure trajectories using Bayesian optimization over multiple parameters of uncertainty from a policy learnt in a model-free setting. The second component selectively modifies the failure points of the policy using gradient-based updates. The approach has been tested on several RL environments, and we demonstrate that the policy can be made to respect the safety specifications through such targeted changes.

03/05/2021

Counterexample Guided RL Policy Refinement Using Bayesian Optimization

Briti Gangopadhyay, Pallab Dasgupta

Comments

Similar Papers

Conservative Safety Critics for Exploration

Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart and Sergey Levine, Florian Shkurti, Animesh Garg

Keywords Abstract Paper

Safe exploration, Reinforcement Learning

Model-based Adversarial Meta-Reinforcement Learning

Zichuan Lin, Garrett Thomas, Guangwen Yang, Tengyu Ma

Keywords Abstract Paper

First Order Constrained Optimization in Policy Space

Yiming Zhang, Quan Vuong, Keith Ross

Keywords Abstract Paper

Measuring the Reliability of Reinforcement Learning Algorithms

Stephanie C.Y. Chan, Samuel Fishman, Anoop Korattikara and John Canny, Sergio Guadarrama

Keywords Abstract Paper

reinforcement learning, metrics, statistics, reliability

An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch

Siddharth Desai, Ishan Durugkar, Haresh Karnan and Garrett Warnell, Josiah Hanna, Peter Stone

Keywords Abstract Paper

Conservative Offline Distributional Reinforcement Learning

Yecheng Ma, Dinesh Jayaraman, Osbert Bastani

Keywords Abstract Paper

Active deep Q-learning with demonstration

Si-An Chen,Hsuan-Tien Lin, Voot Tangkaratt, Masashi Sugiyam

Keywords Abstract Paper

Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization

Sicheng Zhu, Xiao Zhang, David Evans

Keywords Abstract Paper

Sim2Real Transfer for Deep Reinforcement Learning with Stochastic State Transition Delays

Sandeep Singh Sandha, Luis Garcia, Bharathan Balaji and Fatima Anwar, Mani Srivastava

Keywords Abstract Paper

Guaranteeing Safety of Learned Perception Modules via Measurement-Robust Control Barrier Functions

Sarah Dean, Andrew Taylor, Ryan Cosner and Benjamin Recht, Aaron Ames

Keywords Abstract Paper

Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration

Matteo Papini, Andrea Battistello, Marcello Restelli

Keywords Abstract Paper

Model-Based Reinforcement Learning for Infinite-Horizon Discounted Constrained Markov Decision Processes

Aria HasanzadeZonuzy, Dileep Kalathil, Srinivas Shakkottai

Keywords Abstract Paper

Machine Learning, Reinforcement Learning, Markov Decisions Processes

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs

harsh satija, Philip S. Thomas, Joelle Pineau, Romain Laroche

Keywords Abstract Paper

Safe Reinforcement Learning by Imagining the Near Future

Garrett Thomas, Yuping Luo, Tengyu Ma

Keywords Abstract Paper

Tolerance-Guided Policy Learning for Adaptable and Transferrable Delicate Industrial Insertion

Boshen Niu, Chenxi Wang, Changliu Liu

Keywords Abstract Paper

Trusted Multi-View Classification

Zongbo Han, Changqing Zhang, Huazhu FU, Joey T Zhou

Keywords Abstract Paper

Uncertainty Machine Learning, Multi-View Learning, Multi-Modal Learning

What Did You Think Would Happen? Explaining Agent Behaviour through Intended Outcomes

Herman Yau, Chris Russell, Simon Hadfield

Keywords Abstract Paper

Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees

Gregory Dexter, Kevin Bello, Jean Honorio

Keywords Abstract Paper

theory, reinforcement learning and planning

Unsupervised Domain Adaptation for Spatio-Temporal Action Localization

Nakul Agarwal, Yi-Ting Chen, Behzad Dariush, Ming-Hsuan Yang

Keywords Abstract Paper

Spatio-Temporal Action Localization, Unsupervised Domain Adaptation, Adversarial Learning, Video Analysis, Deep Learning

Instabilities of Offline RL with Pre-Trained Neural Representation

Ruosong Wang, Yifan Wu, Russ Salakhutdinov, Sham Kakade

Keywords Abstract Paper

Generating High-Quality Explanations for Navigation in Partially-Revealed Environments

Keywords Abstract Paper

Outcome-Driven Reinforcement Learning via Variational Inference

Tim G. J. Rudner, Vitchyr Pong, Rowan McAllister and Yarin Gal, Sergey Levine

Keywords Abstract Paper

reinforcement learning and planning, generative model

Constrained Markov Decision Processes via Backward Value Functions

Harsh Satija, Philip Amortila, Joelle Pineau

Keywords Abstract Paper

Safe Policy Learning for Continuous Control

Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart and
Sergey Levine, Florian Shkurti, Animesh Garg

Keywords Paper

Keywords Paper

Keywords Paper

Stephanie C.Y. Chan, Samuel Fishman, Anoop Korattikara and
John Canny, Sergio Guadarrama

Keywords Paper

Siddharth Desai, Ishan Durugkar, Haresh Karnan and
Garrett Warnell, Josiah Hanna, Peter Stone

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Sandeep Singh Sandha, Luis Garcia, Bharathan Balaji and
Fatima Anwar, Mani Srivastava

Keywords Paper

Sarah Dean, Andrew Taylor, Ryan Cosner and
Benjamin Recht, Aaron Ames

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Tim G. J. Rudner, Vitchyr Pong, Rowan McAllister and
Yarin Gal, Sergey Levine

Keywords Paper

Keywords Paper

Yinlam Chow, Ofir Nachum, Aleksandra Faust and
Edgar Dueñez-Guzman, Mohammad Ghavamzadeh

Keywords Paper

Jun Yamada, Youngwoon Lee, Gautam Salhotra and
Karl Pertsch, Max Pflueger, Gaurav Sukhatme, Joseph Lim, Peter Englert

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Zengyi Qin, Kaiqing Zhang, chenyx Chen and
Jingkai Chen, Chuchu Fan

Keywords Paper

Keywords Paper

Wonseok Jeon, Chen-Yang Su, Paul Barde and
Thang Doan, Derek Nowrouzezahrai, Joelle Pineau

Keywords Paper

Keywords Paper

Marlesson R. O. Santana, Luckeciano C. Melo, Fernando H. F. Camargo and
Bruno Brandão, Anderson Soares, Renan M. Oliveira, Sandor Caetano

Keywords Paper

Keywords Paper

Kevin Li, Abhishek Gupta, Ashwin D Reddy and
Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

Keywords Paper