Defining admissible rewards for high-confidence policy evaluation in batch reinforcement learning

Abstract: A key impediment to reinforcement learning (RL) in real applications with limited, batch data is in defining a reward function that reflects what we implicitly know about reasonable behaviour for a task and allows for robust off-policy evaluation. In this work, we develop a method to identify an admissible set of reward functions for policies that (a) do not deviate too far in performance from prior behaviour, and (b) can be evaluated with high confidence, given only a collection of past trajectories. Together, these ensure that we avoid proposing unreasonable policies in high-risk settings. We demonstrate our approach to reward design on synthetic domains as well as in a critical care context, to guide the design of a reward function that consolidates clinical objectives to learn a policy for weaning patients from mechanical ventilation.

06/12/2021

Defining admissible rewards for high-confidence policy evaluation in batch reinforcement learning

Niranjani Prasad, Barbara Engelhardt, Finale Doshi-Velez

Comments

Similar Papers

Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs

harsh satija, Philip S. Thomas, Joelle Pineau, Romain Laroche

Keywords Abstract Paper

Learning "What-if" Explanations for Sequential Decision-Making

Ioana Bica, Dan Jarrett, Alihan Hüyük, Mihaela van der Schaar

Keywords Abstract Paper

counterfactuals, preference learning, explaining decision-making

Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions

Omer Gottesman, Joseph Futoma, Yao Liu and Sonali Parbhoo, Leo Celi, Emma Brunskill, Finale Doshi-Velez

Keywords Abstract Paper

Scalable Bayesian Inverse Reinforcement Learning

Alex Chan, Mihaela van der Schaar

Keywords Abstract Paper

Bayesian, Imitation Learning, Inverse reinforcement learning

Off-policy evaluation in infinite-horizon reinforcement learning with latent confounders

Andrew Bennett, Nathan Kallus, Lihong Li, Ali Mousavi

Keywords Abstract Paper

Counterfactual Propagation for Semi-Supervised Individual Treatment Effect Estimation

Shonosuke Harada, Hisashi Kashima

Keywords Abstract Paper

causal inference, treatment effect estimation, semi-supervised learning

Conservative Safety Critics for Exploration

Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart and Sergey Levine, Florian Shkurti, Animesh Garg

Keywords Abstract Paper

Safe exploration, Reinforcement Learning

Identifying Causal-Effect Inference Failure with Uncertainty-Aware Models

Andrew Jesson, Sören Mindermann, Uri Shalit, Yarin Gal

Keywords Abstract Paper

, Theory -> Learning Theory

Clinician-in-the-Loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued Policies

Shengpu Tang, Aditya Modi, Michael Sjoding, Jenna Wiens

Keywords Abstract Paper

Applications - Neuroscience, Cognitive Science, Biology and Health

Towards Robust Bisimulation Metric Learning

Mete Kemertas, Tristan Aumentado-Armstrong

Keywords Abstract Paper

reinforcement learning and planning, robustness, representation learning

Evaluating model robustness and stability to dataset shift

Adarsh Subbaswamy, Roy Adams, Suchi Saria

Keywords Abstract Paper

FAR: A General Framework for Attributional Robustness

Adam Ivankay, Ivan Girardi, Chiara Marchiori, Pascal Frossard

Keywords Abstract Paper

robustness, attribution robustness, adversarial attacks, explainability, attribution maps

The Value Equivalence Principle for Model-Based Reinforcement Learning

Christopher Grimm, Andre Barreto, Satinder Singh, David Silver

Keywords Abstract Paper

Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies

Nathan Kallus, Masatoshi Uehara

Keywords Abstract Paper

A general knowledge distillation framework for counterfactual recommendation via uniform data

Dugang Liu, Pengxiang Cheng, Zhenhua Dong and Xiuqiang He, Weike Pan, Zhong Ming

Keywords Abstract Paper

counterfactual learning, uniform data, recommender systems, knowledge distillation

Corruption-robust exploration in episodic reinforcement learning

Thodoris Lykouris, Max Simchowitz, Alex Slivkins, Wen Sun

Keywords Abstract Paper

Robust Pre-Training by Adversarial Contrastive Learning

Ziyu Jiang, Tianlong Chen, Ting Chen, Zhangyang Wang

Keywords Abstract Paper

Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings

Jesse Zhang, Brian Cheung, Chelsea Finn and Sergey Levine, Dinesh Jayaraman

Keywords Abstract Paper

Generating High-Quality Explanations for Navigation in Partially-Revealed Environments

Keywords Abstract Paper

Security Analysis of Safe and Seldonian Reinforcement Learning Algorithms

Pinar Ozisik, Philip Thomas

Keywords Abstract Paper

Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates

Yang Liu, Hongyi Guo

Keywords Abstract Paper

COMBO: Conservative Offline Model-Based Policy Optimization

Tianhe Yu, Aviral Kumar, Rafael Rafailov and Aravind Rajeswaran, Sergey Levine, Chelsea Finn

Keywords Abstract Paper

deep learning, optimization, reinforcement learning and planning

Detecting Rewards Deterioration in Episodic Reinforcement Learning

Keywords Paper

Keywords Paper

Omer Gottesman, Joseph Futoma, Yao Liu and
Sonali Parbhoo, Leo Celi, Emma Brunskill, Finale Doshi-Velez

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart and
Sergey Levine, Florian Shkurti, Animesh Garg

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Dugang Liu, Pengxiang Cheng, Zhenhua Dong and
Xiuqiang He, Weike Pan, Zhong Ming

Keywords Paper

Keywords Paper

Keywords Paper

Jesse Zhang, Brian Cheung, Chelsea Finn and
Sergey Levine, Dinesh Jayaraman

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Tianhe Yu, Aviral Kumar, Rafael Rafailov and
Aravind Rajeswaran, Sergey Levine, Chelsea Finn

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Tianren Zhang, Shangqi Guo, Tian Tan and
Xiaolin Hu, Feng Chen

Keywords Paper

Keywords Paper

Sarah Dean, Andrew Taylor, Ryan Cosner and
Benjamin Recht, Aaron Ames

Keywords Paper

Keywords Paper

Andreea-Ioana Deac, Petar Veličković, Ognjen Milinkovic and
Pierre-Luc Bacon, Jian Tang, Mladen Nikolic

Keywords Paper

Jingfeng Zhang, Xilie Xu, Bo Han and
Gang Niu, Lizhen Cui, Masashi Sugiyama, Mohan Kankanhalli

Keywords Paper

Keywords Paper

Andrew Jesson, Panagiotis Tigas, Joost van Amersfoort and
Andreas Kirsch, Uri Shalit, Yarin Gal

Keywords Paper

Keywords Paper

Tung Nguyen, Rui Shu, Tuan Pham and
Hung Bui, Stefano Ermon

Keywords Paper

Daniel Neider, Jean-Raphael Gaglione, Ivan Gavran and
Ufuk Topcu, Bo Wu, Zhe Xu

Keywords Paper