Explicable Reward Design for Reinforcement Learning Agents

06/12/2021

Explicable Reward Design for Reinforcement Learning Agents

Rati Devidze, Goran Radanovic, Parameswaran Kamalaruban, Adish Singla

Keywords: optimization, reinforcement learning and planning, interpretability

Abstract Paper Similar Papers

Abstract: We study the design of explicable reward functions for a reinforcement learning agent while guaranteeing that an optimal policy induced by the function belongs to a set of target policies. By being explicable, we seek to capture two properties: (a) informativeness so that the rewards speed up the agent's convergence, and (b) sparseness as a proxy for ease of interpretability of the rewards. The key challenge is that higher informativeness typically requires dense rewards for many learning tasks, and existing techniques do not allow one to balance these two properties appropriately. In this paper, we investigate the problem from the perspective of discrete optimization and introduce a novel framework, ExpRD, to design explicable reward functions. ExpRD builds upon an informativeness criterion that captures the (sub-)optimality of target policies at different time horizons in terms of actions taken from any given starting state. We provide a mathematical analysis of ExpRD, and show its connections to existing reward design techniques, including potential-based reward shaping. Experimental results on two navigation tasks demonstrate the effectiveness of ExpRD in designing explicable reward functions.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping

Yujing Hu, Weixun Wang, Hangtian Jia and
Yixiang Wang, Yingfeng Chen, Jianye Hao, Feng Wu, Changjie Fan

Keywords Paper

0

0

0

0

3:20

26/10/2020

Symbolic Plans as High-Level Instructions for Reinforcement Learning

León Illanes, Xi Yan, Rodrigo Toro Icarte, Sheila A. McIlraith

Keywords Paper

Planning, Reinforcement Learning, Sparse rewards, Sample efficiency, High-level instructions

0

0

0

0

9:06

06/12/2021

Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees

Gregory Dexter, Kevin Bello, Jean Honorio

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

14:49

18/07/2021

Reward Identification in Inverse Reinforcement Learning

Kuno Kim, Shivam Garg, Kiran Shiragur, Stefano Ermon

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:18

06/12/2021

Adversarial Intrinsic Motivation for Reinforcement Learning

Ishan Durugkar, Mauricio Tec, Scott Niekum, Peter Stone

Keywords Paper

reinforcement learning and planning, generative model

0

0

0

0

13:11

26/04/2020

Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning

Dexter R.R. Scobee, S. Shankar Sastry

Keywords Paper

learning from demonstration, inverse reinforcement learning, constraint inference

0

0

0

0

5:19

12/07/2020

What Can Learned Intrinsic Rewards Capture?

Zeyu Zheng, Junhyuk Oh, Matteo Hessel and
Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

14:47

06/12/2021

Risk-Aware Transfer in Reinforcement Learning using Successor Features

Michael Gimelfarb, Andre Barreto, Scott Sanner, Chi-Guhn Lee

Keywords Paper

reinforcement learning and planning, representation learning, transfer learning

0

0

0

0

12:06

02/02/2021

Sequential Generative Exploration Model for Partially Observable Reinforcement Learning

Haiyan Yin, Jianda Chen, Sinno Jialin Pan, Sebastian Tschiatschek

Keywords Paper

0

0

0

0

14:40

12/07/2020

Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards

Umer Siddique, Paul Weng, Matthieu Zimmer

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:17

19/08/2021

Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts

Weinan Zhang, Xihuai Wang, Jian Shen, Ming Zhou

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Multi-agent Learning

0

0

0

0

13:10

18/07/2021

Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning

Anuj Mahajan, Mikayel Samvelyan, Lei Mao and
Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Anima Anandkumar

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:16

02/02/2021

Reinforcement Learning of Sequential Price Mechanisms

Gianluca Brero, Alon Eden, Matthias Gerstgrasser and
David Parkes, Duncan Rheingans-Yoo

Keywords Paper

0

0

0

0

18:11

18/07/2021

Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning

Jongwook Choi, Archit Sharma, Honglak Lee and
Sergey Levine, Shixiang Gu

Keywords Paper

Neuroscience and Cognitive Science, Neuroscience, Reinforcement Learning and Planning, Algorithms, Representation Learning; Algorithms, Sparse Coding and Dimensionality Expansion; Applications, Matrix and Ten

0

0

0

0

5:16

02/02/2021

A One-Size-Fits-All Solution to Conservative Bandit Problems

Yihan Du, Siwei Wang, Longbo Huang

Keywords Paper

0

0

0

0

14:13

02/02/2021

Temporal-Logic-Based Reward Shaping for Continuing Reinforcement Learning Tasks

Yuqian Jiang, Suda Bharadwaj, Bo Wu and
Rishi Shah, Ufuk Topcu, Peter Stone

Keywords Paper

0

0

0

0

15:40

12/07/2020

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Yaodong Yang, Jianye Hao, Guangyong Chen and
Hongyao Tang, Yingfeng Chen, Yujing Hu, Changjie Fan, Zhongyu Wei

Keywords Paper

Planning, Control, and Multiagent Learning

0

0

0

0

6:42

02/02/2021

Bounded Risk-Sensitive Markov Games: Forward Policy Design and Inverse Reward Learning with Iterative Reasoning and Cumulative Prospect Theory

Ran Tian, Liting Sun, Masayoshi Tomizuka

Keywords Paper

0

0

0

0

16:28

02/02/2021

An Efficient Algorithm for Deep Stochastic Contextual Bandits

Tan Zhu, Guannan Liang, Chunjiang Zhu and
Haining Li, Jinbo Bi

Keywords Paper

0

0

0

0

14:36

06/12/2020

Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization

Sreejith Balakrishnan, Quoc Phong Nguyen, Bryan Kian Hsiang Low, Harold Soh

Keywords Paper

0

0

0

0

3:22

02/02/2021

Exploration by Maximizing Renyi Entropy for Reward-Free RL Framework

Chuheng Zhang, Yuanying Cai, Longbo Huang, Jian Li

Keywords Paper

0

0

0

0

16:03

19/08/2021

Average-Reward Reinforcement Learning with Trust Region Methods

Xiaoteng Ma, Xiaohang Tang, Li Xia and
Jun Yang, Qianchuan Zhao

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning, Markov Decision Processes

0

0

0

0

14:41

06/12/2021

Identifiability in inverse reinforcement learning

Haoyang Cao, Samuel Cohen, Lukasz Szpruch

Keywords Paper

reinforcement learning and planning

0

0

0

0

15:07

06/12/2021

Information Directed Reward Learning for Reinforcement Learning

David Lindner, Matteo Turchetta, Sebastian Tschiatschek and
Kamil Ciosek, Andreas Krause

Keywords Paper

reinforcement learning and planning, active learning

0

0

0

0

11:47

18/07/2021

On-Policy Deep Reinforcement Learning for the Average-Reward Criterion

Yiming Zhang, Keith Ross

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:14

18/07/2021

Taylor Expansion of Discount Factors

Yunhao Tang, Mark Rowland, Remi Munos, Michal Valko

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:22

06/12/2021

Towards Robust Bisimulation Metric Learning

Mete Kemertas, Tristan Aumentado-Armstrong

Keywords Paper

reinforcement learning and planning, robustness, representation learning

0

0

0

0

12:24

26/10/2020

Imitation Learning over Heterogeneous Agents with Restraining Bolts

Giuseppe De Giacomo, Marco Favorito, Luca Iocchi, Fabio Patrizi

Keywords Paper

Restraining Bolts, Non-markovian Rewards, Transfer Learning

0

0

0

0

7:50

26/10/2020

Joint Inference of Reward Machines and Policies for Reinforcement Learning

Zhe Xu, Ivan Gavran, Yousef Ahmad and
Rupak Majumdar, Daniel Neider, Ufuk Topcu, Bo Wu

Keywords Paper

Reward Machines, Automata Learning, Reinforcement Learning

0

0

0

0

9:57

02/02/2021

Synthesis of Search Heuristics for Temporal Planning via Reinforcement Learning

Andrea Micheli, Alessandro Valentini

Keywords Paper

0

0

0

0

19:00

19/08/2021

MapGo: Model-Assisted Policy Optimization for Goal-Oriented Tasks

Menghui Zhu, Minghuan Liu, Jian Shen and
Zhicheng Zhang, Sheng Chen, Weinan Zhang, Deheng Ye, Yong Yu, Qiang Fu, Wei Yang

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning

0

0

0

0

11:28

26/04/2020

Posterior sampling for multi-agent reinforcement learning: solving extensive games with imperfect information

Yichi Zhou, Jialian Li, Jun Zhu

Keywords Paper

0

0

0

0

12:55

19/04/2021

Exploring supervised and unsupervised rewards in machine translation

Julia Ive, Zixu Wang, Marina Fomicheva, Lucia Specia

Keywords Paper

0

0

0

0

10:52

12/07/2020

Sequential Transfer in Reinforcement Learning with a Generative Model

Andrea Tirinzoni, Riccardo Poiani, Marcello Restelli

Keywords Paper

Reinforcement Learning - General

0

0

0

0

10:54

06/12/2021

Local policy search with Bayesian optimization

Sarah Müller, Alexander von Rohr, Sebastian Trimpe

Keywords Paper

theory, optimization, reinforcement learning and planning, active learning

0

0

0

0

11:42

26/04/2020

Implementation Matters in Deep RL: A Case Study on PPO and TRPO

Logan Engstrom, Andrew Ilyas, Shibani Santurkar and
Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry

Keywords Paper

deep policy gradient methods, deep reinforcement learning, trpo, ppo

0

0

0

0

20:41

06/12/2020

PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals

Henry Charlesworth, Giovanni Montana

Keywords Paper

0

0

0

0

3:20

06/12/2020

Variational Policy Gradient Method for Reinforcement Learning with General Utilities

Junyu Zhang, Alec Koppel, Amrit Bedi and
Csaba Szepesvari, Mengdi Wang

Keywords Paper

0

0

0

0

3:20

25/07/2020

KERL: A knowledge-guided reinforcement learning model for sequential recommendation

Pengfei Wang, Yu Fan, Long Xia and
Wayne Xin Zhao, Shaozhang Niu, Jimmy Huang

Keywords Paper

sequential recommendation, reinforcement learning, knowledge graph

0

0

0

0

13:23

18/07/2021

Provably Efficient Learning of Transferable Rewards

Alberto Maria Metelli, Giorgia Ramponi, Alessandro Concetti, Marcello Restelli

Keywords Paper

Optimization, Convex Optimization, Reinforcement Learning and Planning, Optimization, Combinatorial Optimization

0

0

0

0

5:14