What Can Learned Intrinsic Rewards Capture?

Abstract: The objective of a reinforcement learning agent is to behave so as to maximise the sum of a suitable scalar function of state: the reward. These rewards are typically given and immutable. In this paper, we instead consider the proposition that the reward function itself can be a good locus of learned knowledge. To investigate this, we propose a scalable meta-gradient framework for learning useful intrinsic reward functions across multiple lifetimes of experience. Through several proof-of-concept experiments, we show that it is feasible to learn and capture knowledge about long-term exploration and exploitation into a reward function. Furthermore, we show that unlike policy transfer methods that capture ``how'' the agent should behave, the learned reward functions can generalise to other kinds of agents and to changes in the dynamics of the environment by capturing ``what'' the agent should strive to do.

26/10/2020

Intrinsic Motivation, Intrinsic Reward, Intrinsically Motivated Reinforcement Learning, Deep Reinforcement Learning, Reinforcement Learning

9:55

03/05/2021

What Can Learned Intrinsic Rewards Capture?

Zeyu Zheng, Junhyuk Oh, Matteo Hessel, Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh

Comments

Similar Papers

Symbolic Plans as High-Level Instructions for Reinforcement Learning

León Illanes, Xi Yan, Rodrigo Toro Icarte, Sheila A. McIlraith

Keywords Abstract Paper

Planning, Reinforcement Learning, Sparse rewards, Sample efficiency, High-level instructions

Maximum Likelihood Constraint Inference for Inverse Reinforcement Learning

Dexter R.R. Scobee, S. Shankar Sastry

Keywords Abstract Paper

learning from demonstration, inverse reinforcement learning, constraint inference

Inverse Reinforcement Learning in a Continuous State Space with Formal Guarantees

Gregory Dexter, Kevin Bello, Jean Honorio

Keywords Abstract Paper

theory, reinforcement learning and planning

Explicable Reward Design for Reinforcement Learning Agents

Rati Devidze, Goran Radanovic, Parameswaran Kamalaruban, Adish Singla

Keywords Abstract Paper

optimization, reinforcement learning and planning, interpretability

Learning to Incentivize Other Learning Agents

Jiachen Yang, Ang Li, Mehrdad Farajtabar and Peter Sunehag, Edward Hughes, Hongyuan Zha

Keywords Abstract Paper

Learning to Utilize Shaping Rewards: A New Approach of Reward Shaping

Yujing Hu, Weixun Wang, Hangtian Jia and Yixiang Wang, Yingfeng Chen, Jianye Hao, Feng Wu, Changjie Fan

Keywords Abstract Paper

Provably Efficient Learning of Transferable Rewards

Alberto Maria Metelli, Giorgia Ramponi, Alessandro Concetti, Marcello Restelli

Keywords Abstract Paper

Optimization, Convex Optimization, Reinforcement Learning and Planning, Optimization, Combinatorial Optimization

Mutual Information State Intrinsic Control

Rui Zhao, Yang Gao, Pieter Abbeel and Volker Tresp, Wei Xu

Keywords Abstract Paper

Intrinsic Motivation, Intrinsic Reward, Intrinsically Motivated Reinforcement Learning, Deep Reinforcement Learning, Reinforcement Learning

Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers

Ben Eysenbach, Shreyas Chaudhari, Swapnil Asawa and Sergey Levine, Ruslan Salakhutdinov

Keywords Abstract Paper

reinforcement learning, domain adaptation, transfer learning

Imitation Learning over Heterogeneous Agents with Restraining Bolts

Giuseppe De Giacomo, Marco Favorito, Luca Iocchi, Fabio Patrizi

Keywords Abstract Paper

Restraining Bolts, Non-markovian Rewards, Transfer Learning

Emergent Prosociality in Multi-Agent Games Through Gifting

Woodrow Z. Wang, Mark Beliaev, Erdem Bıyık and Daniel A. Lazar, Ramtin Pedarsani, Dorsa Sadigh

Keywords Abstract Paper

Agent-based and Multi-agent Systems, Coordination and Cooperation, Multi-agent Learning, Noncooperative Games

RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments

Roberta Raileanu, Tim Rocktäschel

Keywords Abstract Paper

reinforcement learning, exploration, curiosity

Advice-Guided Reinforcement Learning in a non-Markovian Environment

Daniel Neider, Jean-Raphael Gaglione, Ivan Gavran and Ufuk Topcu, Bo Wu, Zhe Xu

Keywords Abstract Paper

Temporal-Logic-Based Reward Shaping for Continuing Reinforcement Learning Tasks

Yuqian Jiang, Suda Bharadwaj, Bo Wu and Rishi Shah, Ufuk Topcu, Peter Stone

Keywords Abstract Paper

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments

Jun Yamada, Youngwoon Lee, Gautam Salhotra and Karl Pertsch, Max Pflueger, Gaurav Sukhatme, Joseph Lim, Peter Englert

Keywords Abstract Paper

Reinforcement Learning of Sequential Price Mechanisms

Gianluca Brero, Alon Eden, Matthias Gerstgrasser and David Parkes, Duncan Rheingans-Yoo

Keywords Abstract Paper

RD$^2$: Reward Decomposition with Representation Decomposition

Zichuan Lin, Derek Yang, Li Zhao and Tao Qin, Guangwen Yang, Tie-Yan Liu

Keywords Abstract Paper

Risk-Aware Transfer in Reinforcement Learning using Successor Features

Michael Gimelfarb, Andre Barreto, Scott Sanner, Chi-Guhn Lee

Keywords Abstract Paper

reinforcement learning and planning, representation learning, transfer learning

Reward Identification in Inverse Reinforcement Learning

Kuno Kim, Shivam Garg, Kiran Shiragur, Stefano Ermon

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

Observational Overfitting in Reinforcement Learning

Xingyou Song, Yiding Jiang, Stephen Tu and Yilun Du, Behnam Neyshabur

Keywords Abstract Paper

observational, overfitting, reinforcement, learning, generalization, implicit, regularization, overparametrization

Reward-Constrained Behavior Cloning

Zhaorong Wang, Meng Wang, Jingqi Zhang and Yingfeng Chen, Chongjie Zhang

Keywords Abstract Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jiachen Yang, Ang Li, Mehrdad Farajtabar and
Peter Sunehag, Edward Hughes, Hongyuan Zha

Keywords Paper

Yujing Hu, Weixun Wang, Hangtian Jia and
Yixiang Wang, Yingfeng Chen, Jianye Hao, Feng Wu, Changjie Fan

Keywords Paper

Keywords Paper

Rui Zhao, Yang Gao, Pieter Abbeel and
Volker Tresp, Wei Xu

Keywords Paper

Ben Eysenbach, Shreyas Chaudhari, Swapnil Asawa and
Sergey Levine, Ruslan Salakhutdinov

Keywords Paper

Keywords Paper

Woodrow Z. Wang, Mark Beliaev, Erdem Bıyık and
Daniel A. Lazar, Ramtin Pedarsani, Dorsa Sadigh

Keywords Paper

Keywords Paper

Daniel Neider, Jean-Raphael Gaglione, Ivan Gavran and
Ufuk Topcu, Bo Wu, Zhe Xu

Keywords Paper

Yuqian Jiang, Suda Bharadwaj, Bo Wu and
Rishi Shah, Ufuk Topcu, Peter Stone

Keywords Paper

Jun Yamada, Youngwoon Lee, Gautam Salhotra and
Karl Pertsch, Max Pflueger, Gaurav Sukhatme, Joseph Lim, Peter Englert

Keywords Paper

Gianluca Brero, Alon Eden, Matthias Gerstgrasser and
David Parkes, Duncan Rheingans-Yoo

Keywords Paper

Zichuan Lin, Derek Yang, Li Zhao and
Tao Qin, Guangwen Yang, Tie-Yan Liu

Keywords Paper

Keywords Paper

Keywords Paper

Xingyou Song, Yiding Jiang, Stephen Tu and
Yilun Du, Behnam Neyshabur

Keywords Paper

Zhaorong Wang, Meng Wang, Jingqi Zhang and
Yingfeng Chen, Chongjie Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Wonseok Jeon, Chen-Yang Su, Paul Barde and
Thang Doan, Derek Nowrouzezahrai, Joelle Pineau

Keywords Paper

Tonghan Wang, Jianhao Wang, Yi Wu, Chongjie Zhang

Keywords Paper

Majid Abdolshah, Hung Le, Thommen Karimpanal George and
Sunil Gupta, Santu Rana, Svetha Venkatesh

Keywords Paper

Keywords Paper

Han Zheng, Pengfei Wei, Jing Jiang and
Guodong Long, Qinghua Lu, Chengqi Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

David Lindner, Matteo Turchetta, Sebastian Tschiatschek and
Kamil Ciosek, Andreas Krause

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper