Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement

06/12/2020

Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement

Benjamin Eysenbach, XINYANG GENG, Sergey Levine, Russ Salakhutdinov

Keywords: Optimization -> Non-Convex Optimization, Theory -> Statistical Physics of Learning

Abstract Paper Similar Papers

Abstract: Multi-task reinforcement learning (RL) aims to simultaneously learn policies for solving many tasks. Several prior works have found that relabeling past experience with different reward functions can improve sample efficiency. Relabeling methods typically pose the question: if, in hindsight, we assume that our experience was optimal for some task, for what task was it optimal? Inverse RL answers this question. In this paper we show that inverse RL is a principled mechanism for reusing experience across tasks. We use this idea to generalize goal-relabeling techniques from prior work to arbitrary types of reward functions. Our experiments confirm that relabeling data using inverse RL outperforms prior relabeling methods on goal-reaching tasks, and accelerates learning on more general multi-task settings where prior methods are not applicable, such as domains with discrete sets of rewards and those with linear reward functions.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

A New Representation of Successor Features for Transfer across Dissimilar Environments

Majid Abdolshah, Hung Le, Thommen Karimpanal George and
Sunil Gupta, Santu Rana, Svetha Venkatesh

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:43

26/10/2020

Symbolic Plans as High-Level Instructions for Reinforcement Learning

León Illanes, Xi Yan, Rodrigo Toro Icarte, Sheila A. McIlraith

Keywords Paper

Planning, Reinforcement Learning, Sparse rewards, Sample efficiency, High-level instructions

0

0

0

0

9:06

06/12/2021

Offline Reinforcement Learning as One Big Sequence Modeling Problem

Michael Janner, Qiyang Li, Sergey Levine

Keywords Paper

reinforcement learning and planning, transformers, language

0

0

0

0

9:48

06/12/2021

Regret Minimization Experience Replay in Off-Policy Reinforcement Learning

Xu-Hui Liu, Zhenghai Xue, Jingcheng Pang and
Shengyi Jiang, Feng Xu, Yang Yu

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

14:06

18/07/2021

Randomized Entity-wise Factorization for Multi-Agent Reinforcement Learning

Shariq Iqbal, Christian Schroeder, Bei Peng and
Wendelin Boehmer, Shimon Whiteson, Fei Sha

Keywords Paper

Optimization, Convex Optimization, Reinforcement Learning and Planning, Multi-Agent RL, Algorithms, Large Scale Learning; Probabilistic Methods, Distributed Inference

0

0

0

0

20:08

26/10/2020

Joint Inference of Reward Machines and Policies for Reinforcement Learning

Zhe Xu, Ivan Gavran, Yousef Ahmad and
Rupak Majumdar, Daniel Neider, Ufuk Topcu, Bo Wu

Keywords Paper

Reward Machines, Automata Learning, Reinforcement Learning

0

0

0

0

9:57

03/05/2021

Learning Robust State Abstractions for Hidden-Parameter Block MDPs

Amy Zhang, Shagun Sodhani, Khimya Khetarpal, Joelle Pineau

Keywords Paper

bisimulation, block mdp, hidden-parameter mdp, multi-task reinforcement learning

0

0

0

0

4:17

06/12/2021

Variational Multi-Task Learning with Gumbel-Softmax Priors

Jiayi Shen, Xiantong Zhen, Marcel Worring, Ling Shao

Keywords Paper

machine learning, generative model

0

0

0

0

13:09

06/12/2021

Model-Based Episodic Memory Induces Dynamic Hybrid Controls

Hung Le, Thommen Karimpanal George, Majid Abdolshah and
Truyen Tran, Svetha Venkatesh

Keywords Paper

reinforcement learning and planning

0

0

0

0

15:06

26/04/2020

Meta-Q-Learning

Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, Alexander J. Smola

Keywords Paper

meta reinforcement learning, propensity estimation, off-policy

0

0

0

0

15:50

18/07/2021

Taylor Expansion of Discount Factors

Yunhao Tang, Mark Rowland, Remi Munos, Michal Valko

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:22

26/04/2020

Observational Overfitting in Reinforcement Learning

Xingyou Song, Yiding Jiang, Stephen Tu and
Yilun Du, Behnam Neyshabur

Keywords Paper

observational, overfitting, reinforcement, learning, generalization, implicit, regularization, overparametrization

0

0

0

0

4:52

19/08/2021

Reward-Constrained Behavior Cloning

Zhaorong Wang, Meng Wang, Jingqi Zhang and
Yingfeng Chen, Chongjie Zhang

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning, Constraint Optimization

0

0

0

0

14:43

18/07/2021

Representation Matters: Offline Pretraining for Sequential Decision Making

Mengjiao Yang, Ofir Nachum

Keywords Paper

Reinforcement Learning and Planning

1

0

0

0

5:06

12/07/2020

Data Valuation using Reinforcement Learning

Jinsung Yoon, Sercan Arik, Tomas Pfister

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

14:35

03/05/2021

Learning to Sample with Local and Global Contexts in Experience Replay Buffer

Youngmin Oh, Kimin Lee, Jinwoo Shin and
Eunho Yang, Sung Ju Hwang

Keywords Paper

reinforcement learning, off-policy RL, experience replay buffer

1

0

0

0

5:20

02/02/2021

Learning to Reweight Imaginary Transitions for Model-Based Reinforcement Learning

Wenzhen Huang, Qiyue Yin, Junge Zhang, Kaiqi Huang

Keywords Paper

0

0

0

0

14:38

19/08/2021

Variational Model-based Policy Optimization

Yinlam Chow, Brandon Cui, Moonkyung Ryu, Mohammad Ghavamzadeh

Keywords Paper

Machine Learning, Reinforcement Learning

0

0

0

0

15:31

02/02/2021

The Value-Improvement Path: Towards Better Representations for Reinforcement Learning

Will Dabney, André Barreto, Mark Rowland and
Robert Dadashi, John Quan, Marc G. Bellemare, David Silver

Keywords Paper

0

0

0

0

20:06

06/12/2020

Learning to Incentivize Other Learning Agents

Jiachen Yang, Ang Li, Mehrdad Farajtabar and
Peter Sunehag, Edward Hughes, Hongyuan Zha

Keywords Paper

0

0

0

0

3:20

19/08/2021

Solving Continuous Control with Episodic Memory

Igor Kuznetsov, Andrey Filchenkov

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning

1

0

0

0

12:13

06/12/2020

Meta-Learning Requires Meta-Augmentation

Janarthanan Rajendran, Alex Irpan, Eric Jang

Keywords Paper

0

0

0

0

2:59

06/12/2020

An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay

Scott Fujimoto, David Meger, Doina Precup

Keywords Paper

0

0

0

0

2:53

12/07/2020

Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards

Umer Siddique, Paul Weng, Matthieu Zimmer

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:17

19/08/2021

MapGo: Model-Assisted Policy Optimization for Goal-Oriented Tasks

Menghui Zhu, Minghuan Liu, Jian Shen and
Zhicheng Zhang, Sheng Chen, Weinan Zhang, Deheng Ye, Yong Yu, Qiang Fu, Wei Yang

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning

0

0

0

0

11:28

12/07/2020

What Can Learned Intrinsic Rewards Capture?

Zeyu Zheng, Junhyuk Oh, Matteo Hessel and
Zhongwen Xu, Manuel Kroiss, Hado van Hasselt, David Silver, Satinder Singh

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

14:47

06/12/2021

Distributional Reinforcement Learning for Multi-Dimensional Reward Functions

Pushi Zhang, Xiaoyu Chen, Li Zhao and
Wei Xiong, Tao Qin, Tie-Yan Liu

Keywords Paper

reinforcement learning and planning

0

0

0

0

8:30

12/07/2020

Revisiting Fundamentals of Experience Replay

William Fedus, Prajit Ramachandran, Rishabh Agarwal and
Yoshua Bengio, Hugo Larochelle, Mark Rowland, Will Dabney

Keywords Paper

Reinforcement Learning - Deep RL

1

0

0

0

13:35

18/07/2021

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Jin Zhang, Jianhao Wang, Hao Hu and
Tong Chen, Yingfeng Chen, Changjie Fan, Chongjie Zhang

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

4:19

18/11/2020

A state aggregation approach for solving knapsack problem with deep reinforcement learning

Reza Refaei Afshar, Yingqian Zhang, Murat Firat, Uzay Kaymak

Keywords Paper

0

0

0

0

12:23

06/12/2020

On Efficiency in Hierarchical Reinforcement Learning

Zheng Wen, Doina Precup, Morteza Ibrahimi and
Andre Barreto, Benjamin Van Roy, Satinder Singh

Keywords Paper

0

0

0

0

3:05

26/08/2020

Independent Subspace Analysis for Unsupervised Learning of Disentangled Representations

Jan Stuehmer, Richard Turner, Sebastian Nowozin

Keywords Paper

0

0

0

0

11:43

18/07/2021

Improved Corruption Robust Algorithms for Episodic Reinforcement Learning

Yifang Chen, Simon Du, Kevin Jamieson

Keywords Paper

, Optimization, Non-Convex Optimization, Theory, Online Learning Theory

0

0

0

0

5:20

06/12/2021

Flexible Option Learning

Martin Klissarov, Doina Precup

Keywords Paper

reinforcement learning and planning

1

0

0

0

15:47

06/12/2021

Reward is enough for convex MDPs

Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, Satinder Singh

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:12

06/12/2020

On Warm-Starting Neural Network Training

Jordan Ash, Ryan Adams

Keywords Paper

0

0

0

0

2:30

12/07/2020

Sequential Transfer in Reinforcement Learning with a Generative Model

Andrea Tirinzoni, Riccardo Poiani, Marcello Restelli

Keywords Paper

Reinforcement Learning - General

0

0

0

0

10:54

18/07/2021

Policy Caches with Successor Features

Mark Nemecek, Ron Parr

Keywords Paper

Reinforcement Learning and Planning, Reinforcement Learning and Planning, Markov Decision Processes; Reinforcement Learning and Planning, Reinforcement Learning

0

0

0

0

5:15

18/07/2021

Phasic Policy Gradient

Karl Cobbe, Jacob Hilton, Oleg Klimov, John Schulman

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

20:40

06/12/2021

Hindsight Task Relabelling: Experience Replay for Sparse Reward Meta-RL

Charles Packer, Pieter Abbeel, Joseph Gonzalez

Keywords Paper

reinforcement learning and planning

1

0

0

0

14:03