A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation

18/07/2021

A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation

Scott Fujimoto, David Meger, Doina Precup

Keywords: Reinforcement Learning and Planning, Deep RL

Abstract Paper Similar Papers

Abstract: Marginalized importance sampling (MIS), which measures the density ratio between the state-action occupancy of a target policy and that of a sampling distribution, is a promising approach for off-policy evaluation. However, current state-of-the-art MIS methods rely on complex optimization tricks and succeed mostly on simple toy problems. We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy. The successor representation can be trained through deep reinforcement learning methodology and decouples the reward optimization from the dynamics of the environment, making the resulting algorithm stable and applicable to high-dimensional domains. We evaluate the empirical performance of our approach on a variety of challenging Atari and MuJoCo environments.

1

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning

Mohammadhosein Hasanbeig, Natasha Yogananda Jeppu, Alessandro Abate and
Tom Melham, Daniel Kroening

Keywords Paper

0

0

0

0

15:45

06/12/2021

Provable Representation Learning for Imitation with Contrastive Fourier Features

Ofir Nachum, Mengjiao Yang

Keywords Paper

reinforcement learning and planning, contrastive learning, representation learning

0

0

0

0

15:06

06/12/2020

RD$^2$: Reward Decomposition with Representation Decomposition

Zichuan Lin, Derek Yang, Li Zhao and
Tao Qin, Guangwen Yang, Tie-Yan Liu

Keywords Paper

0

0

0

0

3:10

18/07/2021

APS: Active Pretraining with Successor Features

Hao Liu, Pieter Abbeel

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

14:29

06/12/2021

Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning

Siyuan Zhang, Nan Jiang

Keywords Paper

reinforcement learning and planning

0

0

0

0

13:23

06/12/2021

Dynamic Bottleneck for Robust Self-Supervised Exploration

Chenjia Bai, Lingxiao Wang, Lei Han and
Animesh Garg, Jianye Hao, Peng Liu, Zhaoran Wang

Keywords Paper

reinforcement learning and planning

0

0

0

0

6:17

06/12/2020

Discovering Reinforcement Learning Algorithms

Junhyuk Oh, Matteo Hessel, Wojciech Czarnecki and
Zhongwen Xu, Hado van Hasselt, Satinder Singh, David Silver

Keywords Paper

0

0

0

0

3:21

02/02/2021

Bayesian Distributional Policy Gradients

Luchen Li, A. Aldo Faisal

Keywords Paper

1

0

0

0

18:06

06/12/2020

The Value Equivalence Principle for Model-Based Reinforcement Learning

Christopher Grimm, Andre Barreto, Satinder Singh, David Silver

Keywords Paper

0

0

0

0

3:19

06/12/2021

Adversarial Intrinsic Motivation for Reinforcement Learning

Ishan Durugkar, Mauricio Tec, Scott Niekum, Peter Stone

Keywords Paper

reinforcement learning and planning, generative model

0

0

0

0

13:11

03/05/2021

Data-Efficient Reinforcement Learning with Self-Predictive Representations

Max Schwarzer, Ankesh Anand, Rishab Goel and
R Devon Hjelm, Aaron Courville, Philip Bachman

Keywords Paper

Representation Learning, Self-Supervised Learning, Reinforcement Learning, Sample Efficiency

0

0

0

1

10:04

12/07/2020

Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences

Daniel Brown, Scott Niekum, Russell Coleman, Ravi Srinivasan

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:11

06/12/2020

Value-driven Hindsight Modelling

Arthur Guez, Fabio Viola, Theophane Weber and
Lars Buesing, Steven Kapturowski, Doina Precup, David Silver, Nicolas Heess

Keywords Paper

1

0

0

0

3:20

06/12/2020

Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization

Sreejith Balakrishnan, Quoc Phong Nguyen, Bryan Kian Hsiang Low, Harold Soh

Keywords Paper

0

0

0

0

3:22

26/04/2020

Disagreement-Regularized Imitation Learning

Kiante Brantley, Wen Sun, Mikael Henaff

Keywords Paper

imitation learning, reinforcement learning, uncertainty

0

0

0

0

4:53

06/12/2021

Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration

Lulu Zheng, Jiarui Chen, Jianhao Wang and
Jiamin He, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao, Chongjie Zhang

Keywords Paper

reinforcement learning and planning

0

0

0

0

12:25

19/08/2021

Novelty Detection via Contrastive Learning with Negative Data Augmentation

Chengwei Chen, Yuan Xie, Shaohui Lin and
Ruizhi Qiao, Jian Zhou, Xin Tan, Yi Zhang, Lizhuang Ma

Keywords Paper

Computer Vision, 2D and 3D Computer Vision, Anomaly/Outlier Detection

0

0

0

0

14:37

12/07/2020

Learning Human Objectives by Evaluating Hypothetical Behavior

Siddharth Reddy, Anca Dragan, Sergey Levine and
Shane Legg, Jan Leike

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

10:21

23/08/2020

Adversarial infidelity learning for model interpretation

Jian Liang, Bing Bai, Yuren Cao and
Kun Bai, Fei Wang

Keywords Paper

infidelity, model interpretation, adversarial learning, black-box explanations

0

0

0

0

5:34

26/04/2020

Posterior sampling for multi-agent reinforcement learning: solving extensive games with imperfect information

Yichi Zhou, Jialian Li, Jun Zhu

Keywords Paper

0

0

0

0

12:55

06/12/2020

An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch

Siddharth Desai, Ishan Durugkar, Haresh Karnan and
Garrett Warnell, Josiah Hanna, Peter Stone

Keywords Paper

0

0

0

0

3:22

26/04/2020

Learning Self-Correctable Policies and Value Functions from Demonstrations with Negative Sampling

Yuping Luo, Huazhe Xu, Tengyu Ma

Keywords Paper

imitation learning, model-based imitation learning, model-based RL, behavior cloning, covariate shift

0

0

0

0

4:38

06/12/2021

Towards Robust Bisimulation Metric Learning

Mete Kemertas, Tristan Aumentado-Armstrong

Keywords Paper

reinforcement learning and planning, robustness, representation learning

0

0

0

0

12:24

02/02/2021

Addressing Action Oscillations through Learning Policy Inertia

Chen Chen, Hongyao Tang, Jianye Hao and
Wulong Liu, Zhaopeng Meng

Keywords Paper

0

0

0

0

14:57

06/12/2020

Model-based Policy Optimization with Unsupervised Model Adaptation

Jian Shen, Han Zhao, Weinan Zhang, Yong Yu

Keywords Paper

0

0

0

0

3:09

03/05/2021

Modeling the Second Player in Distributionally Robust Optimization

Paul Michel, Tatsunori Hashimoto, Graham Neubig

Keywords Paper

adversarial learning, deep learning, robustness, distributionally robust optimization

0

0

0

0

5:09

18/07/2021

Quantum algorithms for reinforcement learning with a generative model

Daochen Wang, Aarthi Sundaram, Robin Kothari and
Ashish Kapoor, Martin Roetteler

Keywords Paper

Optimization, Non-Convex Optimization, Algorithms, Collaborative Filtering; Applications, Information Retrieval; Applications, Matrix and Tensor Factorization; , Theory, RL, Decisions and Control Theory

0

0

0

0

4:55

06/12/2020

High-Dimensional Contextual Policy Search with Unknown Context Rewards using Bayesian Optimization

Qing Feng , Ben Letham, Hongzi Mao, Eytan Bakshy

Keywords Paper

0

0

0

0

3:29

18/07/2021

Monotonic Robust Policy Optimization with Model Discrepancy

yuankun jiang, Chenglin Li, Wenrui Dai and
Junni Zou, Hongkai Xiong

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:17

16/11/2020

Positive-Unlabeled Reward Learning

Danfei Xu, Misha Denil

Keywords Paper

0

0

0

0

5:04

18/07/2021

Emphatic Algorithms for Deep Reinforcement Learning

Ray Jiang, Tom Zahavy, Zhongwen Xu and
Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt

Keywords Paper

Reinforcement Learning and Planning, Deep RL

1

0

0

0

5:21

06/12/2021

Which Mutual-Information Representation Learning Objectives are Sufficient for Control?

Kate Rakelly, Abhishek Gupta, Carlos Florensa, Sergey Levine

Keywords Paper

reinforcement learning and planning, representation learning

1

0

0

0

10:44

06/12/2021

COMBO: Conservative Offline Model-Based Policy Optimization

Tianhe Yu, Aviral Kumar, Rafael Rafailov and
Aravind Rajeswaran, Sergey Levine, Chelsea Finn

Keywords Paper

deep learning, optimization, reinforcement learning and planning

0

0

0

0

12:35

06/12/2021

Local policy search with Bayesian optimization

Sarah Müller, Alexander von Rohr, Sebastian Trimpe

Keywords Paper

theory, optimization, reinforcement learning and planning, active learning

0

0

0

0

11:42

06/12/2020

Non-Crossing Quantile Regression for Distributional Reinforcement Learning

Fan Zhou, Jianing Wang, Xingdong Feng

Keywords Paper

0

0

0

0

3:11

19/08/2021

Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts

Weinan Zhang, Xihuai Wang, Jian Shen, Ming Zhou

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Multi-agent Learning

0

0

0

0

13:10

18/07/2021

Provably Correct Optimization and Exploration with Non-linear Policies

Fei Feng, Wotao Yin, Alekh Agarwal, Lin Yang

Keywords Paper

Deep Learning, Adversarial Networks, Applications, Fairness, Accountability, and Transparency, Theory, RL, Decisions and Control Theory

0

0

0

0

5:03

13/04/2021

Improving KernelSHAP: Practical shapley value estimation using linear regression

Ian Covert, Su-In Lee

Keywords Paper

0

0

0

0

2:52

06/12/2021

Conservative Offline Distributional Reinforcement Learning

Yecheng Ma, Dinesh Jayaraman, Osbert Bastani

Keywords Paper

reinforcement learning and planning

1

0

0

0

13:54

19/08/2021

Hindsight Trust Region Policy Optimization

Hanbo Zhang, Site Bai, Xuguang Lan and
David Hsu, Nanning Zheng

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning

0

0

0

0

13:14