Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments

03/05/2021

Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments

Daochen Zha, Wenye Ma, Lei Yuan, Xia Hu, Ji Liu

Keywords: Exploration, Reinforcement Learning, Self-Imitation, Generalization of Reinforcement Learning

Abstract Paper Similar Papers

Abstract: Exploration under sparse reward is a long-standing challenge of model-free reinforcement learning. The state-of-the-art methods address this challenge by introducing intrinsic rewards to encourage exploration in novel states or uncertain environment dynamics. Unfortunately, methods based on intrinsic rewards often fall short in procedurally-generated environments, where a different environment is generated in each episode so that the agent is not likely to visit the same state more than once. Motivated by how humans distinguish good exploration behaviors by looking into the entire episode, we introduce RAPID, a simple yet effective episode-level exploration method for procedurally-generated environments. RAPID regards each episode as a whole and gives an episodic exploration score from both per-episode and long-term views. Those highly scored episodes are treated as good exploration behaviors and are stored in a small ranking buffer. The agent then imitates the episodes in the buffer to reproduce the past good exploration behaviors. We demonstrate our method on several procedurally-generated MiniGrid environments, a first-person-view 3D Maze navigation task from MiniWorld, and several sparse MuJoCo tasks. The results show that RAPID significantly outperforms the state-of-the-art intrinsic reward strategies in terms of sample efficiency and final performance. The code is available at https://github.com/daochenzha/rapid

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration

Lulu Zheng, Jiarui Chen, Jianhao Wang and
Jiamin He, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao, Chongjie Zhang

Keywords Paper

reinforcement learning and planning

0

0

0

0

12:25

26/04/2020

RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments

Roberta Raileanu, Tim Rocktäschel

Keywords Paper

reinforcement learning, exploration, curiosity

0

0

0

0

4:48

19/08/2021

Don’t Do What Doesn’t Matter: Intrinsic Motivation with Action Usefulness

Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin

Keywords Paper

Machine Learning, Reinforcement Learning, Deep Reinforcement Learning

0

0

0

0

14:48

12/07/2020

Reward-Free Exploration for Reinforcement Learning

Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

14:37

12/07/2020

Implicit Generative Modeling for Efficient Exploration

Neale Ratzlaff, Qinxun Bai, Fuxin Li, Wei Xu

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

14:01

06/12/2021

MADE: Exploration via Maximizing Deviation from Explored Regions

Tianjun Zhang, Paria Rashidinejad, Jiantao Jiao and
Yuandong Tian, Joseph Gonzalez, Stuart Russell

Keywords Paper

reinforcement learning and planning

0

0

0

0

13:09

18/07/2021

Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices

Evan Liu, Aditi Raghunathan, Percy Liang, Chelsea Finn

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:41

02/02/2021

Sequential Generative Exploration Model for Partially Observable Reinforcement Learning

Haiyan Yin, Jianda Chen, Sinno Jialin Pan, Sebastian Tschiatschek

Keywords Paper

0

0

0

0

14:40

06/12/2020

Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration

Andrea Zanette, Alessandro Lazaric, Mykel J Kochenderfer, Emma Brunskill

Keywords Paper

0

0

0

0

3:11

02/02/2021

Exploration by Maximizing Renyi Entropy for Reward-Free RL Framework

Chuheng Zhang, Yuanying Cai, Longbo Huang, Jian Li

Keywords Paper

0

0

0

0

16:03

18/07/2021

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Jin Zhang, Jianhao Wang, Hao Hu and
Tong Chen, Yingfeng Chen, Changjie Fan, Chongjie Zhang

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

4:19

06/12/2020

Novelty Search in Representational Space for Sample Efficient Exploration

David Tao, Vincent Francois-Lavet, Joelle Pineau

Keywords Paper

0

0

0

0

3:04

18/07/2021

Fast active learning for pure exploration in reinforcement learning

Pierre MENARD, Omar Darwiche Domingues, Anders Jonsson and
Emilie Kaufmann, Edouard Leurent, Michal Valko

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

4:54

18/07/2021

Cooperative Exploration for Multi-Agent Deep Reinforcement Learning

Iou-Jen Liu, Unnat Jain, Raymond Yeh, Alex Schwing

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

20:35

06/12/2021

(Almost) Free Incentivized Exploration from Decentralized Learning Agents

Chengshuai Shi, Haifeng Xu, Wei Xiong, Cong Shen

Keywords Paper

reinforcement learning and planning, bandits

0

0

0

0

14:00

06/12/2021

Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning

Junsu Kim, Younggyo Seo, Jinwoo Shin

Keywords Paper

reinforcement learning and planning, graph learning

0

0

0

0

13:42

26/04/2020

Meta Reinforcement Learning with Autonomous Inference of Subtask Dependencies

Sungryull Sohn, Hyunjae Woo, Jongwook Choi, Honglak Lee

Keywords Paper

Meta reinforcement learning, subtask graph

0

0

0

0

5:26

06/12/2020

PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals

Henry Charlesworth, Giovanni Montana

Keywords Paper

0

0

0

0

3:20

18/07/2021

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

Kevin Li, Abhishek Gupta, Ashwin D Reddy and
Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:16

06/12/2021

Learning to Synthesize Programs as Interpretable and Generalizable Policies

Dweep Trivedi, Jesse Zhang, Shao-Hua Sun, Joseph Lim

Keywords Paper

deep learning, reinforcement learning and planning

0

0

0

0

10:30

12/07/2020

Active World Model Learning in Agent-rich Environments with Progress Curiosity

Kuno Kim, Megumi Sano, Julian De Freitas and
Nick Haber, Daniel Yamins

Keywords Paper

Applications - Neuroscience, Cognitive Science, Biology and Health

0

0

0

0

15:25

06/12/2020

The NetHack Learning Environment

Heinrich Küttler, Nantas Nardelli, Alexander Miller and
Roberta Raileanu, Marco Selvatici, Edward Grefenstette, Tim Rocktäschel

Keywords Paper

0

0

0

0

3:14

18/07/2021

On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game

Shuang Qiu, Jieping Ye, Zhaoran Wang, Zhuoran Yang

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

11:19

06/12/2021

A Max-Min Entropy Framework for Reinforcement Learning

Seungyul Han, Youngchul Sung

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

14:35

18/07/2021

Task-Optimal Exploration in Linear Dynamical Systems

Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

19:33

02/02/2021

Exploration via State influence Modeling

Yongxin Kang, Enmin Zhao, Kai Li, Junliang Xing

Keywords Paper

0

0

0

0

14:03

12/07/2020

Flexible and Efficient Long-Range Planning Through Curious Exploration

Aidan Curtis, Minjian Xin, Dilip Arumugam and
Kevin Feigelis, Daniel Yamins

Keywords Paper

Reinforcement Learning - General

0

0

0

0

16:25

06/12/2021

Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies

Tim Seyde, Igor Gilitschenski, Wilko Schwarting and
Bartolomeo Stellato, Martin Riedmiller, Markus Wulfmeier, Daniela Rus

Keywords Paper

reinforcement learning and planning

0

0

0

0

6:48

06/12/2021

Learning Markov State Abstractions for Deep Reinforcement Learning

Cameron Allen, Neev Parikh, Omer Gottesman, George Konidaris

Keywords Paper

reinforcement learning and planning, contrastive learning, representation learning

0

0

0

0

12:31

06/12/2020

Adversarial Style Mining for One-Shot Unsupervised Domain Adaptation

Yawei Luo, Ping Liu, Tao Guan and
Junqing Yu, Yi Yang

Keywords Paper

0

0

0

0

3:22

12/07/2020

Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning

Dipendra Misra, Mikael Henaff, Akshay Krishnamurthy, John Langford

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

15:43

05/01/2021

Proposal Learning for Semi-Supervised Object Detection

Peng Tang, Chetan Ramaiah, Yan Wang and
Ran Xu, Caiming Xiong

Keywords Paper

0

0

0

0

4:51

06/12/2021

Offline Meta Reinforcement Learning -- Identifiability Challenges and Effective Data Collection Strategies

Ron Dorfman, Idan Shenfeld, Aviv Tamar

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:44

12/07/2020

Learning Human Objectives by Evaluating Hypothetical Behavior

Siddharth Reddy, Anca Dragan, Sergey Levine and
Shane Legg, Jan Leike

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

10:21

26/04/2020

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

Siddharth Reddy, Anca D. Dragan, Sergey Levine

Keywords Paper

Imitation Learning, Reinforcement Learning

0

0

0

0

4:38

18/07/2021

State Entropy Maximization with Random Encoders for Efficient Exploration

Younggyo Seo, Lili Chen, Jinwoo Shin and
Honglak Lee, Pieter Abbeel, Kimin Lee

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:02

06/12/2021

Visual Adversarial Imitation Learning using Variational Models

Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn

Keywords Paper

theory, reinforcement learning and planning, adversarial robustness and security, representation learning

0

0

0

0

7:25

02/02/2021

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Stefanos Leonardos, Georgios Piliouras

Keywords Paper

0

0

0

0

20:17

06/12/2021

An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning

Tianpei Yang, Weixun Wang, Hongyao Tang and
Jianye Hao, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yingfeng Chen, Yujing Hu, Changjie Fan, Chengwei Zhang

Keywords Paper

reinforcement learning and planning, transfer learning

0

0

0

0

15:21

22/09/2020

Deep bayesian bandits: Exploring in online personalized recommendations

Dalin Guo, Sofia Ira Ktena, Pranay Kumar Myana and
Ferenc Huszar, Wenzhe Shi, Alykhan Tejani, Michael Kneier, Sourav Das

Keywords Paper

Contextual bandit, Recommender Systems, Algorithmic bias

0

0

0

0

2:59