Prioritized Level Replay

18/07/2021

Prioritized Level Replay

Minqi Jiang, Edward Grefenstette, Tim Rocktäschel

Keywords: Reinforcement Learning and Planning, Deep RL

Abstract Paper Similar Papers

Abstract: Environments with procedurally generated content serve as important benchmarks for testing systematic generalization in deep reinforcement learning. In this setting, each level is an algorithmically created environment instance with a unique configuration of its factors of variation. Training on a prespecified subset of levels allows for testing generalization to unseen levels. What can be learned from a level depends on the current policy, yet prior work defaults to uniform sampling of training levels independently of the policy. We introduce Prioritized Level Replay (PLR), a general framework for selectively sampling the next training level by prioritizing those with higher estimated learning potential when revisited in the future. We show TD-errors effectively estimate a level's future learning potential and, when used to guide the sampling procedure, induce an emergent curriculum of increasingly difficult levels. By adapting the sampling of training levels, PLR significantly improves sample-efficiency and generalization on Procgen Benchmark—matching the previous state-of-the-art in test return—and readily combines with other methods. Combined with the previous leading method, PLR raises the state-of-the-art to over 76% improvement in test return relative to standard RL baselines.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Information-theoretic Task Selection for Meta-Reinforcement Learning

Ricardo Luna Gutierrez, Matteo Leonetti

Keywords Paper

0

0

0

0

2:57

06/12/2020

Self-Paced Deep Reinforcement Learning

Pascal Klink, Carlo D'Eramo, Jan Peters, Joni Pajarinen

Keywords Paper

0

0

0

0

3:00

13/04/2021

Curriculum learning by optimizing learning dynamics

Tianyi Zhou, Shengjie Wang, Jeff Bilmes

Keywords Paper

0

0

0

0

3:03

07/09/2020

Rethinking Curriculum Learning with Incremental Labels and Adaptive Compensation

Madan Ravi Ganesh, Jason Corso

Keywords Paper

label smoothing, curriculum learning, incremental labels, adaptive compensation, negative mining

0

0

0

0

5:18

03/05/2021

Adaptive Procedural Task Generation for Hard-Exploration Problems

Kuan Fang, Yuke Zhu, Silvio Savarese, Li Fei-Fei

Keywords Paper

reinforcement learning, task generation, procedural generation, curriculum learning

0

0

0

0

5:06

06/12/2020

Automatic Curriculum Learning through Value Disagreement

Yunzhi Zhang, Pieter Abbeel, Lerrel Pinto

Keywords Paper

0

0

0

0

3:17

12/07/2020

Optimizing Data Usage via Differentiable Rewards

Xinyi Wang, Hieu Pham, Paul Michel and
Antonios Anastasopoulos, Jaime Carbonell, Graham Neubig

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

12:53

03/05/2021

When Do Curricula Work?

Xiaoxia (Shirley) Wu, Ethan Dyer, Behnam Neyshabur

Keywords Paper

Empirical Investigation, Understanding Deep Learning, Curriculum Learning

0

0

0

0

14:37

14/09/2020

Partial Label Learning via Self-Paced Curriculum Strategy

Gengyu Lyu, Songhe Feng, Yi Jin, Yidong Li

Keywords Paper

partial-label learning, self-paced learning strategy, curriculum learning strategy, instructor-student-collaborative

0

0

0

0

6:46

18/07/2021

PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training

Kimin Lee, Laura Smith, Pieter Abbeel

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

15:02

06/12/2021

Deep Learning on a Data Diet: Finding Important Examples Early in Training

Mansheej Paul, Surya Ganguli, Gintare Karolina Dziugaite

Keywords Paper

deep learning

0

0

0

0

10:18

26/04/2020

Automated curriculum generation through setter-solver interactions

Sebastien Racaniere, Andrew Lampinen, Adam Santoro and
David Reichert, Vlad Firoiu, Timothy Lillicrap

Keywords Paper

Deep Reinforcement Learning, Automatic Curriculum

0

0

0

0

3:55

06/12/2020

Structured Prediction for Conditional Meta-Learning

Ruohan Wang, Yiannis Demiris, Carlo Ciliberto

Keywords Paper

0

0

0

0

3:12

06/12/2021

Meta-learning with an Adaptive Task Scheduler

Huaxiu Yao, Yu Wang, Ying Wei and
Peilin Zhao, Mehrdad Mahdavi, Defu Lian, Chelsea Finn

Keywords Paper

optimization, meta learning

0

0

0

0

15:12

12/07/2020

Data Valuation using Reinforcement Learning

Jinsung Yoon, Sercan Arik, Tomas Pfister

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

14:35

03/05/2021

Learning the Pareto Front with Hypernetworks

Aviv Navon, Aviv Shamsian, Ethan Fetaya, Gal Chechik

Keywords Paper

multi-task learning, Multi-objective optimization

0

0

0

0

5:19

26/04/2020

Meta-Q-Learning

Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, Alexander J. Smola

Keywords Paper

meta reinforcement learning, propensity estimation, off-policy

0

0

0

0

15:50

12/07/2020

Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data

Felipe Petroski Such, Aditya Rawal, Joel Lehman and
Kenneth Stanley, Jeffrey Clune

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

7:25

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

18/07/2021

Offline Meta-Reinforcement Learning with Advantage Weighting

Eric Mitchell, Rafael Rafailov, Xue Bin Peng and
Sergey Levine, Chelsea Finn

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

1

0

0

0

5:08

06/12/2020

Submodular Meta-Learning

Arman Adibi, Aryan Mokhtari, Hamed Hassani

Keywords Paper

0

0

0

0

3:17

16/11/2020

Improving Grammatical Error Correction Models with Purpose-Built Adversarial Examples

Lihao Wang, Xiaoqing Zheng

Keywords Paper

grammatical correction, sequence-to-sequence learning, neural networks, gec

0

0

0

0

11:40

06/12/2020

Model-based Adversarial Meta-Reinforcement Learning

Zichuan Lin, Garrett Thomas, Guangwen Yang, Tengyu Ma

Keywords Paper

0

0

0

0

3:31

12/07/2020

Provably Efficient Model-based Policy Adaptation

Yuda Song, Aditi Mavalankar, Wen Sun, Sicun Gao

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:49

18/07/2021

Optimizing Black-box Metrics with Iterative Example Weighting

Gaurush Hiranandani, Jatin Mathur, Harikrishna Narasimhan and
Mahdi Milani Fard, Sanmi Koyejo

Keywords Paper

Algorithms, Supervised Learning

0

0

0

0

5:49

18/07/2021

Improved OOD Generalization via Adversarial Training and Pretraing

Mingyang Yi, Lu Hou, Jiacheng Sun and
Lifeng Shang, Xin Jiang, Qun Liu, Zhiming Ma

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

4:11

06/12/2020

An Imitation from Observation Approach to Transfer Learning with Dynamics Mismatch

Siddharth Desai, Ishan Durugkar, Haresh Karnan and
Garrett Warnell, Josiah Hanna, Peter Stone

Keywords Paper

0

0

0

0

3:22

14/06/2020

Fast Template Matching and Update for Video Object Tracking and Segmentation

Mingjie Sun, Jimin Xiao, Eng Gee Lim and
Bingfeng Zhang, Yao Zhao

Keywords Paper

video object segmentation, video object tracking, reinforcement learning

0

0

0

0

1:01

03/05/2021

Parrot: Data-Driven Behavioral Priors for Reinforcement Learning

Avi Singh, Huihan Liu, Gaoyue Zhou and
Albert Yu, Nicholas Rhinehart, Sergey Levine

Keywords Paper

reinforcement learning, imitation learning

0

0

0

0

14:21

16/11/2020

Improving Text Generation with Student-Forcing Optimal Transport

Jianqiao Li, Chunyuan Li, Guoyin Wang and
Hao Fu, Yuhchen Lin, Liqun Chen, Yizhe Zhang, Chenyang Tao, Ruiyi Zhang, Wenlin Wang, Dinghan Shen, Qian Yang, Lawrence Carin

Keywords Paper

testing, ot learning, machine translation, text summarization

0

0

0

0

11:51

19/08/2021

Learning with Selective Forgetting

Takashi Shibata, Go Irie, Daiki Ikami, Yu Mitsuzumi

Keywords Paper

Computer Vision, Recognition, Incremental Learning

0

0

0

0

12:43

19/04/2021

Does the order of training samples matter? Improving neural data-to-text generation with curriculum learning

Ernie Chang, Hui-Syuan Yeh, Vera Demberg

Keywords Paper

0

0

0

0

5:42

06/12/2021

LADA: Look-Ahead Data Acquisition via Augmentation for Deep Active Learning

Yoon-Yeong Kim, Kyungwoo Song, JoonHo Jang, Il-chul Moon

Keywords Paper

deep learning, active learning

0

0

0

0

14:01

02/02/2021

Generalising without Forgetting for Lifelong Person Re-Identification

Guile Wu, Shaogang Gong

Keywords Paper

0

0

0

0

17:10

26/04/2020

Understanding Knowledge Distillation in Non-autoregressive Machine Translation

Chunting Zhou, Jiatao Gu, Graham Neubig

Keywords Paper

knowledge distillation, non-autoregressive neural machine translation

0

0

0

0

4:55

07/09/2020

Zero-Shot Domain Generalization

Udit Maniyar, Joseph K J, Aniket Anand Deshmukh and
Urun Dogan, Vineeth N Balasubramanian

Keywords Paper

Domain Generalization, zero-shot learning, semantic space, multi task learning, Learning with limited data, representation learning, classification

0

0

0

0

9:59

02/02/2021

DeepCollaboration: Collaborative Generative and Discriminative Models for Class Incremental Learning

Bo Cui, Guyue Hu, Shan Yu

Keywords Paper

0

0

0

0

15:13

06/12/2021

Replay-Guided Adversarial Environment Design

Minqi Jiang, Michael Dennis, Jack Parker-Holder and
Jakob Foerster, Edward Grefenstette, Tim Rocktäschel

Keywords Paper

theory, reinforcement learning and planning, robustness, self-supervised learning, continual learning

0

0

0

0

12:04

13/04/2021

Understanding robustness in teacher-student setting: A new perspective

Zhuolin Yang, Zhaoxi Chen, Tiffany Cai and
Xinyun Chen, Bo Li, Yuandong Tian

Keywords Paper

0

0

0

0

3:03

06/12/2020

Continuous Meta-Learning without Tasks

James Harrison, Apoorva Sharma, Chelsea Finn, Marco Pavone

Keywords Paper

0

0

0

0

3:09