Temporally-Extended ε-Greedy Exploration

03/05/2021

Temporally-Extended ε-Greedy Exploration

Will Dabney, Georg Ostrovski, Andre Barreto

Keywords: reinforcement learning, exploration

Abstract Paper Similar Papers

Abstract: Recent work on exploration in reinforcement learning (RL) has led to a series of increasingly complex solutions to the problem. This increase in complexity often comes at the expense of generality. Recent empirical studies suggest that, when applied to a broader set of domains, some sophisticated exploration methods are outperformed by simpler counterparts, such as ε-greedy. In this paper we propose an exploration algorithm that retains the simplicity of ε-greedy while reducing dithering. We build on a simple hypothesis: the main limitation of ε-greedy exploration is its lack of temporal persistence, which limits its ability to escape local optima. We propose a temporally extended form of ε-greedy that simply repeats the sampled action for a random duration. It turns out that, for many duration distributions, this suffices to improve exploration on a large set of domains. Interestingly, a class of distributions inspired by ecological models of animal foraging behaviour yields particularly strong performance.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting

Gen Li, Yuxin Chen, Yuejie Chi and
Yuantao Gu, Yuting Wei

Keywords Paper

theory, reinforcement learning and planning, generative model

0

0

0

0

15:34

06/12/2021

MADE: Exploration via Maximizing Deviation from Explored Regions

Tianjun Zhang, Paria Rashidinejad, Jiantao Jiao and
Yuandong Tian, Joseph Gonzalez, Stuart Russell

Keywords Paper

reinforcement learning and planning

0

0

0

0

13:09

19/08/2021

Type-WA*: Using Exploration in Bounded Suboptimal Planning

Eldan Cohen, Richard Valenzano, Sheila McIlraith

Keywords Paper

Planning and Scheduling, Planning Algorithms, Search in Planning and Scheduling, Heuristic Search

0

0

0

0

11:00

02/02/2021

Improving Causal Discovery By Optimal Bayesian Network Learning

Ni Y Lu, Kun Zhang, Changhe Yuan

Keywords Paper

0

0

0

0

15:12

06/12/2021

Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning

Gen Li, Laixi Shi, Yuxin Chen and
Yuantao Gu, Yuejie Chi

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

15:32

12/07/2020

Online Continual Learning from Imbalanced Data

Aristotelis Chrysakis, Marie-Francine Moens

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

15:12

02/02/2021

Theoretically Principled Deep RL Acceleration via Nearest Neighbor Function Approximation

Junhong Shen, Lin F. Yang

Keywords Paper

0

0

0

0

19:12

06/12/2020

Memory Based Trajectory-conditioned Policies for Learning from Sparse Rewards

Yijie Guo, Jongwook Choi, Marcin Moczulski and
Shengyu Feng, Samy Bengio, Mohammad Norouzi, Honglak Lee

Keywords Paper

0

0

1

1

3:30

06/12/2020

Goal-directed Generation of Discrete Structures with Conditional Generative Models

Amina Mollaysa, Brooks Paige, Alexandros Kalousis

Keywords Paper

0

0

0

0

3:10

06/12/2020

MOPO: Model-based Offline Policy Optimization

Tianhe (Kevin) Yu, Garrett Thomas, Lantao Yu and
Stefano Ermon, James Zou, Sergey Levine, Chelsea Finn, Tengyu Ma

Keywords Paper

0

0

0

0

3:30

13/04/2021

Online model selection for reinforcement learning with function approximation

Jonathan Lee, Aldo Pacchiano, Vidya Muthukumar and
Weihao Kong, Emma Brunskill

Keywords Paper

0

0

0

0

3:15

02/02/2021

Evolutionary Approach for AutoAugment Using the Thermodynamical Genetic Algorithm

Akira Terauchi, Naoki Mori

Keywords Paper

0

0

0

0

17:42

18/07/2021

Unbalanced minibatch Optimal Transport; applications to Domain Adaptation

Kilian Fatras, Thibault Séjourné, Rémi Flamary, Nicolas Courty

Keywords Paper

Algorithms, Optimal Transport

0

0

0

2

4:57

26/08/2020

Adaptive Exploration in Linear Contextual Bandit

Botao Hao, Tor Lattimore, Csaba Szepesvari

Keywords Paper

0

0

0

0

14:29

06/12/2020

On Efficiency in Hierarchical Reinforcement Learning

Zheng Wen, Doina Precup, Morteza Ibrahimi and
Andre Barreto, Benjamin Van Roy, Satinder Singh

Keywords Paper

0

0

0

0

3:05

18/07/2021

Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks

Sungryull Sohn, Sungtae Lee, Jongwook Choi and
Harm van Seijen, Mehdi Fatemi, Honglak Lee

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:19

12/07/2020

Thompson Sampling via Local Uncertainty

Zhendong Wang, Mingyuan Zhou

Keywords Paper

Probabilistic Inference - Models and Probabilistic Programming

0

0

0

0

11:59

12/07/2020

History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms

Kaiyi Ji, Zhe Wang, Bowen Weng and
Yi Zhou, Wei Zhang, Yingbin LIANG

Keywords Paper

Optimization - Non-convex

0

0

0

0

14:41

06/12/2021

Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networks

Rong Zhu, Mattia Rigotti

Keywords Paper

theory, deep learning, reinforcement learning and planning, bandits

0

0

0

0

8:45

06/12/2020

Effective Diversity in Population Based Reinforcement Learning

Jack Parker-Holder, Aldo Pacchiano, Krzysztof M Choromanski, Stephen J Roberts

Keywords Paper

0

0

0

0

3:23

06/12/2020

Promoting Stochasticity for Expressive Policies via a Simple and Efficient Regularization Method

Qi Zhou, Yufei Kuang, Zherui Qiu and
Houqiang Li, Jie Wang

Keywords Paper

0

0

0

0

3:10

12/07/2020

On Thompson Sampling with Langevin Algorithms

Eric Mazumdar, Aldo Pacchiano, Yian Ma and
Michael Jordan, Peter Bartlett

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

15:33

18/07/2021

Adapting to misspecification in contextual bandits with offline regression oracles

Sanath Kumar Krishnamurthy, Vitor Hadad, Susan Athey

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

4:17

06/12/2020

Efficient Model-Based Reinforcement Learning through Optimistic Policy Search and Planning

Sebastian Curi, Felix Berkenkamp, Andreas Krause

Keywords Paper

0

0

0

0

3:23

06/12/2020

Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs

Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang

Keywords Paper

0

0

0

0

3:16

26/04/2020

Hypermodels for Exploration

Vikranth Dwaracherla, Xiuyuan Lu, Morteza Ibrahimi and
Ian Osband, Zheng Wen, Benjamin Van Roy

Keywords Paper

exploration, hypermodel, reinforcement learning

0

0

0

0

5:02

13/04/2021

Active online learning with hidden shifting domains

Yining Chen, Haipeng Luo, Tengyu Ma, Chicheng Zhang

Keywords Paper

0

0

0

0

3:06

06/12/2020

LoCo: Local Contrastive Representation Learning

Yuwen Xiong, Mengye Ren, Raquel Urtasun

Keywords Paper

0

1

0

1

3:18

06/12/2021

Implicit Task-Driven Probability Discrepancy Measure for Unsupervised Domain Adaptation

Mao Li, Kaiqi Jiang, Xinhua Zhang

Keywords Paper

optimization, machine learning, generative model, domain adaptation

0

0

0

0

14:01

06/12/2020

Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters

Kaiyi Ji, Jason Lee, Yingbin Liang, H. Vincent Poor

Keywords Paper

0

0

0

0

3:11

26/04/2020

Exploration in Reinforcement Learning with Deep Covering Options

Yuu Jinnai, Jee Won Park, Marlos C. Machado, George Konidaris

Keywords Paper

Reinforcement learning, temporal abstraction, exploration

0

0

0

0

5:06

06/12/2020

Beyond the Mean-Field: Structured Deep Gaussian Processes Improve the Predictive Uncertainties

Jakob Lindinger, David Reeb, Christoph Lippert, Barbara Rakitsch

Keywords Paper

0

0

0

0

3:21

02/02/2021

Maximum Roaming Multi-Task Learning

Lucas Pascal, Pietro Michiardi, Xavier Bost and
Benoit Huet, Maria A. Zuluaga

Keywords Paper

0

0

0

0

19:54

06/12/2020

Dynamic allocation of limited memory resources in reinforcement learning

Nisheet Patel, Luigi Acerbi, Alexandre Pouget

Keywords Paper

0

0

0

0

3:19

23/08/2020

Diverse rule sets

Guangyi Zhang, Aristides Gionis

Keywords Paper

sampling, classifier, pattern mining, rule learning, diversification, rule sets

0

0

0

0

9:41

03/05/2021

Exploring the Uncertainty Properties of Neural Networks’ Implicit Priors in the Infinite-Width Limit

Ben Adlam, Jaehoon Lee, Lechao Xiao and
Jeffrey Pennington, Jasper Snoek

Keywords Paper

Deep Learning, Bayesian Neural Networks, Neural Network Gaussian Process, Infinite-Width Limit, Uncertainty, Gaussian Process

0

0

0

0

4:34

06/12/2021

Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

Dylan J Foster, Akshay Krishnamurthy

Keywords Paper

theory, reinforcement learning and planning, bandits, online learning

0

0

0

0

19:34

06/12/2020

Learning Guidance Rewards with Trajectory-space Smoothing

Tanmay Gangwani, Yuan Zhou, Jian Peng

Keywords Paper

0

0

0

0

3:16

06/12/2020

Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits

Jack Parker-Holder, Vu Nguyen, Stephen J Roberts

Keywords Paper

0

0

0

0

3:22

03/05/2021

MONGOOSE: A Learnable LSH Framework for Efficient Neural Network Training

Beidi Chen, Zichang Liu, Binghui Peng and
Zhaozhuo Xu, Jonathan L Li, Tri Dao, Zhao Song, Anshumali Shrivastava, Christopher Re

Keywords Paper

Randomized Algorithms, Efficient Training, Large-scale Machine Learning, Large-scale Deep Learning

0

0

0

0

15:07