The Value-Improvement Path: Towards Better Representations for Reinforcement Learning

02/02/2021

The Value-Improvement Path: Towards Better Representations for Reinforcement Learning

Will Dabney, André Barreto, Mark Rowland, Robert Dadashi, John Quan, Marc G. Bellemare, David Silver

Keywords:

Abstract Paper Similar Papers

Abstract: In value-based reinforcement learning (RL), unlike in supervised learning, the agent faces not a single, stationary, approximation problem, but a sequence of value prediction problems. Each time the policy improves, the nature of the problem changes, shifting both the distribution of states and their values. In this paper we take a novel perspective, arguing that the value prediction problems faced by an RL agent should not be addressed in isolation, but rather as a single, holistic, prediction problem. An RL algorithm generates a sequence of policies that, at least approximately, improve towards the optimal policy. We explicitly characterize the associated sequence of value functions and call it the value-improvement path. Our main idea is to approximate the value-improvement path holistically, rather than to solely track the value function of the current policy. Specifically, we discuss the impact that this holistic view of RL has on representation learning. We demonstrate that a representation that spans the past value-improvement path will also provide an accurate value approximation for future policy improvements. We use this insight to better understand existing approaches to auxiliary tasks and to propose new ones. To test our hypothesis empirically, we augmented a standard deep RL agent with an auxiliary task of learning the value-improvement path. In a study of Atari 2600 games, the augmented agent achieved approximately double the mean and median performance of the baseline agent.

The video of this talk cannot be embedded. You can watch it here:

https://slideslive.com/38948914

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AAAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Offline Reinforcement Learning as One Big Sequence Modeling Problem

Michael Janner, Qiyang Li, Sergey Levine

Keywords Paper

reinforcement learning and planning, transformers, language

0

0

0

0

9:48

06/12/2020

Value-driven Hindsight Modelling

Arthur Guez, Fabio Viola, Theophane Weber and
Lars Buesing, Steven Kapturowski, Doina Precup, David Silver, Nicolas Heess

Keywords Paper

1

0

0

0

3:20

13/04/2021

Logistic q-learning

Joan Bas-Serrano, Sebastian Curi, Andreas Krause, Gergely Neu

Keywords Paper

0

0

0

0

2:44

19/08/2021

Variational Model-based Policy Optimization

Yinlam Chow, Brandon Cui, Moonkyung Ryu, Mohammad Ghavamzadeh

Keywords Paper

Machine Learning, Reinforcement Learning

0

0

0

0

15:31

02/02/2021

Bayesian Distributional Policy Gradients

Luchen Li, A. Aldo Faisal

Keywords Paper

1

0

0

0

18:06

12/07/2020

Minimax Weight and Q-Function Learning for Off-Policy Evaluation

Masatoshi Uehara, Jiawei Huang, Nan Jiang

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

14:20

06/12/2021

Regret Minimization Experience Replay in Off-Policy Reinforcement Learning

Xu-Hui Liu, Zhenghai Xue, Jingcheng Pang and
Shengyi Jiang, Feng Xu, Yang Yu

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

14:06

18/07/2021

Provably Efficient Algorithms for Multi-Objective Competitive RL

Tiancheng Yu, Yi Tian, Jingzhao Zhang, Suvrit Sra

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

17:04

18/07/2021

Convex Regularization in Monte-Carlo Tree Search

Tuan Q Dam, Carlo D'Eramo, Jan Peters, Joni Pajarinen

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

4:52

06/12/2020

Discovering Reinforcement Learning Algorithms

Junhyuk Oh, Matteo Hessel, Wojciech Czarnecki and
Zhongwen Xu, Hado van Hasselt, Satinder Singh, David Silver

Keywords Paper

0

0

0

0

3:21

18/07/2021

Representation Matters: Offline Pretraining for Sequential Decision Making

Mengjiao Yang, Ofir Nachum

Keywords Paper

Reinforcement Learning and Planning

1

0

0

0

5:06

06/12/2020

Deep Reinforcement and InfoMax Learning

Bogdan Mazoure, Remi Tachet des Combes, Thang Doan and
Philip Bachman, R Devon Hjelm

Keywords Paper

0

0

0

0

3:15

02/02/2021

Expected Eligibility Traces

Hado van Hasselt, Sephora Madjiheurem, Matteo Hessel and
David Silver, André Barreto, Diana Borsa

Keywords Paper

0

0

0

0

18:39

19/08/2021

Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts

Weinan Zhang, Xihuai Wang, Jian Shen, Ming Zhou

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Multi-agent Learning

0

0

0

0

13:10

06/12/2020

The Value Equivalence Principle for Model-Based Reinforcement Learning

Christopher Grimm, Andre Barreto, Satinder Singh, David Silver

Keywords Paper

0

0

0

0

3:19

12/07/2020

Sequential Transfer in Reinforcement Learning with a Generative Model

Andrea Tirinzoni, Riccardo Poiani, Marcello Restelli

Keywords Paper

Reinforcement Learning - General

0

0

0

0

10:54

06/12/2021

Decision Transformer: Reinforcement Learning via Sequence Modeling

Lili Chen, Kevin Lu, Aravind Rajeswaran and
Kimin Lee, Aditya Grover, Misha Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch

Keywords Paper

deep learning, reinforcement learning and planning, transformers, generative model

0

0

0

0

6:51

13/04/2021

Model updating after interventions paradoxically introduces bias

James Liley, Samuel Emerson, Bilal Mateen and
Catalina Vallejos, Louis Aslett, Sebastian Vollmer

Keywords Paper

0

0

0

0

3:02

06/12/2021

Generalized Proximal Policy Optimization with Sample Reuse

James Queeney, Yannis Paschalidis, Christos G Cassandras

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

13:45

06/12/2021

Risk-Aware Transfer in Reinforcement Learning using Successor Features

Michael Gimelfarb, Andre Barreto, Scott Sanner, Chi-Guhn Lee

Keywords Paper

reinforcement learning and planning, representation learning, transfer learning

0

0

0

0

12:06

18/07/2021

PODS: Policy Optimization via Differentiable Simulation

Miguel Angel Zamora Mora, Momchil Peychev, Sehoon Ha and
Martin Vechev, Stelian Coros

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

4:28

06/12/2020

Rewriting History with Inverse RL: Hindsight Inference for Policy Improvement

Benjamin Eysenbach, XINYANG GENG, Sergey Levine, Russ Salakhutdinov

Keywords Paper

Optimization -> Non-Convex Optimization, Theory -> Statistical Physics of Learning

0

0

0

0

3:19

26/04/2020

Explain Your Move: Understanding Agent Actions Using Focused Feature Saliency

Piyush Gupta, Nikaash Puri, Sukriti Verma and
Dhruv Kayastha, Shripad Deshmukh, Balaji Krishnamurthy, Sameer Singh

Keywords Paper

Deep Reinforcement Learning, Saliency maps, Chess, Atari games, Interpretable AI

0

0

0

0

4:59

06/12/2021

Stateful Strategic Regression

Keegan Harris, Hoda Heidari, Steven Wu

Keywords Paper

optimization

0

0

0

0

14:02

26/04/2020

Meta-Q-Learning

Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, Alexander J. Smola

Keywords Paper

meta reinforcement learning, propensity estimation, off-policy

0

0

0

0

15:50

06/12/2021

Boosting with Multiple Sources

Corinna Cortes, Mehryar Mohri, Dmitry Storcheus, Ananda Theertha Suresh

Keywords Paper

federated learning

0

0

0

0

13:17

18/07/2021

Offline Reinforcement Learning with Fisher Divergence Critic Regularization

Ilya Kostrikov, Rob Fergus, Jonathan Tompson, Ofir Nachum

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

4:49

12/07/2020

Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards

Umer Siddique, Paul Weng, Matthieu Zimmer

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:17

19/08/2021

Average-Reward Reinforcement Learning with Trust Region Methods

Xiaoteng Ma, Xiaohang Tang, Li Xia and
Jun Yang, Qianchuan Zhao

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning, Markov Decision Processes

0

0

0

0

14:41

06/12/2021

Test-time Collective Prediction

Celestine Mendler-Dünner, Wenshuo Guo, Stephen Bates, Michael Jordan

Keywords Paper

machine learning, federated learning

0

0

0

0

14:10

18/07/2021

Taylor Expansion of Discount Factors

Yunhao Tang, Mark Rowland, Remi Munos, Michal Valko

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:22

03/05/2021

Learning Robust State Abstractions for Hidden-Parameter Block MDPs

Amy Zhang, Shagun Sodhani, Khimya Khetarpal, Joelle Pineau

Keywords Paper

bisimulation, block mdp, hidden-parameter mdp, multi-task reinforcement learning

0

0

0

0

4:17

06/12/2021

Local policy search with Bayesian optimization

Sarah Müller, Alexander von Rohr, Sebastian Trimpe

Keywords Paper

theory, optimization, reinforcement learning and planning, active learning

0

0

0

0

11:42

26/10/2020

Joint Inference of Reward Machines and Policies for Reinforcement Learning

Zhe Xu, Ivan Gavran, Yousef Ahmad and
Rupak Majumdar, Daniel Neider, Ufuk Topcu, Bo Wu

Keywords Paper

Reward Machines, Automata Learning, Reinforcement Learning

0

0

0

0

9:57

26/08/2020

Momentum in Reinforcement Learning

Nino Vieillard, Bruno Scherrer, Olivier Pietquin, Matthieu Geist

Keywords Paper

0

0

0

0

13:20

26/08/2020

Independent Subspace Analysis for Unsupervised Learning of Disentangled Representations

Jan Stuehmer, Richard Turner, Sebastian Nowozin

Keywords Paper

0

0

0

0

11:43

03/08/2020

No-regret Exploration in Contextual Reinforcement Learning

Aditya Modi, Ambuj Tewari

Keywords Paper

0

0

0

0

8:19

06/12/2020

Inverse Reinforcement Learning from a Gradient-based Learner

Giorgia Ramponi, Gianluca Drappo, Marcello Restelli

Keywords Paper

0

0

0

0

2:42

06/12/2021

Explicable Reward Design for Reinforcement Learning Agents

Rati Devidze, Goran Radanovic, Parameswaran Kamalaruban, Adish Singla

Keywords Paper

optimization, reinforcement learning and planning, interpretability

0

0

0

0

4:10

03/05/2021

On the role of planning in model-based deep reinforcement learning

Jessica Hamrick, Abram Friesen, Feryal Behbahani and
Arthur Guez, Fabio Viola, Sims Witherspoon, Thomas Anthony, Lars Buesing, Petar Veličković, Theo Weber

Keywords Paper

planning, MuZero, model-based RL

0

0

0

0

5:15