Non-decreasing Quantile Function Network with Efficient Exploration for Distributional Reinforcement Learning

19/08/2021

Non-decreasing Quantile Function Network with Efficient Exploration for Distributional Reinforcement Learning

Fan Zhou, Zhoufan Zhu, Qi Kuang, Liwen Zhang

Keywords: Machine Learning, Deep Reinforcement Learning

Abstract Paper Similar Papers

Abstract: Although distributional reinforcement learning (DRL) has been widely examined in the past few years, there are two open questions people are still trying to address. One is how to ensure the validity of the learned quantile function, the other is how to efficiently utilize the distribution information. This paper attempts to provide some new perspectives to encourage the future in-depth studies in these two fields. We first propose a non-decreasing quantile function network (NDQFN) to guarantee the monotonicity of the obtained quantile estimates and then design a general exploration framework called distributional prediction error (DPE) for DRL which utilizes the entire distribution of the quantile function. In this paper, we not only discuss the theoretical necessity of our method but also show the performance gain it achieves in practice by comparing with some competitors on Atari 2600 Games especially in some hard-explored games.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at IJCAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Value-driven Hindsight Modelling

Arthur Guez, Fabio Viola, Theophane Weber and
Lars Buesing, Steven Kapturowski, Doina Precup, David Silver, Nicolas Heess

Keywords Paper

1

0

0

0

3:20

26/08/2020

Uncertainty Quantification for Sparse Deep Learning

Yuexi Wang, Veronika Rockova

Keywords Paper

0

0

0

0

15:12

06/12/2021

EDGE: Explaining Deep Reinforcement Learning Policies

Wenbo Guo, Xian Wu, Usmann Khan, Xinyu Xing

Keywords Paper

reinforcement learning and planning, adversarial robustness and security, generative model, kernel methods, interpretability

0

0

0

0

12:16

06/12/2021

Understanding Instance-based Interpretability of Variational Auto-Encoders

Zhifeng Kong, Kamalika Chaudhuri

Keywords Paper

deep learning, self-supervised learning, generative model, interpretability

0

0

0

0

15:39

06/12/2020

Non-Crossing Quantile Regression for Distributional Reinforcement Learning

Fan Zhou, Jianing Wang, Xingdong Feng

Keywords Paper

0

0

0

0

3:11

06/12/2021

Robust Generalization despite Distribution Shift via Minimum Discriminating Information

Tobias Sutter, Andreas Krause, Daniel Kuhn

Keywords Paper

optimization, machine learning

0

0

0

0

15:05

26/04/2020

Fast Task Inference with Variational Intrinsic Successor Features

Steven Hansen, Will Dabney, Andre Barreto and
David Warde-Farley, Tom Van de Wiele, Volodymyr Mnih

Keywords Paper

Reinforcement Learning, Variational Intrinsic Control, Successor Features

0

0

0

0

14:47

03/05/2021

Entropic gradient descent algorithms and wide flat minima

Fabrizio Pittorino, Carlo Lucibello, Christoph Feinauer and
Gabriele Perugini, Carlo Baldassi, Elizaveta Demyanenko, Riccardo Zecchina

Keywords Paper

flat minima, belief-propagation, statistical physics, entropic algorithms

0

0

0

0

5:38

12/07/2020

A Distributional Framework For Data Valuation

Amirata Ghorbani, Michael Kim, James Zou

Keywords Paper

Learning Theory

0

0

0

0

14:15

13/04/2021

Towards a theoretical understanding of the robustness of variational autoencoders

Alexander Camuto, Matthew Willetts, Stephen Roberts and
Chris Holmes, Tom Rainforth

Keywords Paper

0

0

0

0

3:00

06/12/2021

Out-of-Distribution Generalization in Kernel Regression

Abdulkadir Canatar, Blake Bordelon, Cengiz Pehlevan

Keywords Paper

theory, deep learning, machine learning

0

0

0

0

15:07

06/12/2020

Towards a Combinatorial Characterization of Bounded-Memory Learning

Alon Gonen, Shachar Lovett, Michal Moshkovitz

Keywords Paper

0

0

0

0

3:16

03/05/2021

Learning perturbation sets for robust machine learning

Eric Wong, Zico Kolter

Keywords Paper

conditional variational autoencoder, adversarial examples, perturbation sets, robust machine learning

0

1

0

0

5:06

06/12/2020

Discovering Reinforcement Learning Algorithms

Junhyuk Oh, Matteo Hessel, Wojciech Czarnecki and
Zhongwen Xu, Hado van Hasselt, Satinder Singh, David Silver

Keywords Paper

0

0

0

0

3:21

03/05/2021

Theoretical bounds on estimation error for meta-learning

James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI and
Toniann Pitassi, Richard Zemel

Keywords Paper

meta learning, minimax risk, few-shot, lower bounds, learning theory

0

0

0

0

4:46

06/12/2020

Leverage the Average: an Analysis of KL Regularization in Reinforcement Learning

Nino Vieillard, Tadashi Kozuno, Bruno Scherrer and
Olivier Pietquin, Remi Munos, Matthieu Geist

Keywords Paper

0

0

0

0

3:25

06/12/2020

Provably Efficient Reward-Agnostic Navigation with Linear Value Iteration

Andrea Zanette, Alessandro Lazaric, Mykel J Kochenderfer, Emma Brunskill

Keywords Paper

0

0

0

0

3:11

06/12/2020

RL Unplugged: A Collection of Benchmarks for Offline Reinforcement Learning

CAGLAR Gulcehre, Ziyu Wang, Alexander Novikov and
Thomas Paine, Sergio Gómez, Konrad Zolna, Rishabh Agarwal, Josh Merel, Daniel Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matthew Hoffman, Nicolas Heess, Nando de Freitas

Keywords Paper

0

0

0

0

3:25

02/02/2021

Bayesian Distributional Policy Gradients

Luchen Li, A. Aldo Faisal

Keywords Paper

1

0

0

0

18:06

26/04/2020

Unbiased Contrastive Divergence Algorithm for Training Energy-Based Latent Variable Models

Yixuan Qiu, Lingsong Zhang, Xiao Wang

Keywords Paper

energy model, restricted Boltzmann machine, contrastive divergence, unbiased Markov chain Monte Carlo, distribution coupling

0

0

0

0

4:34

18/07/2021

The Impact of Record Linkage on Learning from Feature Partitioned Data

Richard Nock, Stephen J Hardy, Wilko Henecka and
Hamish Ivey-Law, Jakub Nabaglo, Giorgio Patrini, Guillaume Smith, Brian Thorne

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

6:02

18/07/2021

Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research

Johan Obando Ceron, Pablo Samuel Castro

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:16

18/07/2021

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

Kevin Li, Abhishek Gupta, Ashwin D Reddy and
Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:16

13/04/2021

Good classifiers are abundant in the interpolating regime

Ryan Theisen, Jason Klusowski, Michael Mahoney

Keywords Paper

0

0

0

0

2:59

12/07/2020

Learning Calibratable Policies using Programmatic Style-Consistency

Eric Zhan, Albert Tseng, Yisong Yue and
Adith Swaminathan, Matthew Hausknecht

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

15:05

18/07/2021

Learning in Nonzero-Sum Stochastic Games with Potentials

David Mguni, Yutong Wu, Yali Du and
Yaodong Yang, Ziyi Wang, M. Li, Ying Wen, Joel Jennings, Jun Wang

Keywords Paper

Theory, Game Theory and Computational Economics

0

0

0

0

5:36

26/04/2020

Discriminative Particle Filter Reinforcement Learning for Complex Partial observations

Xiao Ma, Peter Karkus, David Hsu and
Wee Sun Lee, Nan Ye

Keywords Paper

Reinforcement Learning, Partial Observability, Differentiable Particle Filtering

0

0

0

0

5:08

12/07/2020

Agent57: Outperforming the Atari Human Benchmark

Adrià Puigdomenech Badia, Bilal Piot, Steven Kapturowski and
Pablo Sprechmann, Oleksandr Vitvitskyi, Zhaohan Guo, Charles Blundell

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

10:01

06/12/2021

Deep Reinforcement Learning at the Edge of the Statistical Precipice

Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro and
Aaron Courville, Marc Bellemare

Keywords Paper

reinforcement learning and planning

0

0

0

0

19:36

19/04/2021

Learning from revisions: Quality assessment of claims in argumentation at scale

Gabriella Skitalinskaya, Jonas Klaff, Henning Wachsmuth

Keywords Paper

0

0

0

0

9:47

12/07/2020

On Semi-parametric Inference for BART

Veronika Rockova

Keywords Paper

Probabilistic Inference - Models and Probabilistic Programming

0

0

0

0

14:54

06/12/2021

USCO-Solver: Solving Undetermined Stochastic Combinatorial Optimization Problems

Guangmo Tong

Keywords Paper

optimization

0

0

0

0

15:00

02/02/2021

Explainable Models with Consistent Interpretations

Vipin Pillai, Hamed Pirsiavash

Keywords Paper

0

0

0

0

16:20

13/04/2021

Deep neural networks are congestion games: From loss landscape to wardrop equilibrium and beyond

Nina Vesseron, Ievgen Redko, Charlotte Laclau

Keywords Paper

0

0

0

0

2:50

06/12/2021

Regret Minimization Experience Replay in Off-Policy Reinforcement Learning

Xu-Hui Liu, Zhenghai Xue, Jingcheng Pang and
Shengyi Jiang, Feng Xu, Yang Yu

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

14:06

03/05/2021

Modeling the Second Player in Distributionally Robust Optimization

Paul Michel, Tatsunori Hashimoto, Graham Neubig

Keywords Paper

adversarial learning, deep learning, robustness, distributionally robust optimization

0

0

0

0

5:09

12/07/2020

Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences

Daniel Brown, Scott Niekum, Russell Coleman, Ravi Srinivasan

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:11

06/12/2021

Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

Songyuan Zhang, ZHANGJIE CAO, Dorsa Sadigh, Yanan Sui

Keywords Paper

reinforcement learning and planning

0

0

0

0

13:50

04/07/2020

Learning Dialog Policies from Weak Demonstrations

Gabriel Gordon-Hall, Philip John Gorinski, Shay B. Cohen

Keywords Paper

Weak Demonstrations, dialog manager, multi-domain systems, expert demonstrators

0

0

0

0

11:14

02/02/2021

Why Adversarial Interaction Creates Non-Homogeneous Patterns: A Pseudo-Reaction-Diffusion Model for Turing Instability

Litu Rout

Keywords Paper

0

0

0

0

18:23