Adapting to Reward Progressivity via Spectral Reinforcement Learning

Abstract: In this paper we consider reinforcement learning tasks with progressive rewards; that is, tasks where the rewards tend to increase in magnitude over time. We hypothesise that this property may be problematic for value-based deep reinforcement learning agents, particularly if the agent must first succeed in relatively unrewarding regions of the task in order to reach more rewarding regions. To address this issue, we propose Spectral DQN, which decomposes the reward into frequencies such that the high frequencies only activate when large rewards are found. This allows the training loss to be balanced so that it gives more even weighting across small and large reward regions. In two domains with extreme reward progressivity, where standard value-based methods struggle significantly, Spectral DQN is able to make much farther progress. Moreover, when evaluated on a set of six standard Atari games that do not overtly favour the approach, Spectral DQN remains more than competitive: While it underperforms one of the benchmarks in a single game, it comfortably surpasses the benchmarks in three games. These results demonstrate that the approach is not overfit to its target problem, and suggest that Spectral DQN may have advantages beyond addressing reward progressivity.

19/08/2021

Adapting to Reward Progressivity via Spectral Reinforcement Learning

Michael Dann, John Thangarajah

Comments

Similar Papers

Hindsight Trust Region Policy Optimization

Hanbo Zhang, Site Bai, Xuguang Lan and David Hsu, Nanning Zheng

Keywords Abstract Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

Siddharth Reddy, Anca D. Dragan, Sergey Levine

Keywords Abstract Paper

Imitation Learning, Reinforcement Learning

Robust Deep Reinforcement Learning through Adversarial Loss

Tuomas Oikarinen, Wang Zhang, Alexandre Megretski and Luca Daniel, Tsui-Wei Weng

Keywords Abstract Paper

reinforcement learning and planning, robustness, adversarial robustness and security

Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction

Gal Dalal, Assaf Hallak, Steven Dalton and iuri frosio, Shie Mannor, Gal Chechik

Keywords Abstract Paper

theory, reinforcement learning and planning

Deep Reinforcement Learning at the Edge of the Statistical Precipice

Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro and Aaron Courville, Marc Bellemare

Keywords Abstract Paper

reinforcement learning and planning

Non-Crossing Quantile Regression for Distributional Reinforcement Learning

Fan Zhou, Jianing Wang, Xingdong Feng

Keywords Abstract Paper

Provably Efficient Algorithms for Multi-Objective Competitive RL

Tiancheng Yu, Yi Tian, Jingzhao Zhang, Suvrit Sra

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

Data-Efficient Reinforcement Learning with Self-Predictive Representations

Max Schwarzer, Ankesh Anand, Rishab Goel and R Devon Hjelm, Aaron Courville, Philip Bachman

Keywords Abstract Paper

Representation Learning, Self-Supervised Learning, Reinforcement Learning, Sample Efficiency

Sequential Generative Exploration Model for Partially Observable Reinforcement Learning

Haiyan Yin, Jianda Chen, Sinno Jialin Pan, Sebastian Tschiatschek

Keywords Abstract Paper

Independence-aware Advantage Estimation

Pushi Zhang, Li Zhao, Guoqing Liu and Jiang Bian, Minlie Huang, Tao Qin, Tie-Yan Liu

Keywords Abstract Paper

Machine Learning, Reinforcement Learning, Deep Reinforcement Learning

Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences

Daniel Brown, Scott Niekum, Russell Coleman, Ravi Srinivasan

Keywords Abstract Paper

Reinforcement Learning - Deep RL

Reward-Constrained Behavior Cloning

Zhaorong Wang, Meng Wang, Jingqi Zhang and Yingfeng Chen, Chongjie Zhang

Keywords Abstract Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning, Constraint Optimization

An Optimistic Perspective on Offline Deep Reinforcement Learning

Rishabh Agarwal, Dale Schuurmans, Mohammad Norouzi

Keywords Abstract Paper

Reinforcement Learning - Deep RL

Active deep Q-learning with demonstration

Si-An Chen,Hsuan-Tien Lin, Voot Tangkaratt, Masashi Sugiyam

Keywords Abstract Paper

GaussianPath:A Bayesian Multi-Hop Reasoning Framework for Knowledge Graph Reasoning

Guojia Wan, Bo Du

Keywords Abstract Paper

No-regret reinforcement learning with heavy-tailed rewards

Vincent Zhuang, Yanan Sui

Keywords Abstract Paper

Fast Task Inference with Variational Intrinsic Successor Features

Steven Hansen, Will Dabney, Andre Barreto and David Warde-Farley, Tom Van de Wiele, Volodymyr Mnih

Keywords Abstract Paper

Reinforcement Learning, Variational Intrinsic Control, Successor Features

On Bonus Based Exploration Methods In The Arcade Learning Environment

Adrien Ali Taiga, William Fedus, Marlos C. Machado and Aaron Courville, Marc G. Bellemare

Keywords Abstract Paper

exploration, arcade learning environment, bonus-based methods

Exploring supervised and unsupervised rewards in machine translation

Julia Ive, Zixu Wang, Marina Fomicheva, Lucia Specia

Keywords Abstract Paper

Trust the Model When It Is Confident: Masked Model-based Actor-Critic

Feiyang Pan, Jia He, Dandan Tu, Qing He

Keywords Abstract Paper

Distributional Reinforcement Learning for Multi-Dimensional Reward Functions

Pushi Zhang, Xiaoyu Chen, Li Zhao and Wei Xiong, Tao Qin, Tie-Yan Liu

Keywords Abstract Paper

Hanbo Zhang, Site Bai, Xuguang Lan and
David Hsu, Nanning Zheng

Keywords Paper

Keywords Paper

Tuomas Oikarinen, Wang Zhang, Alexandre Megretski and
Luca Daniel, Tsui-Wei Weng

Keywords Paper

Gal Dalal, Assaf Hallak, Steven Dalton and
iuri frosio, Shie Mannor, Gal Chechik

Keywords Paper

Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro and
Aaron Courville, Marc Bellemare

Keywords Paper

Keywords Paper

Keywords Paper

Max Schwarzer, Ankesh Anand, Rishab Goel and
R Devon Hjelm, Aaron Courville, Philip Bachman

Keywords Paper

Keywords Paper

Pushi Zhang, Li Zhao, Guoqing Liu and
Jiang Bian, Minlie Huang, Tao Qin, Tie-Yan Liu

Keywords Paper

Keywords Paper

Zhaorong Wang, Meng Wang, Jingqi Zhang and
Yingfeng Chen, Chongjie Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Steven Hansen, Will Dabney, Andre Barreto and
David Warde-Farley, Tom Van de Wiele, Volodymyr Mnih

Keywords Paper

Adrien Ali Taiga, William Fedus, Marlos C. Machado and
Aaron Courville, Marc G. Bellemare

Keywords Paper

Keywords Paper

Keywords Paper

Pushi Zhang, Xiaoyu Chen, Li Zhao and
Wei Xiong, Tao Qin, Tie-Yan Liu

Keywords Paper

Keywords Paper

Keywords Paper

Zhizhou Ren, Guangxiang Zhu, Hao Hu and
Beining Han, Jianglun Chen, Chongjie Zhang

Keywords Paper

Zaynah Javed, Daniel Brown, Satvik Sharma and
Jerry Zhu, Ashwin Balakrishna, Marek Petrik, Anca Dragan, Ken Goldberg

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

David Lindner, Matteo Turchetta, Sebastian Tschiatschek and
Kamil Ciosek, Andreas Krause

Keywords Paper

Keywords Paper

Keywords Paper

Zichuan Lin, Derek Yang, Li Zhao and
Tao Qin, Guangwen Yang, Tie-Yan Liu

Keywords Paper

Keywords Paper

Yijie Guo, Jongwook Choi, Marcin Moczulski and
Shengyu Feng, Samy Bengio, Mohammad Norouzi, Honglak Lee

Keywords Paper