Average-Reward Learning and Planning with Options

Abstract: We extend the options framework for temporal abstraction in reinforcement learning from discounted Markov decision processes (MDPs) to average-reward MDPs. Our contributions include general convergent off-policy inter-option learning algorithms, intra-option algorithms for learning values and models, as well as sample-based planning variants of our learning algorithms. Our algorithms and convergence proofs extend those recently developed by Wan, Naik, and Sutton. We also extend the notion of option-interrupting behaviour from the discounted to the average-reward formulation. We show the efficacy of the proposed algorithms with experiments on a continuing version of the Four-Room domain.

03/08/2020

Average-Reward Learning and Planning with Options

Yi Wan, Abhishek Naik, Rich Sutton

Comments

Similar Papers

No-regret Exploration in Contextual Reinforcement Learning

Aditya Modi, Ambuj Tewari

Keywords Abstract Paper

Deep Rao-Blackwellised Particle Filters for Time Series Forecasting

Richard Kurle, Syama Sundar Rangapuram, Emmanuel de Bézenac and Stephan Günnemann, Jan Gasthaus

Keywords Abstract Paper

Reinforcement Learning of Sequential Price Mechanisms

Gianluca Brero, Alon Eden, Matthias Gerstgrasser and David Parkes, Duncan Rheingans-Yoo

Keywords Abstract Paper

Lipschitz Lifelong Reinforcement Learning

Erwan Lecarpentier, David Abel, Kavosh Asadi and Yuu Jinnai, Emmanuel Rachelson, Michael L. Littman

Keywords Abstract Paper

Reward is enough for convex MDPs

Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, Satinder Singh

Keywords Abstract Paper

Meta-Q-Learning

Rasool Fakoor, Pratik Chaudhari, Stefano Soatto, Alexander J. Smola

Keywords Abstract Paper

meta reinforcement learning, propensity estimation, off-policy

Deep Reinforcement and InfoMax Learning

Bogdan Mazoure, Remi Tachet des Combes, Thang Doan and Philip Bachman, R Devon Hjelm

Keywords Abstract Paper

Dynamic Automaton-Guided Reward Shaping for Monte Carlo Tree Search

Alvaro Velasquez, Brett Bissey, Lior Barak and Andre Beckus, Ismail Alkhouri, Daniel Melcer, George Atia

Keywords Abstract Paper

Generalised Bayesian Filtering via Sequential Monte Carlo

Ayman Boustati, Omer Deniz Akyildiz, Theo Damoulas, Adam Johansen

Keywords Abstract Paper

On the Convergence Theory of Debiased Model-Agnostic Meta-Reinforcement Learning

Alireza Fallah, Kristian Georgiev, Aryan Mokhtari, Asuman Ozdaglar

Keywords Abstract Paper

theory, optimization, reinforcement learning and planning, meta learning

Flexible Option Learning

Martin Klissarov, Doina Precup

Keywords Abstract Paper

Harmonized Dense Knowledge Distillation Training for Multi-Exit Architectures

Xinglu Wang, Yingming Li

Keywords Abstract Paper

Offline Reinforcement Learning as One Big Sequence Modeling Problem

Michael Janner, Qiyang Li, Sergey Levine

Keywords Abstract Paper

reinforcement learning and planning, transformers, language

Invariant Causal Prediction for Block MDPs

Clare Lyle, Amy Zhang, Angelos Filos and Shagun Sodhani, Marta Kwiatkowska, Yarin Gal, Doina Precup, Joelle Pineau

Keywords Abstract Paper

Learning Robust State Abstractions for Hidden-Parameter Block MDPs

Amy Zhang, Shagun Sodhani, Khimya Khetarpal, Joelle Pineau

Keywords Abstract Paper

bisimulation, block mdp, hidden-parameter mdp, multi-task reinforcement learning

Discount Factor as a Regularizer in Reinforcement Learning

Ron Amit, Kamil Ciosek, Ron Meir

Keywords Abstract Paper

Counterfactual representation learning with balancing weights

Serge Assaad, Shuxi Zeng, Chenyang Tao and Shounak Datta, Nikhil Mehta, Ricardo Henao, Fan Li, Lawrence Carin Duke

Keywords Abstract Paper

Sharing Knowledge in Multi-Task Deep Reinforcement Learning

Carlo D'Eramo, Davide Tateo, Andrea Bonarini and Marcello Restelli, Jan Peters

Keywords Abstract Paper

Deep Reinforcement Learning, Multi-Task

Asynchronous Coagent Networks

James Kostas, Chris Nota, Philip Thomas

Keywords Abstract Paper

Towards Minimax Optimal Reinforcement Learning in Factored Markov Decision Processes

Yi Tian, Jian Qian, Suvrit Sra

Keywords Abstract Paper

Minimax Weight and Q-Function Learning for Off-Policy Evaluation

Masatoshi Uehara, Jiawei Huang, Nan Jiang

Keywords Abstract Paper

Logistic q-learning

Joan Bas-Serrano, Sebastian Curi, Andreas Krause, Gergely Neu

Keywords Abstract Paper

Learning infinite-horizon average-reward MDPs with linear function approximation

Chen-Yu Wei, Mehdi Jafarnia Jahromi, Haipeng Luo, Rahul Jain

Keywords Abstract Paper

PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals

Henry Charlesworth, Giovanni Montana

Keywords Paper

Richard Kurle, Syama Sundar Rangapuram, Emmanuel de Bézenac and
Stephan Günnemann, Jan Gasthaus

Keywords Paper

Gianluca Brero, Alon Eden, Matthias Gerstgrasser and
David Parkes, Duncan Rheingans-Yoo

Keywords Paper

Erwan Lecarpentier, David Abel, Kavosh Asadi and
Yuu Jinnai, Emmanuel Rachelson, Michael L. Littman

Keywords Paper

Keywords Paper

Keywords Paper

Bogdan Mazoure, Remi Tachet des Combes, Thang Doan and
Philip Bachman, R Devon Hjelm

Keywords Paper

Alvaro Velasquez, Brett Bissey, Lior Barak and
Andre Beckus, Ismail Alkhouri, Daniel Melcer, George Atia

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Clare Lyle, Amy Zhang, Angelos Filos and
Shagun Sodhani, Marta Kwiatkowska, Yarin Gal, Doina Precup, Joelle Pineau

Keywords Paper

Keywords Paper

Keywords Paper

Serge Assaad, Shuxi Zeng, Chenyang Tao and
Shounak Datta, Nikhil Mehta, Ricardo Henao, Fan Li, Lawrence Carin Duke

Keywords Paper

Carlo D'Eramo, Davide Tateo, Andrea Bonarini and
Marcello Restelli, Jan Peters

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Pengfei Wang, Yu Fan, Long Xia and
Wayne Xin Zhao, Shaozhang Niu, Jimmy Huang

Keywords Paper

Keywords Paper

Hung Le, Thommen Karimpanal George, Majid Abdolshah and
Truyen Tran, Svetha Venkatesh

Keywords Paper

Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang and
Krzysztof Choromanski, Anna Choromanska, Michael Jordan

Keywords Paper

Jianhao Wang, Zhizhou Ren, Beining Han and
Jianing Ye, Chongjie Zhang

Keywords Paper

Ashok Cutkosky, Christoph Dann, Abhimanyu Das and
Claudio Gentile, Aldo Pacchiano, Manish Purohit

Keywords Paper

Keywords Paper

Mark Rowland, Anna Harutyunyan, Hado van Hasselt and
Diana Borsa, Tom Schaul, Remi Munos, Will Dabney

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Youngsuk Park, Ryan Rossi, Zheng Wen and
Gang Wu, Handong Zhao

Keywords Paper