FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Abstract: We study the offline meta-reinforcement learning (OMRL) problem, a paradigm which enables reinforcement learning (RL) algorithms to quickly adapt to unseen tasks without any interactions with the environments, making RL truly practical in many real-world applications. This problem is still not fully understood, for which two major challenges need to be addressed. First, offline RL usually suffers from bootstrapping errors of out-of-distribution state-actions which leads to divergence of value functions. Second, meta-RL requires efficient and robust task inference learned jointly with control policy. In this work, we enforce behavior regularization on learned policy as a general approach to offline RL, combined with a deterministic context encoder for efficient task inference. We propose a novel negative-power distance metric on bounded context embedding space, whose gradients propagation is detached from the Bellman backup. We provide analysis and insight showing that some simple design choices can yield substantial improvements over recent approaches involving meta-RL and distance metric learning. To the best of our knowledge, our method is the first model-free and end-to-end OMRL algorithm, which is computationally efficient and demonstrated to outperform prior algorithms on several meta-RL benchmarks.

03/05/2021

Algorithms -> Representation Learning; Algorithms -> Structured Prediction; Applications -> Computational Biology and Bioinform, Deep Learning -> Embedding Approaches

3:16

06/12/2021

Christian Henning, Maria Cervera, Francesco D'Angelo and
Johannes von Oswald, Regina Traber, Benjamin Ehret, Seijin Kobayashi, Benjamin F. Grewe, João Sacramento

FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Lanqing Li, Rui Yang, Dijun Luo

Comments

Similar Papers

Batch Reinforcement Learning Through Continuation Method

Yijie Guo, Shengyu Feng, Nicolas Le Roux and Ed H. Chi, Honglak Lee, Minmin Chen

Keywords Abstract Paper

batch reinforcement learning, relaxed regularization, continuation method

Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning

Yiqin Yang, Xiaoteng Ma, Li Chenghao and Zewu Zheng, Qiyuan Zhang, Gao Huang, Jun Yang, Qianchuan Zhao

Keywords Abstract Paper

Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble

Gaon An, Seungyong Moon, Jang-Hyun Kim, Hyun Oh Song

Keywords Abstract Paper

deep learning, reinforcement learning and planning

EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL

Seyed Kamyar Seyed Ghasemipour, Dale Schuurmans, Shixiang Gu

Keywords Abstract Paper

COMBO: Conservative Offline Model-Based Policy Optimization

Tianhe Yu, Aviral Kumar, Rafael Rafailov and Aravind Rajeswaran, Sergey Levine, Chelsea Finn

Keywords Abstract Paper

deep learning, optimization, reinforcement learning and planning

Conservative Offline Distributional Reinforcement Learning

Yecheng Ma, Dinesh Jayaraman, Osbert Bastani

Keywords Abstract Paper

OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation

Jongmin Lee, Wonseok Jeon, Byung-Jun Lee and Joelle Pineau, Kee-Eung Kim

Keywords Abstract Paper

Offline Meta-Reinforcement Learning with Advantage Weighting

Eric Mitchell, Rafael Rafailov, Xue Bin Peng and Sergey Levine, Chelsea Finn

Keywords Abstract Paper

Algorithms, Multitask, Transfer, and Meta Learning

Addressing Action Oscillations through Learning Policy Inertia

Chen Chen, Hongyao Tang, Jianye Hao and Wulong Liu, Zhaopeng Meng

Keywords Abstract Paper

Conservative Data Sharing for Multi-Task Offline Reinforcement Learning

Tianhe Yu, Aviral Kumar, Yevgen Chebotar and Karol Hausman, Sergey Levine, Chelsea Finn

Keywords Abstract Paper

On the generalization properties of adversarial training

Yue Xing, Qifan Song, Guang Cheng

Keywords Abstract Paper

On the Stability and Convergence of Robust Adversarial Reinforcement Learning: A Case Study on Linear Quadratic Systems

Kaiqing Zhang, Bin Hu, Tamer Basar

Keywords Abstract Paper

A Minimalist Approach to Offline Reinforcement Learning

Scott Fujimoto, Shixiang (Shane) Gu

Keywords Abstract Paper

reinforcement learning and planning, generative model

DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction

Aviral Kumar, Abhishek Gupta, Sergey Levine

Keywords Abstract Paper

Off-Policy Imitation Learning from Observations

Zhuangdi Zhu, Kaixiang Lin, Bo Dai, Jiayu Zhou

Keywords Abstract Paper

POMO: Policy Optimization with Multiple Optima for Reinforcement Learning

Yeong-Dae Kwon, Jinho Choo, Byoungjip Kim and Iljoo Yoon, Youngjune Gwon, Seungjai Min

Keywords Abstract Paper

Adversarial Robustness with Semi-Infinite Constrained Learning

Alexander Robey, Luiz Chamon, George J. Pappas and Hamed Hassani, Alejandro Ribeiro

Keywords Abstract Paper

theory, deep learning, optimization, robustness, adversarial robustness and security

MOPO: Model-based Offline Policy Optimization

Tianhe (Kevin) Yu, Garrett Thomas, Lantao Yu and Stefano Ermon, James Zou, Sergey Levine, Chelsea Finn, Tengyu Ma

Keywords Abstract Paper

Making Sense of Reinforcement Learning and Probabilistic Inference

Brendan O'Donoghue, Ian Osband, Catalin Ionescu

Keywords Abstract Paper

Reinforcement learning, Bayesian inference, Exploration

Learning Nearly Decomposable Value Functions Via Communication Minimization

Tonghan Wang*, Jianhao Wang*, Chongyi Zheng, Chongjie Zhang

Keywords Abstract Paper

Multi-agent reinforcement learning, Nearly decomposable value function, Minimized communication

Online model selection for reinforcement learning with function approximation

Jonathan Lee, Aldo Pacchiano, Vidya Muthukumar and Weihao Kong, Emma Brunskill

Keywords Abstract Paper

When False Positive is Intolerant: End-to-End Optimization with Low FPR for Multipartite Ranking

Peisong Wen, Qianqian Xu, Zhiyong Yang and Yuan He, Qingming Huang

Keywords Abstract Paper

deep learning, optimization, machine learning, vision

Towards Robust Bisimulation Metric Learning

Yijie Guo, Shengyu Feng, Nicolas Le Roux and
Ed H. Chi, Honglak Lee, Minmin Chen

Keywords Paper

Yiqin Yang, Xiaoteng Ma, Li Chenghao and
Zewu Zheng, Qiyuan Zhang, Gao Huang, Jun Yang, Qianchuan Zhao

Keywords Paper

Keywords Paper

Keywords Paper

Tianhe Yu, Aviral Kumar, Rafael Rafailov and
Aravind Rajeswaran, Sergey Levine, Chelsea Finn

Keywords Paper

Keywords Paper

Jongmin Lee, Wonseok Jeon, Byung-Jun Lee and
Joelle Pineau, Kee-Eung Kim

Keywords Paper

Eric Mitchell, Rafael Rafailov, Xue Bin Peng and
Sergey Levine, Chelsea Finn

Keywords Paper

Chen Chen, Hongyao Tang, Jianye Hao and
Wulong Liu, Zhaopeng Meng

Keywords Paper

Tianhe Yu, Aviral Kumar, Yevgen Chebotar and
Karol Hausman, Sergey Levine, Chelsea Finn

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yeong-Dae Kwon, Jinho Choo, Byoungjip Kim and
Iljoo Yoon, Youngjune Gwon, Seungjai Min

Keywords Paper

Alexander Robey, Luiz Chamon, George J. Pappas and
Hamed Hassani, Alejandro Ribeiro

Keywords Paper

Tianhe (Kevin) Yu, Garrett Thomas, Lantao Yu and
Stefano Ermon, James Zou, Sergey Levine, Chelsea Finn, Tengyu Ma

Keywords Paper

Keywords Paper

Tonghan Wang, Jianhao Wang, Chongyi Zheng, Chongjie Zhang

Keywords Paper

Jonathan Lee, Aldo Pacchiano, Vidya Muthukumar and
Weihao Kong, Emma Brunskill

Keywords Paper

Peisong Wen, Qianqian Xu, Zhiyong Yang and
Yuan He, Qingming Huang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Roberta Raileanu, Maxwell Goldstein, Denis Yarats and
Ilya Kostrikov, Rob Fergus

Keywords Paper

Dibya Ghosh, Jad Rahme, Aviral Kumar and
Amy Zhang, Ryan Adams, Sergey Levine

Keywords Paper

Moonkyung Ryu, Yinlam Chow, Ross Anderson and
Christian Tjandraatmadja, Craig Boutilier

Keywords Paper

Aljaz Bozic, Pablo Palafox, Michael Zollhöfer and
Angela Dai, Justus Thies, Matthias Niessner

Keywords Paper

Minghuan Liu, Hanye Zhao, Zhengyu Yang and
Jian Shen, Weinan Zhang, Li Zhao, Tie-Yan Liu

Keywords Paper

Keywords Paper

Keywords Paper

Christian Henning, Maria Cervera, Francesco D'Angelo and
Johannes von Oswald, Regina Traber, Benjamin Ehret, Seijin Kobayashi, Benjamin F. Grewe, João Sacramento

Keywords Paper

Keywords Paper

Anurag Ajay, Aviral Kumar, Pulkit Agrawal and
Sergey Levine, Ofir Nachum

Keywords Paper

Denis Yarats, Amy Zhang, Ilya Kostrikov and
Brandon Amos, Joelle Pineau, Rob Fergus

Keywords Paper