Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning

06/12/2021

Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning

Yiqin Yang, Xiaoteng Ma, Li Chenghao, Zewu Zheng, Qiyuan Zhang, Gao Huang, Jun Yang, Qianchuan Zhao

Keywords: reinforcement learning and planning

Abstract Paper Similar Papers

Abstract: Learning from datasets without interaction with environments (Offline Learning) is an essential step to apply Reinforcement Learning (RL) algorithms in real-world scenarios.However, compared with the single-agent counterpart, offline multi-agent RL introduces more agents with the larger state and action space, which is more challenging but attracts little attention. We demonstrate current offline RL algorithms are ineffective in multi-agent systems due to the accumulated extrapolation error. In this paper, we propose a novel offline RL algorithm, named Implicit Constraint Q-learning (ICQ), which effectively alleviates the extrapolation error by only trusting the state-action pairs given in the dataset for value estimation. Moreover, we extend ICQ to multi-agent tasks by decomposing the joint-policy under the implicit constraint. Experimental results demonstrate that the extrapolation error is successfully controlled within a reasonable range and insensitive to the number of agents. We further show that ICQ achieves the state-of-the-art performance in the challenging multi-agent offline tasks (StarCraft II). Our code is public online at https://github.com/YiqinYang/ICQ.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

03/05/2021

FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance Metric Learning and Behavior Regularization

Lanqing Li, Rui Yang, Dijun Luo

Keywords Paper

distance metric learning, offline/batch reinforcement learning, meta-reinforcement learning, contrastive learning, multi-task reinforcement learning

1

0

0

0

6:21

18/07/2021

Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning

Yue Wu, Shuangfei Zhai, Nitish Srivastava and
Josh Susskind, Jian Zhang, Russ Salakhutdinov, Hanlin Goh

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:01

18/07/2021

PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning

Angelos Filos, Clare Lyle, Yarin Gal and
Sergey Levine, Natasha Jaques, Gregory Farquhar

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

15:18

06/12/2021

Conservative Data Sharing for Multi-Task Offline Reinforcement Learning

Tianhe Yu, Aviral Kumar, Yevgen Chebotar and
Karol Hausman, Sergey Levine, Chelsea Finn

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:27

06/12/2021

IQ-Learn: Inverse soft-Q Learning for Imitation

Divyansh Garg, Shuvam Chakraborty, Chris Cundy and
Jiaming Song, Stefano Ermon

Keywords Paper

optimization, reinforcement learning and planning, adversarial robustness and security

0

0

0

0

8:25

02/02/2021

A Free Lunch for Unsupervised Domain Adaptive Object Detection without Source Data

Xianfeng Li, Weijie Chen, Di Xie and
Shicai Yang, Peng Yuan, Shiliang Pu, Yueting Zhuang

Keywords Paper

0

0

0

0

19:06

06/12/2020

Improving Generalization in Reinforcement Learning with Mixture Regularization

KAIXIN WANG, Bingyi Kang, Jie Shao, Jiashi Feng

Keywords Paper

0

0

0

1

3:14

06/12/2020

MOPO: Model-based Offline Policy Optimization

Tianhe (Kevin) Yu, Garrett Thomas, Lantao Yu and
Stefano Ermon, James Zou, Sergey Levine, Chelsea Finn, Tengyu Ma

Keywords Paper

0

0

0

0

3:30

06/12/2021

Curriculum Offline Imitating Learning

Minghuan Liu, Hanye Zhao, Zhengyu Yang and
Jian Shen, Weinan Zhang, Li Zhao, Tie-Yan Liu

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:28

14/06/2020

Learning to Forget for Meta-Learning

Sungyong Baik, Seokil Hong, Kyoung Mu Lee

Keywords Paper

meta learning, few-shot learning, reinforcement learning

0

0

0

0

1:01

06/12/2020

POMO: Policy Optimization with Multiple Optima for Reinforcement Learning

Yeong-Dae Kwon, Jinho Choo, Byoungjip Kim and
Iljoo Yoon, Youngjune Gwon, Seungjai Min

Keywords Paper

0

0

0

0

3:19

13/04/2021

Experimental design for regret minimization in linear bandits

Andrew Wagenmaker, Julian Katz-Samuels, Kevin Jamieson

Keywords Paper

0

0

0

0

3:05

26/04/2020

Mixup Inference: Better Exploiting Mixup to Defend Adversarial Attacks

Tianyu Pang, Kun Xu, Jun Zhu

Keywords Paper

Trustworthy Machine Learning, Adversarial Robustness, Inference Principle, Mixup

0

0

0

0

4:59

18/07/2021

PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration

Yuda Song, Wen Sun

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:13

26/04/2020

Learning Nearly Decomposable Value Functions Via Communication Minimization

Tonghan Wang, Jianhao Wang, Chongyi Zheng, Chongjie Zhang

Keywords Paper

Multi-agent reinforcement learning, Nearly decomposable value function, Minimized communication

0

0

0

0

5:00

18/07/2021

Matrix Sketching for Secure Collaborative Machine Learning

Mengjiao Zhang, Shusen Wang

Keywords Paper

Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

0

4:25

06/12/2021

Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble

Gaon An, Seungyong Moon, Jang-Hyun Kim, Hyun Oh Song

Keywords Paper

deep learning, reinforcement learning and planning

1

0

0

0

13:50

18/07/2021

Continuous Coordination As a Realistic Scenario for Lifelong Learning

Hadi Nekoei, Akilesh Badrinaaraayanan, Aaron Courville, Sarath Chandar

Keywords Paper

Algorithms, Continual Learning

0

0

0

0

5:27

14/09/2020

Treant: Training Evasion-Aware Decision Trees

Stefano Calzavara, Claudio Lucchese, Gabriele Tolomei and
Seyum Assefa Abebe, Salvatore Orland

Keywords Paper

0

0

0

0

18:49

06/12/2020

Adversarial Learning for Robust Deep Clustering

Xu Yang, Cheng Deng, Kun Wei and
Junchi Yan, Wei Liu

Keywords Paper

0

0

0

0

3:23

13/04/2021

Free-rider attacks on model aggregation in federated learning

Yann Fraboni, Richard Vidal, Marco Lorenzi

Keywords Paper

0

0

0

0

3:02

13/04/2021

On the generalization properties of adversarial training

Yue Xing, Qifan Song, Guang Cheng

Keywords Paper

0

0

0

0

3:05

06/12/2021

COMBO: Conservative Offline Model-Based Policy Optimization

Tianhe Yu, Aviral Kumar, Rafael Rafailov and
Aravind Rajeswaran, Sergey Levine, Chelsea Finn

Keywords Paper

deep learning, optimization, reinforcement learning and planning

0

0

0

0

12:35

06/12/2021

Towards Deeper Deep Reinforcement Learning with Spectral Normalization

Nils Bjorck, Carla Gomes, Kilian Weinberger

Keywords Paper

reinforcement learning and planning, vision, language

0

0

0

0

9:28

18/07/2021

Shortest-Path Constrained Reinforcement Learning for Sparse Reward Tasks

Sungryull Sohn, Sungtae Lee, Jongwook Choi and
Harm van Seijen, Mehdi Fatemi, Honglak Lee

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:19

06/12/2020

Adversarial Self-Supervised Contrastive Learning

Minseon Kim, Jihoon Tack, Sung Ju Hwang

Keywords Paper

0

0

0

0

3:19

02/02/2021

Tempered Sigmoid Activations for Deep Learning with Differential Privacy

Nicolas Papernot, Abhradeep Thakurta, Shuang Song and
Steve Chien, Úlfar Erlingsson

Keywords Paper

0

0

0

0

15:38

06/12/2021

Regularized Softmax Deep Multi-Agent Q-Learning

Ling Pan, Tabish Rashid, Bei Peng and
Longbo Huang, Shimon Whiteson

Keywords Paper

reinforcement learning and planning

0

0

0

0

10:58

18/07/2021

Learning and Planning in Average-Reward Markov Decision Processes

Yi Wan, Abhishek Naik, Richard Sutton

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:05

12/07/2020

Self-supervised Label Augmentation via Input Transformations

Hankook Lee, Sung Ju Hwang, Jinwoo Shin

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

14:34

12/07/2020

Goal-Aware Prediction: Learning to Model What Matters

Suraj Nair, Silvio Savarese, Chelsea Finn

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

11:16

06/12/2020

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

Paul Barde, Julien Roy, Wonseok Jeon and
Joelle Pineau, Chris Pal, Derek Nowrouzezahrai

Keywords Paper

0

0

0

0

3:08

12/07/2020

Fast and Three-rious: Speeding Up Weak Supervision with Triplet Methods

Dan Fu, Mayee Chen, Frederic Sala and
Sarah Hooper, Kayvon Fatahalian, Christopher Re

Keywords Paper

Probabilistic Inference - Models and Probabilistic Programming

0

0

0

0

15:01

02/02/2021

Improving Sample Efficiency in Model-Free Reinforcement Learning from Images

Denis Yarats, Amy Zhang, Ilya Kostrikov and
Brandon Amos, Joelle Pineau, Rob Fergus

Keywords Paper

0

0

0

0

12:19

14/06/2020

Equalization Loss for Long-Tailed Object Recognition

Jingru Tan, Changbao Wang, Buyu Li and
Quanquan Li, Wanli Ouyang, Changqing Yin, Junjie Yan

Keywords Paper

long tail, object detection, lvis, object recognition

0

0

0

0

1:00

06/12/2020

Understanding and Improving Fast Adversarial Training

Maksym Andriushchenko, Nicolas Flammarion

Keywords Paper

0

0

0

0

3:23

18/07/2021

Offline Meta-Reinforcement Learning with Advantage Weighting

Eric Mitchell, Rafael Rafailov, Xue Bin Peng and
Sergey Levine, Chelsea Finn

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

1

0

0

0

5:08

02/02/2021

Fast and Scalable Adversarial Training of Kernel SVM via Doubly Stochastic Gradients

Huimin Wu, Zhengmian Hu, Bin Gu

Keywords Paper

0

0

0

0

14:04

06/12/2021

Mitigating Covariate Shift in Imitation Learning via Offline Data With Partial Coverage

Jonathan Chang, Masatoshi Uehara, Dhruv Sreenivas and
Rahul Kidambi, Wen Sun

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

8:37

06/12/2020

Critic Regularized Regression

Ziyu Wang, Alexander Novikov, Konrad Zolna and
Josh Merel, Jost Tobias Springenberg, Scott Reed, Bobak Shahriari, Noah Siegel, Caglar Gulcehre, Nicolas Heess, Nando de Freitas

Keywords Paper

0

0

0

0

3:20