Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration

06/12/2021

Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration

Lulu Zheng, Jiarui Chen, Jianhao Wang, Jiamin He, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao, Chongjie Zhang

Keywords: reinforcement learning and planning

Abstract Paper Similar Papers

Abstract: Efficient exploration in deep cooperative multi-agent reinforcement learning (MARL) still remains challenging in complex coordination problems. In this paper, we introduce a novel Episodic Multi-agent reinforcement learning with Curiosity-driven exploration, called EMC. We leverage an insight of popular factorized MARL algorithms that the ``induced" individual Q-values, i.e., the individual utility functions used for local execution, are the embeddings of local action-observation histories, and can capture the interaction between agents due to reward backpropagation during centralized training. Therefore, we use prediction errors of individual Q-values as intrinsic rewards for coordinated exploration and utilize episodic memory to exploit explored informative experience to boost policy training. As the dynamics of an agent's individual Q-value function captures the novelty of states and the influence from other agents, our intrinsic reward can induce coordinated exploration to new or promising states. We illustrate the advantages of our method by didactic examples, and demonstrate its significant outperformance over state-of-the-art MARL baselines on challenging tasks in the StarCraft II micromanagement benchmark.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

03/05/2021

Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments

Daochen Zha, Wenye Ma, Lei Yuan and
Xia Hu, Ji Liu

Keywords Paper

Exploration, Reinforcement Learning, Self-Imitation, Generalization of Reinforcement Learning

0

0

0

0

5:10

18/07/2021

MetaCURE: Meta Reinforcement Learning with Empowerment-Driven Exploration

Jin Zhang, Jianhao Wang, Hao Hu and
Tong Chen, Yingfeng Chen, Changjie Fan, Chongjie Zhang

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

4:19

06/12/2020

PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals

Henry Charlesworth, Giovanni Montana

Keywords Paper

0

0

0

0

3:20

12/07/2020

Implicit Generative Modeling for Efficient Exploration

Neale Ratzlaff, Qinxun Bai, Fuxin Li, Wei Xu

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

14:01

26/04/2020

RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments

Roberta Raileanu, Tim Rocktäschel

Keywords Paper

reinforcement learning, exploration, curiosity

0

0

0

0

4:48

06/12/2021

MADE: Exploration via Maximizing Deviation from Explored Regions

Tianjun Zhang, Paria Rashidinejad, Jiantao Jiao and
Yuandong Tian, Joseph Gonzalez, Stuart Russell

Keywords Paper

reinforcement learning and planning

0

0

0

0

13:09

26/08/2020

Nested-Wasserstein Self-Imitation Learning for Sequence Generation

Ruiyi Zhang, Changyou Chen, Zhe Gan and
Zheng Wen, Wenlin Wang, Lawrence Carin

Keywords Paper

0

0

0

0

11:18

02/02/2021

Sequential Generative Exploration Model for Partially Observable Reinforcement Learning

Haiyan Yin, Jianda Chen, Sinno Jialin Pan, Sebastian Tschiatschek

Keywords Paper

0

0

0

0

14:40

02/02/2021

Exploration by Maximizing Renyi Entropy for Reward-Free RL Framework

Chuheng Zhang, Yuanying Cai, Longbo Huang, Jian Li

Keywords Paper

0

0

0

0

16:03

06/12/2021

Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies

Tim Seyde, Igor Gilitschenski, Wilko Schwarting and
Bartolomeo Stellato, Martin Riedmiller, Markus Wulfmeier, Daniela Rus

Keywords Paper

reinforcement learning and planning

0

0

0

0

6:48

06/12/2021

Learning to Synthesize Programs as Interpretable and Generalizable Policies

Dweep Trivedi, Jesse Zhang, Shao-Hua Sun, Joseph Lim

Keywords Paper

deep learning, reinforcement learning and planning

0

0

0

0

10:30

18/07/2021

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

Kevin Li, Abhishek Gupta, Ashwin D Reddy and
Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:16

06/12/2020

Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization

Sreejith Balakrishnan, Quoc Phong Nguyen, Bryan Kian Hsiang Low, Harold Soh

Keywords Paper

0

0

0

0

3:22

06/12/2021

Discovery of Options via Meta-Learned Subgoals

Vivek Veeriah, Tom Zahavy, Matteo Hessel and
Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh

Keywords Paper

deep learning, reinforcement learning and planning

0

0

0

0

4:13

26/04/2020

Influence-Based Multi-Agent Exploration

Tonghan Wang, Jianhao Wang, Yi Wu, Chongjie Zhang

Keywords Paper

Multi-agent reinforcement learning, Exploration

0

0

0

0

4:53

12/07/2020

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Yaodong Yang, Jianye Hao, Guangyong Chen and
Hongyao Tang, Yingfeng Chen, Yujing Hu, Changjie Fan, Zhongyu Wei

Keywords Paper

Planning, Control, and Multiagent Learning

0

0

0

0

6:42

06/12/2021

Towards Robust Bisimulation Metric Learning

Mete Kemertas, Tristan Aumentado-Armstrong

Keywords Paper

reinforcement learning and planning, robustness, representation learning

0

0

0

0

12:24

06/12/2021

Towards Understanding Cooperative Multi-Agent Q-Learning with Value Factorization

Jianhao Wang, Zhizhou Ren, Beining Han and
Jianing Ye, Chongjie Zhang

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

11:35

26/04/2020

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

Siddharth Reddy, Anca D. Dragan, Sergey Levine

Keywords Paper

Imitation Learning, Reinforcement Learning

0

0

0

0

4:38

12/07/2020

Reward-Free Exploration for Reinforcement Learning

Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

14:37

06/12/2020

Cooperative Heterogeneous Deep Reinforcement Learning

Han Zheng, Pengfei Wei, Jing Jiang and
Guodong Long, Qinghua Lu, Chengqi Zhang

Keywords Paper

0

0

0

0

3:08

19/08/2021

Reward-Constrained Behavior Cloning

Zhaorong Wang, Meng Wang, Jingqi Zhang and
Yingfeng Chen, Chongjie Zhang

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning, Constraint Optimization

0

0

0

0

14:43

06/12/2021

An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning

Tianpei Yang, Weixun Wang, Hongyao Tang and
Jianye Hao, Zhaopeng Meng, Hangyu Mao, Dong Li, Wulong Liu, Yingfeng Chen, Yujing Hu, Changjie Fan, Chengwei Zhang

Keywords Paper

reinforcement learning and planning, transfer learning

0

0

0

0

15:21

16/11/2020

Positive-Unlabeled Reward Learning

Danfei Xu, Misha Denil

Keywords Paper

0

0

0

0

5:04

19/08/2021

Don’t Do What Doesn’t Matter: Intrinsic Motivation with Action Usefulness

Mathieu Seurin, Florian Strub, Philippe Preux, Olivier Pietquin

Keywords Paper

Machine Learning, Reinforcement Learning, Deep Reinforcement Learning

0

0

0

0

14:48

18/07/2021

Representation Matters: Offline Pretraining for Sequential Decision Making

Mengjiao Yang, Ofir Nachum

Keywords Paper

Reinforcement Learning and Planning

1

0

0

0

5:06

26/04/2020

Intrinsic Motivation for Encouraging Synergistic Behavior

Rohan Chitnis, Shubham Tulsiani, Saurabh Gupta, Abhinav Gupta

Keywords Paper

reinforcement learning, intrinsic motivation, synergistic, robot manipulation

0

0

0

0

5:02

06/12/2020

Policy Improvement via Imitation of Multiple Oracles

Ching-An Cheng, Andrey Kolobov, Alekh Agarwal

Keywords Paper

0

0

0

0

3:12

16/11/2020

Motion Planner Augmented Reinforcement Learning for Robot Manipulation in Obstructed Environments

Jun Yamada, Youngwoon Lee, Gautam Salhotra and
Karl Pertsch, Max Pflueger, Gaurav Sukhatme, Joseph Lim, Peter Englert

Keywords Paper

0

0

0

0

4:59

18/07/2021

Policy Gradient Bayesian Robust Optimization for Imitation Learning

Zaynah Javed, Daniel Brown, Satvik Sharma and
Jerry Zhu, Ashwin Balakrishna, Marek Petrik, Anca Dragan, Ken Goldberg

Keywords Paper

Social Aspects of Machine Learning, AI Safety

0

0

0

1

5:10

22/09/2020

Deep bayesian bandits: Exploring in online personalized recommendations

Dalin Guo, Sofia Ira Ktena, Pranay Kumar Myana and
Ferenc Huszar, Wenzhe Shi, Alykhan Tejani, Michael Kneier, Sourav Das

Keywords Paper

Contextual bandit, Recommender Systems, Algorithmic bias

0

0

0

0

2:59

06/12/2020

Off-Policy Imitation Learning from Observations

Zhuangdi Zhu, Kaixiang Lin, Bo Dai, Jiayu Zhou

Keywords Paper

0

0

0

1

3:24

18/07/2021

Provably Correct Optimization and Exploration with Non-linear Policies

Fei Feng, Wotao Yin, Alekh Agarwal, Lin Yang

Keywords Paper

Deep Learning, Adversarial Networks, Applications, Fairness, Accountability, and Transparency, Theory, RL, Decisions and Control Theory

0

0

0

0

5:03

03/05/2021

Learning perturbation sets for robust machine learning

Eric Wong, Zico Kolter

Keywords Paper

conditional variational autoencoder, adversarial examples, perturbation sets, robust machine learning

0

1

0

0

5:06

06/12/2021

Local policy search with Bayesian optimization

Sarah Müller, Alexander von Rohr, Sebastian Trimpe

Keywords Paper

theory, optimization, reinforcement learning and planning, active learning

0

0

0

0

11:42

18/07/2021

A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation

Scott Fujimoto, David Meger, Doina Precup

Keywords Paper

Reinforcement Learning and Planning, Deep RL

1

0

0

0

3:50

26/04/2020

Learning Nearly Decomposable Value Functions Via Communication Minimization

Tonghan Wang, Jianhao Wang, Chongyi Zheng, Chongjie Zhang

Keywords Paper

Multi-agent reinforcement learning, Nearly decomposable value function, Minimized communication

0

0

0

0

5:00

18/07/2021

Decoupling Exploration and Exploitation for Meta-Reinforcement Learning without Sacrifices

Evan Liu, Aditi Raghunathan, Percy Liang, Chelsea Finn

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:41

06/12/2021

Unsupervised Domain Adaptation with Dynamics-Aware Rewards in Reinforcement Learning

Jinxin Liu, Hao Shen, Donglin Wang and
Yachen Kang, Qiangxing Tian

Keywords Paper

reinforcement learning and planning, domain adaptation

0

0

0

0

8:07

18/07/2021

Provably Efficient Algorithms for Multi-Objective Competitive RL

Tiancheng Yu, Yi Tian, Jingzhao Zhang, Suvrit Sra

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

17:04