Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Abstract: Recently, deep multiagent reinforcement learning (MARL) has become a highly active research area as many real-world problems can be inherently viewed as multiagent systems. A particularly interesting and widely applicable class of problems is the partially observable cooperative multiagent setting, in which a team of agents learns to coordinate their behaviors conditioning on their private observations and commonly shared global reward signals. One natural solution is to resort to the centralized training and decentralized execution paradigm. During centralized training, one key challenge is the multiagent credit assignment: how to allocate the global rewards for individual agent policies for better coordination towards maximizing system-level's benefits. In this paper, we propose a new method called Q-value Path Decomposition (QPD) to decompose the system's global Q-values into individual agents' Q-values. Unlike previous works which restrict the representation relation of the individual Q-values and the global one, we leverage the integrated gradient attribution technique into deep MARL to directly decompose global Q-values along trajectory paths to assign credits for agents. We evaluate QPD on the challenging StarCraft II micromanagement tasks and show that QPD achieves the state-of-the-art performance in both homogeneous and heterogeneous multiagent scenarios compared with existing cooperative MARL algorithms.

12/07/2020

Neuroscience and Cognitive Science, Neuroscience, Reinforcement Learning and Planning, Algorithms, Representation Learning; Algorithms, Sparse Coding and Dimensionality Expansion; Applications, Matrix and Ten

5:16

12/07/2020

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Yaodong Yang, Jianye Hao, Guangyong Chen, Hongyao Tang, Yingfeng Chen, Yujing Hu, Changjie Fan, Zhongyu Wei

Comments

Similar Papers

Multi-Agent Determinantal Q-Learning

Yaodong Yang, Ying Wen, Jun Wang and Liheng Chen, Kun Shao, David Mguni, Weinan Zhang

Keywords Abstract Paper

Planning, Control, and Multiagent Learning

QPLEX: Duplex Dueling Multi-Agent Q-Learning

Jianhao Wang, Zhizhou Ren, Terry Liu and Yang Yu, Chongjie Zhang

Keywords Abstract Paper

Dueling structure, Value factorization, Multi-agent reinforcement learning

Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning

Jongwook Choi, Archit Sharma, Honglak Lee and Sergey Levine, Shixiang Gu

Keywords Abstract Paper

Neuroscience and Cognitive Science, Neuroscience, Reinforcement Learning and Planning, Algorithms, Representation Learning; Algorithms, Sparse Coding and Dimensionality Expansion; Applications, Matrix and Ten

Learning Fair Policies in Multi-Objective (Deep) Reinforcement Learning with Average and Discounted Rewards

Umer Siddique, Paul Weng, Matthieu Zimmer

Keywords Abstract Paper

Reinforcement Learning - Deep RL

Explicable Reward Design for Reinforcement Learning Agents

Rati Devidze, Goran Radanovic, Parameswaran Kamalaruban, Adish Singla

Keywords Abstract Paper

optimization, reinforcement learning and planning, interpretability

Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization

Michael Volpp, Lukas P. Fröhlich, Kirsten Fischer and Andreas Doerr, Stefan Falkner, Frank Hutter, Christian Daniel

Keywords Abstract Paper

Transfer Learning, Meta Learning, Bayesian Optimization, Reinforcement Learning

Towards Understanding Cooperative Multi-Agent Q-Learning with Value Factorization

Jianhao Wang, Zhizhou Ren, Beining Han and Jianing Ye, Chongjie Zhang

Keywords Abstract Paper

theory, reinforcement learning and planning

Learning Nearly Decomposable Value Functions Via Communication Minimization

Tonghan Wang*, Jianhao Wang*, Chongyi Zheng, Chongjie Zhang

Keywords Abstract Paper

Multi-agent reinforcement learning, Nearly decomposable value function, Minimized communication

Reinforced Imitative Graph Representation Learning for Mobile User Profiling: An Adversarial Training Perspective

Dongjie Wang, Pengyang Wang, Kunpeng Liu and Yuanchun Zhou, Charles E Hughes, Yanjie Fu

Keywords Abstract Paper

Finite-Time Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Jun Sun, Gang Wang, Georgios B. Giannakis and Qinmin Yang, Zaiyue Yang

Keywords Abstract Paper

Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning

Anuj Mahajan, Mikayel Samvelyan, Lei Mao and Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Anima Anandkumar

Keywords Abstract Paper

Reinforcement Learning and Planning, Multi-Agent RL

Sequential Generative Exploration Model for Partially Observable Reinforcement Learning

Haiyan Yin, Jianda Chen, Sinno Jialin Pan, Sebastian Tschiatschek

Keywords Abstract Paper

Is Bang-Bang Control All You Need? Solving Continuous Control with Bernoulli Policies

Tim Seyde, Igor Gilitschenski, Wilko Schwarting and Bartolomeo Stellato, Martin Riedmiller, Markus Wulfmeier, Daniela Rus

Keywords Abstract Paper

reinforcement learning and planning

Ready Policy One: World Building Through Active Learning

Philip Ball, Jack Parker-Holder, Aldo Pacchiano and Krzysztof Choromanski, Stephen Roberts

Keywords Abstract Paper

Reinforcement Learning - Deep RL

Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration

Lulu Zheng, Jiarui Chen, Jianhao Wang and Jiamin He, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao, Chongjie Zhang

Keywords Abstract Paper

reinforcement learning and planning

Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Tabish Rashid, Gregory Farquhar, Bei Peng, Shimon Whiteson

Keywords Abstract Paper

Reinforcement Learning Based Multi-Agent Resilient Control: From Deep Neural Networks to an Adaptive Law

Jian Hou, Fangyuan Wang, Lili Wang, Zhiyong Chen

Keywords Abstract Paper

Implementation Matters in Deep RL: A Case Study on PPO and TRPO

Logan Engstrom, Andrew Ilyas, Shibani Santurkar and Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry

Keywords Abstract Paper

deep policy gradient methods, deep reinforcement learning, trpo, ppo

Entity Summarization with User Feedback

Qingxia Liu, Yue Chen, Gong Cheng and Evgeny Kharlamov, Junyou Li, Yuzhong Qu

Keywords Abstract Paper

Celebrating Diversity in Shared Multi-Agent Reinforcement Learning

Li Chenghao, Tonghan Wang, Chengjie Wu and Qianchuan Zhao, Jun Yang, Chongjie Zhang

Keywords Abstract Paper

deep learning, optimization, reinforcement learning and planning

Joint Inference of Reward Machines and Policies for Reinforcement Learning

Zhe Xu, Ivan Gavran, Yousef Ahmad and Rupak Majumdar, Daniel Neider, Ufuk Topcu, Bo Wu

Yaodong Yang, Ying Wen, Jun Wang and
Liheng Chen, Kun Shao, David Mguni, Weinan Zhang

Keywords Paper

Jianhao Wang, Zhizhou Ren, Terry Liu and
Yang Yu, Chongjie Zhang

Keywords Paper

Jongwook Choi, Archit Sharma, Honglak Lee and
Sergey Levine, Shixiang Gu

Keywords Paper

Keywords Paper

Keywords Paper

Michael Volpp, Lukas P. Fröhlich, Kirsten Fischer and
Andreas Doerr, Stefan Falkner, Frank Hutter, Christian Daniel

Keywords Paper

Jianhao Wang, Zhizhou Ren, Beining Han and
Jianing Ye, Chongjie Zhang

Keywords Paper

Tonghan Wang, Jianhao Wang, Chongyi Zheng, Chongjie Zhang

Keywords Paper

Dongjie Wang, Pengyang Wang, Kunpeng Liu and
Yuanchun Zhou, Charles E Hughes, Yanjie Fu

Keywords Paper

Jun Sun, Gang Wang, Georgios B. Giannakis and
Qinmin Yang, Zaiyue Yang

Keywords Paper

Anuj Mahajan, Mikayel Samvelyan, Lei Mao and
Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Anima Anandkumar

Keywords Paper

Keywords Paper

Tim Seyde, Igor Gilitschenski, Wilko Schwarting and
Bartolomeo Stellato, Martin Riedmiller, Markus Wulfmeier, Daniela Rus

Keywords Paper

Philip Ball, Jack Parker-Holder, Aldo Pacchiano and
Krzysztof Choromanski, Stephen Roberts

Keywords Paper

Lulu Zheng, Jiarui Chen, Jianhao Wang and
Jiamin He, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao, Chongjie Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Logan Engstrom, Andrew Ilyas, Shibani Santurkar and
Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry

Keywords Paper

Qingxia Liu, Yue Chen, Gong Cheng and
Evgeny Kharlamov, Junyou Li, Yuzhong Qu

Keywords Paper

Li Chenghao, Tonghan Wang, Chengjie Wu and
Qianchuan Zhao, Jun Yang, Chongjie Zhang

Keywords Paper

Zhe Xu, Ivan Gavran, Yousef Ahmad and
Rupak Majumdar, Daniel Neider, Ufuk Topcu, Bo Wu

Keywords Paper

Alvaro Velasquez, Brett Bissey, Lior Barak and
Andre Beckus, Ismail Alkhouri, Daniel Melcer, George Atia

Keywords Paper

Wei Qiu, Xinrun Wang, Runsheng Yu and
Rundong Wang, Xu He, Bo An, Svetlana Obraztsova, Zinovi Rabinovich

Keywords Paper

Keywords Paper

Jiaheng Wei, Zuyue Fu, Yang Liu and
Xingyu Li, Zhuoran Yang, Zhaoran Wang

Keywords Paper

Bei Peng, Tabish Rashid, Christian Schroeder de Witt and
Pierre-Alexandre Kamienny, Philip Torr, Wendelin Boehmer, Shimon Whiteson

Keywords Paper

Jiachen Yang, Ang Li, Mehrdad Farajtabar and
Peter Sunehag, Edward Hughes, Hongyuan Zha

Keywords Paper

Yujing Hu, Weixun Wang, Hangtian Jia and
Yixiang Wang, Yingfeng Chen, Jianye Hao, Feng Wu, Changjie Fan

Keywords Paper

Keywords Paper

Keywords Paper

Tuomas Oikarinen, Wang Zhang, Alexandre Megretski and
Luca Daniel, Tsui-Wei Weng

Keywords Paper

Keywords Paper

Keywords Paper

Tianhao Zhang, Qiwei Ye, Jiang Bian and
Guangming Xie, Tie-Yan Liu

Keywords Paper

Keywords Paper

Keywords Paper