FOP: Factorizing Optimal Joint Policy of Maximum-Entropy Multi-Agent Reinforcement Learning

18/07/2021

FOP: Factorizing Optimal Joint Policy of Maximum-Entropy Multi-Agent Reinforcement Learning

Tianhao Zhang, 岳珩李, Chen Wang, Guangming Xie, Zongqing Lu

Keywords: Reinforcement Learning and Planning, Multi-Agent RL

Abstract Paper Similar Papers

Abstract: Value decomposition recently injects vigorous vitality into multi-agent actor-critic methods. However, existing decomposed actor-critic methods cannot guarantee the convergence of global optimum. In this paper, we present a novel multi-agent actor-critic method, FOP, which can factorize the optimal joint policy induced by maximum-entropy multi-agent reinforcement learning (MARL) into individual policies. Theoretically, we prove that factorized individual policies of FOP converge to the global optimum. Empirically, in the well-known matrix game and differential game, we verify that FOP can converge to the global optimum for both discrete and continuous action spaces. We also evaluate FOP on a set of StarCraft II micromanagement tasks, and demonstrate that FOP substantially outperforms state-of-the-art decomposed value-based and actor-critic methods.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Stateful Strategic Regression

Keegan Harris, Hoda Heidari, Steven Wu

Keywords Paper

optimization

0

0

0

0

14:02

18/07/2021

Convex Regularization in Monte-Carlo Tree Search

Tuan Q Dam, Carlo D'Eramo, Jan Peters, Joni Pajarinen

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

4:52

06/12/2020

Reinforcement Learning for Control with Multiple Frequencies

Jongmin Lee, Byung-Jun Lee, Kee-Eung Kim

Keywords Paper

Algorithms -> Multitask and Transfer Learning; Deep Learning -> Supervised Deep Networks; Theory -> Learning Theory; Theory -> , Deep Learning

0

0

0

0

3:21

03/05/2021

DOP: Off-Policy Multi-Agent Decomposed Policy Gradients

Yihan Wang, Beining Han, Tonghan Wang and
Heng Dong, Chongjie Zhang

Keywords Paper

Multi-Agent Reinforcement Learning, Multi-Agent Policy Gradients

0

0

0

0

4:38

18/07/2021

Provably Efficient Algorithms for Multi-Objective Competitive RL

Tiancheng Yu, Yi Tian, Jingzhao Zhang, Suvrit Sra

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

17:04

18/07/2021

Provably Correct Optimization and Exploration with Non-linear Policies

Fei Feng, Wotao Yin, Alekh Agarwal, Lin Yang

Keywords Paper

Deep Learning, Adversarial Networks, Applications, Fairness, Accountability, and Transparency, Theory, RL, Decisions and Control Theory

0

0

0

0

5:03

06/12/2021

Taming Communication and Sample Complexities in Decentralized Policy Evaluation for Cooperative Multi-Agent Reinforcement Learning

Xin Zhang, Zhuqing Liu, Jia Liu and
Zhengyuan Zhu, Songtao Lu

Keywords Paper

theory, optimization, reinforcement learning and planning

0

0

0

0

14:54

06/12/2020

On the Stability and Convergence of Robust Adversarial Reinforcement Learning: A Case Study on Linear Quadratic Systems

Kaiqing Zhang, Bin Hu, Tamer Basar

Keywords Paper

0

0

0

0

3:22

06/12/2021

Coordinated Proximal Policy Optimization

Zifan Wu, Chao Yu, Deheng Ye and
Junge Zhang, haiyin piao, Hankz Hankui Zhuo

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

6:19

06/12/2021

Automated Dynamic Mechanism Design

Hanrui Zhang, Vincent Conitzer

Keywords Paper

0

0

0

0

14:35

26/08/2020

A Hybrid Stochastic Policy Gradient Algorithm for Reinforcement Learning

Nhan Pham, Lam Nguyen, Dzung Phan and
PHUONG HA NGUYEN, Marten van Dijk, Quoc Tran-Dinh

Keywords Paper

0

0

0

0

15:49

06/12/2021

Multi-Agent Reinforcement Learning in Stochastic Networked Systems

Yiheng Lin, Guannan Qu, Longbo Huang, Adam Wierman

Keywords Paper

reinforcement learning and planning, graph learning

0

0

0

0

11:20

26/08/2020

Discrete Action On-Policy Learning with Action-Value Critic

Yuguang Yue, Yunhao Tang, Mingzhang Yin, Mingyuan Zhou

Keywords Paper

0

0

0

0

14:23

06/12/2021

Adversarial Attack Generation Empowered by Min-Max Optimization

Jingkang Wang, Tianyun Zhang, Sijia Liu and
Pin-Yu Chen, Jiacen Xu, Makan Fardad, Bo Li

Keywords Paper

optimization, robustness, adversarial robustness and security

0

0

0

0

15:11

12/07/2020

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Yaodong Yang, Jianye Hao, Guangyong Chen and
Hongyao Tang, Yingfeng Chen, Yujing Hu, Changjie Fan, Zhongyu Wei

Keywords Paper

Planning, Control, and Multiagent Learning

0

0

0

0

6:42

18/07/2021

Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality

Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin LIANG

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

4:23

02/02/2021

Progression Heuristics for Planning with Probabilistic LTL Constraints

Ian Mallett, Sylvie Thiebaux, Felipe Trevizan

Keywords Paper

0

0

0

0

18:23

06/12/2021

Reward is enough for convex MDPs

Tom Zahavy, Brendan O'Donoghue, Guillaume Desjardins, Satinder Singh

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:12

06/12/2021

Stochastic Anderson Mixing for Nonconvex Stochastic Optimization

Fuchao Wei, Chenglong Bao, Yang Liu

Keywords Paper

theory, deep learning, optimization, machine learning, vision

0

0

0

0

9:55

06/12/2020

Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization

Sreejith Balakrishnan, Quoc Phong Nguyen, Bryan Kian Hsiang Low, Harold Soh

Keywords Paper

0

0

0

0

3:22

06/12/2020

Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces

Guy Lorberbom, Chris J. Maddison, Nicolas Heess and
Tamir Hazan, Daniel Tarlow

Keywords Paper

0

0

0

0

3:16

04/08/2021

Adaptivity in Adaptive Submodularity

Hossein Esfandiari, Amin Karbasi, Vahab Mirrokni

Keywords Paper

0

0

0

0

13:54

06/12/2021

Towards Hyperparameter-free Policy Selection for Offline Reinforcement Learning

Siyuan Zhang, Nan Jiang

Keywords Paper

reinforcement learning and planning

0

0

0

0

13:23

13/04/2021

Improving KernelSHAP: Practical shapley value estimation using linear regression

Ian Covert, Su-In Lee

Keywords Paper

0

0

0

0

2:52

13/04/2021

Provably eﬃcient actor-critic for risk-sensitive and robust adversarial RL: A linear-quadratic case

Yufeng Zhang, Zhuoran Yang, Zhaoran Wang

Keywords Paper

0

0

0

0

2:53

19/08/2021

Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts

Weinan Zhang, Xihuai Wang, Jian Shen, Ming Zhou

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Multi-agent Learning

0

0

0

0

13:10

06/12/2020

Promoting Stochasticity for Expressive Policies via a Simple and Efficient Regularization Method

Qi Zhou, Yufei Kuang, Zherui Qiu and
Houqiang Li, Jie Wang

Keywords Paper

0

0

0

0

3:10

06/12/2020

Robust Multi-Agent Reinforcement Learning with Model Uncertainty

Kaiqing Zhang, TAO SUN, Yunzhe Tao and
Sahika Genc, Sunil Mallya, Tamer Basar

Keywords Paper

0

0

0

0

3:11

06/12/2020

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Paper

0

0

0

0

3:11

26/08/2020

Finite-Time Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Jun Sun, Gang Wang, Georgios B. Giannakis and
Qinmin Yang, Zaiyue Yang

Keywords Paper

0

0

0

0

17:07

14/09/2020

Target to Source Coordinate-wise Adaptation of Pre-trained Models

Luxin Zhang, Pascal Germain, Yacine Kessaci, Christophe Biernacki

Keywords Paper

domain adaptation, optimal transport, feature selection

0

0

0

0

14:57

18/07/2021

Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning

Anuj Mahajan, Mikayel Samvelyan, Lei Mao and
Viktor Makoviychuk, Animesh Garg, Jean Kossaifi, Shimon Whiteson, Yuke Zhu, Anima Anandkumar

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:16

06/12/2020

Is Long Horizon RL More Difficult Than Short Horizon RL?

Ruosong Wang, Simon Du, Lin Yang, Sham Kakade

Keywords Paper

0

0

0

0

3:20

06/12/2021

Risk Bounds and Calibration for a Smart Predict-then-Optimize Method

Heyuan Liu, Paul Grigas

Keywords Paper

theory, optimization, machine learning

0

0

0

0

14:56

12/07/2020

Exploration Through Bias: Revisiting Biased Maximum Likelihood Estimation in Stochastic Multi-Armed Bandits

Xi Liu, Ping-Chun Hsieh, Yu Heng Hung and
Anirban Bhattacharya, P. Kumar

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

14:46

04/08/2021

Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective

Dylan Foster, Alexander Rakhlin, David Simchi-Levi, Yunzong Xu

Keywords Paper

0

0

0

0

16:53

03/05/2021

Learning Value Functions in Deep Policy Gradients using Residual Variance

Yannis Flet-Berliac, reda ouhamma, odalric-ambrym maillard, philippe preux

Keywords Paper

0

0

0

0

4:49

18/07/2021

Estimating $\alpha$-Rank from A Few Entries with Low Rank Matrix Completion

Yali Du, Xue Yan, Xu Chen and
Jun Wang, Haifeng Zhang

Keywords Paper

Optimization, Probabilistic Methods, Distributed Inference, Algorithms, Algorithms Evaluation

0

0

0

0

4:52

02/02/2021

Sequential Generative Exploration Model for Partially Observable Reinforcement Learning

Haiyan Yin, Jianda Chen, Sinno Jialin Pan, Sebastian Tschiatschek

Keywords Paper

0

0

0

0

14:40

06/12/2021

Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses

Haipeng Luo, Chen-Yu Wei, Chung-Wei Lee

Keywords Paper

optimization, reinforcement learning and planning, bandits

0

0

0

0

15:17