Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization

03/05/2021

Discovering Diverse Multi-Agent Strategic Behavior via Reward Randomization

Zhenggang Tang, Chao Yu, Boyuan Chen, Huazhe Xu, Xiaolong Wang, Fei Fang, Simon Du, Yu Wang, Yi Wu

Keywords: reward randomization, strategic behavior, diverse strategies, multi-agent reinforcement learning

Abstract Paper Similar Papers

Abstract: We propose a simple, general and effective technique, Reward Randomization for discovering diverse strategic policies in complex multi-agent games. Combining reward randomization and policy gradient, we derive a new algorithm, Reward-Randomized Policy Gradient (RPG). RPG is able to discover a set of multiple distinctive human-interpretable strategies in challenging temporal trust dilemmas, including grid-world games and a real-world game Agar.io, where multiple equilibria exist but standard multi-agent policy gradient algorithms always converge to a fixed one with a sub-optimal payoff for every player even using state-of-the-art exploration techniques. Furthermore, with the set of diverse strategies from RPG, we can (1) achieve higher payoffs by fine-tuning the best policy from the set; and (2) obtain an adaptive agent by using this set of strategies as its training opponents.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

Provably Efficient Algorithms for Multi-Objective Competitive RL

Tiancheng Yu, Yi Tian, Jingzhao Zhang, Suvrit Sra

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

17:04

02/02/2021

Bounded Risk-Sensitive Markov Games: Forward Policy Design and Inverse Reward Learning with Iterative Reasoning and Cumulative Prospect Theory

Ran Tian, Liting Sun, Masayoshi Tomizuka

Keywords Paper

0

0

0

0

16:28

06/12/2020

Calibration of Shared Equilibria in General Sum Partially Observable Markov Games

Nelson Vadori, Sumitra Ganesh, Prashant Reddy, Manuela Veloso

Keywords Paper

0

0

0

0

3:18

18/07/2021

Learning in Nonzero-Sum Stochastic Games with Potentials

David Mguni, Yutong Wu, Yali Du and
Yaodong Yang, Ziyi Wang, M. Li, Ying Wen, Joel Jennings, Jun Wang

Keywords Paper

Theory, Game Theory and Computational Economics

0

0

0

0

5:36

02/02/2021

An Efficient Algorithm for Deep Stochastic Contextual Bandits

Tan Zhu, Guannan Liang, Chunjiang Zhu and
Haining Li, Jinbo Bi

Keywords Paper

0

0

0

0

14:36

04/08/2021

Survival of the strictest: Stable and unstable equilibria under regularized learning with partial information

Angeliki Giannou, Emmanouil Vasileios Vlatakis-Gkaragkounis, Panayotis Mertikopoulos

Keywords Paper

0

0

0

0

16:33

02/02/2021

Estimating α-Rank by Maximizing Information Gain

Tabish Rashid, Cheng Zhang, Kamil Ciosek

Keywords Paper

0

0

0

0

14:52

18/07/2021

A Sharp Analysis of Model-based Reinforcement Learning with Self-Play

Qinghua Liu, Tiancheng Yu, Yu Bai, Chi Jin

Keywords Paper

Algorithms, Multitask and Transfer Learning, Algorithms, Meta-Learning; Applications, Object Recognition; Data, Challenges, Implementations, and Software, Benchmarks;, Theory, RL, Decisions and Control Theory

0

0

0

0

4:49

18/07/2021

From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization

Julien Perolat, Remi Munos, Jean-Baptiste Lespiau and
Shayegan Omidshafiei, Mark Rowland, Pedro Ortega, Neil Burch, Thomas Anthony, David Balduzzi, Bart De Vylder, Georgios Piliouras, Marc Lanctot, Karl Tuyls

Keywords Paper

Probabilistic Methods, Causal Inference, Reinforcement Learning and Planning, Multi-Agent RL, Probabilistic Methods, Graphical Models

0

0

0

0

5:24

12/07/2020

Multi-Step Greedy Reinforcement Learning Algorithms

Manan Tomar, Yonathan Efroni, Mohammad Ghavamzadeh

Keywords Paper

Reinforcement Learning - General

0

0

0

0

12:49

06/12/2021

Stateful Strategic Regression

Keegan Harris, Hoda Heidari, Steven Wu

Keywords Paper

optimization

0

0

0

0

14:02

16/11/2020

DORB: Dynamically Optimizing Multiple Rewards with Bandits

Ramakanth Pasunuru, Han Guo, Mohit Bansal

Keywords Paper

language tasks, optimization rewards, nlg tasks, question generation

0

0

0

0

11:34

12/09/2020

Nondeterministic Strategies and their Refinement in Strategy Logic

Giuseppe De Giacomo, Bastien Maubert, Aniello Murano

Keywords Paper

Reasoning about actions and change, action languages-General

0

0

0

0

15:05

06/12/2021

Continuous Mean-Covariance Bandits

Yihan Du, Siwei Wang, Zhixuan Fang, Longbo Huang

Keywords Paper

bandits

0

0

0

0

11:33

06/12/2021

Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks

Maksym Yatsura, Jan Metzen, Matthias Hein

Keywords Paper

robustness, adversarial robustness and security, meta learning

0

0

0

0

12:55

18/07/2021

Efficient Deviation Types and Learning for Hindsight Rationality in Extensive-Form Games

Dustin Morrill, Ryan D'Orazio, Marc Lanctot and
James Wright, Michael Bowling, Amy Greenwald

Keywords Paper

Theory, Game Theory and Computational Economics

0

0

0

0

5:12

06/12/2020

Robust Multi-Agent Reinforcement Learning with Model Uncertainty

Kaiqing Zhang, TAO SUN, Yunzhe Tao and
Sahika Genc, Sunil Mallya, Tamer Basar

Keywords Paper

0

0

0

0

3:11

12/07/2020

OPtions as REsponses: Grounding behavioural hierarchies in multi-agent reinforcement learning

Alexander Vezhnevets, Yuhuai Wu, Maria Eckstein and
Rémi Leblond, Joel Z Leibo

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

14:17

26/04/2020

Posterior sampling for multi-agent reinforcement learning: solving extensive games with imperfect information

Yichi Zhou, Jialian Li, Jun Zhu

Keywords Paper

0

0

0

0

12:55

09/07/2020

Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium

Qiaomin Xie, Yudong Chen, Zhaoran Wang, Zhuoran Yang

Keywords Paper

Reinforcement learning, Planning and control

0

0

0

0

15:16

12/07/2020

Provable Self-Play Algorithms for Competitive Reinforcement Learning

Yu Bai, Chi Jin

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

14:28

18/07/2021

Dynamic Planning and Learning under Recovering Rewards

David Simchi-Levi, Zeyu Zheng, Feng Zhu

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

4:53

06/12/2021

Neural Algorithmic Reasoners are Implicit Planners

Andreea-Ioana Deac, Petar Veličković, Ognjen Milinkovic and
Pierre-Luc Bacon, Jian Tang, Mladen Nikolic

Keywords Paper

deep learning, reinforcement learning and planning, self-supervised learning, generative model, graph learning

0

0

0

0

13:10

06/12/2020

Learning Strategy-Aware Linear Classifiers

Yiling Chen, Yang Liu, Chara Podimata

Keywords Paper

0

0

0

0

3:15

02/02/2021

Evolutionary Game Theory Squared: Evolving Agents in Endogenously Evolving Zero-Sum Games

Stratis Skoulakis, Tanner Fiez, Ryann Sim and
Georgios Piliouras, Lillian Ratliff

Keywords Paper

0

0

0

0

20:14

19/08/2021

Temporal Induced Self-Play for Stochastic Bayesian Games

Weizhe Chen, Zihan Zhou, Yi Wu, Fei Fang

Keywords Paper

Agent-based and Multi-agent Systems, Multi-agent Learning, Applications of Reinforcement Learning

0

0

0

0

11:52

06/12/2021

Exploration-Exploitation in Multi-Agent Competition: Convergence with Bounded Rationality

Stefanos Leonardos, Georgios Piliouras, Kelly Spendlove

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:11

13/04/2021

Reinforcement learning for mean field games with strategic complementarities

Kiyeob Lee, Desik Rengarajan, Dileep Kalathil, Srinivas Shakkottai

Keywords Paper

0

0

0

0

2:57

06/12/2021

Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games

Yu Bai, Chi Jin, Huan Wang, Caiming Xiong

Keywords Paper

theory, reinforcement learning and planning, bandits

0

0

0

0

12:14

18/07/2021

Learning While Playing in Mean-Field Games: Convergence and Optimality

Qiaomin Xie, Zhuoran Yang, Zhaoran Wang, Andreea Minca

Keywords Paper

Applications, Privacy, Anonymity, and Security, Algorithms, Components Analysis (e.g., CCA, ICA, LDA, PCA), Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:24

03/05/2021

Robust Reinforcement Learning on State Observations with Learned Optimal Adversary

Huan Zhang, Hongge Chen, Duane S Boning, Cho-Jui Hsieh

Keywords Paper

reinforcement learning, robustness, adversarial attacks, adversarial defense

0

0

0

0

5:14

26/04/2020

Learning Expensive Coordination: An Event-Based Deep RL Approach

Zhenyu Shi, Runsheng Yu, Xinrun Wang* and
Rundong Wang, Youzhi Zhang, Hanjiang Lai, Bo An

Keywords Paper

Multi-Agent Deep Reinforcement Learning, Deep Reinforcement Learning, Leader–Follower Markov Game, Expensive Coordination

0

0

0

0

3:55

12/07/2020

Adaptive Reward-Poisoning Attacks against Reinforcement Learning

Xuezhou Zhang, Yuzhe Ma, Adish Singla, Jerry Zhu

Keywords Paper

Trustworthy Machine Learning

0

0

0

0

14:19

12/07/2020

Q-value Path Decomposition for Deep Multiagent Reinforcement Learning

Yaodong Yang, Jianye Hao, Guangyong Chen and
Hongyao Tang, Yingfeng Chen, Yujing Hu, Changjie Fan, Zhongyu Wei

Keywords Paper

Planning, Control, and Multiagent Learning

0

0

0

0

6:42

19/08/2021

Learning Nash Equilibria in Zero-Sum Stochastic Games via Entropy-Regularized Policy Approximation

Yue Guan, Qifan Zhang, Panagiotis Tsiotras

Keywords Paper

Machine Learning, Reinforcement Learning, Multi-agent Learning, Noncooperative Games

0

0

0

0

12:26

02/02/2021

Model-Free Online Learning in Unknown Sequential Decision Making Problems and Games

Gabriele Farina, Tuomas Sandholm

Keywords Paper

0

0

0

0

17:09

13/04/2021

Provably eﬃcient actor-critic for risk-sensitive and robust adversarial RL: A linear-quadratic case

Yufeng Zhang, Zhuoran Yang, Zhaoran Wang

Keywords Paper

0

0

0

0

2:53

06/12/2020

Learning to Play Sequential Games versus Unknown Opponents

Pier Giuseppe Sessa, Ilija Bogunovic, Maryam Kamgarpour, Andreas Krause

Keywords Paper

0

0

0

0

3:04

18/07/2021

Off-Belief Learning

Hengyuan Hu, Adam Lerer, Brandon Cui and
Luis Pineda, Noam Brown, Jakob Foerster

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:10

18/07/2021

DFAC Framework: Factorizing the Value Function via Quantile Mixture for Multi-Agent Distributional Q-Learning

Wei-Fang Sun, Cheng-Kuang Lee, Chun-Yi Lee

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:43