Modelling Behavioural Diversity for Learning in Open-Ended Games

18/07/2021

Modelling Behavioural Diversity for Learning in Open-Ended Games

Nicolas Perez-Nieves, Yaodong Yang, Oliver Slumbers, David Mguni, Ying Wen, Jun Wang

Keywords: Theory, Game Theory and Computational Economics

Abstract Paper Similar Papers

Abstract: Promoting behavioural diversity is critical for solving games with non-transitive dynamics where strategic cycles exist, and there is no consistent winner (e.g., Rock-Paper-Scissors). Yet, there is a lack of rigorous treatment for defining diversity and constructing diversity-aware learning dynamics. In this work, we offer a geometric interpretation of behavioural diversity in games and introduce a novel diversity metric based on \emph{determinantal point processes} (DPP). By incorporating the diversity metric into best-response dynamics, we develop \emph{diverse fictitious play} and \emph{diverse policy-space response oracle} for solving normal-form games and open-ended games. We prove the uniqueness of the diverse best response and the convergence of our algorithms on two-player games. Importantly, we show that maximising the DPP-based diversity metric guarantees to enlarge the \emph{gamescape} -- convex polytopes spanned by agents' mixtures of strategies. To validate our diversity-aware solvers, we test on tens of games that show strong non-transitivity. Results suggest that our methods achieve at least the same, and in most games, lower exploitability than PSRO solvers by finding effective and diverse strategies.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games

Xiangyu Liu, Hangtian Jia, Ying Wen and
Yaodong Yang, Yujing Hu, Yingfeng Chen, Changjie Fan, ZHIPENG HU

Keywords Paper

0

0

0

0

13:43

26/04/2020

Double Neural Counterfactual Regret Minimization

Hui Li, Kailiang Hu, Shaohua Zhang and
Yuan Qi, Le Song

Keywords Paper

Counterfactual Regret Minimization, Imperfect Information game, Neural Strategy, Deep Learning, Robust Sampling

0

0

0

0

4:49

09/07/2020

Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium

Qiaomin Xie, Yudong Chen, Zhaoran Wang, Zhuoran Yang

Keywords Paper

Reinforcement learning, Planning and control

0

0

0

0

15:16

06/12/2020

Joint Policy Search for Multi-agent Collaboration with Imperfect Information

Yuandong Tian, Qucheng Gong, Yu Jiang

Keywords Paper

0

0

0

0

3:32

02/02/2021

Solving Common-Payoff Games with Approximate Policy Iteration

Samuel Sokota, Edward Lockhart, Finbarr Timbers and
Elnaz Davoodi, Ryan D'Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot

Keywords Paper

0

0

0

0

18:14

04/08/2021

Online Learning with Simple Predictors and a Combinatorial Characterization of Minimax in 0/1 Games

Steve Hanneke, Roi Livni, Shay Moran

Keywords Paper

0

0

0

0

18:07

06/12/2021

Exploration-Exploitation in Multi-Agent Competition: Convergence with Bounded Rationality

Stefanos Leonardos, Georgios Piliouras, Kelly Spendlove

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:11

06/12/2021

Neural Auto-Curricula in Two-Player Zero-Sum Games

Xidong Feng, Oliver Slumbers, Ziyu Wan and
Bo Liu, Stephen McAleer, Ying Wen, Jun Wang, Yaodong Yang

Keywords Paper

deep learning, optimization, reinforcement learning and planning, meta learning

0

0

0

0

14:46

04/08/2021

Learning in Matrix Games can be Arbitrarily Complex

Gabriel P Andrade, Rafael Frongillo, Georgios Piliouras

Keywords Paper

0

0

0

0

14:59

18/07/2021

A Sharp Analysis of Model-based Reinforcement Learning with Self-Play

Qinghua Liu, Tiancheng Yu, Yu Bai, Chi Jin

Keywords Paper

Algorithms, Multitask and Transfer Learning, Algorithms, Meta-Learning; Applications, Object Recognition; Data, Challenges, Implementations, and Software, Benchmarks;, Theory, RL, Decisions and Control Theory

0

0

0

0

4:49

19/08/2021

Temporal Induced Self-Play for Stochastic Bayesian Games

Weizhe Chen, Zihan Zhou, Yi Wu, Fei Fang

Keywords Paper

Agent-based and Multi-agent Systems, Multi-agent Learning, Applications of Reinforcement Learning

0

0

0

0

11:52

23/08/2020

Diverse rule sets

Guangyi Zhang, Aristides Gionis

Keywords Paper

sampling, classifier, pattern mining, rule learning, diversification, rule sets

0

0

0

0

9:41

12/07/2020

Provable Self-Play Algorithms for Competitive Reinforcement Learning

Yu Bai, Chi Jin

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

14:28

13/04/2021

On the suboptimality of negative momentum for minimax optimization

Guodong Zhang, Yuanhao Wang

Keywords Paper

0

0

0

0

3:11

02/02/2021

On the Approximation of Nash Equilibria in Sparse Win-Lose Multi-player Games

Zhengyang Liu, Jiawei Li, Xiaotie Deng

Keywords Paper

0

0

0

0

16:33

06/12/2020

Follow the Perturbed Leader: Optimism and Fast Parallel Algorithms for Smooth Minimax Games

Arun Suggala, Praneeth Netrapalli

Keywords Paper

1

1

0

0

3:29

03/05/2021

Taming GANs with Lookahead-Minmax

Tatjana Chavdarova, Matteo Pagliardini, Sebastian Stich and
François Fleuret, Martin Jaggi

Keywords Paper

Generative Adversarial Networks, Minmax

0

0

0

0

5:25

18/07/2021

Trajectory Diversity for Zero-Shot Coordination

Andrei Lupu, Brandon Cui, Hengyuan Hu, Jakob Foerster

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:16

02/02/2021

Programmatic Strategies for Real-Time Strategy Games

Julian R. H. Mariño, Rubens O. Moraes, Tassiana C. Oliveira and
Claudio Toledo, Levi H. S. Lelis

Keywords Paper

0

0

0

0

19:22

02/02/2021

Computing Quantal Stackelberg Equilibrium in Extensive-Form Games

Jakub Černý, Viliam Lisý, Branislav Bošanský, Bo An

Keywords Paper

0

0

0

0

15:01

02/02/2021

Hindsight and Sequential Rationality of Correlated Play

Dustin Morrill, Ryan D'Orazio, Reca Sarfati and
Marc Lanctot, James R Wright, Amy R Greenwald, Michael Bowling

Keywords Paper

0

0

0

0

18:34

02/02/2021

Newton Optimization on Helmholtz Decomposition for Continuous Games

Giorgia Ramponi, Marcello Restelli

Keywords Paper

0

0

0

0

17:15

26/08/2020

A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Differentiable Games

Waïss Azizian, Ioannis Mitliagkas, Simon Lacoste-Julien, Gauthier Gidel

Keywords Paper

0

0

0

0

14:40

18/07/2021

Dissecting Supervised Constrastive Learning

Florian Graf, Christoph Hofer, Marc Niethammer, Roland Kwitt

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

17:13

06/12/2021

Optimal Algorithms for Stochastic Contextual Preference Bandits

Aadirupa Saha

Keywords Paper

bandits

0

0

0

0

16:00

06/12/2021

Differentiable Equilibrium Computation with Decision Diagrams for Stackelberg Models of Combinatorial Congestion Games

Shinsaku Sakaue, Kengo Nakamura

Keywords Paper

optimization

0

0

0

0

15:07

06/12/2020

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Stephen Mcaleer, J.B. Lanier, Roy Fox, Pierre Baldi

Keywords Paper

0

0

0

0

3:12

12/07/2020

Converging to Team-Maxmin Equilibria in Zero-Sum Multiplayer Games

Youzhi Zhang, Bo An

Keywords Paper

Learning Theory

0

0

0

0

15:52

02/02/2021

Model-Free Online Learning in Unknown Sequential Decision Making Problems and Games

Gabriele Farina, Tuomas Sandholm

Keywords Paper

0

0

0

0

17:09

06/12/2021

On The Structure of Parametric Tournaments with Application to Ranking from Pairwise Comparisons

Vishnu Veerathu, Arun Rajkumar

Keywords Paper

theory

0

0

0

0

14:53

18/07/2021

A New Formalism, Method and Open Issues for Zero-Shot Coordination

Johannes Treutlein, Michael Dennis, Caspar Oesterheld, Jakob Foerster

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:28

18/07/2021

Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions

Shuang Qiu, Xiaohan Wei, Jieping Ye and
Zhaoran Wang, Zhuoran Yang

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

11:21

19/08/2021

Choosing the Right Algorithm With Hints From Complexity Theory

Shouda Wang, Weijie Zheng, Benjamin Doerr

Keywords Paper

Heuristic Search and Game Playing, Combinatorial Search and Optimisation, Heuristic Search, Meta-Reasoning and Meta-Heuristics

0

0

0

0

13:54

12/07/2020

Implicit Learning Dynamics in Stackelberg Games: Equilibria Characterization, Convergence Analysis, and Empirical Study

Tanner Fiez, Benjamin Chasnov, Lillian Ratliff

Keywords Paper

Learning Theory

0

0

0

0

15:14

02/02/2021

Finding and Certifying (Near-)Optimal Strategies in Black-Box Extensive-Form Games

Brian Hu Zhang, Tuomas Sandholm

Keywords Paper

0

0

0

0

15:00

18/07/2021

Batch Value-function Approximation with Only Realizability

Tengyang Xie, Nan Jiang

Keywords Paper

Algorithms, Multitask and Transfer Learning, Algorithms, Unsupervised Learning; Applications, Image Segmentation, Theory, RL, Decisions and Control Theory

0

0

0

0

5:05

06/12/2020

Near-Optimal Reinforcement Learning with Self-Play

Yu Bai, Chi Jin, Tiancheng Yu

Keywords Paper

Theory -> Regularization, Applications -> Fairness, Accountability, and Transparency

0

0

0

0

3:33

03/05/2021

On the Impossibility of Global Convergence in Multi-Loss Optimization

Alistair Letcher

Keywords Paper

convergence, descent, gradient, multi-player, global, impossibility, multi-loss, optimization, multi-agent

0

0

0

0

5:23

02/02/2021

Complexity and Algorithms for Exploiting Quantal Opponents in Large Two-Player Games

David Milec, Jakub Černý, Viliam Lisý, Bo An

Keywords Paper

0

0

0

0

14:22

06/12/2021

Recurrent Submodular Welfare and Matroid Blocking Semi-Bandits

Orestis Papadigenopoulos, Constantine Caramanis

Keywords Paper

bandits

0

0

0

0

12:28