“Other-Play” for Zero-Shot Coordination

12/07/2020

“Other-Play” for Zero-Shot Coordination

Hengyuan Hu, Alexander Peysakhovich, Adam Lerer, Jakob Foerster

Keywords: Planning, Control, and Multiagent Learning

Abstract Paper Similar Papers

Abstract: We consider the problem of zero-shot coordination - constructing AI agents that can coordinate with novel partners they have not seen before (e.g.humans). Standard Multi-Agent Reinforcement Learning (MARL) methods typically focus on the self-play (SP) setting where agents construct strategies by playing the game with themselves repeatedly. Unfortunately, applying SP naively to the zero-shot coordination problem can produce agents that establish highly specialized conventions that do not carry over to novel partners they have not been trained with. We introduce a novel learning algorithm called other-play (OP), that enhances self-play by looking for more robust strategies. We characterize OP theoretically as well as experimentally. We study the cooperative card game Hanabi and show that OP agents achieve higher scores when paired with independently trained agents as well as with human players than SP agents.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

26/04/2020

Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning

Hengyuan Hu, Jakob N Foerster

Keywords Paper

multi-agent RL, theory of mind

0

0

0

0

5:20

06/12/2021

K-level Reasoning for Zero-Shot Coordination in Hanabi

Brandon Cui, Hengyuan Hu, Luis Pineda, Jakob Foerster

Keywords Paper

reinforcement learning and planning

0

0

0

0

11:40

06/12/2021

Neural Auto-Curricula in Two-Player Zero-Sum Games

Xidong Feng, Oliver Slumbers, Ziyu Wan and
Bo Liu, Stephen McAleer, Ying Wen, Jun Wang, Yaodong Yang

Keywords Paper

deep learning, optimization, reinforcement learning and planning, meta learning

0

0

0

0

14:46

12/07/2020

OPtions as REsponses: Grounding behavioural hierarchies in multi-agent reinforcement learning

Alexander Vezhnevets, Yuhuai Wu, Maria Eckstein and
Rémi Leblond, Joel Z Leibo

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

14:17

18/07/2021

Trajectory Diversity for Zero-Shot Coordination

Andrei Lupu, Brandon Cui, Hengyuan Hu, Jakob Foerster

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:16

06/12/2021

Learning Diverse Policies in MOBA Games via Macro-Goals

Yiming Gao, Bei Shi, Xueying Du and
Liang Wang, Guangwei Chen, Zhenjie Lian, Fuhao Qiu, GUOAN HAN, Weixuan Wang, Deheng Ye, Qiang Fu, Wei Yang, Lanxiao Huang

Keywords Paper

reinforcement learning and planning

0

0

0

0

6:49

06/12/2021

Exploiting Opponents Under Utility Constraints in Sequential Games

Martino Bernasconi-de-Luca, Federico Cacciamani, Simone Fioravanti and
Nicola Gatti, Alberto Marchesi, Francesco Trovò

Keywords Paper

online learning

0

0

0

0

12:59

18/07/2021

Learning in Nonzero-Sum Stochastic Games with Potentials

David Mguni, Yutong Wu, Yali Du and
Yaodong Yang, Ziyi Wang, M. Li, Ying Wen, Joel Jennings, Jun Wang

Keywords Paper

Theory, Game Theory and Computational Economics

0

0

0

0

5:36

19/04/2021

An empirical study on the generalization power of neural representations learned via visual guessing games

Alessandro Suglia, Yonatan Bisk, Ioannis Konstas and
Antonio Vergari, Emanuele Bastianelli, Andrea Vanzo, Oliver Lemon

Keywords Paper

0

0

0

0

7:16

06/12/2021

Evaluation of Human-AI Teams for Learned and Rule-Based Agents in Hanabi

Ho Chit Siu, Jaime Peña, Edenna Chen and
Yutai Zhou, Victor Lopez, Kyle Palko, Kimberlee Chang, Ross Allen

Keywords Paper

deep learning, reinforcement learning and planning, interpretability

0

0

0

0

9:15

03/05/2021

Chaos of Learning Beyond Zero-sum and Coordination via Game Decompositions

Yun Kuen Cheung, Yixin Tao

Keywords Paper

Dynamical Systems, Volume Analysis, Follow-the-Regularized-Leader, Multiplicative Weights Update, Game Decomposition, Lyapunov Chaos, Learning in Games

0

0

0

0

3:53

06/12/2020

Adversarial Example Games

Joey Bose, Gauthier Gidel, Hugo Berard and
Andre Cianflone, Pascal Vincent, Simon Lacoste-Julien, Will Hamilton

Keywords Paper

0

0

0

0

3:22

03/05/2021

On the Critical Role of Conventions in Adaptive Human-AI Collaboration

Andy Shih, Arjun Sawhney, Jovana Kondic and
Stefano Ermon, Dorsa Sadigh

Keywords Paper

human-AI collaboration, emergent behavior, Multi-agent games, transfer learning

0

0

0

0

5:10

06/12/2021

Emergent Discrete Communication in Semantic Spaces

Mycal Tucker, Huao Li, Siddharth Agrawal and
Dana Hughes, Katia Sycara, Michael Lewis, Julie A Shah

Keywords Paper

reinforcement learning and planning, language

0

0

0

0

14:56

19/08/2021

Combining Tree Search and Action Prediction for State-of-the-Art Performance in DouDiZhu

Yunsheng Zhang, Dong Yan, Bei Shi and
Haobo Fu, Qiang Fu, Hang Su, Jun Zhu, Ning Chen

Keywords Paper

Machine Learning, Reinforcement Learning, Game Playing and Machine Learning

0

0

0

0

12:03

02/02/2021

Hindsight and Sequential Rationality of Correlated Play

Dustin Morrill, Ryan D'Orazio, Reca Sarfati and
Marc Lanctot, James R Wright, Amy R Greenwald, Michael Bowling

Keywords Paper

0

0

0

0

18:34

18/07/2021

Estimating $\alpha$-Rank from A Few Entries with Low Rank Matrix Completion

Yali Du, Xue Yan, Xu Chen and
Jun Wang, Haifeng Zhang

Keywords Paper

Optimization, Probabilistic Methods, Distributed Inference, Algorithms, Algorithms Evaluation

0

0

0

0

4:52

02/02/2021

NeuralAC: Learning Cooperation and Competition Effects for Match Outcome Prediction

Yin Gu, Qi Liu, Kai Zhang and
Zhenya Huang, Runze Wu, Jianrong Tao

Keywords Paper

0

0

0

0

16:48

02/02/2021

Estimating α-Rank by Maximizing Information Gain

Tabish Rashid, Cheng Zhang, Kamil Ciosek

Keywords Paper

0

0

0

0

14:52

18/07/2021

DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning

Daochen Zha, Jingru Xie, Wenye Ma and
Sheng Zhang, Xiangru Lian, Xia Hu, Ji Liu

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:20

02/02/2021

Solving Common-Payoff Games with Approximate Policy Iteration

Samuel Sokota, Edward Lockhart, Finbarr Timbers and
Elnaz Davoodi, Ryan D'Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot

Keywords Paper

0

0

0

0

18:14

06/12/2021

The Utility of Explainable AI in Ad Hoc Human-Machine Teaming

Rohan Paleja, Muyleng Ghuy, Nadun Ranawaka Arachchige and
Reed Jensen, Matthew Gombolay

Keywords Paper

machine learning, interpretability

0

0

0

0

12:32

03/05/2021

The role of Disentanglement in Generalisation

Milton Montero, Casimir JH Ludwig, Rui Ponte Costa and
Gaurav Malhotra, Jeffrey Bowers

Keywords Paper

generalisation, compositional generalization, generative models, compositionality, variational autoencoders, disentanglement

0

0

0

0

4:16

06/12/2021

Machine versus Human Attention in Deep Reinforcement Learning Tasks

Suna (Sihang) Guo, Ruohan Zhang, Bo Liu and
Yifeng Zhu, Dana Ballard, Mary Hayhoe, Peter Stone

Keywords Paper

deep learning, reinforcement learning and planning, interpretability

0

0

0

0

13:13

12/07/2020

Provable Self-Play Algorithms for Competitive Reinforcement Learning

Yu Bai, Chi Jin

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

14:28

26/04/2020

Double Neural Counterfactual Regret Minimization

Hui Li, Kailiang Hu, Shaohua Zhang and
Yuan Qi, Le Song

Keywords Paper

Counterfactual Regret Minimization, Imperfect Information game, Neural Strategy, Deep Learning, Robust Sampling

0

0

0

0

4:49

03/05/2021

Mastering Atari with Discrete World Models

Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba

Keywords Paper

reinforcement learning, actor critic, model-based reinforcement learning, world models, Atari, planning

1

0

0

0

5:52

06/12/2020

Learning to Play No-Press Diplomacy with Best Response Policy Iteration

Thomas Anthony, Tom Eccles, Andrea Tacchetti and
János Kramár, Ian Gemp, Thomas Hudson, Nicolas Porcel, Marc Lanctot, Julien Perolat, Richard Everett, Satinder Singh, Thore Graepel, Yoram Bachrach

Keywords Paper

0

0

0

0

3:23

06/12/2021

Learning to Simulate Self-driven Particles System with Coordinated Policy Optimization

Zhenghao Peng, Quanyi Li, Ka Ming Hui and
Chunxiao Liu, Bolei Zhou

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

12:08

12/08/2020

Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference

Klas Leino, Matt Fredrikson

Keywords Paper

0

0

0

0

12:07

18/07/2021

SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II

Xiangjun Wang, Junxiao SONG, Penghui Qi and
Peng Peng, Zhenkun Tang, Wei Zhang, Weimin Li, Xiongjun Pi, Jujie He, Chao Gao, Haitao Long, Quan Yuan

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:11

18/07/2021

Emergent Social Learning via Multi-agent Reinforcement Learning

Kamal Ndousse, Douglas Eck, Sergey Levine, Natasha Jaques

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:30

06/12/2021

Emergent Communication of Generalizations

Jesse Mu, Noah Goodman

Keywords Paper

interpretability

0

0

0

0

11:26

03/05/2021

Meta-Learning of Structured Task Distributions in Humans and Machines

Sreejan Kumar, Ishita Dasgupta, Jonathan Cohen and
Nathaniel Daw, Thomas L Griffiths

Keywords Paper

reinforcement learning, compositionality, human cognition, meta-learning

0

0

0

0

5:18

26/04/2020

Explain Your Move: Understanding Agent Actions Using Focused Feature Saliency

Piyush Gupta, Nikaash Puri, Sukriti Verma and
Dhruv Kayastha, Shripad Deshmukh, Balaji Krishnamurthy, Sameer Singh

Keywords Paper

Deep Reinforcement Learning, Saliency maps, Chess, Atari games, Interpretable AI

0

0

0

0

4:59

25/04/2020

How Points and Theme Affect Performance and Experience in a Gamified Cognitive Task

Katelyn Wiley, Sarah Vedress, Regan Mandryk

Keywords Paper

cognitive tasks, dot probe, games, gamification, assessment

0

0

0

0

7:25

16/11/2020

Positive-Unlabeled Reward Learning

Danfei Xu, Misha Denil

Keywords Paper

0

0

0

0

5:04

26/04/2020

NAS evaluation is frustratingly hard

Antoine Yang, Pedro M. Esperança, Fabio M. Carlucci

Keywords Paper

neural architecture search, nas, benchmark, reproducibility, harking

0

0

0

0

4:56

06/12/2020

Calibration of Shared Equilibria in General Sum Partially Observable Markov Games

Nelson Vadori, Sumitra Ganesh, Prashant Reddy, Manuela Veloso

Keywords Paper

0

0

0

0

3:18

06/12/2020

Contextual Games: Multi-Agent Learning with Side Information

Pier Giuseppe Sessa, Ilija Bogunovic, Andreas Krause, Maryam Kamgarpour

Keywords Paper

0

0

0

0

3:30