Independent Policy Gradient Methods for Competitive Reinforcement Learning

06/12/2020

Independent Policy Gradient Methods for Competitive Reinforcement Learning

Constantinos Daskalakis, Dylan Foster, Noah Golowich

Keywords: Applications -> Web Applications and Internet Data; Theory -> Learning Theory, Probabilistic Methods -> Causal Inference

Abstract Paper Similar Papers

Abstract: We obtain global, non-asymptotic convergence guarantees for independent learning algorithms in competitive reinforcement learning settings with two agents (i.e., zero-sum stochastic games). We consider an episodic setting where in each episode, each player independently selects a policy and observes only their own actions and rewards, along with the state. We show that if both players run policy gradient methods in tandem, their policies will converge to a min-max equilibrium of the game, as long as their learning rates follow a two-timescale rule (which is necessary). To the best of our knowledge, this constitutes the first finite-sample convergence result for independent learning in competitive RL, as prior work has largely focused on centralized/coordinated procedures for equilibrium computation.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

04/08/2021

Last-iterate Convergence of Decentralized Optimistic Gradient Descent/Ascent in Infinite-horizon Competitive Markov Games

Chen-Yu Wei, Chung-Wei Lee, Mengxiao Zhang, Haipeng Luo

Keywords Paper

0

0

0

0

18:24

06/12/2021

Decentralized Q-learning in Zero-sum Markov Games

Muhammed Sayin, Kaiqing Zhang, David Leslie and
Tamer Basar, Asuman Ozdaglar

Keywords Paper

reinforcement learning and planning

0

0

0

0

15:07

06/12/2020

Near-Optimal Reinforcement Learning with Self-Play

Yu Bai, Chi Jin, Tiancheng Yu

Keywords Paper

Theory -> Regularization, Applications -> Fairness, Accountability, and Transparency

0

0

0

0

3:33

06/12/2021

Learning in two-player zero-sum partially observable Markov games with perfect recall

Tadashi Kozuno, Pierre Ménard, Remi Munos, Michal Valko

Keywords Paper

reinforcement learning and planning, bandits, online learning

0

0

0

0

9:31

18/07/2021

Learning While Playing in Mean-Field Games: Convergence and Optimality

Qiaomin Xie, Zhuoran Yang, Zhaoran Wang, Andreea Minca

Keywords Paper

Applications, Privacy, Anonymity, and Security, Algorithms, Components Analysis (e.g., CCA, ICA, LDA, PCA), Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:24

02/02/2021

Evolutionary Game Theory Squared: Evolving Agents in Endogenously Evolving Zero-Sum Games

Stratis Skoulakis, Tanner Fiez, Ryann Sim and
Georgios Piliouras, Lillian Ratliff

Keywords Paper

0

0

0

0

20:14

26/04/2020

Actor-Critic Provably Finds Nash Equilibria of Linear-Quadratic Mean-Field Games

Zuyue Fu, Zhuoran Yang, Yongxin Chen, Zhaoran Wang

Keywords Paper

0

0

0

0

5:09

09/07/2020

Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium

Qiaomin Xie, Yudong Chen, Zhaoran Wang, Zhuoran Yang

Keywords Paper

Reinforcement learning, Planning and control

0

0

0

0

15:16

18/07/2021

A Sharp Analysis of Model-based Reinforcement Learning with Self-Play

Qinghua Liu, Tiancheng Yu, Yu Bai, Chi Jin

Keywords Paper

Algorithms, Multitask and Transfer Learning, Algorithms, Meta-Learning; Applications, Object Recognition; Data, Challenges, Implementations, and Software, Benchmarks;, Theory, RL, Decisions and Control Theory

0

0

0

0

4:49

12/07/2020

Provable Self-Play Algorithms for Competitive Reinforcement Learning

Yu Bai, Chi Jin

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

14:28

19/08/2021

Temporal Induced Self-Play for Stochastic Bayesian Games

Weizhe Chen, Zihan Zhou, Yi Wu, Fei Fang

Keywords Paper

Agent-based and Multi-agent Systems, Multi-agent Learning, Applications of Reinforcement Learning

0

0

0

0

11:52

18/07/2021

Learning in Nonzero-Sum Stochastic Games with Potentials

David Mguni, Yutong Wu, Yali Du and
Yaodong Yang, Ziyi Wang, M. Li, Ying Wen, Joel Jennings, Jun Wang

Keywords Paper

Theory, Game Theory and Computational Economics

0

0

0

0

5:36

12/07/2020

Implicit Learning Dynamics in Stackelberg Games: Equilibria Characterization, Convergence Analysis, and Empirical Study

Tanner Fiez, Benjamin Chasnov, Lillian Ratliff

Keywords Paper

Learning Theory

0

0

0

0

15:14

02/02/2021

Convergence Analysis of No-Regret Bidding Algorithms in Repeated Auctions

Zhe Feng, Guru Guruganesh, Christopher Liaw and
Aranyak Mehta, Abhishek Sethi

Keywords Paper

0

0

0

0

20:14

06/12/2021

Global Convergence to Local Minmax Equilibrium in Classes of Nonconvex Zero-Sum Games

Tanner Fiez, Lillian Ratliff, Eric Mazumdar and
Evan Faulkner, Adhyyan Narang

Keywords Paper

theory, optimization

0

0

0

0

15:13

06/12/2021

Reinforcement Learning in Reward-Mixing MDPs

Jeongyeol Kwon, Yonathan Efroni, Constantine Caramanis, Shie Mannor

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

12:57

06/12/2021

Optimality and Stability in Federated Learning: A Game-theoretic Approach

Kate Donahue, Jon Kleinberg

Keywords Paper

theory, federated learning

0

0

0

0

12:30

13/04/2021

Reinforcement learning for mean field games with strategic complementarities

Kiyeob Lee, Desik Rengarajan, Dileep Kalathil, Srinivas Shakkottai

Keywords Paper

0

0

0

0

2:57

06/12/2021

XDO: A Double Oracle Algorithm for Extensive-Form Games

Stephen McAleer, JB Lanier, Kevin A Wang and
Pierre Baldi, Roy Fox

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:51

06/12/2020

Pipeline PSRO: A Scalable Approach for Finding Approximate Nash Equilibria in Large Games

Stephen Mcaleer, J.B. Lanier, Roy Fox, Pierre Baldi

Keywords Paper

0

0

0

0

3:12

04/08/2021

Survival of the strictest: Stable and unstable equilibria under regularized learning with partial information

Angeliki Giannou, Emmanouil Vasileios Vlatakis-Gkaragkounis, Panayotis Mertikopoulos

Keywords Paper

0

0

0

0

16:33

02/02/2021

Decentralized Policy Gradient Descent Ascent for Safe Multi-Agent Reinforcement Learning

Songtao Lu, Kaiqing Zhang, Tianyi Chen and
Tamer Başar, Lior Horesh

Keywords Paper

0

0

0

0

16:54

18/07/2021

Decentralized Single-Timescale Actor-Critic on Zero-Sum Two-Player Stochastic Games

Hongyi Guo, Zuyue Fu, Zhuoran Yang, Zhaoran Wang

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

4:22

12/07/2020

Optimally Solving Two-Agent Decentralized POMDPs Under One-Sided Information Sharing

Yuxuan Xie, Jilles Dibangoye, Olivier Buffet

Keywords Paper

Planning, Control, and Multiagent Learning

0

0

0

0

8:59

06/12/2021

Exploration-Exploitation in Multi-Agent Competition: Convergence with Bounded Rationality

Stefanos Leonardos, Georgios Piliouras, Kelly Spendlove

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:11

18/07/2021

Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees

Kishan Panaganti, Dileep Kalathil

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:15

02/02/2021

Evolution Strategies for Approximate Solution of Bayesian Games

Zun Li, Michael P. Wellman

Keywords Paper

0

0

0

0

18:18

26/08/2020

Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity

Aaron Sidford, Mengdi Wang, Lin Yang, Yinyu Ye

Keywords Paper

0

0

0

0

14:51

18/07/2021

A New Formalism, Method and Open Issues for Zero-Shot Coordination

Johannes Treutlein, Michael Dennis, Caspar Oesterheld, Jakob Foerster

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:28

04/08/2021

Near Optimal Distributed Learning of Halfspaces with Two Parties

Mark Braverman, Gillat Kol, Shay Moran, Raghuvansh R. Saxena

Keywords Paper

0

0

0

0

16:43

18/07/2021

Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions

Shuang Qiu, Xiaohan Wei, Jieping Ye and
Zhaoran Wang, Zhuoran Yang

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

11:21

18/07/2021

Trajectory Diversity for Zero-Shot Coordination

Andrei Lupu, Brandon Cui, Hengyuan Hu, Jakob Foerster

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:16

06/12/2021

Fast Policy Extragradient Methods for Competitive Games with Entropy Regularization

Shicong Cen, Yuting Wei, Yuejie Chi

Keywords Paper

theory, optimization, reinforcement learning and planning

0

0

0

0

12:57

13/04/2021

Reinforcement learning for constrained markov decision processes

Ather Gattami, Qinbo Bai, Vaneet Aggarwal

Keywords Paper

0

0

0

0

3:08

13/04/2021

Sample complexity bounds for two timescale value-based reinforcement learning algorithms

Tengyu Xu, Yingbin Liang

Keywords Paper

0

0

0

0

2:57

06/12/2020

Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions

Matthew Faw, Rajat Sen, Karthikeyan Shanmugam and
Constantine Caramanis, Sanjay Shakkottai

Keywords Paper

0

0

0

0

3:24

26/08/2020

Finite-Time Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Jun Sun, Gang Wang, Georgios B. Giannakis and
Qinmin Yang, Zaiyue Yang

Keywords Paper

0

0

0

0

17:07

06/12/2020

Cooperative Multi-player Bandit Optimization

Ilai Bistritz, Nicholas Bambos

Keywords Paper

0

0

0

0

3:13

02/02/2021

Loop Estimator for Discounted Values in Markov Reward Processes

Falcon Z. Dai, Matthew R. Walter

Keywords Paper

0

0

0

0

21:51

12/07/2020

Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games

Darren Lin, Zhengyuan Zhou, Panayotis Mertikopoulos, Michael Jordan

Keywords Paper

Learning Theory

1

1

0

0

12:31