Last-iterate Convergence in Extensive-Form Games

Abstract: Regret-based algorithms are highly efficient at finding approximate Nash equilibria in sequential games such as poker games. However, most regret-based algorithms, including counterfactual regret minimization (CFR) and its variants, rely on iterate averaging to achieve convergence. Inspired by recent advances on last-iterate convergence of optimistic algorithms in zero-sum normal-form games, we study this phenomenon in sequential games, and provide a comprehensive study of last-iterate convergence for zero-sum extensive-form games with perfect recall (EFGs), using various optimistic regret-minimization algorithms over treeplexes. This includes algorithms using the vanilla entropy or squared Euclidean norm regularizers, as well as their dilated versions which admit more efficient implementation. In contrast to CFR, we show that all of these algorithms enjoy last-iterate convergence, with some of them even converging exponentially fast. We also provide experiments to further support our theoretical results.

12/07/2020

Non-convex optimization, Combinatorial optimization, Computational complexity, High-dimensional statistics, Unsupervised and semi-supervised learning

15:31

06/12/2020

Last-iterate Convergence in Extensive-Form Games

Chung-Wei Lee, Christian Kroer, Haipeng Luo

Comments

Similar Papers

Stochastic Regret Minimization in Extensive-Form Games

Gabriele Farina, Christian Kroer, Tuomas Sandholm

Keywords Abstract Paper

Structure learning in polynomial time: Greedy algorithms, Bregman information, and exponential families

Goutham Rajendran, Bohdan Kivva, Ming Gao, Bryon Aragam

Keywords Abstract Paper

Increasing Iterate Averaging for Solving Saddle-Point Problems

Yuan Gao, Christian Kroer, Donald Goldfarb

Keywords Abstract Paper

Low-Variance and Zero-Variance Baselines for Extensive-Form Games

Trevor Davis, Martin Schmid, Michael Bowling

Keywords Abstract Paper

Planning, Control, and Multiagent Learning

Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits

Shinji Ito, Shuichi Hirahara, Tasuku Soma, Yuichi Yoshida

Keywords Abstract Paper

Upper bounds for Model-Free Row-Sparse Principal Component Analysis

Guanyi Wang, Santanu Dey

Keywords Abstract Paper

A Greedy Anytime Algorithm for Sparse PCA

Dan Vilenchik, Adam Soffer, Guy Holtzman

Keywords Abstract Paper

Non-convex optimization, Combinatorial optimization, Computational complexity, High-dimensional statistics, Unsupervised and semi-supervised learning

Flexible mean field variational inference using mixtures of non-overlapping exponential families

Keywords Abstract Paper

Double Neural Counterfactual Regret Minimization

Hui Li, Kailiang Hu, Shaohua Zhang and Yuan Qi, Le Song

Keywords Abstract Paper

Counterfactual Regret Minimization, Imperfect Information game, Neural Strategy, Deep Learning, Robust Sampling

One More Step Towards Reality: Cooperative Bandits with Imperfect Communication

Udari Madhushani, Abhimanyu Dubey, Naomi Leonard, Alex Pentland

Keywords Abstract Paper

Model-Free Online Learning in Unknown Sequential Decision Making Problems and Games

Gabriele Farina, Tuomas Sandholm

Keywords Abstract Paper

A Tight Lower Bound and Efficient Reduction for Swap Regret

Keywords Abstract Paper

Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs

Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang

Keywords Abstract Paper

Marginalized Stochastic Natural Gradients for Black-Box Variational Inference

Geng Ji, Debora Sujono, Erik Sudderth

Keywords Abstract Paper

Probabilistic Methods, Approximate Inference

No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium

Andrea Celli, Alberto Marchesi, Gabriele Farina, Nicola Gatti

Keywords Abstract Paper

Double Oracle Algorithm for Computing Equilibria in Continuous Games

Lukáš Adam, Rostislav Horčík, Tomáš Kasl, Tomáš Kroupa

Keywords Abstract Paper

Iteratively Reweighted Least Squares for Basis Pursuit with Global Linear Convergence Rate

Christian Kümmerle, Claudio Mayrink Verdun, Dominik Stöger

Keywords Abstract Paper

theory, optimization, machine learning

Gradient-based Hyperparameter Optimization Over Long Horizons

Paul Micaelli, Amos Storkey

Keywords Abstract Paper

optimization, meta learning

Lazy-CFR: fast and near-optimal regret minimization for extensive games with imperfect information

Yichi Zhou, Tongzheng Ren, Jialian Li and Dong Yan, Jun Zhu

Keywords Abstract Paper

Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity

Nicolas Loizou, Hugo Berard, Gauthier Gidel and Ioannis Mitliagkas, Simon Lacoste-Julien

Keywords Abstract Paper

Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards

Aadirupa Saha, Pierre Gaillard, Michal Valko

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

MOTS: Minimax Optimal Thompson Sampling

Tianyuan Jin, Pan Xu, Jieming Shi and Xiaokui Xiao, Quanquan Gu

Keywords Abstract Paper

Algorithms, Online Learning Algorithms

Global Convergence and Variance Reduction for a Class of Nonconvex-Nonconcave Minimax Problems

Junchi Yang, Negar Kiyavash, Niao He

Keywords Abstract Paper

Graphical Models in Heavy-Tailed Markets

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Hui Li, Kailiang Hu, Shaohua Zhang and
Yuan Qi, Le Song

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yichi Zhou, Tongzheng Ren, Jialian Li and
Dong Yan, Jun Zhu

Keywords Paper

Nicolas Loizou, Hugo Berard, Gauthier Gidel and
Ioannis Mitliagkas, Simon Lacoste-Julien

Keywords Paper

Keywords Paper

Tianyuan Jin, Pan Xu, Jieming Shi and
Xiaokui Xiao, Quanquan Gu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Xi Liu, Ping-Chun Hsieh, Yu Heng Hung and
Anirban Bhattacharya, P. Kumar

Keywords Paper

Keywords Paper

Keywords Paper

Stephen McAleer, JB Lanier, Kevin A Wang and
Pierre Baldi, Roy Fox

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper