Non-Stochastic Multi-Player Multi-Armed Bandits: Optimal Rate With Collision Information, Sublinear Without

09/07/2020

Non-Stochastic Multi-Player Multi-Armed Bandits: Optimal Rate With Collision Information, Sublinear Without

Sebastien Bubeck, Yuanzhi Li, Yuval Peres, Mark Sellke

Keywords: Bandit problems,

Abstract Paper Similar Papers

Abstract: We consider the non-stochastic version of the (cooperative) multi-player multi-armed bandit problem. The model assumes no communication and no shared randomness at all between the players, and furthermore when two (or more) players select the same action this results in a maximal loss. We prove the first $\sqrt{T}$-type regret guarantee for this problem, assuming only two players, under the feedback model where collisions are announced to the colliding players. We also prove the first sublinear regret guarantee for the feedback model where collision information is not available, namely $T^{1-\frac{1}{2m}}$ where $m$ is the number of players.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at COLT 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

09/07/2020

Coordination without communication: optimal regret in two players multi-armed bandits

Sebastien Bubeck, Thomas Budzinski

Keywords Paper

Bandit problems,

0

0

0

0

14:56

26/08/2020

A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players

Abbas Mehrabian, Etienne Boursier, Emilie Kaufmann, Vianney Perchet

Keywords Paper

0

0

0

0

15:32

04/08/2021

Cooperative and Stochastic Multi-Player Multi-Armed Bandit: Optimal Regret With Neither Communication Nor Collisions

Mark Sellke, Sebastien Bubeck, Thomas Budzinski

Keywords Paper

0

0

0

0

9:25

26/08/2020

The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits

Ronshee Chawla, Abishek Sankararaman, Ayalvadi Ganesh, Sanjay Shakkottai

Keywords Paper

0

0

0

0

15:59

12/07/2020

My Fair Bandit: Distributed Learning of Max-Min Fairness with Multi-player Bandits

Ilai Bistritz, Tavor Baharav, Amir Leshem, Nicholas Bambos

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

15:04

26/08/2020

Optimal Algorithms for Multiplayer Multi-Armed Bandits

Po-An Wang, Alexandre Proutiere, Kaito Ariu and
Yassir Jedra, Alessio Russo

Keywords Paper

0

0

0

0

11:56

26/08/2020

Decentralized Multi-player Multi-armed Bandits with No Collision Information

Chengshuai Shi, Wei Xiong, Cong Shen, Jing Yang

Keywords Paper

0

0

0

0

14:18

09/07/2020

Selfish Robustness and Equilibria in Multi-Player Bandits

Etienne Boursier, Vianney Perchet

Keywords Paper

Bandit problems, Economics, game theory, and incentives

0

0

0

0

15:07

02/02/2021

Decentralized Multi-Agent Linear Bandits with Safety Constraints

Sanae Amani, Christos Thrampoulidis

Keywords Paper

0

0

0

0

19:13

26/08/2020

OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits

Niladri Chatterji, Vidya Muthukumar, Peter Bartlett

Keywords Paper

0

0

0

0

8:20

26/04/2020

Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication

Yuanhao Wang, Jiachen Hu, Xiaoyu Chen, Liwei Wang

Keywords Paper

Theory, Bandit Algorithms, Communication Efficiency

0

0

0

0

5:01

22/06/2020

Contention resolution without collision detection

Michael A. Bender, Tsvi Kopelowitz, William Kuszmaul, Seth Pettie

Keywords Paper

backoff, parallelism, networks, throughput

0

0

0

0

23:35

06/12/2020

On Regret with Multiple Best Arms

Yinglun Zhu, Robert Nowak

Keywords Paper

0

0

0

0

3:22

02/02/2021

DART: Adaptive Accept Reject Algorithm for Non-Linear Combinatorial Bandits

Mridul Agarwal, Vaneet Aggarwal, Abhishek Kumar Umrawal, Chris Quinn

Keywords Paper

0

0

0

0

14:48

09/07/2020

Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium

Qiaomin Xie, Yudong Chen, Zhaoran Wang, Zhuoran Yang

Keywords Paper

Reinforcement learning, Planning and control

0

0

0

0

15:16

02/02/2021

Adaptive Algorithms for Multi-armed Bandit with Composite and Anonymous Feedback

Siwei Wang, Haoyun Wang, Longbo Huang

Keywords Paper

0

0

0

0

19:29

06/12/2020

Exploiting the Surrogate Gap in Online Multiclass Classification

Dirk van der Hoeven

Keywords Paper

0

0

0

0

3:24

12/07/2020

No-Regret Exploration in Goal-Oriented Reinforcement Learning

Jean Tarbouriech, Evrard Garcelon, Michal Valko and
Matteo Pirotta, Alessandro Lazaric

Keywords Paper

Reinforcement Learning - General

0

0

0

0

11:14

06/12/2020

No-Regret Learning and Mixed Nash Equilibria: They Do Not Mix

Manolis Vlatakis-Gkaragkounis, Lampros Flokas, Thanasis Lianeas and
Panayotis Mertikopoulos, Georgios Piliouras

Keywords Paper

Algorithms -> Semi-Supervised Learning; Applications -> Computer Vision; Deep Learning, Applications -> Computational Photography

0

0

0

0

3:10

06/12/2021

Dueling Bandits with Adversarial Sleeping

Aadirupa Saha, Pierre Gaillard

Keywords Paper

optimization, bandits

0

0

0

0

15:50

04/08/2021

Adaptive Learning in Continuous Games: Optimal Regret Bounds and Convergence to Nash Equilibrium

Yu-Guan Hsieh, Kimon Antonakopoulos, Panayotis Mertikopoulos

Keywords Paper

0

0

0

0

16:09

18/07/2021

Online Learning in Unknown Markov Games

Yi Tian, Yuanhao Wang, Tiancheng Yu, Suvrit Sra

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:13

06/12/2021

Cooperative Stochastic Bandits with Asynchronous Agents and Constrained Feedback

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and
Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Paper

bandits

0

0

0

0

12:07

12/07/2020

Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards

Aadirupa Saha, Pierre Gaillard, Michal Valko

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

15:20

06/12/2020

From Finite to Countable-Armed Bandits

Anand Kalvit, Assaf Zeevi

Keywords Paper

, Theory -> Control Theory

0

0

0

0

3:15

13/04/2021

Contextual blocking bandits

Soumya Basu, Orestis Papadigenopoulos, Constantine Caramanis, Sanjay Shakkottai

Keywords Paper

0

0

0

0

2:47

09/07/2020

Tsallis-INF for Decoupled Exploration and Exploitation in Multi-armed Bandits

Chloé Rouyer , Yevgeny Seldin

Keywords Paper

Bandit problems, Online learning

0

0

0

0

15:30

06/12/2021

Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games

Yu Bai, Chi Jin, Huan Wang, Caiming Xiong

Keywords Paper

theory, reinforcement learning and planning, bandits

0

0

0

0

12:14

18/07/2021

Trajectory Diversity for Zero-Shot Coordination

Andrei Lupu, Brandon Cui, Hengyuan Hu, Jakob Foerster

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:16

03/08/2020

Regret Bounds for Decentralized Learning in Cooperative Multi-Agent Dynamical Systems

Seyed Mohammad Asghari, Yi Ouyang, Ashutosh Nayyar

Keywords Paper

0

0

0

0

7:49

06/12/2020

No-Regret Learning Dynamics for Extensive-Form Correlated Equilibrium

Andrea Celli, Alberto Marchesi, Gabriele Farina, Nicola Gatti

Keywords Paper

0

0

0

0

2:56

06/12/2021

Decentralized Q-learning in Zero-sum Markov Games

Muhammed Sayin, Kaiqing Zhang, David Leslie and
Tamer Basar, Asuman Ozdaglar

Keywords Paper

reinforcement learning and planning

0

0

0

0

15:07

18/07/2021

Beyond $log^2(T)$ regret for decentralized bandits in matching markets

Soumya Basu, Karthik Abinav Sankararaman, Abishek Sankararaman

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

5:11

12/07/2020

Provable Self-Play Algorithms for Competitive Reinforcement Learning

Yu Bai, Chi Jin

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

14:28

26/08/2020

Thompson Sampling for Linearly Constrained Bandits

Vidit Saxena, Joakim Jalden, Joseph Gonzalez

Keywords Paper

0

0

0

0

13:06

26/04/2020

Double Neural Counterfactual Regret Minimization

Hui Li, Kailiang Hu, Shaohua Zhang and
Yuan Qi, Le Song

Keywords Paper

Counterfactual Regret Minimization, Imperfect Information game, Neural Strategy, Deep Learning, Robust Sampling

0

0

0

0

4:49

02/02/2021

Computing Quantal Stackelberg Equilibrium in Extensive-Form Games

Jakub Černý, Viliam Lisý, Branislav Bošanský, Bo An

Keywords Paper

0

0

0

0

15:01

06/12/2020

Simple and Fast Algorithm for Binary Integer and Online Linear Programming

Xiaocheng Li, Chunlin Sun, Yinyu Ye

Keywords Paper

0

0

0

0

3:24

09/07/2020

Efficient and robust algorithms for adversarial linear contextual bandits

Gergely Neu, Julia Olkhovskaya

Keywords Paper

Bandit problems, Online learning

0

0

0

0

9:53

06/12/2020

Adversarial Blocking Bandits

Nicholas Bishop, Hau Chan, Debmalya Mandal, Long Tran-Thanh

Keywords Paper

0

0

0

0

3:09