My Fair Bandit: Distributed Learning of Max-Min Fairness with Multi-player Bandits

Abstract: Consider N cooperative but non-communicating players where each plays one out of M arms for T turns. Players have different utilities for each arm, representable as an NxM matrix. However, these utilities are unknown to the players. In each turn players receive noisy observations of their utility for their selected arm. However, if any other players selected the same arm that turn, they will all receive zero utility due to the conflict. No other communication or coordination between the players is possible. Our goal is to design a distributed algorithm that learns the matching between players and arms that achieves max-min fairness while minimizing the regret. We present an algorithm and prove that it is regret optimal up to a log(log T) factor. This is the first max-min fairness multi-player bandit algorithm with (near) order optimal regret.

26/08/2020

My Fair Bandit: Distributed Learning of Max-Min Fairness with Multi-player Bandits

Ilai Bistritz, Tavor Baharav, Amir Leshem, Nicholas Bambos

Comments

Similar Papers

Optimal Algorithms for Multiplayer Multi-Armed Bandits

Po-An Wang, Alexandre Proutiere, Kaito Ariu and Yassir Jedra, Alessio Russo

Keywords Abstract Paper

The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits

Ronshee Chawla, Abishek Sankararaman, Ayalvadi Ganesh, Sanjay Shakkottai

Keywords Abstract Paper

Non-Stochastic Multi-Player Multi-Armed Bandits: Optimal Rate With Collision Information, Sublinear Without

Sebastien Bubeck, Yuanzhi Li, Yuval Peres, Mark Sellke

Keywords Abstract Paper

Bandit problems,

Cooperative and Stochastic Multi-Player Multi-Armed Bandit: Optimal Regret With Neither Communication Nor Collisions

Mark Sellke, Sebastien Bubeck, Thomas Budzinski

Keywords Abstract Paper

Slot Machines: Discovering Winning Combinations of Random Weights in Neural Networks

Maxwell M Aladago, Lorenzo Torresani

Keywords Abstract Paper

Deep Learning, Optimization for Deep Networks

A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players

Abbas Mehrabian, Etienne Boursier, Emilie Kaufmann, Vianney Perchet

Keywords Abstract Paper

Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards

Aadirupa Saha, Pierre Gaillard, Michal Valko

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Double Neural Counterfactual Regret Minimization

Hui Li, Kailiang Hu, Shaohua Zhang and Yuan Qi, Le Song

Keywords Abstract Paper

Counterfactual Regret Minimization, Imperfect Information game, Neural Strategy, Deep Learning, Robust Sampling

Adaptive Learning in Continuous Games: Optimal Regret Bounds and Convergence to Nash Equilibrium

Yu-Guan Hsieh, Kimon Antonakopoulos, Panayotis Mertikopoulos

Keywords Abstract Paper

Equilibrium Refinement for the Age of Machines: The One-Sided Quasi-Perfect Equilibrium

Gabriele Farina, Tuomas Sandholm

Keywords Abstract Paper

On Regret with Multiple Best Arms

Yinglun Zhu, Robert Nowak

Keywords Abstract Paper

Selfish Robustness and Equilibria in Multi-Player Bandits

Etienne Boursier, Vianney Perchet

Keywords Abstract Paper

Bandit problems, Economics, game theory, and incentives

Sparsity-Agnostic Lasso Bandit

Min-hwan Oh, Garud Iyengar, Assaf Zeevi

Keywords Abstract Paper

Reinforcement Learning and Planning, Bandits

Coordination without communication: optimal regret in two players multi-armed bandits

Sebastien Bubeck, Thomas Budzinski

Keywords Abstract Paper

Bandit problems,

On the Approximation of Nash Equilibria in Sparse Win-Lose Multi-player Games

Zhengyang Liu, Jiawei Li, Xiaotie Deng

Keywords Abstract Paper

Cooperative Stochastic Bandits with Asynchronous Agents and Constrained Feedback

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Abstract Paper

bandits

Online Multi-Armed Bandits with Adaptive Inference

Maria Dimakopoulou, Zhimei Ren, Zhengyuan Zhou

Keywords Abstract Paper

theory, reinforcement learning and planning, bandits, online learning, causality

Online Learning for Load Balancing of Unknown Monotone Resource Allocation Games

Ilai Bistritz, Nicholas Bambos

Keywords Abstract Paper

Theory, Game Theory and Computational Economics

Dueling Bandits with Adversarial Sleeping

Aadirupa Saha, Pierre Gaillard

Keywords Abstract Paper

optimization, bandits

Converging to Team-Maxmin Equilibria in Zero-Sum Multiplayer Games

Youzhi Zhang, Bo An

Keywords Abstract Paper

Learning Theory

DART: Adaptive Accept Reject Algorithm for Non-Linear Combinatorial Bandits

Mridul Agarwal, Vaneet Aggarwal, Abhishek Kumar Umrawal, Chris Quinn

Keywords Abstract Paper

Provably Efficient Fictitious Play Policy Optimization for Zero-Sum Markov Games with Structured Transitions

Po-An Wang, Alexandre Proutiere, Kaito Ariu and
Yassir Jedra, Alessio Russo

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Hui Li, Kailiang Hu, Shaohua Zhang and
Yuan Qi, Le Song

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and
Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Shuang Qiu, Xiaohan Wei, Jieping Ye and
Zhaoran Wang, Zhuoran Yang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper