Robust Multi-Agent Decision-Making with Heavy-Tailed Payoffs

Abstract: We study the heavy-tailed stochastic bandit problem in the cooperative multiagent setting, where a group of agents interact with a common bandit problem, while communicating on a network with delays. Existing algorithms for the stochastic bandit in this setting utilize confidence intervals arising from an averaging-based communication protocol known as~\textit{running consensus}, that does not lend itself to robust estimation for heavy-tailed settings. We propose \textsc{MP-UCB}, a decentralized multi-agent algorithm for the cooperative stochastic bandit that incorporates robust estimation with a message-passing protocol. We prove optimal regret bounds for \textsc{MP-UCB} for several problem settings, and also demonstrate its superiority to existing methods. Furthermore, we establish the first lower bounds for the cooperative bandit problem, in addition to providing efficient algorithms for robust bandit estimation of location.

06/12/2021

Robust Multi-Agent Decision-Making with Heavy-Tailed Payoffs

Abhimanyu Dubey, Alex `Sandy' Pentland

Comments

Similar Papers

One More Step Towards Reality: Cooperative Bandits with Imperfect Communication

Udari Madhushani, Abhimanyu Dubey, Naomi Leonard, Alex Pentland

Keywords Abstract Paper

bandits

Kernel Methods for Cooperative Multi-Agent Learning with Delays

Abhimanyu Dubey, Alex `Sandy' Pentland

Keywords Abstract Paper

Planning, Control, and Multiagent Learning

Decentralized Multi-player Multi-armed Bandits with No Collision Information

Chengshuai Shi, Wei Xiong, Cong Shen, Jing Yang

Keywords Abstract Paper

Decentralized Multi-Agent Linear Bandits with Safety Constraints

Sanae Amani, Christos Thrampoulidis

Keywords Abstract Paper

Multi-Agent Reinforcement Learning in Stochastic Networked Systems

Yiheng Lin, Guannan Qu, Longbo Huang, Adam Wierman

Keywords Abstract Paper

reinforcement learning and planning, graph learning

Bayesian decision-making under misspecified priors with applications to meta-learning

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Abstract Paper

meta learning, bandits

Learning from eXtreme Bandit Feedback

Romain Lopez, Inderjit S. Dhillon, Michael I. Jordan

Keywords Abstract Paper

Hybrid Regret Bounds for Combinatorial Semi-Bandits and Adversarial Linear Bandits

Shinji Ito

Keywords Abstract Paper

bandits

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Abstract Paper

Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective

Dylan Foster, Alexander Rakhlin, David Simchi-Levi, Yunzong Xu

Keywords Abstract Paper

Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition

Tiancheng Jin, Haipeng Luo

Keywords Abstract Paper

Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes

YICHUN HU, Nathan Kallus, Xiaojie Mao

Keywords Abstract Paper

Bandit problems,

Nonasymptotic Guarantees for Spiked Matrix Recovery with Generative Priors

Jorio Cocola, Paul Hand, Vlad Voroninski

Keywords Abstract Paper

The one-way communication complexity of submodular maximization with applications to streaming and robustness

Moran Feldman, Ashkan Norouzi-Fard, Ola Svensson, Rico Zenklusen

Keywords Abstract Paper

Submodular Maximization, Approximation Algorithms, Robustness, Streaming, Communication Complexity

Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games

Yu Bai, Chi Jin, Huan Wang, Caiming Xiong

Keywords Abstract Paper

theory, reinforcement learning and planning, bandits

Tackling the Objective Inconsistency Problem in Heterogeneous Federated Optimization

Jianyu Wang, Qinghua Liu, Hao Liang and Gauri Joshi, H. Vincent Poor

Keywords Abstract Paper

Decentralized Policy Gradient Descent Ascent for Safe Multi-Agent Reinforcement Learning

Songtao Lu, Kaiqing Zhang, Tianyi Chen and Tamer Başar, Lior Horesh

Keywords Abstract Paper

OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits

Niladri Chatterji, Vidya Muthukumar, Peter Bartlett

Keywords Abstract Paper

On Optimal Robustness to Adversarial Corruption in Online Decision Problems

Shinji Ito

Keywords Abstract Paper

robustness, adversarial robustness and security, bandits

Queue-Learning: A Reinforcement Learning Approach for Providing Quality of Service

Majid Raeis, Ali Tizghadam, Alberto Leon-Garcia

Keywords Abstract Paper

Stochastic bandits with linear constraints

Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett, Heinrich Jiang

Keywords Abstract Paper

Cooperative Stochastic Bandits with Asynchronous Agents and Constrained Feedback

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Abstract Paper

bandits

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and
Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jianyu Wang, Qinghua Liu, Hao Liang and
Gauri Joshi, H. Vincent Poor

Keywords Paper

Songtao Lu, Kaiqing Zhang, Tianyi Chen and
Tamer Başar, Lior Horesh

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and
Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Paper

Zhi Wang, Chicheng Zhang, Manish Kumar Singh and
Laurel Riek, Kamalika Chaudhuri

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

El-Mahdi El-Mhamdi, Rachid Guerraoui, Arsany Guirguis and
Lê Nguyên Hoang, Sébastien Rouault

Keywords Paper

Keywords Paper

Keywords Paper

Xiangyi Chen, Tiancong Chen, Haoran Sun and
Steven Wu, Mingyi Hong

Keywords Paper

Liang Yang, Mengzhe Li, Liyang Liu and
bingxin niu, Chuan Wang, Xiaochun Cao, Yuanfang Guo

Keywords Paper

Keywords Paper

Keywords Paper