Regret Minimization in Heavy-Tailed Bandits

Abstract: We revisit the classic regret-minimization problem in the stochastic multi-armed bandit setting when the arm-distributions are allowed to be heavy-tailed. Regret minimization has been well studied in simpler settings of either bounded support reward distributions or distributions that belong to a single parameter exponential family. We work under the much weaker assumption that the moments of order \((1+\epsilon)\) are uniformly bounded by a known constant \(B\), for some given \( \epsilon > 0\). We propose an optimal algorithm that matches the lower bound exactly in the first-order term. We also give a finite-time bound on its regret. We show that our index concentrates faster than the well-known truncated or trimmed empirical mean estimators for the mean of heavy-tailed distributions. Computing our index can be computationally demanding. To address this, we develop a batch-based algorithm that is optimal up to a multiplicative constant depending on the batch size. We hence provide a controlled trade-off between statistical optimality and computational cost.

06/12/2020

Deep Learning, Attention Models, Applications, Time Series Analysis; Deep Learning, Predictive Models, Reinforcement Learning and Planning, Bandits

6:18

06/12/2021

Regret Minimization in Heavy-Tailed Bandits

Shubhada Agrawal, Sandeep K Juneja, Wouter M Koolen

Comments

Similar Papers

Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards

Kyungjae Lee, Hongjun Yang, Sungbin Lim, Songhwai Oh

Keywords Abstract Paper

Low-rank generalized linear bandit problems

Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

Keywords Abstract Paper

Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes

YICHUN HU, Nathan Kallus, Xiaojie Mao

Keywords Abstract Paper

Bandit problems,

Parameter-Free Multi-Armed Bandit Algorithms with Hybrid Data-Dependent Regret Bounds

Shinji Ito

Keywords Abstract Paper

Stochastic bandits with groups of similar arms.

Fabien Pesquerel, Hassan SABER, Odalric-Ambrym Maillard

Keywords Abstract Paper

optimization, generative model, bandits

Covariance-adapting algorithm for semi-bandits with application to sparse outcomes

Pierre Perrault, Vianney Perchet, Michal Valko

Keywords Abstract Paper

Bandit problems,

On information gain and regret bounds in gaussian process bandits

Sattar Vakili, Kia Khezeli, Victor Picheny

Keywords Abstract Paper

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

Julian Katz-Samuels, Lalit Jain, zohar karnin, Kevin Jamieson

Keywords Abstract Paper

Adapting to Delays and Data in Adversarial Multi-Armed Bandits

András György, Pooria Joulani

Keywords Abstract Paper

Deep Learning, Attention Models, Applications, Time Series Analysis; Deep Learning, Predictive Models, Reinforcement Learning and Planning, Bandits

Bayesian decision-making under misspecified priors with applications to meta-learning

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Abstract Paper

meta learning, bandits

Improved Optimistic Algorithms for Logistic Bandits

Louis Faury, Marc Abeille, Clément Calauzènes, Olivier Fercoq

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Greedy-GQ with Variance Reduction: Finite-time Analysis and Improved Complexity

Shaocong Ma, Ziyi Chen, Yi Zhou, Shaofeng Zou

Keywords Abstract Paper

Machine Learning, Reinforcement Learning, Optimization

On Regret with Multiple Best Arms

Yinglun Zhu, Robert Nowak

Keywords Abstract Paper

On Limited-Memory Subsampling Strategies for Bandits

Dorian Baudry, Yoan Russac, Olivier Cappé

Keywords Abstract Paper

Reinforcement Learning and Planning, Bandits

Non-Exponentially Weighted Aggregation: Regret Bounds for Unbounded Loss Functions

Pierre Alquier

Keywords Abstract Paper

Probabilistic Methods, Bayesian Methods

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Abstract Paper

Thompson Sampling for Linearly Constrained Bandits

Vidit Saxena, Joakim Jalden, Joseph Gonzalez

Keywords Abstract Paper

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

Andrea Tirinzoni, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric

Keywords Abstract Paper

Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms

Mohsen Bayati, Nima Hamidi, Ramesh Johari, Khashayar Khosravi

Keywords Abstract Paper

Stochastic bandits with linear constraints

Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett, Heinrich Jiang

Keywords Abstract Paper

A Novel Confidence-Based Algorithm for Structured Bandits

Andrea Tirinzoni, Alessandro Lazaric, Marcello Restelli

Keywords Abstract Paper

Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits

Shinji Ito, Shuichi Hirahara, Tasuku Soma, Yuichi Yoshida

Keywords Abstract Paper

Doubly Robust Thompson Sampling with Linear Payoffs

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and
Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Dheeraj Nagaraj, Xian Wu, Guy Bresler and
Prateek Jain, Praneeth Netrapalli

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Gen Li, Laixi Shi, Yuxin Chen and
Yuantao Gu, Yuejie Chi

Keywords Paper