Bandit algorithms: Letting go of logarithmic regret for statistical robustness

Abstract: We study regret minimization in a stochastic multi-armed bandit setting, and establish a fundamental trade-off between the regret suffered under an algorithm, and its statistical robustness. Considering broad classes of underlying arms’ distributions, we show that bandit learning algorithms with logarithmic regret are always inconsistent and that consistent learning algorithms always suffer a super-logarithmic regret. This result highlights the inevitable statistical fragility of all “logarithmic regret” bandit algorithms available in the literature - for instance, if a UCB algorithm designed for 1-subGaussian distributions is used in a subGaussian setting with a mismatched variance parameter, the learning performance could be inconsistent. Next, we show a positive result: statistically robust and consistent learning performance is attainable if we allow the regret to be slightly worse than logarithmic. Specifically, we propose three classes of distribution oblivious algorithms that achieve an asymptotic regret that is arbitrarily close to logarithmic.

18/07/2021

Bandit algorithms: Letting go of logarithmic regret for statistical robustness

Kumar Ashutosh, Jayakrishnan Nair, Anmol Kagrecha, Krishna Jagannathan

Comments

Similar Papers

Leveraging Good Representations in Linear Contextual Bandits

Matteo Papini, Andrea Tirinzoni, Marcello Restelli and Alessandro Lazaric, Matteo Pirotta

Keywords Abstract Paper

Reinforcement Learning and Planning, Bandits

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Abstract Paper

Improved exploration in factored average-reward MDPs

Mohammad Sadegh Talebi, Anders Jonsson, Odalric Maillard

Keywords Abstract Paper

Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection

Matteo Papini, Andrea Tirinzoni, Aldo Pacchiano and Marcello Restelli, Alessandro Lazaric, Matteo Pirotta

Keywords Abstract Paper

Near-Optimal Representation Learning for Linear Bandits and Linear RL

Jiachen Hu, Xiaoyu Chen, Chi Jin and Lihong Li, Liwei Wang

Keywords Abstract Paper

Theory, Online Learning Theory

Lenient Regret for Multi-Armed Bandits

Nadav Merlis, Shie Mannor

Keywords Abstract Paper

Sequential prediction under log-loss and misspecification

Meir Feder, Yury Polyanskiy

Keywords Abstract Paper

Improved Sleeping Bandits with Stochastic Action Sets and Adversarial Rewards

Aadirupa Saha, Pierre Gaillard, Michal Valko

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits

Shinji Ito, Shuichi Hirahara, Tasuku Soma, Yuichi Yoshida

Keywords Abstract Paper

Neural Regret-Matching for Distributed Constraint Optimization Problems

Yanchen Deng, Runsheng Yu, Xinrun Wang, Bo An

Keywords Abstract Paper

Agent-based and Multi-agent Systems, Coordination and Cooperation, Constraint Optimization, Distributed Constraints

Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes

YICHUN HU, Nathan Kallus, Xiaojie Mao

Keywords Abstract Paper

Bandit problems,

Adaptive Exploration in Linear Contextual Bandit

Botao Hao, Tor Lattimore, Csaba Szepesvari

Keywords Abstract Paper

Improved Worst-Case Regret Bounds for Randomized Least-Squares Value Iteration

Priyank Agrawal, Jinglin Chen, Nan Jiang

Keywords Abstract Paper

Adaptive Discretization for Adversarial Lipschitz Bandits

Chara Podimata, Alex Slivkins

Keywords Abstract Paper

Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis

Vidyashankar Sivakumar, Steven Wu, Arindam Banerjee

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Smooth bandit optimization: Generalization to holder space

Yusha Liu, Yining Wang, Aarti Singh

Keywords Abstract Paper

Surrogate Regret Bounds for Polyhedral Losses

Rafael Frongillo, Bo Waggoner

Keywords Abstract Paper

Model Selection in Contextual Stochastic Bandit Problems

Aldo Pacchiano, My Phan, Yasin Abbasi Yadkori and Anup Rao, Julian Zimmert, Tor Lattimore, Csaba Szepesvari

Keywords Abstract Paper

Cooperative Stochastic Bandits with Asynchronous Agents and Constrained Feedback

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Abstract Paper

On Optimal Robustness to Adversarial Corruption in Online Decision Problems

Keywords Abstract Paper

robustness, adversarial robustness and security, bandits

Improved Optimistic Algorithms for Logistic Bandits

Louis Faury, Marc Abeille, Clément Calauzènes, Olivier Fercoq

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Thompson Sampling for Linearly Constrained Bandits

Vidit Saxena, Joakim Jalden, Joseph Gonzalez

Keywords Abstract Paper

Parameter-Free Multi-Armed Bandit Algorithms with Hybrid Data-Dependent Regret Bounds

Keywords Abstract Paper

Crush Optimism with Pessimism: Structured Bandits Beyond Asymptotic Optimality

Matteo Papini, Andrea Tirinzoni, Marcello Restelli and
Alessandro Lazaric, Matteo Pirotta

Keywords Paper

Keywords Paper

Keywords Paper

Matteo Papini, Andrea Tirinzoni, Aldo Pacchiano and
Marcello Restelli, Alessandro Lazaric, Matteo Pirotta

Keywords Paper

Jiachen Hu, Xiaoyu Chen, Chi Jin and
Lihong Li, Liwei Wang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Aldo Pacchiano, My Phan, Yasin Abbasi Yadkori and
Anup Rao, Julian Zimmert, Tor Lattimore, Csaba Szepesvari

Keywords Paper

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and
Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei and
Mengxiao Zhang, Xiaojin Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jeffrey Negrea, Blair Bilodeau, Nicolò Campolongo and
Francesco Orabona, Dan Roy

Keywords Paper