Stochastic bandits with arm-dependent delays

Abstract: Significant work has been recently dedicated to the stochastic delayed bandit setting because of its relevance in applications. The applicability of existing algorithms is however restricted by the fact that strong assumptions are often made on the delay distributions, such as full observability, restrictive shape constraints, or uniformity over arms. In this work, we weaken them significantly and only assume that there is a bound on the tail of the delay. In particular, we cover the important case where the delay distributions vary across arms, and the case where the delays are heavy-tailed. Addressing these difficulties, we propose a simple but efficient UCB-based algorithm called the PATIENTBANDITS. We provide both problem-dependent and problem-independent bounds on the regret as well as performance lower bounds.

18/07/2021

Deep Learning, Attention Models, Applications, Time Series Analysis; Deep Learning, Predictive Models, Reinforcement Learning and Planning, Bandits

6:18

12/07/2020

Stochastic bandits with arm-dependent delays

Anne Gael Manegueu, Claire Vernade, Alexandra Carpentier, Michal Valko

Comments

Similar Papers

Adapting to Delays and Data in Adversarial Multi-Armed Bandits

András György, Pooria Joulani

Keywords Abstract Paper

Deep Learning, Attention Models, Applications, Time Series Analysis; Deep Learning, Predictive Models, Reinforcement Learning and Planning, Bandits

Non-Stationary Bandits with Intermediate Observations

Claire Vernade, András György, Timothy Mann

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Principal component regression with semirandom observations via matrix completion

Aditya Bhaskara, Aravinda Kanchana Ruwanpathirana, Maheshakya Wijewardena

Keywords Abstract Paper

Twice regularized MDPs and the equivalence between robustness and regularization

Esther Derman, Matthieu Geist, Shie Mannor

Keywords Abstract Paper

optimization, reinforcement learning and planning, robustness

Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks

Mingchen Li, Mahdi Soltanolkotabi, Samet Oymak

Keywords Abstract Paper

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

Clémence Réda, Andrea Tirinzoni, Rémy Degenne

Keywords Abstract Paper

theory, reinforcement learning and planning, bandits

Can gradient clipping mitigate label noise?

Aditya Krishna Menon, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar

Keywords Abstract Paper

Escaping Saddle Points of Empirical Risk Privately and Scalably via DP-Trust Region Method

Di Wang, Jinhui Xu

Keywords Abstract Paper

differential privacy, empirical risk minimization, private machine learning

Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits

Shinji Ito, Shuichi Hirahara, Tasuku Soma, Yuichi Yoshida

Keywords Abstract Paper

Non-convex Distributionally Robust Optimization: Non-asymptotic Analysis

Jikai Jin, Bohang Zhang, Haiyang Wang, Liwei Wang

Keywords Abstract Paper

optimization

Locally Valid and Discriminative Prediction Intervals for Deep Learning Models

Zhen Lin, Shubhendu Trivedi, Jimeng Sun

Keywords Abstract Paper

deep learning

STORM+: Fully Adaptive SGD with Recursive Momentum for Nonconvex Optimization

Kfir Levy, Ali Kavis, Volkan Cevher

Keywords Abstract Paper

optimization

Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation

Andrea Zanette, Ching-An Cheng, Alekh Agarwal

Keywords Abstract Paper

Robust compressed sensing using generative models

Ajil Jalal, Liu Liu, Alex Dimakis, Constantine Caramanis

Keywords Abstract Paper

Neuroscience and Cognitive Science -> Neuroscience, Neuroscience and Cognitive Science -> Neural Coding

Adversarial Robustness with Semi-Infinite Constrained Learning

Alexander Robey, Luiz Chamon, George J. Pappas and Hamed Hassani, Alejandro Ribeiro

Keywords Abstract Paper

theory, deep learning, optimization, robustness, adversarial robustness and security

Hausdorff Dimension, Heavy Tails, and Generalization in Neural Networks

Umut Simsekli, Ozan Sener, George Deligiannidis, Murat Erdogdu

Keywords Abstract Paper

Deep Learning -> Supervised Deep Networks, Deep Learning -> Embedding Approaches

Parameter-Free Multi-Armed Bandit Algorithms with Hybrid Data-Dependent Regret Bounds

Shinji Ito

Keywords Abstract Paper

Unbalanced minibatch Optimal Transport; applications to Domain Adaptation

Kilian Fatras, Thibault Séjourné, Rémi Flamary, Nicolas Courty

Keywords Abstract Paper

Algorithms, Optimal Transport

Online model selection for reinforcement learning with function approximation

Jonathan Lee, Aldo Pacchiano, Vidya Muthukumar and Weihao Kong, Emma Brunskill

Keywords Abstract Paper

On Thompson Sampling with Langevin Algorithms

Eric Mazumdar, Aldo Pacchiano, Yian Ma and Michael Jordan, Peter Bartlett

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Acting in Delayed Environments with Non-Stationary Markov Policies

Esther Derman, Gal Dalal, Shie Mannor

Keywords Abstract Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Alexander Robey, Luiz Chamon, George J. Pappas and
Hamed Hassani, Alejandro Ribeiro

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jonathan Lee, Aldo Pacchiano, Vidya Muthukumar and
Weihao Kong, Emma Brunskill

Keywords Paper

Eric Mazumdar, Aldo Pacchiano, Yian Ma and
Michael Jordan, Peter Bartlett

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Xiangyi Chen, Tiancong Chen, Haoran Sun and
Steven Wu, Mingyi Hong

Keywords Paper

Kilian Fatras, Younès Zine, Rémi Flamary and
Remi Gribonval, Nicolas Courty

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Hadi Daneshmand Daneshmand, Jonas Kohler, Francis Bach and
Thomas Hofmann, Aurelien Lucchi

Keywords Paper