Instance-wise minimax-optimal algorithms for logistic bandits

Abstract: Logistic Bandits have recently attracted substantial attention, by providing an uncluttered yet challenging framework for understanding the impact of non-linearity in parametrized bandits. It was shown by Faury et al. (2020) that the learning-theoretic difficulties of Logistic Bandits can be embodied by a large (sometimes prohibitively) problem-dependent constant \kappa, characterizing the magnitude of the reward’s non-linearity. In this paper we introduce an algorithm for which we provide a refined analysis. This allows for a better characterization of the effect of non-linearity and yields improved problem-dependent guarantees. In most favorable cases this leads to a regret upper-bound scaling as \tilde{\mathcal{O}}(d\sqrt{T/\kappa}), which dramatically improves over the \tilde{\mathcal{O}}(d\sqrt{T}+\kappa) state-of-the-art guarantees. We prove that this rate is <i>minimax-optimal</i> by deriving a \Omega(d\sqrt{T/\kappa}) problem-dependent lower-bound. Our analysis identifies two regimes (permanent and transitory) of the regret, which ultimately re-conciliates (Faury et al., 2020) with the Bayesian approach of Dong et al. (2019). In contrast to previous works, we find that in the permanent regime non-linearity can dramatically ease the exploration-exploitation trade-off. While it also impacts the length of the transitory phase in a problem-dependent fashion, we show that this impact is mild in most reasonable configurations.

26/08/2020

Instance-wise minimax-optimal algorithms for logistic bandits

Marc Abeille, Louis Faury, Clement Calauzenes

Comments

Similar Papers

Adaptive Exploration in Linear Contextual Bandit

Botao Hao, Tor Lattimore, Csaba Szepesvari

Keywords Abstract Paper

Adaptive Discretization for Adversarial Lipschitz Bandits

Chara Podimata, Alex Slivkins

Keywords Abstract Paper

Stage-wise Conservative Linear Bandits

Ahmadreza Moradipari, Christos Thrampoulidis, Mahnoosh Alizadeh

Keywords Abstract Paper

Improved Optimistic Algorithms for Logistic Bandits

Louis Faury, Marc Abeille, Clément Calauzènes, Olivier Fercoq

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Bayesian decision-making under misspecified priors with applications to meta-learning

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Abstract Paper

meta learning, bandits

A Closer Look at the Worst-case Behavior of Multi-armed Bandit Algorithms

Anand Kalvit, Assaf Zeevi

Keywords Abstract Paper

Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses

Haipeng Luo, Chen-Yu Wei, Chung-Wei Lee

Keywords Abstract Paper

optimization, reinforcement learning and planning, bandits

Cautiously Optimistic Policy Optimization and Exploration with Linear Function Approximation

Andrea Zanette, Ching-An Cheng, Alekh Agarwal

Keywords Abstract Paper

Asymptotically Optimal Information-Directed Sampling

Johannes Kirschner, Tor Lattimore, Claire Vernade, Csaba Szepesvari

Keywords Abstract Paper

Dynamic Regret of Policy Optimization in Non-Stationary Environments

Yingjie Fei, Zhuoran Yang, Zhaoran Wang, Qiaomin Xie

Keywords Abstract Paper

An Efficient Pessimistic-Optimistic Algorithm for Stochastic Linear Bandits with General Constraints

Xin Liu, Bin Li, Pengyi Shi, Lei Ying

Keywords Abstract Paper

optimization, bandits

Optimal Gradient-based Algorithms for Non-concave Bandit Optimization

Baihe Huang, Kaixuan Huang, Sham Kakade and Jason Lee, Qi Lei, Runzhe Wang, Jiaqi Yang

Keywords Abstract Paper

theory, deep learning, optimization, generative model, bandits

Parameter-Free Multi-Armed Bandit Algorithms with Hybrid Data-Dependent Regret Bounds

Keywords Abstract Paper

Logistic Regression Regret: What’s the Catch?

Keywords Abstract Paper

Online learning, Convex optimization, Information theory, Regression

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Abstract Paper

Low-rank generalized linear bandit problems

Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

Keywords Abstract Paper

Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning

Gen Li, Laixi Shi, Yuxin Chen and Yuantao Gu, Yuejie Chi

Keywords Abstract Paper

theory, reinforcement learning and planning

Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis

Vidyashankar Sivakumar, Steven Wu, Arindam Banerjee

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Misspecified Gaussian Process Bandit Optimization

Ilija Bogunovic, Andreas Krause

Keywords Abstract Paper

optimization, bandits, kernel methods

Bandit Phase Retrieval

Tor Lattimore, Botao Hao

Keywords Abstract Paper

Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret

Yingjie Fei, Zhuoran Yang, Yudong Chen and Zhaoran Wang, Qiaomin Xie

Keywords Abstract Paper

Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning

Aurelien Bibaut, Nathan Kallus, Maria Dimakopoulou and Antoine Chambaz, Mark van der Laan

Keywords Abstract Paper

theory, reinforcement learning and planning, machine learning, bandits

Doubly Robust Thompson Sampling with Linear Payoffs

Wonyoung Kim, Gi-Soo Kim, Myunghee Cho Paik

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and
Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Baihe Huang, Kaixuan Huang, Sham Kakade and
Jason Lee, Qi Lei, Runzhe Wang, Jiaqi Yang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Gen Li, Laixi Shi, Yuxin Chen and
Yuantao Gu, Yuejie Chi

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yingjie Fei, Zhuoran Yang, Yudong Chen and
Zhaoran Wang, Qiaomin Xie

Keywords Paper

Aurelien Bibaut, Nathan Kallus, Maria Dimakopoulou and
Antoine Chambaz, Mark van der Laan

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei and
Mengxiao Zhang, Xiaojin Zhang

Keywords Paper

Jonathan Lee, Aldo Pacchiano, Vidya Muthukumar and
Weihao Kong, Emma Brunskill

Keywords Paper

Keywords Paper