Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes

09/07/2020

Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes

YICHUN HU, Nathan Kallus, Xiaojie Mao

Keywords: Bandit problems,

Abstract Paper Similar Papers

Abstract: We study a nonparametric contextual bandit problem where the expected reward functions belong to a H\"older class with smoothness parameter $\beta$. We show how this interpolates between two extremes that were previously studied in isolation: non-differentiable bandits ($\beta\leq1$), where rate-optimal regret is achieved by running separate non-contextual bandits in different context regions, and parametric-response bandits (satisfying $\beta=\infty$), where rate-optimal regret can be achieved with minimal or no exploration due to infinite extrapolatability. We develop a novel algorithm that carefully adjusts to all smoothness settings and we prove its regret is rate-optimal by establishing matching upper and lower bounds, recovering the existing results at the two extremes. In this sense, our work bridges the gap between the existing literature on parametric and non-differentiable contextual bandit problems and between bandit algorithms that exclusively use global or local information, shedding light on the crucial interplay of complexity and regret in contextual bandits.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at COLT 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

04/08/2021

Adaptive Discretization for Adversarial Lipschitz Bandits

Chara Podimata, Alex Slivkins

Keywords Paper

0

0

0

0

18:13

13/04/2021

Smooth bandit optimization: Generalization to holder space

Yusha Liu, Yining Wang, Aarti Singh

Keywords Paper

0

0

0

0

2:52

13/04/2021

Low-rank generalized linear bandit problems

Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

Keywords Paper

0

0

0

0

2:49

26/08/2020

Thompson Sampling for Linearly Constrained Bandits

Vidit Saxena, Joakim Jalden, Joseph Gonzalez

Keywords Paper

0

0

0

0

13:06

13/04/2021

Stochastic bandits with linear constraints

Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett, Heinrich Jiang

Keywords Paper

0

0

0

0

3:02

04/08/2021

Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective

Dylan Foster, Alexander Rakhlin, David Simchi-Levi, Yunzong Xu

Keywords Paper

0

0

0

0

16:53

12/07/2020

Improved Optimistic Algorithms for Logistic Bandits

Louis Faury, Marc Abeille, Clément Calauzènes, Olivier Fercoq

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

15:22

04/08/2021

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Paper

0

0

0

0

20:29

04/08/2021

Parameter-Free Multi-Armed Bandit Algorithms with Hybrid Data-Dependent Regret Bounds

Shinji Ito

Keywords Paper

0

0

0

0

15:29

18/07/2021

On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization

Xu Cai, Jonathan Scarlett

Keywords Paper

Applications, Natural Language Processing, Applications, Network Analysis, Reinforcement Learning and Planning, Bandits

0

0

0

0

4:19

06/12/2021

Optimal Gradient-based Algorithms for Non-concave Bandit Optimization

Baihe Huang, Kaixuan Huang, Sham Kakade and
Jason Lee, Qi Lei, Runzhe Wang, Jiaqi Yang

Keywords Paper

theory, deep learning, optimization, generative model, bandits

0

0

0

0

10:53

12/07/2020

Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis

Vidyashankar Sivakumar, Steven Wu, Arindam Banerjee

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

17:56

02/02/2021

Regret Bounds for Batched Bandits

Hossein Esfandiari, Amin Karbasi, Abbas Mehrabian, Vahab Mirrokni

Keywords Paper

0

0

0

0

17:53

06/12/2021

Bayesian decision-making under misspecified priors with applications to meta-learning

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and
Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Paper

meta learning, bandits

0

0

0

0

14:58

06/12/2020

Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards

Kyungjae Lee, Hongjun Yang, Sungbin Lim, Songhwai Oh

Keywords Paper

0

0

0

0

3:26

06/12/2020

Stage-wise Conservative Linear Bandits

Ahmadreza Moradipari, Christos Thrampoulidis, Mahnoosh Alizadeh

Keywords Paper

0

0

0

0

3:18

06/12/2020

Model Selection in Contextual Stochastic Bandit Problems

Aldo Pacchiano, My Phan, Yasin Abbasi Yadkori and
Anup Rao, Julian Zimmert, Tor Lattimore, Csaba Szepesvari

Keywords Paper

0

0

0

0

3:22

02/02/2021

Lenient Regret for Multi-Armed Bandits

Nadav Merlis, Shie Mannor

Keywords Paper

0

0

0

0

20:14

26/08/2020

OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits

Niladri Chatterji, Vidya Muthukumar, Peter Bartlett

Keywords Paper

0

0

0

0

8:20

12/07/2020

Tightening Exploration in Upper Confidence Reinforcement Learning

Hippolyte Bourel, Odalric-Ambrym Maillard, Mohammad Sadegh Talebi

Keywords Paper

Reinforcement Learning - General

0

0

0

0

16:14

06/12/2021

Minimax Optimal Quantile and Semi-Adversarial Regret via Root-Logarithmic Regularizers

Jeffrey Negrea, Blair Bilodeau, Nicolò Campolongo and
Francesco Orabona, Dan Roy

Keywords Paper

generative model, online learning

0

0

0

0

14:30

04/08/2021

Regret Minimization in Heavy-Tailed Bandits

Shubhada Agrawal, Sandeep K Juneja, Wouter M Koolen

Keywords Paper

0

0

0

0

17:35

06/12/2021

Doubly Robust Thompson Sampling with Linear Payoffs

Wonyoung Kim, Gi-Soo Kim, Myunghee Cho Paik

Keywords Paper

bandits

0

0

0

0

14:18

18/07/2021

Beyond $log^2(T)$ regret for decentralized bandits in matching markets

Soumya Basu, Karthik Abinav Sankararaman, Abishek Sankararaman

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

5:11

06/12/2021

Hybrid Regret Bounds for Combinatorial Semi-Bandits and Adversarial Linear Bandits

Shinji Ito

Keywords Paper

bandits

0

0

0

0

10:49

18/07/2021

Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously

Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei and
Mengxiao Zhang, Xiaojin Zhang

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

5:12

04/08/2021

Asymptotically Optimal Information-Directed Sampling

Johannes Kirschner, Tor Lattimore, Claire Vernade, Csaba Szepesvari

Keywords Paper

0

0

0

0

15:01

06/12/2021

Cooperative Stochastic Bandits with Asynchronous Agents and Constrained Feedback

Lin Yang, Yu-Zhen Janice Chen, Stephen Pasteris and
Mohammad Hajiesmaili, John C. S. Lui, Don Towsley

Keywords Paper

bandits

0

0

0

0

12:07

18/07/2021

Near-Optimal Representation Learning for Linear Bandits and Linear RL

Jiachen Hu, Xiaoyu Chen, Chi Jin and
Lihong Li, Liwei Wang

Keywords Paper

Theory, Online Learning Theory

0

0

0

0

5:13

04/08/2021

Fine-Grained Gap-Dependent Bounds for Tabular MDPs via Adaptive Multi-Step Bootstrap

Haike Xu, Tengyu Ma, Simon Du

Keywords Paper

0

0

0

0

10:42

18/07/2021

Adversarial Combinatorial Bandits with General Non-linear Reward Functions

Yanjun Han, Yining Wang, Xi Chen

Keywords Paper

Applications, Computer Vision, Applications, Computational Photography, Theory, Online Learning Theory

0

0

0

0

5:21

06/12/2020

Dynamic Regret of Policy Optimization in Non-Stationary Environments

Yingjie Fei, Zhuoran Yang, Zhaoran Wang, Qiaomin Xie

Keywords Paper

0

0

0

0

2:41

09/07/2020

Tight Lower Bounds for Combinatorial Multi-Armed Bandits

Nadav Merlis, Shie Mannor

Keywords Paper

Bandit problems, Learning with algebraic or combinatorial structure

0

0

0

0

14:00

06/12/2021

On Optimal Robustness to Adversarial Corruption in Online Decision Problems

Shinji Ito

Keywords Paper

robustness, adversarial robustness and security, bandits

0

0

0

0

5:41

06/12/2021

Stochastic bandits with groups of similar arms.

Fabien Pesquerel, Hassan SABER, Odalric-Ambrym Maillard

Keywords Paper

optimization, generative model, bandits

0

0

0

0

13:22

18/07/2021

Adapting to Delays and Data in Adversarial Multi-Armed Bandits

András György, Pooria Joulani

Keywords Paper

Deep Learning, Attention Models, Applications, Time Series Analysis; Deep Learning, Predictive Models, Reinforcement Learning and Planning, Bandits

0

0

0

0

6:18

04/08/2021

Minimax Regret for Stochastic Shortest Path with Adversarial Costs and Known Transition

Liyu Chen, Haipeng Luo, Chen-Yu Wei

Keywords Paper

0

0

0

0

14:48

13/04/2021

Instance-wise minimax-optimal algorithms for logistic bandits

Marc Abeille, Louis Faury, Clement Calauzenes

Keywords Paper

0

0

0

0

3:06

02/02/2021

Disposable Linear Bandits for Online Recommendations

Melda Korkut, Andrew Li

Keywords Paper

0

0

0

0

17:20

13/04/2021

Tractable contextual bandits beyond realizability

Sanath Kumar Krishnamurthy, Vitor Hadad, Susan Athey

Keywords Paper

0

0

0

0

2:51