OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits

26/08/2020

OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits

Niladri Chatterji, Vidya Muthukumar, Peter Bartlett

Keywords:

Abstract Paper Similar Papers

Abstract: We consider the stochastic linear (multi-armed) contextual bandit problem with the possibility of hidden simple multi-armed bandit structure in which the rewards are independent of the contextual information. Algorithms that are designed solely for one of the regimes are known to be sub-optimal for their alternate regime. We design a single computationally efficient algorithm that simultaneously obtains problem-dependent optimal regret rates in the simple multi-armed bandit regime and minimax optimal regret rates in the linear contextual bandit regime, without knowing a priori which of the two models generates the rewards. These results are proved under the condition of stochasticity of contextual information over multiple rounds. Our results should be viewed as a step towards principled data-dependent policy class selection for contextual bandits.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AISTATS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Hybrid Regret Bounds for Combinatorial Semi-Bandits and Adversarial Linear Bandits

Shinji Ito

Keywords Paper

bandits

0

0

0

0

10:49

09/07/2020

Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes

YICHUN HU, Nathan Kallus, Xiaojie Mao

Keywords Paper

Bandit problems,

0

0

0

0

14:35

02/02/2021

Disposable Linear Bandits for Online Recommendations

Melda Korkut, Andrew Li

Keywords Paper

0

0

0

0

17:20

06/12/2021

Optimal Gradient-based Algorithms for Non-concave Bandit Optimization

Baihe Huang, Kaixuan Huang, Sham Kakade and
Jason Lee, Qi Lei, Runzhe Wang, Jiaqi Yang

Keywords Paper

theory, deep learning, optimization, generative model, bandits

0

0

0

0

10:53

13/04/2021

Stochastic linear bandits robust to adversarial attacks

Ilija Bogunovic, Arpan Losalka, Andreas Krause, Jonathan Scarlett

Keywords Paper

0

0

0

0

3:01

13/04/2021

Stochastic bandits with linear constraints

Aldo Pacchiano, Mohammad Ghavamzadeh, Peter Bartlett, Heinrich Jiang

Keywords Paper

0

0

0

0

3:02

13/04/2021

Experimental design for regret minimization in linear bandits

Andrew Wagenmaker, Julian Katz-Samuels, Kevin Jamieson

Keywords Paper

0

0

0

0

3:05

13/04/2021

Tractable contextual bandits beyond realizability

Sanath Kumar Krishnamurthy, Vitor Hadad, Susan Athey

Keywords Paper

0

0

0

0

2:51

06/12/2020

Model Selection in Contextual Stochastic Bandit Problems

Aldo Pacchiano, My Phan, Yasin Abbasi Yadkori and
Anup Rao, Julian Zimmert, Tor Lattimore, Csaba Szepesvari

Keywords Paper

0

0

0

0

3:22

02/02/2021

Adaptive Algorithms for Multi-armed Bandit with Composite and Anonymous Feedback

Siwei Wang, Haoyun Wang, Longbo Huang

Keywords Paper

0

0

0

0

19:29

06/12/2021

Continuous Mean-Covariance Bandits

Yihan Du, Siwei Wang, Zhixuan Fang, Longbo Huang

Keywords Paper

bandits

0

0

0

0

11:33

13/04/2021

Contextual blocking bandits

Soumya Basu, Orestis Papadigenopoulos, Constantine Caramanis, Sanjay Shakkottai

Keywords Paper

0

0

0

0

2:47

06/12/2020

On Regret with Multiple Best Arms

Yinglun Zhu, Robert Nowak

Keywords Paper

0

0

0

0

3:22

06/12/2021

Observation-Free Attacks on Stochastic Bandits

Yinglun Xu, Bhuvesh Kumar, Jacob Abernethy

Keywords Paper

bandits

0

0

0

0

3:07

04/08/2021

Corruption-robust exploration in episodic reinforcement learning

Thodoris Lykouris, Max Simchowitz, Alex Slivkins, Wen Sun

Keywords Paper

0

0

0

0

18:27

04/08/2021

Adaptive Discretization for Adversarial Lipschitz Bandits

Chara Podimata, Alex Slivkins

Keywords Paper

0

0

0

0

18:13

18/07/2021

Bias-Robust Bayesian Optimization via Dueling Bandits

Johannes Kirschner, Andreas Krause

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

4:54

06/12/2020

On Reward-Free Reinforcement Learning with Linear Function Approximation

Ruosong Wang, Simon Du, Lin Yang, Russ Salakhutdinov

Keywords Paper

0

0

0

0

3:12

04/08/2021

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Paper

0

0

0

0

20:29

13/04/2021

Low-rank generalized linear bandit problems

Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

Keywords Paper

0

0

0

0

2:49

06/12/2020

Latent Bandits Revisited

Joey Hong, Branislav Kveton, Manzil Zaheer and
Yinlam Chow, Amr Ahmed, Craig Boutilier

Keywords Paper

0

0

0

0

3:11

09/07/2020

Tight Lower Bounds for Combinatorial Multi-Armed Bandits

Nadav Merlis, Shie Mannor

Keywords Paper

Bandit problems, Learning with algebraic or combinatorial structure

0

0

0

0

14:00

26/08/2020

Thompson Sampling for Linearly Constrained Bandits

Vidit Saxena, Joakim Jalden, Joseph Gonzalez

Keywords Paper

0

0

0

0

13:06

04/08/2021

Asymptotically Optimal Information-Directed Sampling

Johannes Kirschner, Tor Lattimore, Claire Vernade, Csaba Szepesvari

Keywords Paper

0

0

0

0

15:01

06/12/2020

A Bandit Learning Algorithm and Applications to Auction Design

Kim Thang Nguyen

Keywords Paper

0

0

0

0

2:43

06/12/2021

The balancing principle for parameter choice in distance-regularized domain adaptation

Werner Zellinger, Natalia Shepeleva, Marius-Constantin Dinu and
Hamid Eghbal-zadeh, Hoan Duc Nguyen, Bernhard Nessler, Sergei Pereverzyev, Bernhard A. Moser

Keywords Paper

domain adaptation

0

0

0

0

12:47

16/11/2020

DORB: Dynamically Optimizing Multiple Rewards with Bandits

Ramakanth Pasunuru, Han Guo, Mohit Bansal

Keywords Paper

language tasks, optimization rewards, nlg tasks, question generation

0

0

0

0

11:34

18/07/2021

Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously

Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei and
Mengxiao Zhang, Xiaojin Zhang

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

5:12

18/07/2021

On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization

Xu Cai, Jonathan Scarlett

Keywords Paper

Applications, Natural Language Processing, Applications, Network Analysis, Reinforcement Learning and Planning, Bandits

0

0

0

0

4:19

04/08/2021

Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective

Dylan Foster, Alexander Rakhlin, David Simchi-Levi, Yunzong Xu

Keywords Paper

0

0

0

0

16:53

06/12/2021

Bayesian decision-making under misspecified priors with applications to meta-learning

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and
Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Paper

meta learning, bandits

0

0

0

0

14:58

04/08/2021

Double Explore-then-Commit: Asymptotic Optimality and Beyond

Tianyuan Jin, Pan Xu, Xiaokui Xiao, Quanquan Gu

Keywords Paper

0

0

0

0

13:57

09/07/2020

Information Directed Sampling for Linear Partial Monitoring

Johannes Kirschner, Tor Lattimore, Andreas Krause

Keywords Paper

Bandit problems, Partial monitoring

0

0

0

0

13:02

09/07/2020

Efficient and robust algorithms for adversarial linear contextual bandits

Gergely Neu, Julia Olkhovskaya

Keywords Paper

Bandit problems, Online learning

0

0

0

0

9:53

18/07/2021

Beyond Variance Reduction: Understanding the True Impact of Baselines on Policy Optimization

Wes Chung, Valentin Thomas, Marlos C. Machado, Nicolas Le Roux

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:13

12/07/2020

Structure Adaptive Algorithms for Stochastic Bandits

Rémy Degenne, Han Shao, Wouter Koolen

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

16:05

02/02/2021

Lenient Regret for Multi-Armed Bandits

Nadav Merlis, Shie Mannor

Keywords Paper

0

0

0

0

20:14

06/12/2021

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

Paria Rashidinejad, Banghua Zhu, Cong Ma and
Jiantao Jiao, Stuart Russell

Keywords Paper

theory, reinforcement learning and planning, bandits

0

0

0

0

12:21

26/08/2020

Decentralized Multi-player Multi-armed Bandits with No Collision Information

Chengshuai Shi, Wei Xiong, Cong Shen, Jing Yang

Keywords Paper

0

0

0

0

14:18

06/12/2020

Simultaneously Learning Stochastic and Adversarial Episodic MDPs with Known Transition

Tiancheng Jin, Haipeng Luo

Keywords Paper

0

0

0

0

3:39