Adapting to Misspecification in Contextual Bandits

06/12/2020

Adapting to Misspecification in Contextual Bandits

Dylan Foster, Claudio Gentile, Mehryar Mohri, Julian Zimmert

Keywords:

Abstract Paper Similar Papers

Abstract: A major research direction in contextual bandits is to develop algorithms that are computationally efficient, yet support flexible, general-purpose function approximation. Algorithms based on modeling rewards have shown strong empirical performance, yet typically require a well-specified model, and can fail when this assumption does not hold. Can we design algorithms that are efficient and flexible, yet degrade gracefully in the face of model misspecification? We introduce a new family of oracle-efficient algorithms for $\varepsilon$-misspecified contextual bandits that adapt to unknown model misspecification---both for finite and infinite action settings. Given access to an \emph{online oracle} for square loss regression, our algorithm attains optimal regret and---in particular---optimal dependence on the misspecification level, with \emph{no prior knowledge}. Specializing to linear contextual bandits with infinite actions in $d$ dimensions, we obtain the first algorithm that achieves the optimal $\bigoht(d\sqrt{T} + \varepsilon\sqrt{d}T)$ regret bound for unknown $\varepsilon$. On a conceptual level, our results are enabled by a new optimization-based perspective on the regression oracle reduction framework of Foster and Rakhlin (2020), which we believe will be useful more broadly.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits

Shinji Ito, Shuichi Hirahara, Tasuku Soma, Yuichi Yoshida

Keywords Paper

0

0

0

0

3:24

06/12/2021

Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning

Gen Li, Laixi Shi, Yuxin Chen and
Yuantao Gu, Yuejie Chi

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

15:32

03/08/2020

Semi-bandit Optimization in the Dispersed Setting

Travis Dick, Wesley Pegden, Maria-Florina Balcan

Keywords Paper

0

0

0

0

8:04

04/08/2021

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Paper

0

0

0

0

20:29

06/12/2020

Dynamic Regret of Convex and Smooth Functions

Peng Zhao, Yu-Jie Zhang, Lijun Zhang, Zhi-Hua Zhou

Keywords Paper

1

1

1

1

3:09

06/12/2020

Provably Efficient Reinforcement Learning with Kernel and Neural Function Approximations

Zhuoran Yang, Chi Jin, Zhaoran Wang and
Mengdi Wang, Michael Jordan

Keywords Paper

0

0

0

0

3:42

12/07/2020

Black-Box Methods for Restoring Monotonicity

Evangelia Gergatsouli, Brendan Lucier, Christos Tzamos

Keywords Paper

Learning Theory

0

0

0

0

15:40

06/12/2021

Calibration and Consistency of Adversarial Surrogate Losses

Pranjal Awasthi, Natalie Frank, Anqi Mao and
Mehryar Mohri, Yutao Zhong

Keywords Paper

theory, optimization, machine learning, robustness, adversarial robustness and security

0

0

0

0

13:30

03/05/2021

Learning to Make Decisions via Submodular Regularization

Ayya Alieva, Aiden Aceves, Jialin Song and
Stephen Mayo, Yisong Yue, Yuxin Chen

Keywords Paper

0

0

0

0

5:53

06/12/2021

Policy Optimization in Adversarial MDPs: Improved Exploration via Dilated Bonuses

Haipeng Luo, Chen-Yu Wei, Chung-Wei Lee

Keywords Paper

optimization, reinforcement learning and planning, bandits

0

0

0

0

15:17

12/07/2020

Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles

Dylan Foster, Alexander Rakhlin

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

14:39

06/12/2021

Misspecified Gaussian Process Bandit Optimization

Ilija Bogunovic, Andreas Krause

Keywords Paper

optimization, bandits, kernel methods

0

0

0

0

11:41

18/07/2021

Regret Minimization in Stochastic Non-Convex Learning via a Proximal-Gradient Approach

Nadav Hallak, Panayotis Mertikopoulos, Volkan Cevher

Keywords Paper

Optimization, Non-Convex Optimization

0

0

0

0

5:06

26/08/2020

Learning piecewise Lipschitz functions in changing environments

Dravyansh Sharma, Maria-Florina Balcan, Travis Dick

Keywords Paper

0

0

0

0

14:57

26/08/2020

Distributionally Robust Bayesian Optimization

Johannes Kirschner, Ilija Bogunovic, Stefanie Jegelka, Andreas Krause

Keywords Paper

0

0

0

0

14:35

04/08/2021

Towards a Query-Optimal and Time-Efficient Algorithm for Clustering with a Faulty Oracle

Pan Peng, Jiapeng Zhang

Keywords Paper

0

0

0

0

15:53

18/07/2021

The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks

Bohan Wang, Qi Meng, Wei Chen, Tie-Yan Liu

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

16:53

06/12/2020

Follow the Perturbed Leader: Optimism and Fast Parallel Algorithms for Smooth Minimax Games

Arun Suggala, Praneeth Netrapalli

Keywords Paper

1

1

0

0

3:29

12/07/2020

Online Convex Optimization in the Random Order Model

Dan Garber, Gal Korcia, Kfir Levy

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

10:06

18/07/2021

Submodular Maximization subject to a Knapsack Constraint: Combinatorial Algorithms with Near-optimal Adaptive Complexity

Yorgos Amanatidis, Federico Fusco, Philip Lazos and
Stefano Leonardi, Alberto Marchetti-Spaccamela, Rebecca Reiffenhäuser

Keywords Paper

Optimization, Combinatorial Optimization

0

0

0

0

5:15

26/08/2020

Improved Regret Bounds for Projection-free Bandit Convex Optimization

Dan Garber, Ben Kretzu

Keywords Paper

0

0

0

0

6:03

06/12/2021

Oracle Complexity in Nonsmooth Nonconvex Optimization

Guy Kornowski, Ohad Shamir

Keywords Paper

theory, deep learning, optimization

0

0

0

0

18:30

06/12/2020

Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs

Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang

Keywords Paper

0

0

0

0

3:16

13/04/2021

No-regret algorithms for multi-task bayesian optimization

Sayak Ray Chowdhury, Aditya Gopalan

Keywords Paper

0

0

0

0

3:06

03/08/2020

A Simple Online Algorithm for Competing with Dynamic Comparators

Yu-Jie Zhang, Peng Zhao, Zhi-Hua Zhou

Keywords Paper

0

0

0

0

7:44

09/07/2020

Logistic Regression Regret: What’s the Catch?

Gil I Shamir

Keywords Paper

Online learning, Convex optimization, Information theory, Regression

0

0

0

0

16:03

13/04/2021

Experimental design for regret minimization in linear bandits

Andrew Wagenmaker, Julian Katz-Samuels, Kevin Jamieson

Keywords Paper

0

0

0

0

3:05

02/02/2021

Projection-free Online Learning in Dynamic Environments

Yuanyu Wan, Bo Xue, Lijun Zhang

Keywords Paper

0

0

0

0

15:41

26/08/2020

Corruption-Tolerant Gaussian Process Bandit Optimization

Ilija Bogunovic, Andreas Krause, Jonathan Scarlett

Keywords Paper

0

0

0

0

12:01

13/04/2021

Online model selection for reinforcement learning with function approximation

Jonathan Lee, Aldo Pacchiano, Vidya Muthukumar and
Weihao Kong, Emma Brunskill

Keywords Paper

0

0

0

0

3:15

13/04/2021

Adaptive sampling for fast constrained maximization of submodular functions

Francesco Quinzan, Vanja Doskoc, Andreas Göbel, Tobias Friedrich

Keywords Paper

0

0

0

0

2:54

06/12/2021

Optimal Sketching for Trace Estimation

Shuli Jiang, Hai Pham, David Woodruff, Richard Zhang

Keywords Paper

machine learning

0

0

0

0

15:14

09/07/2020

Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations

Yossi Arjevani, Yair Carmon, John Duchi and
Dylan Foster, Ayush Sekhari, Karthik Sridharan

Keywords Paper

Non-convex optimization, Stochastic optimization

0

0

0

0

11:57

06/12/2021

Risk Minimization from Adaptively Collected Data: Guarantees for Supervised and Policy Learning

Aurelien Bibaut, Nathan Kallus, Maria Dimakopoulou and
Antoine Chambaz, Mark van der Laan

Keywords Paper

theory, reinforcement learning and planning, machine learning, bandits

0

0

0

0

16:07

18/07/2021

On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization

Xu Cai, Jonathan Scarlett

Keywords Paper

Applications, Natural Language Processing, Applications, Network Analysis, Reinforcement Learning and Planning, Bandits

0

0

0

0

4:19

18/07/2021

Private Stochastic Convex Optimization: Optimal Rates in L1 Geometry

Hilal Asi, Vitaly Feldman, Tomer Koren, Kunal Talwar

Keywords Paper

Deep Learning, Algorithms, Multitask and Transfer Learning; Algorithms, Online Learning, Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

0

17:27

06/12/2020

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

Julian Katz-Samuels, Lalit Jain, zohar karnin, Kevin Jamieson

Keywords Paper

0

0

0

0

3:20

06/12/2020

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

Andrea Tirinzoni, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric

Keywords Paper

0

0

0

0

3:13

18/07/2021

Adapting to misspecification in contextual bandits with offline regression oracles

Sanath Kumar Krishnamurthy, Vitor Hadad, Susan Athey

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

4:17

09/07/2020

Estimating Principal Components under Adversarial Perturbations

Pranjal Awasthi, Xue Chen, Aravindan Vijayaraghavan

Keywords Paper

Unsupervised and semi-supervised learning, Adversarial learning and robustness

0

0

0

0

15:40