Information Directed Sampling for Sparse Linear Bandits

Abstract: Stochastic sparse linear bandits offer a practical model for high-dimensional online decision-making problems and have a rich information-regret structure. In this work we explore the use of information-directed sampling (IDS), which naturally balances the information-regret trade-off. We develop a class of information-theoretic Bayesian regret bounds that nearly match existing lower bounds on a variety of problem instances, demonstrating the adaptivity of IDS. To efficiently implement sparse IDS, we propose an empirical Bayesian approach for sparse posterior sampling using a spike-and-slab Gaussian-Laplace prior. Numerical results demonstrate significant regret reductions by sparse IDS relative to several baselines.

19/08/2021

Information Directed Sampling for Sparse Linear Bandits

Botao Hao, Tor Lattimore, Wei Deng

Comments

Similar Papers

Neural Regret-Matching for Distributed Constraint Optimization Problems

Yanchen Deng, Runsheng Yu, Xinrun Wang, Bo An

Keywords Abstract Paper

Agent-based and Multi-agent Systems, Coordination and Cooperation, Constraint Optimization, Distributed Constraints

Asymptotically Optimal Information-Directed Sampling

Johannes Kirschner, Tor Lattimore, Claire Vernade, Csaba Szepesvari

Keywords Abstract Paper

Policy Optimization as Online Learning with Mediator Feedback

Alberto Maria Metelli, Matteo Papini, Pierluca D'Oro, Marcello Restelli

Keywords Abstract Paper

High-Dimensional Sparse Linear Bandits

Botao Hao, Tor Lattimore, Mengdi Wang

Keywords Abstract Paper

Stochastic Online Linear Regression: the Forward Algorithm to Replace Ridge

Reda Ouhamma, Odalric-Ambrym Maillard, Vianney Perchet

Keywords Abstract Paper

robustness, bandits

Tracking regret bounds for online submodular optimization

Tatsuya Matsuoka, Shinji Ito, Naoto Ohsaka

Keywords Abstract Paper

Optimal Algorithms for Stochastic Multi-Armed Bandits with Heavy Tailed Rewards

Kyungjae Lee, Hongjun Yang, Sungbin Lim, Songhwai Oh

Keywords Abstract Paper

Learning-to-learn non-convex piecewise-Lipschitz functions

Maria-Florina Balcan, Mikhail Khodak, Dravyansh Sharma, Ameet S Talwalkar

Keywords Abstract Paper

optimization, machine learning, robustness, meta learning, online learning

Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis

Vidyashankar Sivakumar, Steven Wu, Arindam Banerjee

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Thompson Sampling for Bandits with Clustered Arms

Emil Carlsson, Devdatt Dubhashi, Fredrik D. Johansson

Keywords Abstract Paper

Machine Learning, Online Learning, Learning Theory, Reinforcement Learning

Surrogate Regret Bounds for Polyhedral Losses

Rafael Frongillo, Bo Waggoner

Keywords Abstract Paper

machine learning

Adaptive Importance Sampling for Finite-Sum Optimization and Sampling with Decreasing Step-Sizes

Ayoub El Hanchi, David Stephens

Keywords Abstract Paper

Adaptive Sampling for Stochastic Risk-Averse Learning

Sebastian Curi, Kfir Y. Levy, Stefanie Jegelka, Andreas Krause

Keywords Abstract Paper

Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits

Shinji Ito, Shuichi Hirahara, Tasuku Soma, Yuichi Yoshida

Keywords Abstract Paper

Covariance-adapting algorithm for semi-bandits with application to sparse outcomes

Pierre Perrault, Vianney Perchet, Michal Valko

Keywords Abstract Paper

Bandit problems,

Boosting for Online Convex Optimization

Elad Hazan, Karan Singh

Keywords Abstract Paper

Theory, Online Learning Theory

Non-Exponentially Weighted Aggregation: Regret Bounds for Unbounded Loss Functions

Pierre Alquier

Keywords Abstract Paper

Probabilistic Methods, Bayesian Methods

Bandit algorithms: Letting go of logarithmic regret for statistical robustness

Kumar Ashutosh, Jayakrishnan Nair, Anmol Kagrecha, Krishna Jagannathan

Keywords Abstract Paper

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Abstract Paper

A Robust Univariate Mean Estimator is All You Need

Adarsh Prasad, Sivaraman Balakrishnan, Pradeep Ravikumar

Keywords Abstract Paper

Logistic Regression Regret: What’s the Catch?

Gil I Shamir

Keywords Abstract Paper

Online learning, Convex optimization, Information theory, Regression

Adaptive Discretization for Adversarial Lipschitz Bandits

Chara Podimata, Alex Slivkins

Keywords Abstract Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yu-Hang Zhou, Peng Hu, Chen Liang and
Huan Xu, Guangda Huzhang, Yinfu Feng, Qing Da, Xinshang Wang, An-Xiang Zeng

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper