On Thompson Sampling with Langevin Algorithms

Abstract: Thompson sampling has been demonstrated both theoretically and empirically to enjoy favorable performance in tackling multi-armed bandits problems. Despite its successes, however, one key obstacle to its use in a much broader range of scenarios is the need for perfect samples from posterior distributions at every iteration, which is oftentimes not feasible in practice. We propose a Markov Chain Monte Carlo (MCMC) method tailored to Thompson sampling to address this issue. We construct a fast converging Langevin algorithm to generate approximate samples with accuracy guarantees. We then leverage novel posterior concentration rates to analyze the statistical risk of the overall Thompson sampling method. Finally, we specify the necessary hyperparameters and the required computational resources for the MCMC procedure to match the optimal risk. The resulting algorithm enjoys both optimal instance-dependent frequentist regret and appealing computation complexity.

06/12/2021

On Thompson Sampling with Langevin Algorithms

Eric Mazumdar, Aldo Pacchiano, Yian Ma, Michael Jordan, Peter Bartlett

Comments

Similar Papers

A Provably Efficient Model-Free Posterior Sampling Method for Episodic Reinforcement Learning

Christoph Dann, Mehryar Mohri, Tong Zhang, Julian Zimmert

Keywords Abstract Paper

theory, optimization, reinforcement learning and planning, bandits

Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networks

Rong Zhu, Mattia Rigotti

Keywords Abstract Paper

theory, deep learning, reinforcement learning and planning, bandits

Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs

Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang

Keywords Abstract Paper

Dealing With Misspecification In Fixed-Confidence Linear Top-m Identification

Clémence Réda, Andrea Tirinzoni, Rémy Degenne

Keywords Abstract Paper

theory, reinforcement learning and planning, bandits

Online Multi-Armed Bandits with Adaptive Inference

Maria Dimakopoulou, Zhimei Ren, Zhengyuan Zhou

Keywords Abstract Paper

theory, reinforcement learning and planning, bandits, online learning, causality

Perturbation Based Learning for Structured NLP tasks with Application to Dependency Parsing

Amichay Doitch, Ram Yazdi, Tamir Hazan, Roi Reichart

Keywords Abstract Paper

Structured tasks, Dependency Parsing, NLP, sampling

On the Suboptimality of Thompson Sampling in High Dimensions

Raymond Zhang, Richard Combes

Keywords Abstract Paper

reinforcement learning and planning, bandits

A Closer Look at the Worst-case Behavior of Multi-armed Bandit Algorithms

Anand Kalvit, Assaf Zeevi

Keywords Abstract Paper

bandits

Principal component regression with semirandom observations via matrix completion

Aditya Bhaskara, Aravinda Kanchana Ruwanpathirana, Maheshakya Wijewardena

Keywords Abstract Paper

Adaptive Exploration in Linear Contextual Bandit

Botao Hao, Tor Lattimore, Csaba Szepesvari

Keywords Abstract Paper

Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning

Gen Li, Laixi Shi, Yuxin Chen and Yuantao Gu, Yuejie Chi

Keywords Abstract Paper

theory, reinforcement learning and planning

Structured Dropout Variational Inference for Bayesian Neural Networks

Son Nguyen, Duong Nguyen, Khai Nguyen and Khoat Than, Hung Bui, Nhat Ho

Keywords Abstract Paper

deep learning, generative model

Bayesian decision-making under misspecified priors with applications to meta-learning

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Abstract Paper

meta learning, bandits

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

Julian Katz-Samuels, Lalit Jain, zohar karnin, Kevin Jamieson

Keywords Abstract Paper

Sample-Efficient Reinforcement Learning Is Feasible for Linearly Realizable MDPs with Limited Revisiting

Gen Li, Yuxin Chen, Yuejie Chi and Yuantao Gu, Yuting Wei

Keywords Abstract Paper

theory, reinforcement learning and planning, generative model

Joint Stochastic Approximation and Its Application to Learning Discrete Latent Variable Models

Zhijian Ou, Yunfu Song

Keywords Abstract Paper

Thompson Sampling Algorithms for Mean-Variance Bandits

Qiuyu Zhu, Vincent Tan

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Accelerated Combinatorial Search for Outlier Detection with Provable Bound on Sub-Optimality

Guihong Wan, Haim Schweitzer

Keywords Abstract Paper

Active Slices for Sliced Stein Discrepancy

Wenbo Gong, Kaibo Zhang, Yingzhen Li, Jose Miguel Hernandez-Lobato

Keywords Abstract Paper

, Deep Learning, Efficient Inference Methods, Algorithms, Kernel Methods

Data-driven confidence bands for distributed nonparametric regression

Valeriy Avanesov

Keywords Abstract Paper

Kernel methods, Excess risk bounds and generalization error bounds, Regression, Sampling algorithms, Supervised learning

DeepPseudo: Pseudo Value Based Deep Learning Models for Competing Risk Analysis

Md Mahmudur Rahman, Koji Matsuo, Shinya Matsuzaki, Sanjay Purushotham

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Gen Li, Laixi Shi, Yuxin Chen and
Yuantao Gu, Yuejie Chi

Keywords Paper

Son Nguyen, Duong Nguyen, Khai Nguyen and
Khoat Than, Hung Bui, Nhat Ho

Keywords Paper

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and
Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Paper

Keywords Paper

Gen Li, Yuxin Chen, Yuejie Chi and
Yuantao Gu, Yuting Wei

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Sattar Vakili, Henry Moss, Artem Artemev and
Vincent Dutordoir, Victor Picheny

Keywords Paper

Keywords Paper

Keywords Paper

Robert Mattila, Cristian Rojas, Eric Moulines and
Vikram Krishnamurthy, Bo Wahlberg

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper