Meta-Thompson Sampling

18/07/2021

Meta-Thompson Sampling

Branislav Kveton, Mikhail Konobeev, Manzil Zaheer, Chih-wei Hsu, Martin Mladenov, Craig Boutilier, Csaba Szepesvari

Keywords: Reinforcement Learning and Planning, Bandits

Abstract Paper Similar Papers

Abstract: Efficient exploration in bandits is a fundamental online learning problem. We propose a variant of Thompson sampling that learns to explore better as it interacts with bandit instances drawn from an unknown prior. The algorithm meta-learns the prior and thus we call it MetaTS. We propose several efficient implementations of MetaTS and analyze it in Gaussian bandits. Our analysis shows the benefit of meta-learning and is of a broader interest, because we derive a novel prior-dependent Bayes regret bound for Thompson sampling. Our theory is complemented by empirical evaluation, which shows that MetaTS quickly adapts to the unknown prior.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Metadata-based Multi-Task Bandits with Bayesian Hierarchical Models

Runzhe Wan, Lin Ge, Rui Song

Keywords Paper

meta learning, bandits, transfer learning

0

0

0

0

13:18

12/07/2020

Influence Diagram Bandits

Tong Yu, Branislav Kveton, Zheng Wen and
Ruiyi Zhang, Ole J. Mengshoel

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

14:14

06/12/2020

Latent Bandits Revisited

Joey Hong, Branislav Kveton, Manzil Zaheer and
Yinlam Chow, Amr Ahmed, Craig Boutilier

Keywords Paper

0

0

0

0

3:11

12/07/2020

Meta-learning with Stochastic Linear Bandits

Leonardo Cella, Alessandro Lazaric, Massimiliano Pontil

Keywords Paper

Transfer, Multitask and Meta-learning

1

1

0

0

13:17

02/02/2021

Dynamic Automaton-Guided Reward Shaping for Monte Carlo Tree Search

Alvaro Velasquez, Brett Bissey, Lior Barak and
Andre Beckus, Ismail Alkhouri, Daniel Melcer, George Atia

Keywords Paper

0

0

0

0

18:52

06/12/2021

Learning to Learn Dense Gaussian Processes for Few-Shot Learning

Ze Wang, Zichen Miao, Xiantong Zhen, Qiang Qiu

Keywords Paper

deep learning, optimization, generative model, meta learning, kernel methods, few shot learning

0

0

0

0

5:21

06/12/2020

Modeling and Optimization Trade-off in Meta-learning

Katelyn Gao, Ozan Sener

Keywords Paper

0

0

0

0

3:21

04/08/2021

Bounded Memory Active Learning through Enriched Queries

Max Hopkins, Daniel Kane, Shachar Lovett, Michal Moshkovitz

Keywords Paper

1

1

0

0

18:26

18/07/2021

Message Passing Adaptive Resonance Theory for Online Active Semi-supervised Learning

Taehyeong Kim, Injune Hwang, Hyundo Lee and
Hyunseo Kim, Won-Seok Choi, Joseph Lim, Byoung-Tak Zhang

Keywords Paper

Algorithms, Active Learning

0

0

0

0

4:53

18/07/2021

DriftSurf: Stable-State / Reactive-State Learning under Concept Drift

Ashraf Tahmasbi, Ellango Jothimurugesan, Srikanta Tirthapura, Phil Gibbons

Keywords Paper

Algorithms, Online Learning Algorithms

0

0

0

0

5:07

06/12/2021

NeurWIN: Neural Whittle Index Network For Restless Bandits Via Deep RL

Khaled Nakhleh, Santosh Ganji, Ping-Chun Hsieh and
I-Hong Hou, Srinivas Shakkottai

Keywords Paper

deep learning, reinforcement learning and planning, bandits

0

0

0

0

14:51

02/02/2021

Learning from eXtreme Bandit Feedback

Romain Lopez, Inderjit S. Dhillon, Michael I. Jordan

Keywords Paper

0

0

0

0

19:29

06/12/2021

No Regrets for Learning the Prior in Bandits

Soumya Basu, Branislav Kveton, Manzil Zaheer, Csaba Szepesvari

Keywords Paper

theory, meta learning, bandits, online learning

0

0

0

0

11:56

26/04/2020

Learning to Plan in High Dimensions via Neural Exploration-Exploitation Trees

Binghong Chen, Bo Dai, Qinjie Lin and
Guo Ye, Han Liu, Le Song

Keywords Paper

learning to plan, representation learning, learning to design algorithm, reinforcement learning, meta learning

0

0

0

0

4:59

26/04/2020

Bayesian Meta Sampling for Fast Uncertainty Adaptation

Zhenyi Wang, Yang Zhao, Ping Yu and
Ruiyi Zhang, Changyou Chen

Keywords Paper

Bayesian Sampling, Uncertainty Adaptation, Meta Learning, Variational Inference

0

0

0

0

4:44

06/12/2020

Differentiable Meta-Learning of Bandit Policies

Craig Boutilier, Chih-wei Hsu, Branislav Kveton and
Martin Mladenov, Csaba Szepesvari, Manzil Zaheer

Keywords Paper

0

0

0

0

3:10

26/08/2020

Active Community Detection with Maximal Expected Model Change

Dan Kushnir, Benjamin Mirabelli

Keywords Paper

0

0

0

0

14:55

12/07/2020

Thompson Sampling via Local Uncertainty

Zhendong Wang, Mingyuan Zhou

Keywords Paper

Probabilistic Inference - Models and Probabilistic Programming

0

0

0

0

11:59

19/08/2021

Ordering-Based Causal Discovery with Reinforcement Learning

Xiaoqiang Wang, Yali Du, Shengyu Zhu and
Liangjun Ke, Zhitang Chen, Jianye Hao, Jun Wang

Keywords Paper

Machine Learning Applications, Applications of Reinforcement Learning, Bayesian Networks

0

0

0

0

8:44

02/02/2021

Reward-Biased Maximum Likelihood Estimation for Linear Stochastic Bandits

Yu-Heng Hung, Ping-Chun Hsieh, Xi Liu, P. R. Kumar

Keywords Paper

0

0

0

0

19:35

03/05/2021

Auxiliary Task Update Decomposition: The Good, the Bad and the Neutral

Lucio Dery, Yann Dauphin, David Grangier

Keywords Paper

multitask learning, deeplearning, pre-training, gradient decomposition

0

0

0

0

5:22

06/12/2021

Hindsight Task Relabelling: Experience Replay for Sparse Reward Meta-RL

Charles Packer, Pieter Abbeel, Joseph Gonzalez

Keywords Paper

reinforcement learning and planning

1

0

0

0

14:03

06/12/2021

Meta Learning Backpropagation And Improving It

Louis Kirsch, Jürgen Schmidhuber

Keywords Paper

deep learning, optimization, generative model, meta learning

0

0

0

0

12:39

22/09/2020

Deep bayesian bandits: Exploring in online personalized recommendations

Dalin Guo, Sofia Ira Ktena, Pranay Kumar Myana and
Ferenc Huszar, Wenzhe Shi, Alykhan Tejani, Michael Kneier, Sourav Das

Keywords Paper

Contextual bandit, Recommender Systems, Algorithmic bias

0

0

0

0

2:59

06/12/2020

Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning

Younggyo Seo, Kimin Lee, Ignasi Clavera Gilaberte and
Thanard Kurutach, Jinwoo Shin, Pieter Abbeel

Keywords Paper

0

0

0

0

3:20

04/08/2021

Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective

Dylan Foster, Alexander Rakhlin, David Simchi-Levi, Yunzong Xu

Keywords Paper

0

0

0

0

16:53

06/12/2021

Online Sign Identification: Minimization of the Number of Errors in Thresholding Bandits

Reda Ouhamma, Rémy Degenne, Vianney Perchet, Pierre Gaillard

Keywords Paper

bandits, online learning

0

0

0

0

14:36

18/11/2020

CCA-flow: Deep multi-view subspace learning with inverse autoregressive flow

Jia He, Feiyang Pan, Fuzhen Zhuang, Qing He

Keywords Paper

0

0

0

0

11:33

04/08/2021

Online Markov Decision Processes with Aggregate Bandit Feedback

Alon Cohen, Haim Kaplan, Tomer Koren, Yishay Mansour

Keywords Paper

0

0

0

0

13:07

06/12/2021

NeuroLKH: Combining Deep Learning Model with Lin-Kernighan-Helsgaun Heuristic for Solving the Traveling Salesman Problem

Liang Xin, Wen Song, Zhiguang Cao, Jie Zhang

Keywords Paper

deep learning, self-supervised learning, graph learning

0

0

0

0

8:14

03/05/2021

Neurally Augmented ALISTA

Freya Behrens, Jonathan Sauder, Peter Jung

Keywords Paper

learned ISTA, unrolled algorithms, compressed sensing, sparse reconstruction

0

0

0

0

5:18

06/12/2021

Reducing Collision Checking for Sampling-Based Motion Planning Using Graph Neural Networks

Chenning Yu, Sicun Gao

Keywords Paper

deep learning, reinforcement learning and planning, graph learning

0

0

0

0

2:51

06/12/2021

Posterior Meta-Replay for Continual Learning

Christian Henning, Maria Cervera, Francesco D'Angelo and
Johannes von Oswald, Regina Traber, Benjamin Ehret, Seijin Kobayashi, Benjamin F. Grewe, João Sacramento

Keywords Paper

deep learning, continual learning

0

0

0

0

12:27

12/07/2020

Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning

Dipendra Misra, Mikael Henaff, Akshay Krishnamurthy, John Langford

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

15:43

02/02/2021

Decoupled and Memory-Reinforced Networks: Towards Effective Feature Learning for One-Step Person Search

Chuchu Han, Zhedong Zheng, Changxin Gao and
Nong Sang, Yi Yang

Keywords Paper

0

0

0

0

10:34

06/12/2020

Bandit Samplers for Training Graph Neural Networks

Ziqi Liu, Zhengwei Wu, Zhiqiang Zhang and
Jun Zhou, Shuang Yang, Le Song, Yuan Qi

Keywords Paper

0

0

0

0

3:15

26/04/2020

Continual Learning with Bayesian Neural Networks for Non-Stationary Data

Richard Kurle, Botond Cseke, Alexej Klushyn and
Patrick van der Smagt, Stephan Günnemann

Keywords Paper

Continual Learning, Online Variational Bayes, Non-Stationary Data, Bayesian Neural Networks, Variational Inference, Lifelong Learning, Concept Drift, Episodic Memory

0

0

0

0

5:26

22/06/2020

Learning Credal Sum-Product Networks

Amelie Levray, Vaishak Belle

Keywords Paper

credal networks, imprecise probabilities, tractable learning

0

0

0

0

5:10

12/07/2020

Searching to Exploit Memorization Effect in Learning with Noisy Labels

QUANMING YAO, Hansi Yang, Bo Han and
Gang Niu, James Kwok

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

12:25

26/04/2020

Frequency-based Search-control in Dyna

Yangchen Pan, Jincheng Mei, Amir-massoud Farahmand

Keywords Paper

Model-based reinforcement learning, search-control, Dyna, frequency of a signal

0

0

0

0

4:32