Approximation Theory Based Methods for RKHS Bandits

Abstract: The RKHS bandit problem (also called kernelized multi-armed bandit problem) is an online optimization problem of non-linear functions with noisy feedback. Although the problem has been extensively studied, there are unsatisfactory results for some problems compared to the well-studied linear bandit case. Specifically, there is no general algorithm for the adversarial RKHS bandit problem. In addition, high computational complexity of existing algorithms hinders practical application. We address these issues by considering a novel amalgamation of approximation theory and the misspecified linear bandit problem. Using an approximation method, we propose efficient algorithms for the stochastic RKHS bandit problem and the first general algorithm for the adversarial RKHS bandit problem. Furthermore, we empirically show that one of our proposed methods has comparable cumulative regret to IGP-UCB and its running time is much shorter.

06/12/2021

Approximation Theory Based Methods for RKHS Bandits

Sho Takemori, Masahiro Sato

Comments

Similar Papers

Optimal Gradient-based Algorithms for Non-concave Bandit Optimization

Baihe Huang, Kaixuan Huang, Sham Kakade and Jason Lee, Qi Lei, Runzhe Wang, Jiaqi Yang

Keywords Abstract Paper

theory, deep learning, optimization, generative model, bandits

Misspecified Gaussian Process Bandit Optimization

Ilija Bogunovic, Andreas Krause

Keywords Abstract Paper

optimization, bandits, kernel methods

Efficient Bandit Convex Optimization: Beyond Linear Losses

Arun Sai Suggala, Pradeep Ravikumar, Praneeth Netrapalli

Keywords Abstract Paper

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits

Andrea Tirinzoni, Matteo Pirotta, Marcello Restelli, Alessandro Lazaric

Keywords Abstract Paper

Corruption-Tolerant Gaussian Process Bandit Optimization

Ilija Bogunovic, Andreas Krause, Jonathan Scarlett

Keywords Abstract Paper

Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits

Shinji Ito, Shuichi Hirahara, Tasuku Soma, Yuichi Yoshida

Keywords Abstract Paper

Beyond Variance Reduction: Understanding the True Impact of Baselines on Policy Optimization

Wes Chung, Valentin Thomas, Marlos C. Machado, Nicolas Le Roux

Keywords Abstract Paper

Reinforcement Learning and Planning

Quasi-Newton Solver for Robust Non-Rigid Registration

Yuxin Yao, Bailin Deng, Weiwei Xu, Juyong Zhang

Keywords Abstract Paper

non-rigid registration, robust estimator, quasi-newton, welsch's function, mm algorithm, l-bfgs, deformation graph.

SLIP: Learning to predict in unknown dynamical systems with long-term memory

Paria Rashidinejad, Jiantao Jiao, Stuart Russell

Keywords Abstract Paper

Algorithms -> Online Learning; Theory -> Learning Theory, Algorithms -> Bandit Algorithms

Experimental design for regret minimization in linear bandits

Andrew Wagenmaker, Julian Katz-Samuels, Kevin Jamieson

Keywords Abstract Paper

Exponential convergence rates of classification errors on learning with SGD and random features

Shingo Yashima, Atsushi Nitanda, Taiji Suzuki

Keywords Abstract Paper

No-Regret Prediction in Marginally Stable Systems

Udaya Ghai, Holden Lee, Karan Singh and Cyril Zhang, Yi Zhang

Keywords Abstract Paper

Online learning, Planning and control

Logarithmic Regret for Online Control with Adversarial Noise

Dylan Foster, Max Simchowitz

Keywords Abstract Paper

Reinforcement Learning - Theory

Delay and Cooperation in Nonstochastic Linear Bandits

Shinji Ito, Daisuke Hatano, Hanna Sumita and Kei Takemura, Takuro Fukunaga, Naonori Kakimura, Ken-Ichi Kawarabayashi

Keywords Abstract Paper

Dynamic Regret of Convex and Smooth Functions

Peng Zhao, Yu-Jie Zhang, Lijun Zhang, Zhi-Hua Zhou

Keywords Abstract Paper

Bias no more: high-probability data-dependent regret bounds for adversarial bandits and MDPs

Chung-Wei Lee, Haipeng Luo, Chen-Yu Wei, Mengxiao Zhang

Keywords Abstract Paper

Large-Scale Learning with Fourier Features and Tensor Decompositions

Frederiek Wesel, Kim Batselier

Keywords Abstract Paper

machine learning, kernel methods

Adapting to Misspecification in Contextual Bandits

Dylan Foster, Claudio Gentile, Mehryar Mohri, Julian Zimmert

Keywords Abstract Paper

Iteratively Reweighted Least Squares for Basis Pursuit with Global Linear Convergence Rate

Christian Kümmerle, Claudio Mayrink Verdun, Dominik Stöger

Keywords Abstract Paper

theory, optimization, machine learning

Implicit differentiation of Lasso-type models for hyperparameter optimization

Quentin Bertrand, Quentin Klopfenstein, Mathieu Blondel and Samuel Vaiter, Alexandre Gramfort, Joseph Salmon

Keywords Abstract Paper

Optimization - General

A block decomposition algorithm for sparse optimization

Ganzhao Yuan, Li Shen, Wei-Shi Zheng

Keywords Abstract Paper

NP-hard, nonconvex optimization, block coordinate descent, sparse optimization, convex optimization

Low-rank generalized linear bandit problems

Yangyi Lu, Amirhossein Meisami, Ambuj Tewari

Baihe Huang, Kaixuan Huang, Sham Kakade and
Jason Lee, Qi Lei, Runzhe Wang, Jiaqi Yang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Udaya Ghai, Holden Lee, Karan Singh and
Cyril Zhang, Yi Zhang

Keywords Paper

Keywords Paper

Shinji Ito, Daisuke Hatano, Hanna Sumita and
Kei Takemura, Takuro Fukunaga, Naonori Kakimura, Ken-Ichi Kawarabayashi

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Quentin Bertrand, Quentin Klopfenstein, Mathieu Blondel and
Samuel Vaiter, Alexandre Gramfort, Joseph Salmon

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Dheeraj Nagaraj, Xian Wu, Guy Bresler and
Prateek Jain, Praneeth Netrapalli

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper