19/08/2021

Epsilon Best Arm Identification in Spectral Bandits

Tomáš Kocák, Aurélien Garivier

Keywords: Machine Learning, Learning Theory, Online Learning

Abstract: We propose an analysis of Probably Approximately Correct (PAC) identification of an ϵ-best arm in graph bandit models with Gaussian distributions. We consider finite but potentially very large bandit models where the set of arms is endowed with a graph structure, and we assume that the arms' expectations μ are smooth with respect to this graph. Our goal is to identify an arm whose expectation is at most ϵ below the largest of all means. We focus on the fixed-confidence setting: given a risk parameter δ, we consider sequential strategies that yield an ϵ-optimal arm with probability at least 1-δ. All such strategies use at least T*(μ)log(1/δ) samples, where R is the smoothness parameter. We identify the complexity term T*(μ) as the solution of a min-max problem for which we give a game-theoretic analysis and an approximation procedure. This procedure is the key element required by the asymptotically optimal Track-and-Stop strategy.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at IJCAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers