06/12/2021

Indexed Minimum Empirical Divergence for Unimodal Bandits

Hassan SABER, Pierre Ménard, Odalric-Ambrym Maillard

Keywords: bandits

Abstract: We consider a stochastic multi-armed bandit problem specified by a set of one-dimensional family exponential distributions endowed with a unimodal structure. The unimodal structure is of practical relevance for several applications. We introduce IMED-UB, an algorithm that exploits provably optimally the unimodal-structure, by adapting to this setting the Indexed Minimum Empirical Divergence (IMED) algorithm introduced by Honda and Takemura (2015). Owing to our proof technique, we are able to provide a concise finite-time analysis of the IMED-UB algorithm, that is simple and yet yields asymptotic optimality. We finally provide numerical experiments showing that IMED-UB competes favorably with the recently introduced state-of-the-art algorithms.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd

Similar Papers