An Optimal Elimination Algorithm for Learning a Best Arm

06/12/2020

An Optimal Elimination Algorithm for Learning a Best Arm

Avinatan Hassidim, Ron Kupfer, Yaron Singer

Keywords:

Abstract Paper Similar Papers

Abstract: We consider the classic problem of $(\epsilon,\delta)$-\texttt{PAC} learning a best arm where the goal is to identify with confidence $1-\delta$ an arm whose mean is an $\epsilon$-approximation to that of the highest mean arm in a multi-armed bandit setting. This problem is one of the most fundamental problems in statistics and learning theory, yet somewhat surprisingly its worst case sample complexity is not well understood. In this paper we propose a new approach for $(\epsilon,\delta)$-\texttt{PAC} learning a best arm. This approach leads to an algorithm whose sample complexity converges to \emph{exactly} the optimal sample complexity of $(\epsilon,\delta)$-learning the mean of $n$ arms separately and we complement this result with a conditional matching lower bound. More specifically: \begin{itemize} \item The algorithm's sample complexity converges to \emph{exactly} $\frac{n}{2\epsilon^2}\log \frac{1}{\delta}$ as $n$ grows and $\delta \geq \frac{1}{n}$; % \item We prove that no elimination algorithm obtains sample complexity arbitrarily lower than $\frac{n}{2\epsilon^2}\log \frac{1}{\delta}$. Elimination algorithms is a broad class of $(\epsilon,\delta)$-\texttt{PAC} best arm learning algorithms that includes many algorithms in the literature. \end{itemize} When $n$ is independent of $\delta$ our approach yields an algorithm whose sample complexity converges to $\frac{2n}{\epsilon^2} \log \frac{1}{\delta}$ as $n$ grows. In comparison with the best known algorithm for this problem our approach improves the sample complexity by a factor of over 1500 and over 6000 when $\delta\geq \frac{1}{n}$.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Bandits with many optimal arms

Rianne de Heide, James Cheshire, Pierre Ménard, Alexandra Carpentier

Keywords Paper

bandits

0

0

0

0

12:23

06/12/2021

List-Decodable Mean Estimation in Nearly-PCA Time

Ilias Diakonikolas, Daniel Kane, Daniel Kongsgaard and
Jerry Li, Kevin Tian

Keywords Paper

theory, clustering

0

0

0

0

14:21

12/07/2020

From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model

Aadirupa Saha, Aditya Gopalan

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

0

0

15:22

18/07/2021

PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization

Zhize Li, Hongyan Bao, Xiangliang Zhang, Peter Richtarik

Keywords Paper

Optimization

0

0

0

0

11:53

12/07/2020

On Efficient Low Distortion Ultrametric Embedding

Vincent Cohen-Addad, Karthik C. S., Guillaume Lagarde

Keywords Paper

Unsupervised and Semi-Supervised Learning

0

0

0

0

16:37

06/12/2021

Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination

Arpan Mukherjee, Ali Tajer, Pin-Yu Chen, Payel Das

Keywords Paper

theory, bandits

0

0

0

0

15:07

06/12/2020

Deterministic Approximation for Submodular Maximization over a Matroid in Nearly Linear Time

Kai Han, zongmai Cao, Shuang Cui, Benwei Wu

Keywords Paper

0

0

0

0

3:16

06/12/2021

Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers

Tommaso d'Orsi, Chih-Hung Liu, Rajai Nasser and
Gleb Novikov, David Steurer, Stefan Tiegel

Keywords Paper

optimization

0

0

0

0

10:44

06/12/2021

Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning

Gen Li, Laixi Shi, Yuxin Chen and
Yuantao Gu, Yuejie Chi

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

15:32

12/07/2020

Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: Joint Gradient Estimation and Tracking

Haoran Sun, Songtao Lu, Mingyi Hong

Keywords Paper

Optimization - Non-convex

0

0

0

0

13:56

06/12/2021

On the Sample Complexity of Privately Learning Axis-Aligned Rectangles

Menachem Sadigurschi, Uri Stemmer

Keywords Paper

theory, privacy

0

0

0

0

14:00

18/07/2021

Variance Reduction via Primal-Dual Accelerated Dual Averaging for Nonsmooth Convex Finite-Sums

Chaobing Song, Stephen Wright, Jelena Diakonikolas

Keywords Paper

Optimization, Convex Optimization

0

0

0

0

18:04

06/12/2020

Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time

Jerry Li, Guanghao Ye

Keywords Paper

0

0

0

0

3:13

18/07/2021

Top-k eXtreme Contextual Bandits with Arm Hierarchy

Rajat Sen, Alexander Rakhlin, Lexing Ying and
Rahul Kidambi, Dean Foster, Daniel Hill, Inderjit Dhillon

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

5:23

06/12/2020

Escaping Saddle-Point Faster under Interpolation-like Conditions

Abhishek Roy, Krishnakumar Balasubramanian, Saeed Ghadimi, Prasant Mohapatra

Keywords Paper

0

0

0

0

3:19

09/07/2020

A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates

Zhixian Lei, Kyle Luh, Prayaag Venkat, Fred Zhang

Keywords Paper

High-dimensional statistics, Adversarial learning and robustness

0

0

0

0

15:00

06/12/2021

Improved Coresets and Sublinear Algorithms for Power Means in Euclidean Spaces

Vincent Cohen-Addad, David Saulpic, Chris Schwiegelshohn

Keywords Paper

clustering

0

0

0

0

16:06

09/07/2020

How to trap a gradient flow

Dan Mikulincer, Sebastien Bubeck

Keywords Paper

Non-convex optimization,

0

0

0

0

15:01

12/07/2020

Near-optimal sample complexity bounds for learning Latent $k-$polytopes and applications to Ad-Mixtures

Chiranjib Bhattacharyya, Ravindran Kannan

Keywords Paper

Learning Theory

0

0

0

0

15:04

18/07/2021

Private Stochastic Convex Optimization: Optimal Rates in L1 Geometry

Hilal Asi, Vitaly Feldman, Tomer Koren, Kunal Talwar

Keywords Paper

Deep Learning, Algorithms, Multitask and Transfer Learning; Algorithms, Online Learning, Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

0

17:27

09/07/2020

Locally Private Hypothesis Selection

Sivakanth Gopi, Gautam Kamath, Janardhan D Kulkarni and
Aleksandar Nikolov, Steven Wu, Huanyu Zhang

Keywords Paper

Privacy, fairness, Distribution learning/testing

0

0

0

0

14:58

06/12/2020

The Flajolet-Martin Sketch Itself Preserves Differential Privacy: Private Counting with Minimal Space

Adam Smith, Shuang Song, Abhradeep Guha Thakurta

Keywords Paper

0

0

0

0

3:17

06/12/2021

Better Algorithms for Individually Fair $k$-Clustering

Maryam Negahbani, Deeparnab Chakrabarty

Keywords Paper

theory, self-supervised learning, clustering, fairness

0

0

0

0

14:02

06/12/2021

Sequential Algorithms for Testing Closeness of Distributions

Aadil Oufkir, Omar Fawzi, Nicolas Flammarion, Aurélien Garivier

Keywords Paper

theory

0

0

0

0

14:09

06/12/2021

LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes

Aditya Kusupati, Matthew Wallingford, Vivek Ramanujan and
Raghav Somani, Jae Sung Park, Krishna Pillutla, Prateek Jain, Sham Kakade, Ali Farhadi

Keywords Paper

machine learning, representation learning

0

0

0

0

15:05

09/07/2020

Estimating Principal Components under Adversarial Perturbations

Pranjal Awasthi, Xue Chen, Aravindan Vijayaraghavan

Keywords Paper

Unsupervised and semi-supervised learning, Adversarial learning and robustness

0

0

0

0

15:40

26/08/2020

An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays

Julian Zimmert, Yevgeny Seldin

Keywords Paper

0

0

0

0

13:12

06/12/2021

Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, Karthikeyan Shanmugam

Keywords Paper

0

0

0

0

14:56

22/06/2020

Efficiently learning structured distributions from untrusted batches

Sitan Chen, Jerry Li, Ankur Moitra

Keywords Paper

sum-of-squares, federated learning, VC complexity, Robust statistics

0

0

0

0

24:38

09/07/2020

Bessel Smoothing and Multi-Distribution Property Estimation

Yi Hao, Ping Li

Keywords Paper

Distribution learning/testing, High-dimensional statistics, Information theory

0

0

0

0

14:48

06/12/2020

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

Julian Katz-Samuels, Lalit Jain, zohar karnin, Kevin Jamieson

Keywords Paper

0

0

0

0

3:20

18/07/2021

Streaming and Distributed Algorithms for Robust Column Subset Selection

Shuli Jiang, Dongyu Li, Irene Mengze Li and
Arvind Mahankali, David Woodruff

Keywords Paper

Algorithms, Deep Learning, Generative Models, Deep Learning, Predictive Models; Deep Learning, Recurrent Networks

0

0

0

0

7:26

06/12/2020

Fourier Sparse Leverage Scores and Approximate Kernel Learning

Tamas Erdelyi, Cameron Musco, Christopher Musco

Keywords Paper

0

0

0

0

3:25

06/12/2021

Optimal Sketching for Trace Estimation

Shuli Jiang, Hai Pham, David Woodruff, Richard Zhang

Keywords Paper

machine learning

0

0

0

0

15:14

06/12/2020

Optimal Best-arm Identification in Linear Bandits

Yassir Jedra, Alexandre Proutiere

Keywords Paper

0

0

0

0

3:21

18/07/2021

Improving Ultrametrics Embeddings Through Coresets

Vincent Cohen-Addad, Rémi de Joannis de Verclos, Guillaume Lagarde

Keywords Paper

Algorithms, Clustering

0

0

0

0

5:19

18/07/2021

First-Order Methods for Wasserstein Distributionally Robust MDP

Julien Grand-Clement, Christian Kroer

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:18

08/07/2020

The Online Min-Sum Set Cover Problem

Dimitris Fotakis, Loukas Kavouras, Grigorios Koumoutsos and
Stratis Skoulakis, Manolis Vardas

Keywords Paper

Online Algorithms, Competitive Analysis, Min-Sum Set Cover

0

0

0

0

25:10

06/12/2021

Doubly Robust Thompson Sampling with Linear Payoffs

Wonyoung Kim, Gi-Soo Kim, Myunghee Cho Paik

Keywords Paper

bandits

0

0

0

0

14:18

06/12/2020

Task-Robust Model-Agnostic Meta-Learning

Liam Collins, Aryan Mokhtari, Sanjay Shakkottai

Keywords Paper

0

0

0

0

3:17