From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model

12/07/2020

From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model

Aadirupa Saha, Aditya Gopalan

Keywords: Online Learning, Active Learning, and Bandits

Abstract Paper Similar Papers

Abstract: We consider PAC learning a good item from $k$-subsetwise feedback sampled from a Plackett-Luce probability model, with instance-dependent sample complexity performance. In the setting where subsets of a fixed size can be tested and top-ranked feedback is made available to the learner, we give an optimal instance-dependent algorithm with a sample complexity bound for PAC best arm identification algorithm of $O\bigg(\frac{\Theta_{[k]}}{k}\sum_{i = 2}^n\max\Big(1,\frac{1}{\Delta_i^2}\Big) \ln\frac{k}{\delta}\Big(\ln \frac{1}{\Delta_i}\Big)\bigg)$, $\Delta_i$ being the Plackett-Luce parameter gap between the best and the $i^{th}$ best item, and $\Theta_{[k]}$ is the sum of the Plackett-Luce parameters for top-$k$ items. The algorithm is based on a wrapper around a PAC winner-finding algorithm with weaker performance guarantees to adapt to the hardness of the input instance. The sample complexity is also shown to be multiplicatively better depending on the length of rank-ordered feedback available in each subset-wise play. We show optimality of our algorithms with matching sample complexity lower bounds. We next address the winner-finding problem in Plackett-Luce models in the fixed-budget setting with instance dependent upper and lower bounds on the misidentification probability, of $\Omega\left(\exp(-2 \tilde \Delta Q) \right)$ for a given budget $Q$, where $\tilde \Delta$ is an explicit instance-dependent problem complexity parameter. Numerical performance results are also reported for the algorithms.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

Combinatorial Pure Exploration with Full-Bandit or Partial Linear Feedback

Yihan Du, Yuko Kuroki, Wei Chen

Keywords Paper

0

0

0

0

17:13

26/08/2020

Best-item Learning in Random Utility Models with Subset Choices

Aadirupa Saha , Bangalore), Aditya Gopalan , Bangalore)

Keywords Paper

0

0

0

0

16:30

06/12/2020

Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation

Devavrat Shah, Dogyoon Song, Zhi Xu, Yuzhe Yang

Keywords Paper

0

0

0

0

3:22

12/07/2020

The Sample Complexity of Best-$k$ Items Selection from Pairwise Comparisons

Wenbo Ren, Jia Liu, Ness Shroff

Keywords Paper

Supervised Learning

0

0

0

0

13:16

06/12/2021

Nearly Horizon-Free Offline Reinforcement Learning

Tongzheng Ren, Jialian Li, Bo Dai and
Simon Du, Sujay Sanghavi

Keywords Paper

theory, optimization, reinforcement learning and planning

0

0

0

0

8:44

09/07/2020

Proper Learning, Helly Number, and an Optimal SVM Bound

Olivier Bousquet, Steve Hanneke, Shay Moran, Nikita Zhivotovskiy

Keywords Paper

PAC learning, Classification, Excess risk bounds and generalization error bounds

0

0

0

0

14:43

06/12/2021

List-Decodable Mean Estimation in Nearly-PCA Time

Ilias Diakonikolas, Daniel Kane, Daniel Kongsgaard and
Jerry Li, Kevin Tian

Keywords Paper

theory, clustering

0

0

0

0

14:21

04/08/2021

Exponential Weights Algorithms for Selective Learning

Mingda Qiao, Gregory Valiant

Keywords Paper

0

0

0

0

12:52

06/12/2020

Agnostic $Q$-learning with Function Approximation in Deterministic Systems: Near-Optimal Bounds on Approximation Error and Sample Complexity

Simon Du, Jason Lee, Gaurav Mahajan, Ruosong Wang

Keywords Paper

0

0

0

0

1:56

18/07/2021

Multi-group Agnostic PAC Learnability

Guy Rothblum, Gal Yona

Keywords Paper

Social Aspects of Machine Learning, Fairness, Accountability, and Transparency

0

0

0

0

5:30

06/12/2020

An Optimal Elimination Algorithm for Learning a Best Arm

Avinatan Hassidim, Ron Kupfer, Yaron Singer

Keywords Paper

0

0

0

0

3:23

04/08/2021

Learning and testing junta distributions with sub cube conditioning

Xi Chen, Rajesh Jayaram, Amit Levi, Erik Waingarten

Keywords Paper

0

0

0

0

19:51

06/12/2021

Instance-Dependent Bounds for Zeroth-order Lipschitz Optimization with Error Certificates

Francois Bachoc, Tom Cesari, Sébastien Gerchinovitz

Keywords Paper

theory, optimization

0

0

0

0

14:51

04/08/2021

Adaptivity in Adaptive Submodularity

Hossein Esfandiari, Amin Karbasi, Vahab Mirrokni

Keywords Paper

0

0

0

0

13:54

06/12/2021

Optimal Algorithms for Stochastic Contextual Preference Bandits

Aadirupa Saha

Keywords Paper

bandits

0

0

0

0

16:00

18/07/2021

Optimal regret algorithm for Pseudo-1d Bandit Convex Optimization

Aadirupa Saha, Nagarajan Natarajan, Praneeth Netrapalli, Prateek Jain

Keywords Paper

Optimization, Convex Optimization

0

0

0

0

6:19

26/08/2020

Contextual Combinatorial Volatile Multi-armed Bandit with Adaptive Discretization

Andi Nika, Sepehr Elahi, Cem Tekin

Keywords Paper

0

0

0

0

13:12

18/07/2021

Adaptive Sampling for Best Policy Identification in Markov Decision Processes

Aymen Al Marjani, Alexandre Proutiere

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:35

06/12/2021

Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning

Gen Li, Laixi Shi, Yuxin Chen and
Yuantao Gu, Yuejie Chi

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

15:32

06/12/2021

Bandit Phase Retrieval

Tor Lattimore, Botao Hao

Keywords Paper

bandits

0

0

0

0

14:14

06/12/2021

Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, Karthikeyan Shanmugam

Keywords Paper

0

0

0

0

14:56

18/07/2021

Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity

Zhang Zihan, Yuan Zhou, Xiangyang Ji

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:03

09/07/2020

Bessel Smoothing and Multi-Distribution Property Estimation

Yi Hao, Ping Li

Keywords Paper

Distribution learning/testing, High-dimensional statistics, Information theory

0

0

0

0

14:48

06/12/2020

Improving Sample Complexity Bounds for (Natural) Actor-Critic Algorithms

Tengyu Xu, Zhe Wang, Yingbin Liang

Keywords Paper

0

0

0

0

3:12

06/12/2020

Escaping Saddle-Point Faster under Interpolation-like Conditions

Abhishek Roy, Krishnakumar Balasubramanian, Saeed Ghadimi, Prasant Mohapatra

Keywords Paper

0

0

0

0

3:19

06/12/2021

Bandits with many optimal arms

Rianne de Heide, James Cheshire, Pierre Ménard, Alexandra Carpentier

Keywords Paper

bandits

0

0

0

0

12:23

09/07/2020

Privately Learning Thresholds: Closing the Exponential Gap

Haim Kaplan, Katrina Ligett, Yishay Mansour and
Moni Naor, Uri Stemmer

Keywords Paper

Privacy, fairness, PAC learning

0

0

0

0

14:44

06/12/2021

Sample-Efficient Reinforcement Learning for Linearly-Parameterized MDPs with a Generative Model

Bingyan Wang, Yuling Yan, Jianqing Fan

Keywords Paper

theory, reinforcement learning and planning, generative model

0

0

0

0

7:34

06/12/2021

Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings

Ming Yin, Yu-Xiang Wang

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

8:46

06/12/2020

Agnostic Learning with Multiple Objectives

Corinna Cortes, Mehryar Mohri, Javier Gonzalvo, Dmitry Storcheus

Keywords Paper

0

0

0

0

3:07

06/12/2020

Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction

Gen Li, Yuting Wei, Yuejie Chi and
Yuantao Gu, Yuxin Chen

Keywords Paper

0

0

0

0

3:06

18/07/2021

Streaming and Distributed Algorithms for Robust Column Subset Selection

Shuli Jiang, Dongyu Li, Irene Mengze Li and
Arvind Mahankali, David Woodruff

Keywords Paper

Algorithms, Deep Learning, Generative Models, Deep Learning, Predictive Models; Deep Learning, Recurrent Networks

0

0

0

0

7:26

26/04/2020

CAQL: Continuous Action Q-Learning

Moonkyung Ryu, Yinlam Chow, Ross Anderson and
Christian Tjandraatmadja, Craig Boutilier

Keywords Paper

Reinforcement learning (RL), DQN, Continuous control, Mixed-Integer Programming (MIP)

0

0

0

0

5:36

06/12/2020

Finite-Time Analysis for Double Q-learning

Huaqing Xiong, Lin Zhao, Yingbin Liang, Wei Zhang

Keywords Paper

Deep Learning -> Embedding Approaches, Applications -> Natural Language Processing

0

0

0

0

3:18

06/12/2021

Bayesian decision-making under misspecified priors with applications to meta-learning

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and
Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Paper

meta learning, bandits

0

0

0

0

14:58

02/02/2021

Near-Optimal Regret Bounds for Contextual Combinatorial Semi-Bandits with Linear Payoff Functions

Kei Takemura, Shinji Ito, Daisuke Hatano and
Hanna Sumita, Takuro Fukunaga, Naonori Kakimura, Ken-ichi Kawarabayashi

Keywords Paper

0

0

0

0

14:16

18/07/2021

Tightening the Dependence on Horizon in the Sample Complexity of Q-Learning

Gen Li, Changxiao Cai, Yuxin Chen and
Yuantao Gu, Yuting Wei, Yuejie Chi

Keywords Paper

Reinforcement Learning and Planning

0

0

0

1

4:49

18/07/2021

Learning from Biased Data: A Semi-Parametric Approach

Patrice Bertail, Stephan Clémençon, Yannick Guyonvarch, Nathan NOIRY

Keywords Paper

Applications, Fairness, Accountability, and Transparency, Theory, Algorithms, Clustering; Applications, Hardware and Systems; Applications, Privacy, Anonymity, and Security

0

0

0

0

5:09

09/07/2020

Taking a hint: How to leverage loss predictors in contextual bandits?

Chen-Yu Wei, Haipeng Luo, Alekh Agarwal

Keywords Paper

Bandit problems, Online learning

0

0

0

0

14:35

04/08/2021

Generalizing Complex Hypotheses on Product Distributions: Auctions, Prophet Inequalities, and Pandora's Problem

Chenghao Guo, Zhiyi Huang, Zhihao Gavin Tang, Xinzhi Zhang

Keywords Paper

0

0

0

0

13:12