Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

06/12/2020

Planning in Markov Decision Processes with Gap-Dependent Sample Complexity

Anders Jonsson, Emilie Kaufmann, Pierre Menard, Omar Darwiche Domingues, Edouard Leurent, Michal Valko

Keywords:

Abstract Paper Similar Papers

Abstract: We propose MDP-GapE, a new trajectory-based Monte-Carlo Tree Search algorithm for planning in a Markov Decision Process in which transitions have a finite support. We prove an upper bound on the number of sampled trajectories needed for MDP-GapE to identify a near-optimal action with high probability. This problem-dependent result is expressed in terms of the sub-optimality gaps of the state-action pairs that are visited during exploration. Our experiments reveal that MDP-GapE is also effective in practice, in contrast with other algorithms with sample complexity guarantees in the fixed-confidence setting, that are mostly theoretical.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

On Efficiency in Hierarchical Reinforcement Learning

Zheng Wen, Doina Precup, Morteza Ibrahimi and
Andre Barreto, Benjamin Van Roy, Satinder Singh

Keywords Paper

0

0

0

0

3:05

26/04/2020

SVQN: Sequential Variational Soft Q-Learning Networks

Shiyu Huang, Hang Su, Jun Zhu, Ting Chen

Keywords Paper

reinforcement learning, POMDP, variational inference, generative model

0

0

0

0

4:52

06/12/2021

Implicit MLE: Backpropagating Through Discrete Exponential Family Distributions

Mathias Niepert, Pasquale Minervini, Luca Franceschi

Keywords Paper

deep learning, optimization

0

0

0

0

15:02

02/02/2021

Anytime Inference with Distilled Hierarchical Neural Ensembles

Adria Ruiz, Jakob Verbeek

Keywords Paper

0

0

0

0

15:09

26/04/2020

Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks

Yu Bai, Jason D. Lee

Keywords Paper

Neural Tangent Kernels, over-parametrized neural networks, deep learning theory

0

0

0

0

5:25

06/12/2021

Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

Dylan J Foster, Akshay Krishnamurthy

Keywords Paper

theory, reinforcement learning and planning, bandits, online learning

0

0

0

0

19:34

26/08/2020

Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions

Lars Buesing, Nicolas Heess, Theophane Weber

Keywords Paper

0

0

0

0

15:39

06/12/2020

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

Julian Katz-Samuels, Lalit Jain, zohar karnin, Kevin Jamieson

Keywords Paper

0

0

0

0

3:20

02/02/2021

Bayes DistNet - A Robust Neural Network for Algorithm Runtime Distribution Predictions

Jake Tuero, Michael Buro

Keywords Paper

0

0

0

0

18:39

16/11/2020

Sampling-based Reachability Analysis: A Random Set Theory Approach with Adversarial Sampling

Thomas Lew, Marco Pavone

Keywords Paper

0

0

0

0

5:05

06/12/2021

Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms

Alexander Camuto, George Deligiannidis, Murat Erdogdu and
Mert Gurbuzbalaban, Umut Simsekli, Lingjiong Zhu

Keywords Paper

theory, deep learning, optimization

0

0

0

0

14:36

06/12/2020

A Maximum-Entropy Approach to Off-Policy Evaluation in Average-Reward MDPs

Nevena Lazic, Dong Yin, Mehrdad Farajtabar and
Nir Levine, DILAN Gorur, Chris Harris, Dale Schuurmans

Keywords Paper

Deep Learning -> Supervised Deep Networks, Algorithms -> Semi-Supervised Learning

0

0

0

0

3:20

03/05/2021

Exploring the Uncertainty Properties of Neural Networks’ Implicit Priors in the Infinite-Width Limit

Ben Adlam, Jaehoon Lee, Lechao Xiao and
Jeffrey Pennington, Jasper Snoek

Keywords Paper

Deep Learning, Bayesian Neural Networks, Neural Network Gaussian Process, Infinite-Width Limit, Uncertainty, Gaussian Process

0

0

0

0

4:34

03/05/2021

Set Prediction without Imposing Structure as Conditional Density Estimation

David W Zhang, Gertjan J Burghouts, Cees G Snoek

Keywords Paper

energy based models, set prediction

0

0

0

0

5:02

09/07/2020

Data-driven confidence bands for distributed nonparametric regression

Valeriy Avanesov

Keywords Paper

Kernel methods, Excess risk bounds and generalization error bounds, Regression, Sampling algorithms, Supervised learning

0

0

0

0

14:42

12/07/2020

Multi-Task Learning with User Preferences: Gradient Descent with Controlled Ascent in Pareto Optimization

Debabrata Mahapatra, Vaibhav Rajan

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

15:35

06/12/2021

Meta-Learning for Relative Density-Ratio Estimation

Atsutoshi Kumagai, Tomoharu Iwata, Yasuhiro Fujiwara

Keywords Paper

deep learning, machine learning, meta learning

0

0

0

0

8:56

09/07/2020

High probability guarantees for stochastic convex optimization

Damek Davis, Dmitriy Drusvyatskiy

Keywords Paper

Stochastic optimization, Computational complexity, Convex optimization, Excess risk bounds and generalization error bounds

0

0

0

0

15:10

18/07/2021

An Identifiable Double VAE For Disentangled Representations

Graziano Mita, Maurizio Filippone, Pietro Michiardi

Keywords Paper

Deep Learning, Adversarial Networks, Deep Learning, Generative Models

0

0

0

0

4:51

03/08/2020

Neural Likelihoods via Cumulative Distribution Functions

Pawel Chilinski, Ricardo Silva

Keywords Paper

0

0

0

0

8:07

06/12/2020

On Learning Ising Models under Huber's Contamination Model

Adarsh Prasad, Vishwak Srinivasan, Sivaraman Balakrishnan, Pradeep Ravikumar

Keywords Paper

0

0

0

0

3:16

06/12/2020

A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Wei Deng, Guang Lin, Faming Liang

Keywords Paper

0

0

0

0

3:26

06/12/2020

Uncertainty Quantification for Inferring Hawkes Networks

Haoyun Wang, Liyan Xie, Alex Cuozzo and
Simon Mak, Yao Xie

Keywords Paper

0

0

0

0

3:02

03/05/2021

Global Convergence of Three-layer Neural Networks in the Mean Field Regime

Huy Tuan Pham, Phan-Minh Nguyen

Keywords Paper

deep learning theory

0

0

0

0

15:41

26/04/2020

Hypermodels for Exploration

Vikranth Dwaracherla, Xiuyuan Lu, Morteza Ibrahimi and
Ian Osband, Zheng Wen, Benjamin Van Roy

Keywords Paper

exploration, hypermodel, reinforcement learning

0

0

0

0

5:02

18/07/2021

Generative Adversarial Networks for Markovian Temporal Dynamics: Stochastic Continuous Data Generation

Sung Woo Park, Dong Wook Shu, Junseok Kwon

Keywords Paper

Deep Learning, , Deep Learning, Generative Models

0

0

0

0

4:03

06/12/2020

Log-Likelihood Ratio Minimizing Flows: Towards Robust and Quantifiable Neural Distribution Alignment

Ben Usman, Avneesh Sud, Nick Dufour, Kate Saenko

Keywords Paper

0

0

0

0

3:09

26/04/2020

Stochastic AUC Maximization with Deep Neural Networks

Mingrui Liu, Zhuoning Yuan, Yiming Ying, Tianbao Yang

Keywords Paper

Stochastic AUC Maximization, Deep Neural Networks

0

0

0

0

4:58

06/12/2020

Matrix Inference and Estimation in Multi-Layer Models

Parthe Pandit, Moji Sahraee Ardakan, Sundeep Rangan and
Phil Schniter, Alyson Fletcher

Keywords Paper

0

0

0

0

3:24

09/07/2020

On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels

Tengyuan Liang, Alexander Rakhlin, Xiyu Zhai

Keywords Paper

Supervised learning, Excess risk bounds and generalization error bounds, High-dimensional statistics, Kernel methods, Regression

0

0

0

0

14:56

26/04/2020

Conservative Uncertainty Estimation By Fitting Prior Networks

Kamil Ciosek, Vincent Fortuin, Ryota Tomioka and
Katja Hofmann, Richard Turner

Keywords Paper

uncertainty quantification, deep learning, Gaussian process, epistemic uncertainty, random network, prior, Bayesian inference

0

0

0

1

5:06

04/08/2021

On Query-efficient Planning in MDPs under Linear Realizability of the Optimal State-value Function

Gellert Weisz, Philip Amortila, Barnabás Janzer and
Yasin Abbasi-Yadkori, Nan Jiang, Csaba Szepesvari

Keywords Paper

0

0

0

0

18:01

06/12/2021

Generalization Bounds For Meta-Learning: An Information-Theoretic Analysis

Qi CHEN, Changjian Shui, Mario Marchand

Keywords Paper

deep learning, meta learning, few shot learning

0

0

0

0

11:45

02/02/2021

Characterizing the Loss Landscape in Non-Negative Matrix Factorization

Johan Bjorck, Anmol Kabra, Kilian Q. Weinberger, Carla Gomes

Keywords Paper

0

0

0

0

20:00

06/12/2021

Conformal Time-series Forecasting

Kamile Stankeviciute, Ahmed M. Alaa, Mihaela van der Schaar

Keywords Paper

theory, deep learning

0

0

1

1

14:53

06/12/2020

Sample-Efficient Reinforcement Learning of Undercomplete POMDPs

Chi Jin, Sham Kakade, Akshay Krishnamurthy, Qinghua Liu

Keywords Paper

0

0

0

0

3:12

06/12/2020

Normalizing Kalman Filters for Multivariate Time Series Analysis

Emmanuel de Bézenac, Syama Sundar Rangapuram, Konstantinos Benidis and
Michael Bohlke-Schneider, Richard Kurle, Lorenzo Stella, Hilaf Hasson, Patrick Gallinari, Tim Januschowski

Keywords Paper

0

0

0

0

3:19

03/08/2020

Hidden Markov Nonlinear ICA: Unsupervised Learning from Nonstationary Time Series

Hermanni Hälvä, Aapo Hyvarinen

Keywords Paper

0

0

0

0

7:57

18/07/2021

On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent

Shahar Azulay, Edward Moroshko, Mor Shpigel Nacson and
Blake Woodworth, Nati Srebro, Amir Globerson, Daniel Soudry

Keywords Paper

, Probabilistic Methods, MCMC, Theory, Deep learning Theory

0

0

0

0

15:38

06/12/2021

Structured Dropout Variational Inference for Bayesian Neural Networks

Son Nguyen, Duong Nguyen, Khai Nguyen and
Khoat Than, Hung Bui, Nhat Ho

Keywords Paper

deep learning, generative model

0

0

0

0

11:28