Benchmarks for Deep Off-Policy Evaluation

03/05/2021

Benchmarks for Deep Off-Policy Evaluation

Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, ziyu wang, Alexander Novikov, Sherry Yang, Michael Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Paine

Keywords: reinforcement learning, benchmarks, off-policy evaluation

Abstract Paper Similar Papers

Abstract: Off-policy evaluation (OPE) holds the promise of being able to leverage large, offline datasets for both evaluating and selecting complex policies for decision making. The ability to learn offline is particularly important in many real-world domains, such as in healthcare, recommender systems, or robotics, where online data collection is an expensive and potentially dangerous process. Being able to accurately evaluate and select high-performing policies without requiring online interaction could yield significant benefits in safety, time, and cost for these applications. While many OPE methods have been proposed in recent years, comparing results between papers is difficult because currently there is a lack of a comprehensive and unified benchmark, and measuring algorithmic progress has been challenging due to the lack of difficult evaluation tasks. In order to address this gap, we present a collection of policies that in conjunction with existing offline datasets can be used for benchmarking off-policy evaluation. Our tasks include a range of challenging high-dimensional continuous control problems, with wide selections of datasets and policies for performing policy selection. The goal of our benchmark is to provide a standardized measure of progress that is motivated from a set of principles designed to challenge and test the limits of existing OPE methods. We perform an evaluation of state-of-the-art algorithms and provide open-source access to our data and code to foster future research in this area.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Active Offline Policy Selection

Ksenia Konyushova, Yutian Chen, Thomas Paine and
Caglar Gulcehre, Cosmin Paduraru, Daniel J Mankowitz, Misha Denil, Nando de Freitas

Keywords Paper

optimization, reinforcement learning and planning, active learning

1

0

0

0

12:46

06/12/2021

USCO-Solver: Solving Undetermined Stochastic Combinatorial Optimization Problems

Guangmo Tong

Keywords Paper

optimization

0

0

0

0

15:00

06/12/2020

High-Dimensional Contextual Policy Search with Unknown Context Rewards using Bayesian Optimization

Qing Feng , Ben Letham, Hongzi Mao, Eytan Bakshy

Keywords Paper

0

0

0

0

3:29

06/12/2020

Parabolic Approximation Line Search for DNNs

Maximus Mutschler, Andreas Zell

Keywords Paper

0

0

0

0

3:19

02/02/2021

A Scalable Two Stage Approach to Computing Optimal Decision Sets

Alexey Ignatiev, Edward Lam, Peter J. Stuckey, Joao Marques-Silva

Keywords Paper

0

0

0

0

16:23

19/08/2021

Greybox Algorithm Configuration

Marie Anastacio

Keywords Paper

Heuristic Search and Game Playing, Combinatorial Search and Optimisation, Heuristic Search and Machine Learning

0

0

0

0

14:13

04/08/2021

Group testing and local search: is there a computational-statistical gap?

Fotis Iliopoulos, Ilias Zadik

Keywords Paper

0

0

0

0

17:50

12/07/2020

Tuning-free Plug-and-Play Proximal Algorithm for Inverse Imaging Problems

Kaixuan Wei, Angelica I Aviles-Rivero, Jingwei Liang and
Ying Fu, Carola-Bibiane Schönlieb, Hua Huang

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

11:48

18/07/2021

Provably Correct Optimization and Exploration with Non-linear Policies

Fei Feng, Wotao Yin, Alekh Agarwal, Lin Yang

Keywords Paper

Deep Learning, Adversarial Networks, Applications, Fairness, Accountability, and Transparency, Theory, RL, Decisions and Control Theory

0

0

0

0

5:03

15/06/2020

Learning fast and precise numerical analysis

Jingxuan He, Gagandeep Singh, Markus Püschel, Martin Vechev

Keywords Paper

Abstract interpretation, Performance optimization, Machine learning, Numerical domains

0

0

0

0

14:20

04/08/2021

Adversarially Robust Low Dimensional Representations

Pranjal Awasthi, Vaggos Chatziafratis, Xue Chen, Aravindan Vijayaraghavan

Keywords Paper

0

0

0

0

20:19

12/07/2020

Searching to Exploit Memorization Effect in Learning with Noisy Labels

QUANMING YAO, Hansi Yang, Bo Han and
Gang Niu, James Kwok

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

12:25

26/04/2020

Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning

Ali Mousavi, Lihong Li, Qiang Liu, Denny Zhou

Keywords Paper

reinforcement learning, off-policy estimation, importance sampling, propensity score

0

0

0

0

5:25

23/08/2020

Diverse rule sets

Guangyi Zhang, Aristides Gionis

Keywords Paper

sampling, classifier, pattern mining, rule learning, diversification, rule sets

0

0

0

0

9:41

03/08/2020

Semi-bandit Optimization in the Dispersed Setting

Travis Dick, Wesley Pegden, Maria-Florina Balcan

Keywords Paper

0

0

0

0

8:04

06/12/2021

Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs

Thomas Spooner, Nelson Vadori, Sumitra Ganesh

Keywords Paper

bandits

0

0

0

0

14:40

02/02/2021

On the Tractability of SHAP Explanations

Guy Van den Broeck, Anton Lykov, Maximilian Schleich, Dan Suciu

Keywords Paper

0

0

0

0

19:31

06/12/2021

Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks

Tolga Birdal, Aaron Lou, Leonidas Guibas, Umut Simsekli

Keywords Paper

theory, deep learning, optimization

0

0

0

0

14:38

06/12/2020

Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View

Christos Thrampoulidis, oymak Oymak, Mahdi Soltanolkotabi

Keywords Paper

0

0

0

0

4:25

06/12/2021

Overcoming the curse of dimensionality with Laplacian regularization in semi-supervised learning

Vivien Cabannes, Loucas Pillaud-Vivien, Francis Bach, Alessandro Rudi

Keywords Paper

machine learning, kernel methods, semi-supervised learning

0

0

0

0

14:24

12/07/2020

Estimation of Bounds on Potential Outcomes For Decision Making

Maggie Makar, Fredrik Johansson, John Guttag, David Sontag

Keywords Paper

Causality

0

0

0

0

13:12

18/07/2021

Post-selection inference with HSIC-Lasso

Tobias Freidling, Benjamin Poignard, Héctor Climente-González, Makoto Yamada

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:03

18/07/2021

DORO: Distributional and Outlier Robust Optimization

Runtian Zhai, Chen Dan, Zico Kolter, Pradeep Ravikumar

Keywords Paper

Probabilistic Methods, Robust statistics

0

0

0

1

5:06

14/09/2020

An efficient K-means clustering algorithm for tall data

Marco Capó, Aritz Pérez, Jose A. Lozan

Keywords Paper

0

0

0

0

14:46

02/02/2021

Over-MAP: Structural Attention Mechanism and Automated Semantic Segmentation Ensembled for Uncertainty Prediction

Charles A. Kantor, Léonard Boussioux, Brice Rauby, Hugues Talbot

Keywords Paper

0

0

0

0

16:38

13/04/2021

Benchmarking simulation-based inference

Jan-Matthis Lueckmann, Jan Boelts, David Greenberg and
Pedro Goncalves, Jakob Macke

Keywords Paper

0

0

0

0

3:04

13/04/2021

Non-stationary off-policy optimization

Joey Hong, Branislav Kveton, Manzil Zaheer and
Yinlam Chow, Amr Ahmed

Keywords Paper

0

0

0

0

2:57

06/12/2020

Beyond Individualized Recourse: Interpretable and Interactive Summaries of Actionable Recourses

Kai Rawal, Himabindu Lakkaraju

Keywords Paper

0

0

0

0

3:31

02/02/2021

Learning Generalized Relational Heuristic Networks for Model-Agnostic Planning

Rushang Karia, Siddharth Srivastava

Keywords Paper

0

0

0

0

16:56

02/02/2021

Learning Prediction Intervals for Model Performance

Benjamin Elder, Matthew Arnold, Anupama Murthi, Jiří Navrátil

Keywords Paper

0

0

0

0

20:12

04/07/2020

SEEK: Segmented Embedding of Knowledge Graphs

Wentao Xu, Shun Zheng, Liang He and
Bin Shao, Jian Yin, Tie-Yan Liu

Keywords Paper

Segmented Graphs, knowledge embedding, artificial intelligence, recommendation

0

0

0

0

12:01

06/12/2020

Approximate Cross-Validation with Low-Rank Data in High Dimensions

Will Stephenson, Madeleine Udell, Tamara Broderick

Keywords Paper

0

0

0

0

3:02

03/05/2021

In Search of Lost Domain Generalization

Ishaan Gulrajani, David Lopez-Paz

Keywords Paper

reproducible research, domain generalization

0

0

0

0

5:38

06/12/2021

Cross-modal Domain Adaptation for Cost-Efficient Visual Reinforcement Learning

Xiong-Hui Chen, Shengyi Jiang, Feng Xu and
Zongzhang Zhang, Yang Yu

Keywords Paper

reinforcement learning and planning, domain adaptation

0

0

0

0

10:55

06/12/2021

Learning with Labeling Induced Abstentions

Kareem Amin, Giulia DeSalvo, Afshin Rostamizadeh

Keywords Paper

machine learning, active learning

0

0

0

0

11:22

12/07/2020

Robust Black Box Explanations Under Distribution Shift

Himabindu Lakkaraju, Nino Arsov, Osbert Bastani

Keywords Paper

Accountability, Transparency and Interpretability

0

0

0

0

14:02

06/12/2021

Automatic Unsupervised Outlier Model Selection

Yue Zhao, Ryan Rossi, Leman Akoglu

Keywords Paper

machine learning, self-supervised learning, meta learning, clustering

0

0

0

0

15:08

03/05/2021

C-Learning: Horizon-Aware Cumulative Accessibility Estimation

Panteha Naderian, Gabriel Loaiza-Ganem, Harry Braviner and
Anthony Caterini, Jesse C Cresswell, Tong Li, Animesh Garg

Keywords Paper

reinforcement learning, goal reaching, Q-learning

0

0

0

0

4:49

06/12/2020

Off-Policy Imitation Learning from Observations

Zhuangdi Zhu, Kaixiang Lin, Bo Dai, Jiayu Zhou

Keywords Paper

0

0

0

1

3:24

18/07/2021

Meta-Cal: Well-controlled Post-hoc Calibration by Ranking

Xingchen Ma, Matthew B Blaschko

Keywords Paper

Algorithms, Supervised Learning

0

0

0

0

4:28