Asymptotically Exact Error Characterization of Offline Policy Evaluation with Misspecified Linear Models

06/12/2021

Asymptotically Exact Error Characterization of Offline Policy Evaluation with Misspecified Linear Models

Kohei Miyaguchi

Keywords: reinforcement learning and planning

Abstract Paper Similar Papers

Abstract: We consider the problem of offline policy evaluation~(OPE) with Markov decision processes~(MDPs), where the goal is to estimate the utility of given decision-making policies based on static datasets. Recently, theoretical understanding of OPE has been rapidly advanced under (approximate) realizability assumptions, i.e., where the environments of interest are well approximated with the given hypothetical models. On the other hand, the OPE under unrealizability has not been well understood as much as in the realizable setting despite its importance in real-world applications.To address this issue, we study the behavior of a simple existing OPE method called the linear direct method~(DM) under the unrealizability. Consequently, we obtain an asymptotically exact characterization of the OPE error in a doubly robust form. Leveraging this result, we also establish the nonparametric consistency of the tile-coding estimators under quite mild assumptions.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Risk Bounds and Calibration for a Smart Predict-then-Optimize Method

Heyuan Liu, Paul Grigas

Keywords Paper

theory, optimization, machine learning

0

0

0

0

14:56

18/07/2021

Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality

Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin LIANG

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

4:23

13/04/2021

Adversarially robust estimate and risk analysis in linear regression

Yue Xing, Ruizhi Zhang, Guang Cheng

Keywords Paper

0

0

0

0

3:03

06/12/2020

Fair regression with Wasserstein barycenters

Evgenii Chzhen, Christophe Denis, Mohamed Hebiri and
Luca Oneto, Massimiliano Pontil

Keywords Paper

0

0

0

0

3:12

03/05/2021

What are the Statistical Limits of Offline RL with Linear Function Approximation?

Ruosong Wang, Dean Foster, Sham M Kakade

Keywords Paper

batch reinforcement learning, representation, function approximation, lower bound

0

0

0

0

9:02

06/12/2021

Structured Dropout Variational Inference for Bayesian Neural Networks

Son Nguyen, Duong Nguyen, Khai Nguyen and
Khoat Than, Hung Bui, Nhat Ho

Keywords Paper

deep learning, generative model

0

0

0

0

11:28

06/12/2021

Continuous Latent Process Flows

Ruizhi Deng, Marcus Brubaker, Greg Mori, Andreas M Lehrmann

Keywords Paper

generative model

0

0

0

0

14:54

26/04/2020

GenDICE: Generalized Offline Estimation of Stationary Values

Ruiyi Zhang, Bo Dai, Lihong Li, Dale Schuurmans

Keywords Paper

Off-policy Policy Evaluation, Reinforcement Learning, Stationary Distribution Correction Estimation, Fenchel Dual

0

0

0

0

15:37

06/12/2020

The Statistical Complexity of Early-Stopped Mirror Descent

Tomas Vaskevicius, Varun Kanade, Patrick Rebeschini

Keywords Paper

Algorithms; Algorithms -> Regression; Algorithms -> Similarity and Distance Learning; Optimization -> Combinatorial Optimizatio, Optimization

0

0

0

0

3:21

12/07/2020

Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation

Yaqi Duan, Zeyu Jia, Mengdi Wang

Keywords Paper

Learning Theory

0

0

0

0

14:10

06/12/2020

Robust, Accurate Stochastic Optimization for Variational Inference

Akash Kumar Dhaka, Alejandro Catalina, Michael Andersen and
Måns Magnusson, Jonathan Huggins, Aki Vehtari

Keywords Paper

0

0

0

0

3:23

13/04/2021

Localizing changes in high-dimensional regression models

Alessandro Rinaldo, Daren Wang, Qin Wen and
Rebecca Willett, Yi Yu

Keywords Paper

0

0

0

0

3:00

13/04/2021

Learning prediction intervals for regression: Generalization and calibration

Haoxian Chen, Ziyi Huang, Henry Lam and
Huajie Qian, Haofeng Zhang

Keywords Paper

0

0

0

0

3:26

18/07/2021

Provably Correct Optimization and Exploration with Non-linear Policies

Fei Feng, Wotao Yin, Alekh Agarwal, Lin Yang

Keywords Paper

Deep Learning, Adversarial Networks, Applications, Fairness, Accountability, and Transparency, Theory, RL, Decisions and Control Theory

0

0

0

0

5:03

06/12/2021

Learning to Select Exogenous Events for Marked Temporal Point Process

Ping Zhang, Rishabh Iyer, Ashish Tendulkar and
Gaurav Aggarwal, Abir De

Keywords Paper

0

0

0

0

12:27

12/07/2020

Fast and Consistent Learning of Hidden Markov Models by Incorporating Non-Consecutive Correlations

Robert Mattila, Cristian Rojas, Eric Moulines and
Vikram Krishnamurthy, Bo Wahlberg

Keywords Paper

Sequential, Network, and Time-Series Modeling

0

0

0

0

13:37

06/12/2021

Conformal Bayesian Computation

Edwin Fong, Chris C Holmes

Keywords Paper

machine learning

0

0

0

0

14:54

06/12/2020

Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms

Dheeraj Nagaraj, Xian Wu, Guy Bresler and
Prateek Jain, Praneeth Netrapalli

Keywords Paper

0

0

0

0

3:34

06/12/2021

Control Variates for Slate Off-Policy Evaluation

Nikos Vlassis, Ashok Chandrashekar, Fernando Amat, Nathan Kallus

Keywords Paper

optimization, bandits

0

0

0

0

12:25

13/04/2021

Optimizing percentile criterion using robust MDPs

Bahram Behzadian, Reazul Hasan Russel, Marek Petrik, Chin Pang Ho

Keywords Paper

0

0

0

0

3:00

06/12/2020

Outlier Robust Mean Estimation with Subgaussian Rates via Stability

Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia

Keywords Paper

0

0

0

0

3:19

09/07/2020

Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal

Alekh Agarwal, Sham Kakade, Lin Yang

Keywords Paper

Reinforcement learning, Sampling algorithms

0

0

0

0

15:13

02/02/2021

Robust Reinforcement Learning: A Case Study in Linear Quadratic Regulation

Bo Pang, Zhong-Ping Jiang

Keywords Paper

0

0

0

0

20:01

06/12/2021

Loss function based second-order Jensen inequality and its application to particle variational inference

Futoshi Futami, Tomoharu Iwata, naonori ueda and
Issei Sato, Masashi Sugiyama

Keywords Paper

optimization, generative model

0

0

0

0

14:09

26/08/2020

A Unified Statistically Efficient Estimation Framework for Unnormalized Models

Masatoshi Uehara, Takafumi Kanamori, Takashi Takenouchi, Takeru Matsuda

Keywords Paper

0

0

0

0

13:58

13/04/2021

Principal component regression with semirandom observations via matrix completion

Aditya Bhaskara, Aravinda Kanchana Ruwanpathirana, Maheshakya Wijewardena

Keywords Paper

0

0

0

0

2:48

26/08/2020

Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization

Kenji Kawaguchi, Haihao Lu

Keywords Paper

0

0

0

0

14:10

06/12/2020

Confounding-Robust Policy Evaluation in Infinite-Horizon Reinforcement Learning

Nathan Kallus, Angela Zhou

Keywords Paper

0

0

0

0

4:51

06/12/2020

A Maximum-Entropy Approach to Off-Policy Evaluation in Average-Reward MDPs

Nevena Lazic, Dong Yin, Mehrdad Farajtabar and
Nir Levine, DILAN Gorur, Chris Harris, Dale Schuurmans

Keywords Paper

Deep Learning -> Supervised Deep Networks, Algorithms -> Semi-Supervised Learning

0

0

0

0

3:20

06/12/2021

Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD

Rémi Bardenet, Subhroshekhar Ghosh, Meixia LIN

Keywords Paper

optimization, machine learning

0

0

0

0

14:51

06/12/2020

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Paper

0

0

0

0

3:11

09/07/2020

Optimality and Approximation with Policy Gradient Methods in Markov Decision Processes

Alekh Agarwal, Sham Kakade, Jason Lee, Gaurav Mahajan

Keywords Paper

Reinforcement learning, Non-convex optimization

0

0

0

0

11:00

26/08/2020

Distributionally Robust Formulation and Model Selection for the Graphical Lasso

Pedro Cisneros, Alexander Petersen, Sang-Yun Oh

Keywords Paper

0

0

0

0

14:08

03/05/2021

Learning Value Functions in Deep Policy Gradients using Residual Variance

Yannis Flet-Berliac, reda ouhamma, odalric-ambrym maillard, philippe preux

Keywords Paper

0

0

0

0

4:49

12/07/2020

The continuous categorical: a novel simplex-valued exponential family

Elliott Gordon-Rodriguez, Gabriel Loaiza-Ganem, John Cunningham

Keywords Paper

Probabilistic Inference - Models and Probabilistic Programming

0

0

0

0

14:59

06/12/2021

Automated Dynamic Mechanism Design

Hanrui Zhang, Vincent Conitzer

Keywords Paper

0

0

0

0

14:35

19/08/2021

LTL-Constrained Steady-State Policy Synthesis

Jan Křetínský

Keywords Paper

Planning and Scheduling, Markov Decisions Processes, Formal Verification, Validation and Synthesis, Markov Decision Processes

0

0

0

0

15:01

06/12/2021

Auditing Black-Box Prediction Models for Data Minimization Compliance

Bashir Rastegarpanah, Krishna Gummadi, Mark Crovella

Keywords Paper

reinforcement learning and planning, bandits, privacy

0

0

0

0

14:40

02/02/2021

Variance Penalized On-Policy and Off-Policy Actor-Critic

Arushi Jain, Gandharv Patil, Ayush Jain and
Khimya Khetarpal, Doina Precup

Keywords Paper

0

0

0

0

17:58

06/12/2020

Reinforcement Learning for Control with Multiple Frequencies

Jongmin Lee, Byung-Jun Lee, Kee-Eung Kim

Keywords Paper

Algorithms -> Multitask and Transfer Learning; Deep Learning -> Supervised Deep Networks; Theory -> Learning Theory; Theory -> , Deep Learning

0

0

0

0

3:21