Off-Policy Evaluation via the Regularized Lagrangian

06/12/2020

Off-Policy Evaluation via the Regularized Lagrangian

Sherry Yang, Ofir Nachum, Bo Dai, Lihong Li, Dale Schuurmans

Keywords:

Abstract Paper Similar Papers

Abstract: The recently proposed distribution correction estimation (DICE) family of estimators has advanced the state of the art in off-policy evaluation from behavior-agnostic data. While these estimators all perform some form of stationary distribution correction, they arise from different derivations and objective functions. In this paper, we unify these estimators as regularized Lagrangians of the same linear program. The unification allows us to expand the space of DICE estimators to new alternatives that demonstrate improved performance. More importantly, by analyzing the expanded space of estimators both mathematically and empirically we find that dual solutions offer greater flexibility in navigating the tradeoff between optimization stability and estimation bias, and generally provide superior estimates in practice.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

A Discriminative Technique for Multiple-Source Adaptation

Corinna Cortes, Mehryar Mohri, Ananda Theertha Suresh, Ningshan Zhang

Keywords Paper

Applications, , Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

1

4:49

06/12/2021

Continuous Latent Process Flows

Ruizhi Deng, Marcus Brubaker, Greg Mori, Andreas M Lehrmann

Keywords Paper

generative model

0

0

0

0

14:54

18/07/2021

On Estimation in Latent Variable Models

Guanhua Fang, Ping Li

Keywords Paper

Optimization, Stochastic Optimization

0

0

0

0

4:55

18/11/2020

Localizing and amortizing: Efficient inference for gaussian processes

Linfeng Liu, Liping Liu

Keywords Paper

0

0

0

0

6:58

06/12/2020

Minibatch Stochastic Approximate Proximal Point Methods

Hilal Asi, Karan Chadha, Gary Cheng, John Duchi

Keywords Paper

0

0

0

0

3:21

06/12/2020

Demystifying Orthogonal Monte Carlo and Beyond

Han Lin, Haoxian Chen, Krzysztof M Choromanski and
Tianyi Zhang, Clement Laroche

Keywords Paper

0

0

0

0

3:19

03/08/2020

Amortized Nesterov’s Momentum: A Robust Momentum and Its Application to Deep Learning

Kaiwen Zhou, Yanghua Jin, Qinghua Ding, James Cheng

Keywords Paper

0

0

0

0

7:10

06/12/2020

Conic Descent and its Application to Memory-efficient Optimization over Positive Semidefinite Matrices

John Duchi, Oliver Hinder, Andrew Naber, Yinyu Ye

Keywords Paper

0

0

0

0

3:22

02/02/2021

Adaptive Gradient Methods for Constrained Convex Optimization and Variational Inequalities

Alina Ene, Huy L. Nguyen, Adrian Vladu

Keywords Paper

0

0

0

0

16:24

18/07/2021

Convex Regularization in Monte-Carlo Tree Search

Tuan Q Dam, Carlo D'Eramo, Jan Peters, Joni Pajarinen

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

4:52

26/04/2020

Infinite-horizon Off-Policy Policy Evaluation with Multiple Behavior Policies

Xinyun Chen, Lu Wang, Yizhe Hang and
Heng Ge, Hongyuan Zha

Keywords Paper

off-policy policy evaluation, multiple importance sampling, kernel method, variance reduction

0

0

0

0

6:57

06/12/2021

Minibatch and Momentum Model-based Methods for Stochastic Weakly Convex Optimization

Qi Deng, Wenzhi Gao

Keywords Paper

optimization, robustness

0

0

0

0

13:12

13/04/2021

Sparse gaussian processes revisited: Bayesian approaches to inducing-variable approximations

Simone Rossi, Markus Heinonen, Edwin Bonilla and
Zheyang Shen, Maurizio Filippone

Keywords Paper

0

0

0

0

2:59

12/07/2020

State Space Expectation Propagation: Efficient Inference Schemes for Temporal Gaussian Processes

William Wilkinson, Paul Chang, Michael Andersen, Arno Solin

Keywords Paper

Gaussian Processes

0

0

0

0

13:31

06/12/2021

Coordinated Proximal Policy Optimization

Zifan Wu, Chao Yu, Deheng Ye and
Junge Zhang, haiyin piao, Hankz Hankui Zhuo

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

6:19

18/07/2021

ConvexVST: A Convex Optimization Approach to Variance-stabilizing Transformation

Mengfan Wang, Boyu Lyu, Guoqiang Yu

Keywords Paper

Optimization, Convex Optimization

0

0

0

0

4:59

26/08/2020

Practical Nonisotropic Monte Carlo Sampling in High Dimensions via Determinantal Point Processes

Krzysztof Choromanski, Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang

Keywords Paper

0

0

0

0

12:42

12/07/2020

A simpler approach to accelerated optimization: iterative averaging meets optimism

Pooria Joulani, Anant Raj, András György, Csaba Szepesvari

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

1

1

16:17

13/04/2021

Improving KernelSHAP: Practical shapley value estimation using linear regression

Ian Covert, Su-In Lee

Keywords Paper

0

0

0

0

2:52

06/12/2020

Rankmax: An Adaptive Projection Alternative to the Softmax Function

Weiwei Kong, Walid Krichene, Nic E Mayoraz and
Steffen Rendle, Li Zhang

Keywords Paper

Algorithms -> Unsupervised Learning, Theory -> Learning Theory

0

0

0

0

3:23

06/12/2020

Gaussian Process Bandit Optimization of the Thermodynamic Variational Objective

Vu Nguyen, Vaden Masrani, Rob Brekelmans and
Michael A Osborne, Frank Wood

Keywords Paper

0

0

0

0

3:23

13/04/2021

High-dimensional multi-task averaging and application to kernel mean embedding

Hannah Marienwald, Jean-Baptiste Fermanian, Gilles Blanchard

Keywords Paper

0

0

0

0

3:01

06/12/2020

A Catalyst Framework for Minimax Optimization

Junchi Yang, Siqi Zhang, Negar Kiyavash, Niao He

Keywords Paper

0

0

0

0

3:01

06/12/2021

Functional Variational Inference based on Stochastic Process Generators

Chao Ma, José Miguel Hernández-Lobato

Keywords Paper

deep learning, generative model

0

0

0

0

12:48

13/04/2021

Sparse algorithms for markovian gaussian processes

William Wilkinson, Arno Solin, Vincent Adam

Keywords Paper

0

0

0

0

3:10

03/05/2021

Linear Last-iterate Convergence in Constrained Saddle-point Optimization

Chen-Yu Wei, Chung-Wei Lee, Mengxiao Zhang, Haipeng Luo

Keywords Paper

Game Theory, Last-iterate Convergence, Optimistic Multiplicative Weights Update, Optimistic Gradient Descent Ascent, Optimistic Mirror Decent, Saddle-point Optimization

0

0

0

0

4:59

12/07/2020

Random extrapolation for primal-dual coordinate descent

Ahmet Alacaoglu, Olivier Fercoq, Volkan Cevher

Keywords Paper

Optimization - Convex

0

0

0

0

14:34

19/08/2021

SHPOS: A Theoretical Guaranteed Accelerated Particle Optimization Sampling Method

Zhijian Li, Chao Zhang, Hui Qian and
Xin Du, Lingwei Peng

Keywords Paper

Machine Learning, Bayesian Learning, Approximate Probabilistic Inference, Exact Probabilistic Inference

0

0

0

0

10:11

06/12/2021

Model Selection for Bayesian Autoencoders

Ba-Hien Tran, Simone Rossi, Dimitrios Milios and
Pietro Michiardi, Edwin Bonilla, Maurizio Filippone

Keywords Paper

optimization, self-supervised learning, generative model, representation learning

0

0

0

0

10:49

02/02/2021

Bayes DistNet - A Robust Neural Network for Algorithm Runtime Distribution Predictions

Jake Tuero, Michael Buro

Keywords Paper

0

0

0

0

18:39

12/07/2020

Randomized Block-Diagonal Preconditioning for Parallel Learning

Celestine Mendler-Dünner, Aurelien Lucchi

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

12:57

26/08/2020

Sparse Orthogonal Variational Inference for Gaussian Processes

Jiaxin Shi, Michalis Titsias, Andriy Mnih

Keywords Paper

0

0

0

0

13:53

26/08/2020

Gaussianization Flows

Chenlin Meng, Yang Song, Jiaming Song, Stefano Ermon

Keywords Paper

0

0

0

0

11:21

26/08/2020

Kernel Conditional Density Operators

Ingmar Schuster, Mattes Mollenhauer, Stefan Klus, Krikamol Muandet

Keywords Paper

0

0

0

0

14:59

06/12/2020

Multi-task Additive Models for Robust Estimation and Automatic Structure Discovery

Yingjie Wang, Hong Chen, Feng Zheng and
Chen Xu, Tieliang Gong, Yanhong Chen

Keywords Paper

Applications -> Time Series Analysis; Probabilistic Methods -> Variational Inference, Probabilistic Methods -> Causal Inference

0

0

0

0

3:00

02/02/2021

ACMo: Angle-Calibrated Moment Methods for Stochastic Optimization

Xunpeng Huang, Runxin Xu, Hao Zhou and
Zhe Wang, Zhengyang Liu, Lei Li

Keywords Paper

0

0

0

0

17:26

12/07/2020

A Unified Theory of Decentralized SGD with Changing Topology and Local Updates

Anastasiia Koloskova, Nicolas Loizou, Sadra Boreiri and
Martin Jaggi, Sebastian Stich

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

13:46

06/12/2021

Marginalised Gaussian Processes with Nested Sampling

Fergus Simpson, Vidhi Lalchand, Carl Edward Rasmussen

Keywords Paper

kernel methods

0

0

0

0

11:30

06/12/2021

A Continuous Mapping For Augmentation Design

Keyu Tian, Chen Lin, Ser Nam Lim and
Wanli Ouyang, Puneet Dokania, Philip Torr

Keywords Paper

optimization

0

0

0

0

9:23

06/12/2021

Estimating the Unique Information of Continuous Variables

Ari Pakman, Amin Nejatbakhsh, Dar Gilboa and
Abdullah Makkeh, Luca Mazzucato, Michael Wibral, Elad Schneidman

Keywords Paper

deep learning, optimization, generative model

0

0

0

0

12:39