High-Dimensional Contextual Policy Search with Unknown Context Rewards using Bayesian Optimization

06/12/2020

High-Dimensional Contextual Policy Search with Unknown Context Rewards using Bayesian Optimization

Qing Feng , Ben Letham, Hongzi Mao, Eytan Bakshy

Keywords:

Abstract Paper Similar Papers

Abstract: Contextual policies are used in many settings to customize system parameters and actions to the specifics of a particular setting. In some real-world settings, such as randomized controlled trials or A/B tests, it may not be possible to measure policy outcomes at the level of context—we observe only aggregate rewards across a distribution of contexts. This makes policy optimization much more difficult because we must solve a high-dimensional optimization problem over the entire space of contextual policies, for which existing optimization methods are not suitable. We develop effective models that leverage the structure of the search space to enable contextual policy optimization directly from the aggregate rewards using Bayesian optimization. We use a collection of simulation studies to characterize the performance and robustness of the models, and show that our approach of inferring a low-dimensional context embedding performs best. Finally, we show successful contextual policy optimization in a real-world video bitrate policy problem.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Design of Experiments for Stochastic Contextual Linear Bandits

Andrea Zanette, Kefan Dong, Jonathan N Lee, Emma Brunskill

Keywords Paper

reinforcement learning and planning, bandits

0

0

0

0

13:58

03/05/2021

Blending MPC & Value Function Approximation for Efficient Reinforcement Learning

Mohak Bhardwaj, Sanjiban Choudhury, Byron Boots

Keywords Paper

reinforcement learning, model-predictive control

0

0

0

0

5:09

06/12/2021

Stateful Strategic Regression

Keegan Harris, Hoda Heidari, Steven Wu

Keywords Paper

optimization

0

0

0

0

14:02

18/07/2021

Provably Correct Optimization and Exploration with Non-linear Policies

Fei Feng, Wotao Yin, Alekh Agarwal, Lin Yang

Keywords Paper

Deep Learning, Adversarial Networks, Applications, Fairness, Accountability, and Transparency, Theory, RL, Decisions and Control Theory

0

0

0

0

5:03

26/08/2020

Mixed Strategies for Robust Optimization of Unknown Objectives

Pier Giuseppe Sessa, Ilija Bogunovic, Maryam Kamgarpour, Andreas Krause

Keywords Paper

0

0

0

0

14:13

12/07/2020

A distributional view on multi objective policy optimization

Abbas Abdolmaleki, Sandy Huang, Leonard Hasenclever and
Michael Neunert, Martina Zambelli, Murilo Martins, Francis Song, Nicolas Heess, Raia Hadsell, Martin Riedmiller

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:04

18/07/2021

Implicit rate-constrained optimization of non-decomposable objectives

Abhishek Kumar, Harikrishna Narasimhan, Andrew Cotter

Keywords Paper

Algorithms, Supervised Learning

0

0

0

0

3:48

06/12/2021

Automated Dynamic Mechanism Design

Hanrui Zhang, Vincent Conitzer

Keywords Paper

0

0

0

0

14:35

18/07/2021

The Power of Adaptivity for Stochastic Submodular Cover

Rohan Ghuge, Anupam Gupta, viswanath nagarajan

Keywords Paper

Optimization, Stochastic Optimization

0

0

0

0

16:47

18/07/2021

Monotonic Robust Policy Optimization with Model Discrepancy

yuankun jiang, Chenglin Li, Wenrui Dai and
Junni Zou, Hongkai Xiong

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:17

06/12/2021

Control Variates for Slate Off-Policy Evaluation

Nikos Vlassis, Ashok Chandrashekar, Fernando Amat, Nathan Kallus

Keywords Paper

optimization, bandits

0

0

0

0

12:25

06/12/2021

USCO-Solver: Solving Undetermined Stochastic Combinatorial Optimization Problems

Guangmo Tong

Keywords Paper

optimization

0

0

0

0

15:00

06/12/2020

The Advantage of Conditional Meta-Learning for Biased Regularization and Fine Tuning

Giulia Denevi, Massimiliano Pontil, Carlo Ciliberto

Keywords Paper

0

0

0

0

3:24

13/04/2021

Linear models are robust optimal under strategic behavior

Wei Tang, Chien-Ju Ho, Yang Liu

Keywords Paper

0

0

0

0

3:32

26/08/2020

Finite-Time Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Jun Sun, Gang Wang, Georgios B. Giannakis and
Qinmin Yang, Zaiyue Yang

Keywords Paper

0

0

0

0

17:07

02/02/2021

Learning Prediction Intervals for Model Performance

Benjamin Elder, Matthew Arnold, Anupama Murthi, Jiří Navrátil

Keywords Paper

0

0

0

0

20:12

06/12/2021

Factored Policy Gradients: Leveraging Structure for Efficient Learning in MOMDPs

Thomas Spooner, Nelson Vadori, Sumitra Ganesh

Keywords Paper

bandits

0

0

0

0

14:40

13/04/2021

Learning prediction intervals for regression: Generalization and calibration

Haoxian Chen, Ziyi Huang, Henry Lam and
Huajie Qian, Haofeng Zhang

Keywords Paper

0

0

0

0

3:26

26/08/2020

Truly Batch Model-Free Inverse Reinforcement Learning about Multiple Intentions

Giorgia Ramponi, Amarildo Likmeta, Alberto Maria Metelli and
Andrea Tirinzoni, Marcello Restelli

Keywords Paper

0

0

0

0

9:41

18/07/2021

Meta-learning Hyperparameter Performance Prediction with Neural Processes

Ying WEI, Peilin Zhao, Junzhou Huang

Keywords Paper

Algorithms, Supervised Learning

0

0

0

0

5:07

03/05/2021

Benchmarks for Deep Off-Policy Evaluation

Justin Fu, Mohammad Norouzi, Ofir Nachum and
George Tucker, ziyu wang, Alexander Novikov, Sherry Yang, Michael Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Paine

Keywords Paper

reinforcement learning, benchmarks, off-policy evaluation

0

0

0

0

10:05

26/04/2020

Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization

Michael Volpp, Lukas P. Fröhlich, Kirsten Fischer and
Andreas Doerr, Stefan Falkner, Frank Hutter, Christian Daniel

Keywords Paper

Transfer Learning, Meta Learning, Bayesian Optimization, Reinforcement Learning

0

0

0

0

5:05

06/12/2020

Online Bayesian Goal Inference for Boundedly Rational Planning Agents

Tan Zhi-Xuan, Jordyn Mann, Tom Silver and
Josh Tenenbaum, Vikash Mansinghka

Keywords Paper

0

0

0

0

3:23

06/12/2021

Generalized Proximal Policy Optimization with Sample Reuse

James Queeney, Yannis Paschalidis, Christos G Cassandras

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

13:45

12/07/2020

Interpreting Robust Optimization via Adversarial Influence Functions

Zhun Deng, Cynthia Dwork, Jialiang Wang, Linjun Zhang

Keywords Paper

Trustworthy Machine Learning

0

0

0

0

13:20

12/07/2020

Robust Black Box Explanations Under Distribution Shift

Himabindu Lakkaraju, Nino Arsov, Osbert Bastani

Keywords Paper

Accountability, Transparency and Interpretability

0

0

0

0

14:02

06/12/2021

Out-of-Distribution Generalization in Kernel Regression

Abdulkadir Canatar, Blake Bordelon, Cengiz Pehlevan

Keywords Paper

theory, deep learning, machine learning

0

0

0

0

15:07

06/12/2020

Model Inversion Networks for Model-Based Optimization

Aviral Kumar, Sergey Levine

Keywords Paper

Algorithms -> Semi-Supervised Learning, Applications -> Fairness, Accountability, and Transparency

0

0

0

0

3:10

02/02/2021

Integrated Optimization of Bipartite Matching and Its Stochastic Behavior: New Formulation and Approximation Algorithm via Min-cost Flow Optimization

Yuya Hikima, Yasunori Akagi, Hideaki Kim and
Masahiro Kohjima, Takeshi Kurashima, Hiroyuki Toda

Keywords Paper

0

0

0

0

17:19

06/12/2020

Normalizing Kalman Filters for Multivariate Time Series Analysis

Emmanuel de Bézenac, Syama Sundar Rangapuram, Konstantinos Benidis and
Michael Bohlke-Schneider, Richard Kurle, Lorenzo Stella, Hilaf Hasson, Patrick Gallinari, Tim Januschowski

Keywords Paper

0

0

0

0

3:19

18/07/2021

PODS: Policy Optimization via Differentiable Simulation

Miguel Angel Zamora Mora, Momchil Peychev, Sehoon Ha and
Martin Vechev, Stelian Coros

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

4:28

03/05/2021

Entropic gradient descent algorithms and wide flat minima

Fabrizio Pittorino, Carlo Lucibello, Christoph Feinauer and
Gabriele Perugini, Carlo Baldassi, Elizaveta Demyanenko, Riccardo Zecchina

Keywords Paper

flat minima, belief-propagation, statistical physics, entropic algorithms

0

0

0

0

5:38

02/02/2021

Stable Adversarial Learning under Distributional Shifts

Jiashuo Liu, Zheyan Shen, Peng Cui and
Linjun Zhou, Kun Kuang, Bo Li, Yishi Lin

Keywords Paper

0

0

0

0

14:30

18/07/2021

Sequential Domain Adaptation by Synthesizing Distributionally Robust Experts

Bahar Taskesen, Man Chung Yue, Jose Blanchet and
Daniel Kuhn, Viet Anh Nguyen

Keywords Paper

Optimization, Convex Optimization, Theory, Regularization

0

0

0

0

17:53

02/02/2021

Wasserstein Distributionally Robust Inverse Multiobjective Optimization

Chaosheng Dong, Bo Zeng

Keywords Paper

0

0

0

0

14:45

26/04/2020

Black-box Off-policy Estimation for Infinite-Horizon Reinforcement Learning

Ali Mousavi, Lihong Li, Qiang Liu, Denny Zhou

Keywords Paper

reinforcement learning, off-policy estimation, importance sampling, propensity score

0

0

0

0

5:25

06/12/2021

Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Making by Reinforcement Learning

Kai Wang, Sanket Shah, Haipeng Chen and
Andrew Perrault, Finale Doshi-Velez, Milind Tambe

Keywords Paper

deep learning, optimization, reinforcement learning and planning

0

0

0

0

14:52

06/12/2021

Efficient Online Estimation of Causal Effects by Deciding What to Observe

Shantanu Gupta, Zachary Lipton, David Childers

Keywords Paper

reinforcement learning and planning, graph learning, causality

0

0

0

0

14:18

06/12/2021

Robust Generalization despite Distribution Shift via Minimum Discriminating Information

Tobias Sutter, Andreas Krause, Daniel Kuhn

Keywords Paper

optimization, machine learning

0

0

0

0

15:05

18/07/2021

Beyond the Pareto Efficient Frontier: Constraint Active Search for Multiobjective Experimental Design

Gustavo Malkomes, Harvey Cheng, Eric Lee, Michael McCourt

Keywords Paper

Probabilistic Methods, Gaussian Processes and Bayesian non-parametrics

0

0

0

0

5:20