Iterative Amortized Policy Optimization

06/12/2021

Iterative Amortized Policy Optimization

Joseph Marino, Alexandre Piche, Alessandro Davide Ialongo, Yisong Yue

Keywords: optimization, reinforcement learning and planning, generative model

Abstract Paper Similar Papers

Abstract: Policy networks are a central feature of deep reinforcement learning (RL) algorithms for continuous control, enabling the estimation and sampling of high-value actions. From the variational inference perspective on RL, policy networks, when used with entropy or KL regularization, are a form of amortized optimization, optimizing network parameters rather than the policy distributions directly. However, direct amortized mappings can yield suboptimal policy estimates and restricted distributions, limiting performance and exploration. Given this perspective, we consider the more flexible class of iterative amortized optimizers. We demonstrate that the resulting technique, iterative amortized policy optimization, yields performance improvements over direct amortization on benchmark continuous control tasks.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Direct Policy Gradients: Direct Optimization of Policies in Discrete Action Spaces

Guy Lorberbom, Chris J. Maddison, Nicolas Heess and
Tamir Hazan, Daniel Tarlow

Keywords Paper

0

0

0

0

3:16

06/12/2020

Variational Policy Gradient Method for Reinforcement Learning with General Utilities

Junyu Zhang, Alec Koppel, Amrit Bedi and
Csaba Szepesvari, Mengdi Wang

Keywords Paper

0

0

0

0

3:20

06/12/2021

Risk-Aware Transfer in Reinforcement Learning using Successor Features

Michael Gimelfarb, Andre Barreto, Scott Sanner, Chi-Guhn Lee

Keywords Paper

reinforcement learning and planning, representation learning, transfer learning

0

0

0

0

12:06

18/07/2021

Offline Reinforcement Learning with Fisher Divergence Critic Regularization

Ilya Kostrikov, Rob Fergus, Jonathan Tompson, Ofir Nachum

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

4:49

18/07/2021

Discovering symbolic policies with deep reinforcement learning

Mikel Landajuela Larma, Brenden Petersen, Sookyung Kim and
Claudio Santiago, Ruben Glatt, Nathan Mundhenk, Jacob Pettit, Daniel Faissol

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

4:55

06/12/2021

Stochastic Multi-Armed Bandits with Control Variates

Arun Verma, Manjesh Kumar Hanawal

Keywords Paper

theory, bandits

0

0

0

0

14:02

18/07/2021

Deep Coherent Exploration for Continuous Control

Yijie Zhang, Herke van Hoof

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:14

19/08/2021

Variational Model-based Policy Optimization

Yinlam Chow, Brandon Cui, Moonkyung Ryu, Mohammad Ghavamzadeh

Keywords Paper

Machine Learning, Reinforcement Learning

0

0

0

0

15:31

02/02/2021

A New Bounding Scheme for Influence Diagrams

Radu Marinescu, Junkyu Lee, Rina Dechter

Keywords Paper

0

0

0

0

16:40

06/12/2021

Local policy search with Bayesian optimization

Sarah Müller, Alexander von Rohr, Sebastian Trimpe

Keywords Paper

theory, optimization, reinforcement learning and planning, active learning

0

0

0

0

11:42

26/08/2020

Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration

Matteo Papini, Andrea Battistello, Marcello Restelli

Keywords Paper

0

0

0

0

12:47

12/07/2020

Structured Policy Iteration for Linear Quadratic Regulator

Youngsuk Park, Ryan Rossi, Zheng Wen and
Gang Wu, Handong Zhao

Keywords Paper

Reinforcement Learning - General

0

0

0

0

16:08

26/08/2020

Discrete Action On-Policy Learning with Action-Value Critic

Yuguang Yue, Yunhao Tang, Mingzhang Yin, Mingyuan Zhou

Keywords Paper

0

0

0

0

14:23

19/08/2021

Monte Carlo Filtering Objectives

Shuangshuang Chen, Sihao Ding, Yiannis Karayiannidis, Mårten Björkman

Keywords Paper

Machine Learning, Learning Generative Models, Time-series; Data Streams, Unsupervised Learning, Approximate Probabilistic Inference

0

0

0

0

13:39

18/07/2021

Provably Correct Optimization and Exploration with Non-linear Policies

Fei Feng, Wotao Yin, Alekh Agarwal, Lin Yang

Keywords Paper

Deep Learning, Adversarial Networks, Applications, Fairness, Accountability, and Transparency, Theory, RL, Decisions and Control Theory

0

0

0

0

5:03

03/05/2021

Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime

Andrea Agazzi, Jianfeng Lu

Keywords Paper

policy gradient, mean-field dynamics, entropy regularization, neural networks

0

0

0

0

5:09

18/07/2021

Representation Matters: Offline Pretraining for Sequential Decision Making

Mengjiao Yang, Ofir Nachum

Keywords Paper

Reinforcement Learning and Planning

1

0

0

0

5:06

06/12/2021

On the Convergence and Sample Efficiency of Variance-Reduced Policy Gradient Method

Junyu Zhang, Chengzhuo Ni, zheng Yu and
Csaba Szepesvari, Mengdi Wang

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

14:49

06/12/2020

Deep Rao-Blackwellised Particle Filters for Time Series Forecasting

Richard Kurle, Syama Sundar Rangapuram, Emmanuel de Bézenac and
Stephan Günnemann, Jan Gasthaus

Keywords Paper

0

0

0

0

3:14

16/11/2020

Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity

Tanmay Gangwani, Jian Peng, Yuan Zhou

Keywords Paper

0

0

0

0

4:27

12/07/2020

Learning to Score Behaviors for Guided Policy Optimization

Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang and
Krzysztof Choromanski, Anna Choromanska, Michael Jordan

Keywords Paper

Reinforcement Learning - General

0

0

0

0

14:10

06/12/2021

Towards Robust Bisimulation Metric Learning

Mete Kemertas, Tristan Aumentado-Armstrong

Keywords Paper

reinforcement learning and planning, robustness, representation learning

0

0

0

0

12:24

06/12/2020

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Paper

0

0

0

0

3:11

06/12/2020

Promoting Stochasticity for Expressive Policies via a Simple and Efficient Regularization Method

Qi Zhou, Yufei Kuang, Zherui Qiu and
Houqiang Li, Jie Wang

Keywords Paper

0

0

0

0

3:10

02/02/2021

Progression Heuristics for Planning with Probabilistic LTL Constraints

Ian Mallett, Sylvie Thiebaux, Felipe Trevizan

Keywords Paper

0

0

0

0

18:23

26/08/2020

Variance Reduction for Evolution Strategies via Structured Control Variates

Yunhao Tang, Krzysztof Choromanski, Alp Kucukelbir

Keywords Paper

0

0

0

0

13:37

19/08/2021

Hindsight Value Function for Variance Reduction in Stochastic Dynamic Environment

Jiaming Guo, Rui Zhang, Xishan Zhang and
Shaohui Peng, Qi Yi, Zidong Du, Xing Hu, Qi Guo, Yunji Chen

Keywords Paper

Machine Learning, Deep Learning, Deep Reinforcement Learning, Sequential Decision Making

0

0

0

0

14:36

12/07/2020

Uniform Convergence of Rank-weighted Learning

Liu Leqi, Justin Khim, Adarsh Prasad, Pradeep Ravikumar

Keywords Paper

Learning Theory

0

0

0

0

13:21

12/07/2020

Momentum-Based Policy Gradient Methods

Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Keywords Paper

Reinforcement Learning - General

0

0

0

0

13:28

06/12/2021

Generalized Proximal Policy Optimization with Sample Reuse

James Queeney, Yannis Paschalidis, Christos G Cassandras

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

13:45

06/12/2021

Multi-Agent Reinforcement Learning in Stochastic Networked Systems

Yiheng Lin, Guannan Qu, Longbo Huang, Adam Wierman

Keywords Paper

reinforcement learning and planning, graph learning

0

0

0

0

11:20

06/12/2021

Control Variates for Slate Off-Policy Evaluation

Nikos Vlassis, Ashok Chandrashekar, Fernando Amat, Nathan Kallus

Keywords Paper

optimization, bandits

0

0

0

0

12:25

18/07/2021

Better Training using Weight-Constrained Stochastic Dynamics

Benedict Leimkuhler, Tiffany Vlaar, Timothée Pouchon, Amos Storkey

Keywords Paper

Deep Learning, Bayesian Deep Learning

0

0

0

0

5:14

06/12/2021

Sampling with Trusthworthy Constraints: A Variational Gradient Framework

Xingchao Liu, Xin Tong, Qiang Liu

Keywords Paper

optimization, machine learning, fairness, interpretability

0

0

0

0

11:21

18/07/2021

A Regret Minimization Approach to Iterative Learning Control

Naman Agarwal, Elad Hazan, Anirudha Majumdar, Karan Singh

Keywords Paper

Reinforcement Learning and Planning, Planning and Control

0

0

0

0

5:13

26/04/2020

Implementation Matters in Deep RL: A Case Study on PPO and TRPO

Logan Engstrom, Andrew Ilyas, Shibani Santurkar and
Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry

Keywords Paper

deep policy gradient methods, deep reinforcement learning, trpo, ppo

0

0

0

0

20:41

18/07/2021

Active Feature Acquisition with Generative Surrogate Models

Yang Li, Junier Oliva

Keywords Paper

Deep Learning, Generative Models, Applications, Computational Biology and Bioinformatics, Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:44

16/11/2020

Safe Policy Learning for Continuous Control

Yinlam Chow, Ofir Nachum, Aleksandra Faust and
Edgar Dueñez-Guzman, Mohammad Ghavamzadeh

Keywords Paper

0

0

0

0

5:20

03/08/2020

Neural Likelihoods via Cumulative Distribution Functions

Pawel Chilinski, Ricardo Silva

Keywords Paper

0

0

0

0

8:07

06/12/2020

Approximation Based Variance Reduction for Reparameterization Gradients

Tomas Geffner, Justin Domke

Keywords Paper

0

0

0

0

3:11