Generalized Proximal Policy Optimization with Sample Reuse

06/12/2021

Generalized Proximal Policy Optimization with Sample Reuse

James Queeney, Yannis Paschalidis, Christos G Cassandras

Keywords: optimization, reinforcement learning and planning

Abstract Paper Similar Papers

Abstract: In real-world decision making tasks, it is critical for data-driven reinforcement learning methods to be both stable and sample efficient. On-policy methods typically generate reliable policy improvement throughout training, while off-policy methods make more efficient use of data through sample reuse. In this work, we combine the theoretically supported stability benefits of on-policy algorithms with the sample efficiency of off-policy algorithms. We develop policy improvement guarantees that are suitable for the off-policy setting, and connect these bounds to the clipping mechanism used in Proximal Policy Optimization. This motivates an off-policy version of the popular algorithm that we call Generalized Proximal Policy Optimization with Sample Reuse. We demonstrate both theoretically and empirically that our algorithm delivers improved performance by effectively balancing the competing goals of stability and sample efficiency.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

13/04/2021

Logistic q-learning

Joan Bas-Serrano, Sebastian Curi, Andreas Krause, Gergely Neu

Keywords Paper

0

0

0

0

2:44

19/08/2021

Variational Model-based Policy Optimization

Yinlam Chow, Brandon Cui, Moonkyung Ryu, Mohammad Ghavamzadeh

Keywords Paper

Machine Learning, Reinforcement Learning

0

0

0

0

15:31

02/02/2021

Variance Penalized On-Policy and Off-Policy Actor-Critic

Arushi Jain, Gandharv Patil, Ayush Jain and
Khimya Khetarpal, Doina Precup

Keywords Paper

0

0

0

0

17:58

18/07/2021

Representation Matters: Offline Pretraining for Sequential Decision Making

Mengjiao Yang, Ofir Nachum

Keywords Paper

Reinforcement Learning and Planning

1

0

0

0

5:06

26/08/2020

Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration

Matteo Papini, Andrea Battistello, Marcello Restelli

Keywords Paper

0

0

0

0

12:47

06/12/2020

A Unifying View of Optimism in Episodic Reinforcement Learning

Gergely Neu, Ciara Pike-Burke

Keywords Paper

0

0

0

0

3:13

18/07/2021

Offline Reinforcement Learning with Fisher Divergence Critic Regularization

Ilya Kostrikov, Rob Fergus, Jonathan Tompson, Ofir Nachum

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

4:49

06/12/2020

Adaptive Discretization for Model-Based Reinforcement Learning

Sean Sinclair, Tianyu Wang, Gauri Jain and
Sid Banerjee, Christina Yu

Keywords Paper

0

0

0

0

3:12

06/12/2021

Adversarial Intrinsic Motivation for Reinforcement Learning

Ishan Durugkar, Mauricio Tec, Scott Niekum, Peter Stone

Keywords Paper

reinforcement learning and planning, generative model

0

0

0

0

13:11

06/12/2021

Time-series Generation by Contrastive Imitation

Daniel Jarrett, Ioana Bica, Mihaela van der Schaar

Keywords Paper

generative model

0

0

0

0

8:47

12/07/2020

Momentum-Based Policy Gradient Methods

Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Keywords Paper

Reinforcement Learning - General

0

0

0

0

13:28

19/08/2021

Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts

Weinan Zhang, Xihuai Wang, Jian Shen, Ming Zhou

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Multi-agent Learning

0

0

0

0

13:10

06/12/2021

On Effective Scheduling of Model-based Reinforcement Learning

Hang Lai, Jian Shen, Weinan Zhang and
Yimin Huang, Xing Zhang, Ruiming Tang, Yong Yu, Zhenguo Li

Keywords Paper

optimization, reinforcement learning and planning

0

0

0

0

10:28

18/07/2021

Learning Routines for Effective Off-Policy Reinforcement Learning

Edoardo Cetin, Oya Celiktutan

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:17

26/04/2020

Ranking Policy Gradient

Kaixiang Lin, Jiayu Zhou

Keywords Paper

Sample-efficient reinforcement learning, off-policy learning.

0

0

0

0

5:43

12/07/2020

Structured Policy Iteration for Linear Quadratic Regulator

Youngsuk Park, Ryan Rossi, Zheng Wen and
Gang Wu, Handong Zhao

Keywords Paper

Reinforcement Learning - General

0

0

0

0

16:08

19/08/2021

Average-Reward Reinforcement Learning with Trust Region Methods

Xiaoteng Ma, Xiaohang Tang, Li Xia and
Jun Yang, Qianchuan Zhao

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning, Markov Decision Processes

0

0

0

0

14:41

06/12/2021

Algorithmic stability and generalization of an unsupervised feature selection algorithm

xinxing wu, Qiang Cheng

Keywords Paper

deep learning

0

0

0

0

12:41

02/02/2021

Robust Reinforcement Learning: A Case Study in Linear Quadratic Regulation

Bo Pang, Zhong-Ping Jiang

Keywords Paper

0

0

0

0

20:01

18/07/2021

Bilevel Optimization: Convergence Analysis and Enhanced Design

Kaiyi Ji, Junjie Yang, Yingbin LIANG

Keywords Paper

Optimization, Non-Convex Optimization

0

0

0

0

5:02

03/05/2021

Entropic gradient descent algorithms and wide flat minima

Fabrizio Pittorino, Carlo Lucibello, Christoph Feinauer and
Gabriele Perugini, Carlo Baldassi, Elizaveta Demyanenko, Riccardo Zecchina

Keywords Paper

flat minima, belief-propagation, statistical physics, entropic algorithms

0

0

0

0

5:38

18/07/2021

Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality

Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin LIANG

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

4:23

06/12/2021

Risk-Aware Transfer in Reinforcement Learning using Successor Features

Michael Gimelfarb, Andre Barreto, Scott Sanner, Chi-Guhn Lee

Keywords Paper

reinforcement learning and planning, representation learning, transfer learning

0

0

0

0

12:06

12/07/2020

Sequential Transfer in Reinforcement Learning with a Generative Model

Andrea Tirinzoni, Riccardo Poiani, Marcello Restelli

Keywords Paper

Reinforcement Learning - General

0

0

0

0

10:54

12/07/2020

Optimization and Analysis of the pAp@k Metric for Recommender Systems

Gaurush Hiranandani, Warut Vijitbenjaronk, Sanmi Koyejo, Prateek Jain

Keywords Paper

Learning Theory

0

0

0

0

16:11

12/07/2020

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

13:21

06/12/2021

Stability and Generalization of Bilevel Programming in Hyperparameter Optimization

Fan Bao, Guoqiang Wu, Chongxuan LI and
Jun Zhu, Bo Zhang

Keywords Paper

optimization

0

0

0

0

8:58

06/12/2021

Explicable Reward Design for Reinforcement Learning Agents

Rati Devidze, Goran Radanovic, Parameswaran Kamalaruban, Adish Singla

Keywords Paper

optimization, reinforcement learning and planning, interpretability

0

0

0

0

4:10

13/04/2021

Learning prediction intervals for regression: Generalization and calibration

Haoxian Chen, Ziyi Huang, Henry Lam and
Huajie Qian, Haofeng Zhang

Keywords Paper

0

0

0

0

3:26

03/05/2021

Learning Value Functions in Deep Policy Gradients using Residual Variance

Yannis Flet-Berliac, reda ouhamma, odalric-ambrym maillard, philippe preux

Keywords Paper

0

0

0

0

4:49

18/07/2021

Phasic Policy Gradient

Karl Cobbe, Jacob Hilton, Oleg Klimov, John Schulman

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

20:40

26/04/2020

Population-Guided Parallel Policy Search for Reinforcement Learning

Whiyoung Jung, Giseung Park, Youngchul Sung

Keywords Paper

Reinforcement Learning, Parallel Learning, Population Based Learning

0

0

0

0

5:01

06/12/2020

A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Wei Deng, Guang Lin, Faming Liang

Keywords Paper

0

0

0

0

3:26

13/04/2021

Adversarially robust estimate and risk analysis in linear regression

Yue Xing, Ruizhi Zhang, Guang Cheng

Keywords Paper

0

0

0

0

3:03

26/04/2020

Implementation Matters in Deep RL: A Case Study on PPO and TRPO

Logan Engstrom, Andrew Ilyas, Shibani Santurkar and
Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry

Keywords Paper

deep policy gradient methods, deep reinforcement learning, trpo, ppo

0

0

0

0

20:41

18/07/2021

Taylor Expansion of Discount Factors

Yunhao Tang, Mark Rowland, Remi Munos, Michal Valko

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:22

03/05/2021

Batch Reinforcement Learning Through Continuation Method

Yijie Guo, Shengyu Feng, Nicolas Le Roux and
Ed H. Chi, Honglak Lee, Minmin Chen

Keywords Paper

batch reinforcement learning, relaxed regularization, continuation method

1

0

0

0

5:34

06/12/2020

Off-Policy Imitation Learning from Observations

Zhuangdi Zhu, Kaixiang Lin, Bo Dai, Jiayu Zhou

Keywords Paper

0

0

0

1

3:24

18/07/2021

Accelerating Safe Reinforcement Learning with Constraint-mismatched Baseline Policies

Jimmy Yang, Justinian Rosca, Karthik Narasimhan, Peter Ramadge

Keywords Paper

Algorithms, Adversarial Learning, Applications, Computer Vision; Deep Learning, Adversarial Networks; Deep Learning, Generative Models, Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:20

06/12/2021

Towards Robust Bisimulation Metric Learning

Mete Kemertas, Tristan Aumentado-Armstrong

Keywords Paper

reinforcement learning and planning, robustness, representation learning

0

0

0

0

12:24