Population-Guided Parallel Policy Search for Reinforcement Learning

Abstract: In this paper, a new population-guided parallel learning scheme is proposed to enhance the performance of off-policy reinforcement learning (RL). In the proposed scheme, multiple identical learners with their own value-functions and policies share a common experience replay buffer, and search a good policy in collaboration with the guidance of the best policy information. The key point is that the information of the best policy is fused in a soft manner by constructing an augmented loss function for policy update to enlarge the overall search region by the multiple learners. The guidance by the previous best policy and the enlarged range enable faster and better policy search, and monotone improvement of the expected cumulative return by the proposed scheme is proved theoretically. Working algorithms are constructed by applying the proposed scheme to the twin delayed deep deterministic (TD3) policy gradient algorithm, and numerical results show that the constructed P3S-TD3 outperforms most of the current state-of-the-art RL algorithms, and the gain is significant in the case of sparse reward environment.

18/07/2021

Neuroscience and Cognitive Science, Neuroscience, Reinforcement Learning and Planning, Algorithms, Representation Learning; Algorithms, Sparse Coding and Dimensionality Expansion; Applications, Matrix and Ten

5:16

06/12/2020

Population-Guided Parallel Policy Search for Reinforcement Learning

Whiyoung Jung, Giseung Park, Youngchul Sung

Comments

Similar Papers

Representation Matters: Offline Pretraining for Sequential Decision Making

Mengjiao Yang, Ofir Nachum

Keywords Abstract Paper

Reinforcement Learning and Planning

Adversarial Intrinsic Motivation for Reinforcement Learning

Ishan Durugkar, Mauricio Tec, Scott Niekum, Peter Stone

Keywords Abstract Paper

reinforcement learning and planning, generative model

Provably Efficient Algorithms for Multi-Objective Competitive RL

Tiancheng Yu, Yi Tian, Jingzhao Zhang, Suvrit Sra

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

Cross-Gradient Aggregation for Decentralized Learning from Non-IID Data

Yasaman Esfandiari, Sin Yong Tan, Zhanhong Jiang and Aditya Balu, Ethan Herron, Chinmay Hegde, Soumik Sarkar

Keywords Abstract Paper

Optimization, Distributed and Parallel Optimization

Logistic q-learning

Joan Bas-Serrano, Sebastian Curi, Andreas Krause, Gergely Neu

Keywords Abstract Paper

Bayesian Reinforcement Learning via Deep, Sparse Sampling

Divya Grover, Debabrota Basu, Christos Dimitrakakis

Keywords Abstract Paper

Finite-Time Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

Jun Sun, Gang Wang, Georgios B. Giannakis and Qinmin Yang, Zaiyue Yang

Keywords Abstract Paper

Variational Empowerment as Representation Learning for Goal-Conditioned Reinforcement Learning

Jongwook Choi, Archit Sharma, Honglak Lee and Sergey Levine, Shixiang Gu

Keywords Abstract Paper

Neuroscience and Cognitive Science, Neuroscience, Reinforcement Learning and Planning, Algorithms, Representation Learning; Algorithms, Sparse Coding and Dimensionality Expansion; Applications, Matrix and Ten

Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions

Matthew Faw, Rajat Sen, Karthikeyan Shanmugam and Constantine Caramanis, Sanjay Shakkottai

Keywords Abstract Paper

C-Learning: Horizon-Aware Cumulative Accessibility Estimation

Panteha Naderian, Gabriel Loaiza-Ganem, Harry Braviner and Anthony Caterini, Jesse C Cresswell, Tong Li, Animesh Garg

Keywords Abstract Paper

reinforcement learning, goal reaching, Q-learning

Hindsight Trust Region Policy Optimization

Hanbo Zhang, Site Bai, Xuguang Lan and David Hsu, Nanning Zheng

Keywords Abstract Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning

A Hybrid Stochastic Policy Gradient Algorithm for Reinforcement Learning

Nhan Pham, Lam Nguyen, Dzung Phan and PHUONG HA NGUYEN, Marten van Dijk, Quoc Tran-Dinh

Keywords Abstract Paper

Adaptive Discretization for Model-Based Reinforcement Learning

Sean Sinclair, Tianyu Wang, Gauri Jain and Sid Banerjee, Christina Yu

Keywords Abstract Paper

Improved Corruption Robust Algorithms for Episodic Reinforcement Learning

Yifang Chen, Simon Du, Kevin Jamieson

Keywords Abstract Paper

, Optimization, Non-Convex Optimization, Theory, Online Learning Theory

A state aggregation approach for solving knapsack problem with deep reinforcement learning

Reza Refaei Afshar, Yingqian Zhang, Murat Firat, Uzay Kaymak

Keywords Abstract Paper

Sequential Transfer in Reinforcement Learning with a Generative Model

Andrea Tirinzoni, Riccardo Poiani, Marcello Restelli

Keywords Abstract Paper

Reinforcement Learning - General

Reinforcement Learning Based Multi-Agent Resilient Control: From Deep Neural Networks to an Adaptive Law

Jian Hou, Fangyuan Wang, Lili Wang, Zhiyong Chen

Keywords Abstract Paper

Generalized Proximal Policy Optimization with Sample Reuse

James Queeney, Yannis Paschalidis, Christos G Cassandras

Keywords Abstract Paper

optimization, reinforcement learning and planning

Bilevel Optimization: Convergence Analysis and Enhanced Design

Kaiyi Ji, Junjie Yang, Yingbin LIANG

Keywords Abstract Paper

Optimization, Non-Convex Optimization

Implementation Matters in Deep RL: A Case Study on PPO and TRPO

Logan Engstrom, Andrew Ilyas, Shibani Santurkar and Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry

Keywords Abstract Paper

deep policy gradient methods, deep reinforcement learning, trpo, ppo

Modeling and Optimization Trade-off in Meta-learning

Katelyn Gao, Ozan Sener

Keywords Abstract Paper

A Minimalist Approach to Offline Reinforcement Learning

Keywords Paper

Keywords Paper

Keywords Paper

Yasaman Esfandiari, Sin Yong Tan, Zhanhong Jiang and
Aditya Balu, Ethan Herron, Chinmay Hegde, Soumik Sarkar

Keywords Paper

Keywords Paper

Keywords Paper

Jun Sun, Gang Wang, Georgios B. Giannakis and
Qinmin Yang, Zaiyue Yang

Keywords Paper

Jongwook Choi, Archit Sharma, Honglak Lee and
Sergey Levine, Shixiang Gu

Keywords Paper

Matthew Faw, Rajat Sen, Karthikeyan Shanmugam and
Constantine Caramanis, Sanjay Shakkottai

Keywords Paper

Panteha Naderian, Gabriel Loaiza-Ganem, Harry Braviner and
Anthony Caterini, Jesse C Cresswell, Tong Li, Animesh Garg

Keywords Paper

Hanbo Zhang, Site Bai, Xuguang Lan and
David Hsu, Nanning Zheng

Keywords Paper

Nhan Pham, Lam Nguyen, Dzung Phan and
PHUONG HA NGUYEN, Marten van Dijk, Quoc Tran-Dinh

Keywords Paper

Sean Sinclair, Tianyu Wang, Gauri Jain and
Sid Banerjee, Christina Yu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Logan Engstrom, Andrew Ilyas, Shibani Santurkar and
Dimitris Tsipras, Firdaus Janoos, Larry Rudolph, Aleksander Madry

Keywords Paper

Keywords Paper

Keywords Paper

Xin Zhang, Zhuqing Liu, Jia Liu and
Zhengyuan Zhu, Songtao Lu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Zhe Xu, Ivan Gavran, Yousef Ahmad and
Rupak Majumdar, Daniel Neider, Ufuk Topcu, Bo Wu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Tianhao Zhang, Qiwei Ye, Jiang Bian and
Guangming Xie, Tie-Yan Liu

Keywords Paper

Yuzhou Cao, Lei Feng, Yitian Xu and
Bo An, Gang Niu, Masashi Sugiyama

Keywords Paper

Keywords Paper

Keywords Paper

Qing Li, Siyuan Huang, Yining Hong and
Yixin Chen, Ying Nian Wu, Song-Chun Zhu

Keywords Paper