12/07/2020

Momentum-Based Policy Gradient Methods

Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Keywords: Reinforcement Learning - General

Abstract: Policy gradient methods are a class of powerful algorithms in reinforcement learning (RL). More recently, some variance reduced policy gradient methods have been developed to improve sample efficiency and obtain a near-optimal sample complexity $O(\epsilon^{-3})$ for finding an $\epsilon$-stationary point of non-concave performance function in model-free RL. However, the practical performances of these variance reduced policy gradient methods are not consistent with their near-optimal sample complexity, because these methods require large batches and strict learning rates to achieve this optimal complexity. In the paper, thus, we propose a class of efficient momentum-based policy gradient methods, which use adaptive learning rates and do not require large batches. Specifically, we propose a fast important-sampling momentum-based policy gradient (IS-MBPG) method by using the important sampling technique. Meanwhile, we also propose a fast hessian-aided momentum-based policy gradient (HA-MBPG) method via using the semi-hessian information. In theoretical analysis, we prove that our algorithms also have the sample complexity $O(\epsilon^{-3})$, as the existing best policy gradient methods. In the experiments, we use some benchmark tasks to demonstrate the effectiveness of algorithms.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers