02/02/2021

On Convergence of Gradient Expected Sarsa(λ)

Long Yang, Gang Zheng, Yu Zhang, Qian Zheng, Pengfei Li, Gang Pan

Keywords:

Abstract: We study the convergence of Expected Sarsa(λ) with function approximation. We show that with off-line es- timate (multi-step bootstrapping) to ExpectedSarsa(λ) is unstable for off-policy learning. Furthermore, based on convex-concave saddle-point framework, we propose a con- vergent Gradient Expected Sarsa(λ) (GES(λ)) algorithm. The theoretical analysis shows that the proposed GES(λ) converges to the optimal solution at a linear convergence rate under true gradient setting. Furthermore, we develop a Lyapunov function technique to investigate how the step- size influences finite-time performance of GES(λ). Addition- ally, such a technique of Lyapunov function can be poten- tially generalized to other gradient temporal difference algo- rithms. Finally, our experiments verify the effectiveness of our GES(λ). For the details of proof, please refer to https: //arxiv.org/pdf/2012.07199.pdf.

The video of this talk cannot be embedded. You can watch it here:
https://slideslive.com/38948258
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at AAAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers

 2:28