A new convergent variant of Q-learning with linear function approximation

06/12/2020

A new convergent variant of Q-learning with linear function approximation

Diogo Carvalho, Francisco S. Melo, Pedro A. Santos

Keywords:

Abstract Paper Similar Papers

Abstract: In this work, we identify a novel set of conditions that ensure convergence with probability 1 of Q-learning with linear function approximation, by proposing a two time-scale variation thereof. In the faster time scale, the algorithm features an update similar to that of DQN, where the impact of bootstrapping is attenuated by using a Q-value estimate akin to that of the target network in DQN. The slower time-scale, in turn, can be seen as a modified target network update. We establish the convergence of our algorithm, provide an error bound and discuss our results in light of existing convergence results on reinforcement learning with function approximation. Finally, we illustrate the convergent behavior of our method in domains where standard Q-learning has previously been shown to diverge.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

13/04/2021

Logistic q-learning

Joan Bas-Serrano, Sebastian Curi, Andreas Krause, Gergely Neu

Keywords Paper

0

0

0

0

2:44

12/07/2020

Lookahead-Bounded Q-learning

Ibrahim El Shar, Daniel Jiang

Keywords Paper

Reinforcement Learning - General

0

0

0

0

13:51

06/12/2021

On the Convergence Theory of Debiased Model-Agnostic Meta-Reinforcement Learning

Alireza Fallah, Kristian Georgiev, Aryan Mokhtari, Asuman Ozdaglar

Keywords Paper

theory, optimization, reinforcement learning and planning, meta learning

0

1

1

0

12:25

26/08/2020

Finite-Time Error Bounds for Biased Stochastic Approximation with Applications to Q-Learning

Gang Wang, Georgios B. Giannakis

Keywords Paper

0

0

0

0

14:03

18/07/2021

Bilevel Optimization: Convergence Analysis and Enhanced Design

Kaiyi Ji, Junjie Yang, Yingbin LIANG

Keywords Paper

Optimization, Non-Convex Optimization

0

0

0

0

5:02

12/07/2020

Sequential Transfer in Reinforcement Learning with a Generative Model

Andrea Tirinzoni, Riccardo Poiani, Marcello Restelli

Keywords Paper

Reinforcement Learning - General

0

0

0

0

10:54

26/08/2020

Revisiting Stochastic Extragradient

Konstantin Mishchenko, Dmitry Kovalev, Egor Shulgin and
Peter Richtarik, Yura Malitsky

Keywords Paper

0

0

0

0

11:24

13/04/2021

Explicit regularization of stochastic gradient methods through duality

Anant Raj, Francis Bach

Keywords Paper

0

0

0

0

2:53

26/04/2020

Population-Guided Parallel Policy Search for Reinforcement Learning

Whiyoung Jung, Giseung Park, Youngchul Sung

Keywords Paper

Reinforcement Learning, Parallel Learning, Population Based Learning

0

0

0

0

5:01

14/06/2020

Learning to Optimize Non-Rigid Tracking

Yang Li, Aljaž Božič, Tianwei Zhang and
Yanli Ji, Tatsuya Harada, Matthias Nießner

Keywords Paper

non-rigid tracking, learnable optimization, differentiable solver, non-rigid icp, gauss newton, pcg, preconditioning, non-linear optimization, 4d perception, deep learning

0

0

0

0

4:56

06/12/2021

Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks

Dmitry Kovalev, Elnur Gasanov, Alexander Gasnikov, Peter Richtarik

Keywords Paper

optimization

0

0

0

0

15:02

06/12/2020

A Unified Switching System Perspective and Convergence Analysis of Q-Learning Algorithms

Donghwan Lee, Niao He

Keywords Paper

0

0

0

0

3:56

06/12/2021

Non-Asymptotic Analysis for Two Time-scale TDC with General Smooth Function Approximation

Yue Wang, Shaofeng Zou, Yi Zhou

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:28

18/07/2021

Bilinear Classes: A Structural Framework for Provable Generalization in RL

Simon Du, Sham Kakade, Jason Lee and
Shachar Lovett, Gaurav Mahajan, Wen Sun, Ruosong Wang

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

17:40

13/04/2021

Improving predictions of bayesian neural nets via local linearization

Alexander Immer, Maciej Korzepa, Matthias Bauer

Keywords Paper

0

0

0

0

3:24

26/04/2020

Empirical Bayes Transductive Meta-Learning with Synthetic Gradients

Shell Xu Hu, Pablo Moreno, Yang Xiao and
Xi Shen, Guillaume Obozinski, Neil Lawrence, Andreas Damianou

Keywords Paper

Meta-learning, Empirical Bayes, Synthetic Gradient, Information Bottleneck

0

0

0

0

4:47

14/09/2020

Efficiency of Coordinate Descent Methods For Structured Nonconvex Optimization

Qi Deng, Chenghao Lan

Keywords Paper

coordinate descent method, nonconvex optimization, nonsmooth optimization

0

0

0

0

3:20

03/08/2020

No-regret Exploration in Contextual Reinforcement Learning

Aditya Modi, Ambuj Tewari

Keywords Paper

0

0

0

0

8:19

18/07/2021

Putting the ``Learning" into Learning-Augmented Algorithms for Frequency Estimation

Elbert Du, Franklyn Wang, Michael Mitzenmacher

Keywords Paper

Applications, Hardware and Systems

0

0

0

0

5:17

06/12/2020

Minibatch Stochastic Approximate Proximal Point Methods

Hilal Asi, Karan Chadha, Gary Cheng, John Duchi

Keywords Paper

0

0

0

0

3:21

18/07/2021

Risk-Sensitive Reinforcement Learning with Function Approximation: A Debiasing Approach

Tom Fei, Zhuoran Yang, Zhaoran Wang

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

17:05

26/08/2020

A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms

Philip Amortila, Doina Precup, Prakash Panangaden, Marc G. Bellemare

Keywords Paper

0

0

0

0

15:15

06/12/2020

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Paper

0

0

0

0

3:11

06/12/2020

Gaussian Process Bandit Optimization of the Thermodynamic Variational Objective

Vu Nguyen, Vaden Masrani, Rob Brekelmans and
Michael A Osborne, Frank Wood

Keywords Paper

0

0

0

0

3:23

06/12/2021

Model Selection for Bayesian Autoencoders

Ba-Hien Tran, Simone Rossi, Dimitrios Milios and
Pietro Michiardi, Edwin Bonilla, Maurizio Filippone

Keywords Paper

optimization, self-supervised learning, generative model, representation learning

0

0

0

0

10:49

06/12/2020

Revisiting the Sample Complexity of Sparse Spectrum Approximation of Gaussian Processes

Minh Hoang, Nghia Hoang, Hai Pham, David Woodruff

Keywords Paper

, Deep Learning

0

0

0

0

3:25

18/07/2021

Asynchronous Decentralized Optimization With Implicit Stochastic Variance Reduction

Kenta Niwa, Guoqiang Zhang, W. Bastiaan Kleijn and
Noboru Harada, Hiroshi Sawada, Akinori Fujino

Keywords Paper

Optimization, Distributed and Parallel Optimization

0

0

0

0

5:41

02/02/2021

Self-correcting Q-learning

Rong Zhu, Mattia Rigotti

Keywords Paper

0

0

0

0

15:22

12/07/2020

Structured Policy Iteration for Linear Quadratic Regulator

Youngsuk Park, Ryan Rossi, Zheng Wen and
Gang Wu, Handong Zhao

Keywords Paper

Reinforcement Learning - General

0

0

0

0

16:08

18/07/2021

PID Accelerated Value Iteration Algorithm

Amir-massoud Farahmand, Mohammad Ghavamzadeh

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:13

02/02/2021

On Convergence of Gradient Expected Sarsa(λ)

Long Yang, Gang Zheng, Yu Zhang and
Qian Zheng, Pengfei Li, Gang Pan

Keywords Paper

0

0

0

0

11:27

26/04/2020

Gradient Descent Maximizes the Margin of Homogeneous Neural Networks

Kaifeng Lyu, Jian Li

Keywords Paper

margin, homogeneous, gradient descent

0

0

0

0

15:02

06/12/2021

Analyzing the Generalization Capability of SGLD Using Properties of Gaussian Channels

Hao Wang, Yizhe Huang, Rui Gao, Flavio Calmon

Keywords Paper

theory, optimization, machine learning

0

0

0

0

12:27

06/12/2020

Deep Reinforcement and InfoMax Learning

Bogdan Mazoure, Remi Tachet des Combes, Thang Doan and
Philip Bachman, R Devon Hjelm

Keywords Paper

0

0

0

0

3:15

04/08/2021

Nearly Minimax Optimal Reinforcement Learning for Linear Mixture MDPs

Dongruo Zhou, Quanquan Gu, Csaba Szepesvari

Keywords Paper

0

0

0

0

16:33

03/05/2021

Optimism in Reinforcement Learning with Generalized Linear Function Approximation

Yining Wang, Ruosong Wang, Simon Du, Akshay Krishnamurthy

Keywords Paper

reinforcement learning, theory, exploration, function approximation, provable sample efficiency, regret analysis, optimism

0

0

0

0

4:51

06/12/2020

Modeling and Optimization Trade-off in Meta-learning

Katelyn Gao, Ozan Sener

Keywords Paper

0

0

0

0

3:21

04/08/2021

SGD in the Large: Average-case Analysis, Asymptotics, and Stepsize Criticality

Courtney Paquette, Kiwon Lee, Fabian Pedregosa, Elliot Paquette

Keywords Paper

0

0

0

0

14:38

06/12/2021

Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization

Ke Sun, Yafei Wang, Yi Liu and
yingnan zhao, Bo Pan, Shangling Jui, Bei Jiang, Linglong Kong

Keywords Paper

reinforcement learning and planning

0

0

0

0

5:07

12/07/2020

Learning to Score Behaviors for Guided Policy Optimization

Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang and
Krzysztof Choromanski, Anna Choromanska, Michael Jordan

Keywords Paper

Reinforcement Learning - General

0

0

0

0

14:10