Lookahead-Bounded Q-learning

Abstract: We introduce the lookahead-bounded Q-learning (LBQL) algorithm, a new, provably convergent variant of Q-learning that seeks to improve the performance of standard Q-learning in stochastic environments through the use of “lookahead” upper and lower bounds. To do this, LBQL employs previously collected experience and each iteration’s state-action values as dual feasible penalties to construct a sequence of sampled information relaxation problems. The solutions to these problems provide estimated upper and lower bounds on the optimal value, which we track via stochastic approximation. These quantities are then used to constrain the iterates to stay within the bounds at every iteration. Numerical experiments confirm the fast convergence of LBQL as compared to the standard Q-learning algorithm and several related techniques.

06/12/2020

non-rigid tracking, learnable optimization, differentiable solver, non-rigid icp, gauss newton, pcg, preconditioning, non-linear optimization, 4d perception, deep learning

4:56

06/12/2020

Lookahead-Bounded Q-learning

Ibrahim El Shar, Daniel Jiang

Comments

Similar Papers

A new convergent variant of Q-learning with linear function approximation

Diogo Carvalho, Francisco S. Melo, Pedro A. Santos

Keywords Abstract Paper

Bilevel Optimization: Convergence Analysis and Enhanced Design

Kaiyi Ji, Junjie Yang, Yingbin LIANG

Keywords Abstract Paper

Optimization, Non-Convex Optimization

Harmonized Dense Knowledge Distillation Training for Multi-Exit Architectures

Xinglu Wang, Yingming Li

Keywords Abstract Paper

Revisiting the Sample Complexity of Sparse Spectrum Approximation of Gaussian Processes

Minh Hoang, Nghia Hoang, Hai Pham, David Woodruff

Keywords Abstract Paper

, Deep Learning

A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Wei Deng, Guang Lin, Faming Liang

Keywords Abstract Paper

Efficient Generalization with Distributionally Robust Learning

Soumyadip Ghosh, Mark Squillante, Ebisa Wollega

Keywords Abstract Paper

optimization, machine learning

Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation

Si Yi Meng, Sharan Vaswani, Issam Hadj Laradji and Mark Schmidt, Simon Lacoste-Julien

Keywords Abstract Paper

A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms

Philip Amortila, Doina Precup, Prakash Panangaden, Marc G. Bellemare

Keywords Abstract Paper

Zap Q-Learning With Nonlinear Function Approximation

Shuhang Chen, Adithya M Devraj, Fan Lu and Ana Busic, Sean Meyn

Keywords Abstract Paper

Learning to Optimize Non-Rigid Tracking

Yang Li, Aljaž Božič, Tianwei Zhang and Yanli Ji, Tatsuya Harada, Matthias Nießner

Keywords Abstract Paper

non-rigid tracking, learnable optimization, differentiable solver, non-rigid icp, gauss newton, pcg, preconditioning, non-linear optimization, 4d perception, deep learning

Minibatch Stochastic Approximate Proximal Point Methods

Hilal Asi, Karan Chadha, Gary Cheng, John Duchi

Keywords Abstract Paper

On Convergence of Gradient Expected Sarsa(λ)

Long Yang, Gang Zheng, Yu Zhang and Qian Zheng, Pengfei Li, Gang Pan

Keywords Abstract Paper

Revisiting Stochastic Extragradient

Konstantin Mishchenko, Dmitry Kovalev, Egor Shulgin and Peter Richtarik, Yura Malitsky

Keywords Abstract Paper

Damped Anderson Mixing for Deep Reinforcement Learning: Acceleration, Convergence, and Stabilization

Ke Sun, Yafei Wang, Yi Liu and yingnan zhao, Bo Pan, Shangling Jui, Bei Jiang, Linglong Kong

Keywords Abstract Paper

reinforcement learning and planning

Logistic q-learning

Joan Bas-Serrano, Sebastian Curi, Andreas Krause, Gergely Neu

Keywords Abstract Paper

Tightening the Dependence on Horizon in the Sample Complexity of Q-Learning

Gen Li, Changxiao Cai, Yuxin Chen and Yuantao Gu, Yuting Wei, Yuejie Chi

Keywords Abstract Paper

Reinforcement Learning and Planning

On the Convergence Theory of Debiased Model-Agnostic Meta-Reinforcement Learning

Alireza Fallah, Kristian Georgiev, Aryan Mokhtari, Asuman Ozdaglar

Keywords Abstract Paper

theory, optimization, reinforcement learning and planning, meta learning

Adaptive Gradient Quantization for Data-Parallel SGD

Fartash Faghri, Iman Tabrizian, Ilia Markov and Dan Alistarh, Dan Roy, Ali Ramezani-Kebrya

Keywords Abstract Paper

A simpler approach to accelerated optimization: iterative averaging meets optimism

Pooria Joulani, Anant Raj, András György, Csaba Szepesvari

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Constrained Robust Submodular Partitioning

Shengjie Wang, Tianyi Zhou, Chandrashekhar Lavania, Jeff A Bilmes

Keywords Abstract Paper

optimization, machine learning

On the Convergence Theory of Gradient-Based Model-Agnostic Meta-Learning Algorithms

Alireza Fallah, Aryan Mokhtari, Asuman Ozdaglar

Keywords Abstract Paper

Gaussian Process Bandit Optimization of the Thermodynamic Variational Objective

Vu Nguyen, Vaden Masrani, Rob Brekelmans and Michael A Osborne, Frank Wood

Keywords Abstract Paper

Accelerating SGD with momentum for over-parameterized learning

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Si Yi Meng, Sharan Vaswani, Issam Hadj Laradji and
Mark Schmidt, Simon Lacoste-Julien

Keywords Paper

Keywords Paper

Shuhang Chen, Adithya M Devraj, Fan Lu and
Ana Busic, Sean Meyn

Keywords Paper

Yang Li, Aljaž Božič, Tianwei Zhang and
Yanli Ji, Tatsuya Harada, Matthias Nießner

Keywords Paper

Keywords Paper

Long Yang, Gang Zheng, Yu Zhang and
Qian Zheng, Pengfei Li, Gang Pan

Keywords Paper

Konstantin Mishchenko, Dmitry Kovalev, Egor Shulgin and
Peter Richtarik, Yura Malitsky

Keywords Paper

Ke Sun, Yafei Wang, Yi Liu and
yingnan zhao, Bo Pan, Shangling Jui, Bei Jiang, Linglong Kong

Keywords Paper

Keywords Paper

Gen Li, Changxiao Cai, Yuxin Chen and
Yuantao Gu, Yuting Wei, Yuejie Chi

Keywords Paper

Keywords Paper

Fartash Faghri, Iman Tabrizian, Ilia Markov and
Dan Alistarh, Dan Roy, Ali Ramezani-Kebrya

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Vu Nguyen, Vaden Masrani, Rob Brekelmans and
Michael A Osborne, Frank Wood

Keywords Paper

Keywords Paper

Thomas Parnell, Andreea Anghel, Małgorzata Łazuka and
Nikolas Ioannou, Sebastian Kurella, Peshal Agarwal, Nikolaos Papandreou, Haris Pozidis

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Simon Du, Sham Kakade, Jason Lee and
Shachar Lovett, Gaurav Mahajan, Wen Sun, Ruosong Wang

Keywords Paper

Mingrui Zhang, Zebang Shen, Aryan Mokhtari and
Hamed Hassani, Amin Karbasi

Keywords Paper

Keywords Paper

Keywords Paper

André Martins, António Farinhas, Marcos Treviso and
Vlad Niculae, Pedro Aguiar, Mario Figueiredo

Keywords Paper