On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game

Abstract: To achieve sample efficiency in reinforcement learning (RL), it necessitates to efficiently explore the underlying environment. Under the offline setting, addressing the exploration challenge lies in collecting an offline dataset with sufficient coverage. Motivated by such a challenge, we study the reward-free RL problem, where an agent aims to thoroughly explore the environment without any pre-specified reward function. Then, given any extrinsic reward, the agent computes the optimal policy via offline RL with data collected in the exploration stage. Moreover, we tackle this problem under the context of function approximation, leveraging powerful function approximators. Specifically, we propose to explore via an optimistic variant of the value-iteration algorithm incorporating kernel and neural function approximations, where we adopt the associated exploration bonus as the exploration reward. Moreover, we design exploration and planning algorithms for both single-agent MDPs and zero-sum Markov games and prove that our methods can achieve $\widetilde{\mathcal{O}}(1 /\varepsilon^2)$ sample complexity for generating a $\varepsilon$-suboptimal policy or $\varepsilon$-approximate Nash equilibrium when given an arbitrary extrinsic reward. To the best of our knowledge, we establish the first provably efficient reward-free RL algorithm with kernel and neural function approximators.

12/07/2020

Algorithms, Multitask and Transfer Learning, Algorithms, Meta-Learning; Applications, Object Recognition; Data, Challenges, Implementations, and Software, Benchmarks;, Theory, RL, Decisions and Control Theory

4:49

06/12/2020

On Reward-Free RL with Kernel and Neural Function Approximations: Single-Agent MDP and Markov Game

Shuang Qiu, Jieping Ye, Zhaoran Wang, Zhuoran Yang

Comments

Similar Papers

Reward-Free Exploration for Reinforcement Learning

Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu

Keywords Abstract Paper

Reinforcement Learning - Theory

Exploration by Maximizing Renyi Entropy for Reward-Free RL Framework

Chuheng Zhang, Yuanying Cai, Longbo Huang, Jian Li

Keywords Abstract Paper

Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation

Weitong ZHANG, Dongruo Zhou, Quanquan Gu

Keywords Abstract Paper

reinforcement learning and planning

Near Optimal Reward-Free Reinforcement Learning

Zhang Zihan, Simon Du, Xiangyang Ji

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

Provably efficient safe exploration via primal-dual policy optimization

Dongsheng Ding, Xiaohan Wei, Zhuoran Yang and Zhaoran Wang, Mihailo Jovanovic

Keywords Abstract Paper

Implicit Generative Modeling for Efficient Exploration

Neale Ratzlaff, Qinxun Bai, Fuxin Li, Wei Xu

Keywords Abstract Paper

Reinforcement Learning - Deep RL

Task-agnostic Exploration in Reinforcement Learning

Xuezhou Zhang, Yuzhe Ma, Adish Singla

Keywords Abstract Paper

CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee

Tengyu Xu, Yingbin LIANG, Guanghui Lan

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

Task-Optimal Exploration in Linear Dynamical Systems

Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

Keywords Abstract Paper

Theory, RL, Decisions and Control Theory

MADE: Exploration via Maximizing Deviation from Explored Regions

Tianjun Zhang, Paria Rashidinejad, Jiantao Jiao and Yuandong Tian, Joseph Gonzalez, Stuart Russell

Keywords Abstract Paper

reinforcement learning and planning

A Max-Min Entropy Framework for Reinforcement Learning

Seungyul Han, Youngchul Sung

Keywords Abstract Paper

optimization, reinforcement learning and planning

On Reward-Free Reinforcement Learning with Linear Function Approximation

Ruosong Wang, Simon Du, Lin Yang, Russ Salakhutdinov

Keywords Abstract Paper

PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning

Alekh Agarwal, Mikael Henaff, Sham Kakade, Wen Sun

Keywords Abstract Paper

Adversarial Intrinsic Motivation for Reinforcement Learning

Ishan Durugkar, Mauricio Tec, Scott Niekum, Peter Stone

Keywords Abstract Paper

reinforcement learning and planning, generative model

A Sharp Analysis of Model-based Reinforcement Learning with Self-Play

Qinghua Liu, Tiancheng Yu, Yu Bai, Chi Jin

Keywords Abstract Paper

Algorithms, Multitask and Transfer Learning, Algorithms, Meta-Learning; Applications, Object Recognition; Data, Challenges, Implementations, and Software, Benchmarks;, Theory, RL, Decisions and Control Theory

Provably Efficient Exploration for Reinforcement Learning Using Unsupervised Learning

Fei Feng, Ruosong Wang, Wotao Yin and Simon Du, Lin Yang

Keywords Abstract Paper

Reinforcement Learning and Planning -> Decision and Control, Probabilistic Methods -> Gaussian Processes

Preference-based Reinforcement Learning with Finite-Time Guarantees

Yichong Xu, Ruosong Wang, Lin Yang and Aarti Singh, Artur Dubrawski

Keywords Abstract Paper

Deep Bandits Show-Off: Simple and Efficient Exploration with Deep Networks

Rong Zhu, Mattia Rigotti

Keywords Abstract Paper

theory, deep learning, reinforcement learning and planning, bandits

Exploration-Exploitation in Multi-Agent Learning: Catastrophe Theory Meets Game Theory

Stefanos Leonardos, Georgios Piliouras

Keywords Abstract Paper

Combinatorial Pure Exploration with Bottleneck Reward Function

Yihan Du, Yuko Kuroki, Wei Chen

Keywords Abstract Paper

theory, reinforcement learning and planning, bandits

Near-Optimal Offline Reinforcement Learning via Double Variance Reduction

Ming Yin, Yu Bai, Yu-Xiang Wang

Keywords Abstract Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Dongsheng Ding, Xiaohan Wei, Zhuoran Yang and
Zhaoran Wang, Mihailo Jovanovic

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Tianjun Zhang, Paria Rashidinejad, Jiantao Jiao and
Yuandong Tian, Joseph Gonzalez, Stuart Russell

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Fei Feng, Ruosong Wang, Wotao Yin and
Simon Du, Lin Yang

Keywords Paper

Yichong Xu, Ruosong Wang, Lin Yang and
Aarti Singh, Artur Dubrawski

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Tengyang Xie, Nan Jiang, Huan Wang and
Caiming Xiong, Yu Bai

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Chenjia Bai, Lingxiao Wang, Lei Han and
Jianye Hao, Animesh Garg, Peng Liu, Zhaoran Wang

Keywords Paper

Craig Boutilier, Chih-wei Hsu, Branislav Kveton and
Martin Mladenov, Csaba Szepesvari, Manzil Zaheer

Keywords Paper

Zhuoran Yang, Chi Jin, Zhaoran Wang and
Mengdi Wang, Michael Jordan

Keywords Paper

Yinlam Chow, Ofir Nachum, Aleksandra Faust and
Edgar Dueñez-Guzman, Mohammad Ghavamzadeh

Keywords Paper

Keywords Paper

Keywords Paper

Siddharth Reddy, Anca Dragan, Sergey Levine and
Shane Legg, Jan Leike

Keywords Paper

Haque Ishfaq, Qiwen Cui, Viet Nguyen and
Alex Ayoub, Zhuoran Yang, Zhaoran Wang, Doina Precup, Lin Yang

Keywords Paper

Keywords Paper