Disagreement-Regularized Imitation Learning

Abstract: We present a simple and effective algorithm designed to address the covariate shift problem in imitation learning. It operates by training an ensemble of policies on the expert demonstration data, and using the variance of their predictions as a cost which is minimized with RL together with a supervised behavioral cloning cost. Unlike adversarial imitation methods, it uses a fixed reward function which is easy to optimize. We prove a regret bound for the algorithm which is linear in the time horizon multiplied by a coefficient which we show to be low for certain problems in which behavioral cloning fails. We evaluate our algorithm empirically across multiple pixel-based Atari environments and continuous control tasks, and show that it matches or significantly outperforms behavioral cloning and generative adversarial imitation learning.

06/12/2021

Disagreement-Regularized Imitation Learning

Kiante Brantley, Wen Sun, Mikael Henaff

Comments

Similar Papers

Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning

Jongjin Park, Younggyo Seo, Chang Liu and Li Zhao, Tao Qin, Jinwoo Shin, Tie-Yan Liu

Keywords Abstract Paper

reinforcement learning and planning, causality

Emphatic Algorithms for Deep Reinforcement Learning

Ray Jiang, Tom Zahavy, Zhongwen Xu and Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt

Keywords Abstract Paper

Reinforcement Learning and Planning, Deep RL

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

Siddharth Reddy, Anca D. Dragan, Sergey Levine

Keywords Abstract Paper

Imitation Learning, Reinforcement Learning

Examining and Combating Spurious Features under Distribution Shift

Chunting Zhou, Xuezhe Ma, Paul Michel, Graham Neubig

Keywords Abstract Paper

Deep Learning, Embedding and Representation learning

Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences

Daniel Brown, Scott Niekum, Russell Coleman, Ravi Srinivasan

Keywords Abstract Paper

Reinforcement Learning - Deep RL

Learning Self-Correctable Policies and Value Functions from Demonstrations with Negative Sampling

Yuping Luo, Huazhe Xu, Tengyu Ma

Keywords Abstract Paper

imitation learning, model-based imitation learning, model-based RL, behavior cloning, covariate shift

Addressing Action Oscillations through Learning Policy Inertia

Chen Chen, Hongyao Tang, Jianye Hao and Wulong Liu, Zhaopeng Meng

Keywords Abstract Paper

Imitation Learning via Off-Policy Distribution Matching

Ilya Kostrikov, Ofir Nachum, Jonathan Tompson

Keywords Abstract Paper

reinforcement learning, deep learning, imitation learning, adversarial learning

Non-Crossing Quantile Regression for Distributional Reinforcement Learning

Fan Zhou, Jianing Wang, Xingdong Feng

Keywords Abstract Paper

Hindsight Trust Region Policy Optimization

Hanbo Zhang, Site Bai, Xuguang Lan and David Hsu, Nanning Zheng

Keywords Abstract Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning

Imitation by Predicting Observations

Andrew Jaegle, Yury Sulsky, Arun Ahuja and Jake Bruce, Rob Fergus, Greg Wayne

Keywords Abstract Paper

Reinforcement Learning and Planning, Others

IQ-Learn: Inverse soft-Q Learning for Imitation

Divyansh Garg, Shuvam Chakraborty, Chris Cundy and Jiaming Song, Stefano Ermon

Keywords Abstract Paper

optimization, reinforcement learning and planning, adversarial robustness and security

Supervising the Transfer of Reasoning Patterns in VQA

Corentin Kervadec, Christian Wolf, Grigory Antipov and Moez Baccouche, Madiha Nadri

Keywords Abstract Paper

theory, deep learning, vision

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

Paul Barde, Julien Roy, Wonseok Jeon and Joelle Pineau, Chris Pal, Derek Nowrouzezahrai

Keywords Abstract Paper

ConQUR: Mitigating Delusional Bias in Deep Q-Learning

DiJia Su, Jayden Ooi, Tyler Lu and Dale Schuurmans, Craig Boutilier

Keywords Abstract Paper

Reinforcement Learning - General

Stable Adversarial Learning under Distributional Shifts

Jiashuo Liu, Zheyan Shen, Peng Cui and Linjun Zhou, Kun Kuang, Bo Li, Yishi Lin

Keywords Abstract Paper

Robust Learning from Discriminative Feature Feedback

Sanjoy Dasgupta, Sivan Sabato

Keywords Abstract Paper

Data-Efficient Reinforcement Learning with Self-Predictive Representations

Max Schwarzer, Ankesh Anand, Rishab Goel and R Devon Hjelm, Aaron Courville, Philip Bachman

Keywords Abstract Paper

Representation Learning, Self-Supervised Learning, Reinforcement Learning, Sample Efficiency

Improve Agents without Retraining: Parallel Tree Search with Off-Policy Correction

Gal Dalal, Assaf Hallak, Steven Dalton and iuri frosio, Shie Mannor, Gal Chechik

Keywords Abstract Paper

theory, reinforcement learning and planning

Fast Task Inference with Variational Intrinsic Successor Features

Steven Hansen, Will Dabney, Andre Barreto and David Warde-Farley, Tom Van de Wiele, Volodymyr Mnih

Keywords Abstract Paper

Reinforcement Learning, Variational Intrinsic Control, Successor Features

Decision Transformer: Reinforcement Learning via Sequence Modeling

Jongjin Park, Younggyo Seo, Chang Liu and
Li Zhao, Tao Qin, Jinwoo Shin, Tie-Yan Liu

Keywords Paper

Ray Jiang, Tom Zahavy, Zhongwen Xu and
Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Chen Chen, Hongyao Tang, Jianye Hao and
Wulong Liu, Zhaopeng Meng

Keywords Paper

Keywords Paper

Keywords Paper

Hanbo Zhang, Site Bai, Xuguang Lan and
David Hsu, Nanning Zheng

Keywords Paper

Andrew Jaegle, Yury Sulsky, Arun Ahuja and
Jake Bruce, Rob Fergus, Greg Wayne

Keywords Paper

Divyansh Garg, Shuvam Chakraborty, Chris Cundy and
Jiaming Song, Stefano Ermon

Keywords Paper

Corentin Kervadec, Christian Wolf, Grigory Antipov and
Moez Baccouche, Madiha Nadri

Keywords Paper

Paul Barde, Julien Roy, Wonseok Jeon and
Joelle Pineau, Chris Pal, Derek Nowrouzezahrai

Keywords Paper

DiJia Su, Jayden Ooi, Tyler Lu and
Dale Schuurmans, Craig Boutilier

Keywords Paper

Jiashuo Liu, Zheyan Shen, Peng Cui and
Linjun Zhou, Kun Kuang, Bo Li, Yishi Lin

Keywords Paper

Keywords Paper

Max Schwarzer, Ankesh Anand, Rishab Goel and
R Devon Hjelm, Aaron Courville, Philip Bachman

Keywords Paper

Gal Dalal, Assaf Hallak, Steven Dalton and
iuri frosio, Shie Mannor, Gal Chechik

Keywords Paper

Steven Hansen, Will Dabney, Andre Barreto and
David Warde-Farley, Tom Van de Wiele, Volodymyr Mnih

Keywords Paper

Lili Chen, Kevin Lu, Aravind Rajeswaran and
Kimin Lee, Aditya Grover, Misha Laskin, Pieter Abbeel, Aravind Srinivas, Igor Mordatch

Keywords Paper

Mohammadhosein Hasanbeig, Natasha Yogananda Jeppu, Alessandro Abate and
Tom Melham, Daniel Kroening

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Tianhe Yu, Aviral Kumar, Yevgen Chebotar and
Karol Hausman, Sergey Levine, Chelsea Finn

Keywords Paper

Keywords Paper

Keywords Paper

Dibya Ghosh, Abhishek Gupta, Ashwin D Reddy and
Justin Fu, Coline M Devin, Ben Eysenbach, Sergey Levine

Keywords Paper

Matthew Faw, Rajat Sen, Karthikeyan Shanmugam and
Constantine Caramanis, Sanjay Shakkottai

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yue Wu, Shuangfei Zhai, Nitish Srivastava and
Josh Susskind, Jian Zhang, Russ Salakhutdinov, Hanlin Goh

Keywords Paper