Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

Abstract: Adversarial Imitation Learning alternates between learning a discriminator -- which tells apart expert's demonstrations from generated ones -- and a generator's policy to produce trajectories that can fool this discriminator. This alternated optimization is known to be delicate in practice since it compounds unstable adversarial training with brittle and sample-inefficient reinforcement learning. We propose to remove the burden of the policy optimization steps by leveraging a novel discriminator formulation. Specifically, our discriminator is explicitly conditioned on two policies: the one from the previous generator's iteration and a learnable policy. When optimized, this discriminator directly learns the optimal generator's policy. Consequently, our discriminator's update solves the generator's optimization problem for free: learning a policy that imitates the expert does not require an additional optimization loop. This formulation effectively cuts by half the implementation and computational burden of Adversarial Imitation Learning algorithms by removing the Reinforcement Learning phase altogether. We show on a variety of tasks that our simpler approach is competitive to prevalent Imitation Learning methods.

03/05/2021

distance metric learning, offline/batch reinforcement learning, meta-reinforcement learning, contrastive learning, multi-task reinforcement learning

6:21

03/05/2021

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

Paul Barde, Julien Roy, Wonseok Jeon, Joelle Pineau, Chris Pal, Derek Nowrouzezahrai

Comments

Similar Papers

Learning to Reach Goals via Iterated Supervised Learning

Dibya Ghosh, Abhishek Gupta, Ashwin D Reddy and Justin Fu, Coline M Devin, Ben Eysenbach, Sergey Levine

Keywords Abstract Paper

goal reaching, reinforcement learning, goal-conditioned RL, behavior cloning

State-only Imitation with Transition Dynamics Mismatch

Tanmay Gangwani, Jian Peng

Keywords Abstract Paper

Imitation learning, Reinforcement Learning, Inverse Reinforcement Learning

Tempered Sigmoid Activations for Deep Learning with Differential Privacy

Nicolas Papernot, Abhradeep Thakurta, Shuang Song and Steve Chien, Úlfar Erlingsson

Keywords Abstract Paper

Self-supervised Label Augmentation via Input Transformations

Hankook Lee, Sung Ju Hwang, Jinwoo Shin

Keywords Abstract Paper

Deep Learning - Algorithms

A theoretical characterization of semi-supervised learning with self-training for gaussian mixture models

Samet Oymak, Talha Cihad Gulcu

Keywords Abstract Paper

Cogradient Descent for Bilinear Optimization

Li'an Zhuo, Baochang Zhang, Linlin Yang and Hanlin Chen, Qixiang Ye, David Doermann, Rongrong Ji, Guodong Guo

Keywords Abstract Paper

bilinear optimization, gradient descent algorithm, convolutional sparse coding, network pruning

Mitigating Covariate Shift in Imitation Learning via Offline Data With Partial Coverage

Jonathan Chang, Masatoshi Uehara, Dhruv Sreenivas and Rahul Kidambi, Wen Sun

Keywords Abstract Paper

theory, reinforcement learning and planning

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

Siddharth Reddy, Anca D. Dragan, Sergey Levine

Keywords Abstract Paper

Imitation Learning, Reinforcement Learning

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and Danil Karpushkin, Dmitry Vetrov

Keywords Abstract Paper

deep learning, optimization

Batch Reinforcement Learning Through Continuation Method

Yijie Guo, Shengyu Feng, Nicolas Le Roux and Ed H. Chi, Honglak Lee, Minmin Chen

Keywords Abstract Paper

batch reinforcement learning, relaxed regularization, continuation method

Reinforcement Learning for Route Optimization with Robustness Guarantees

Tobias Jacobs, Francesco Alesiani, Gulcin Ermis

Keywords Abstract Paper

Machine Learning, Deep Reinforcement Learning, Planning under Uncertainty, Applications of Reinforcement Learning

Modeling and Optimization Trade-off in Meta-learning

Katelyn Gao, Ozan Sener

Keywords Abstract Paper

Learning Self-Correctable Policies and Value Functions from Demonstrations with Negative Sampling

Yuping Luo, Huazhe Xu, Tengyu Ma

Keywords Abstract Paper

imitation learning, model-based imitation learning, model-based RL, behavior cloning, covariate shift

A Decision-Theoretic Approach for Model Interpretability in Bayesian Framework

Homayun Afrabandpey, Tomi Peltola, Juho Piironen and Aki Vehtari, Samuel Kaski

Keywords Abstract Paper

Meta-Learning with Warped Gradient Descent

Sebastian Flennerhag, Andrei A. Rusu, Razvan Pascanu and Francesco Visin, Hujun Yin, Raia Hadsell

Keywords Abstract Paper

meta-learning, transfer learning

Imitation Learning via Off-Policy Distribution Matching

Ilya Kostrikov, Ofir Nachum, Jonathan Tompson

Keywords Abstract Paper

reinforcement learning, deep learning, imitation learning, adversarial learning

Bridging the Imitation Gap by Adaptive Insubordination

Luca Weihs, Unnat Jain, Iou-Jen Liu and Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alex Schwing

Keywords Abstract Paper

reinforcement learning and planning

Believe What You See: Implicit Constraint Approach for Offline Multi-Agent Reinforcement Learning

Yiqin Yang, Xiaoteng Ma, Li Chenghao and Zewu Zheng, Qiyuan Zhang, Gao Huang, Jun Yang, Qianchuan Zhao

Keywords Abstract Paper

reinforcement learning and planning

Parameter-Based Value Functions

Francesco Faccio, Louis Kirsch, Jürgen Schmidhuber

Keywords Abstract Paper

Off-Policy Reinforcement Learning, Reinforcement Learning

Fast and Scalable Adversarial Training of Kernel SVM via Doubly Stochastic Gradients

Huimin Wu, Zhengmian Hu, Bin Gu

Keywords Abstract Paper

Time-series Generation by Contrastive Imitation

Dibya Ghosh, Abhishek Gupta, Ashwin D Reddy and
Justin Fu, Coline M Devin, Ben Eysenbach, Sergey Levine

Keywords Paper

Keywords Paper

Nicolas Papernot, Abhradeep Thakurta, Shuang Song and
Steve Chien, Úlfar Erlingsson

Keywords Paper

Keywords Paper

Keywords Paper

Li'an Zhuo, Baochang Zhang, Linlin Yang and
Hanlin Chen, Qixiang Ye, David Doermann, Rongrong Ji, Guodong Guo

Keywords Paper

Jonathan Chang, Masatoshi Uehara, Dhruv Sreenivas and
Rahul Kidambi, Wen Sun

Keywords Paper

Keywords Paper

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and
Danil Karpushkin, Dmitry Vetrov

Keywords Paper

Yijie Guo, Shengyu Feng, Nicolas Le Roux and
Ed H. Chi, Honglak Lee, Minmin Chen

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Homayun Afrabandpey, Tomi Peltola, Juho Piironen and
Aki Vehtari, Samuel Kaski

Keywords Paper

Sebastian Flennerhag, Andrei A. Rusu, Razvan Pascanu and
Francesco Visin, Hujun Yin, Raia Hadsell

Keywords Paper

Keywords Paper

Luca Weihs, Unnat Jain, Iou-Jen Liu and
Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alex Schwing

Keywords Paper

Yiqin Yang, Xiaoteng Ma, Li Chenghao and
Zewu Zheng, Qiyuan Zhang, Gao Huang, Jun Yang, Qianchuan Zhao

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Eric Mitchell, Rafael Rafailov, Xue Bin Peng and
Sergey Levine, Chelsea Finn

Keywords Paper

Massimiliano Patacchiola, Jack Turner, Elliot Crowley and
Michael O'Boyle, Amos Storkey

Keywords Paper

Keywords Paper

Minghuan Liu, Hanye Zhao, Zhengyu Yang and
Jian Shen, Weinan Zhang, Li Zhao, Tie-Yan Liu

Keywords Paper

Yufan Zhou, Zhenyi Wang, Jiayi Xian and
Changyou Chen, Jinhui Xu

Keywords Paper

Chengwei Chen, Yuan Xie, Shaohui Lin and
Ruizhi Qiao, Jian Zhou, Xin Tan, Yi Zhang, Lizhuang Ma

Keywords Paper

Keywords Paper

Keywords Paper

Ren Wang, Kaidi Xu, Sijia Liu and
Pin-Yu Chen, Lily Weng, Chuang Gan, Meng Wang

Keywords Paper

Ted Moskovitz, Jack Parker-Holder, Aldo Pacchiano and
Michael Arbel, Michael Jordan

Keywords Paper

Keywords Paper

Keywords Paper

Ling Pan, Tabish Rashid, Bei Peng and
Longbo Huang, Shimon Whiteson

Keywords Paper

Keywords Paper