State-only Imitation with Transition Dynamics Mismatch

26/04/2020

State-only Imitation with Transition Dynamics Mismatch

Tanmay Gangwani, Jian Peng

Keywords: Imitation learning, Reinforcement Learning, Inverse Reinforcement Learning

Abstract Paper Code Similar Papers

Abstract: Imitation Learning (IL) is a popular paradigm for training agents to achieve complicated goals by leveraging expert behavior, rather than dealing with the hardships of designing a correct reward function. With the environment modeled as a Markov Decision Process (MDP), most of the existing IL algorithms are contingent on the availability of expert demonstrations in the same MDP as the one in which a new imitator policy is to be learned. This is uncharacteristic of many real-life scenarios where discrepancies between the expert and the imitator MDPs are common, especially in the transition dynamics function. Furthermore, obtaining expert actions may be costly or infeasible, making the recent trend towards state-only IL (where expert demonstrations constitute only states or observations) ever so promising. Building on recent adversarial imitation approaches that are motivated by the idea of divergence minimization, we present a new state-only IL algorithm in this paper. It divides the overall optimization objective into two subproblems by introducing an indirection step and solves the subproblems iteratively. We show that our algorithm is particularly effective when there is a transition dynamics mismatch between the expert and imitator MDPs, while the baseline IL methods suffer from performance degradation. To analyze this, we construct several interesting MDPs by modifying the configuration parameters for the MuJoCo locomotion tasks from OpenAI Gym.

0

0

0

1

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

Cross-domain Imitation from Observations

Dripta S. Raychaudhuri, Sujoy Paul, Jeroen Vanbaar, Amit Roy-Chowdhury

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

20:49

06/12/2021

Mitigating Covariate Shift in Imitation Learning via Offline Data With Partial Coverage

Jonathan Chang, Masatoshi Uehara, Dhruv Sreenivas and
Rahul Kidambi, Wen Sun

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

8:37

06/12/2020

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

Paul Barde, Julien Roy, Wonseok Jeon and
Joelle Pineau, Chris Pal, Derek Nowrouzezahrai

Keywords Paper

0

0

0

0

3:08

26/04/2020

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

Siddharth Reddy, Anca D. Dragan, Sergey Levine

Keywords Paper

Imitation Learning, Reinforcement Learning

0

0

0

0

4:38

06/12/2020

Policy Improvement via Imitation of Multiple Oracles

Ching-An Cheng, Andrey Kolobov, Alekh Agarwal

Keywords Paper

0

0

0

0

3:12

18/07/2021

Robust Asymmetric Learning in POMDPs

Andrew Warrington, Jonathan Lavington, Adam Scibior and
Mark Schmidt, Frank Wood

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

16:53

16/11/2020

f-IRL: Inverse Reinforcement Learning via State Marginal Matching

Tianwei Ni, Harshit Sikchi, Yufei Wang and
Tejus Gupta, Lisa Lee, Ben Eysenbach

Keywords Paper

0

0

0

0

5:07

04/07/2020

Active Imitation Learning with Noisy Guidance

Kianté Brantley, Hal Daumé III, Amr Sharaf

Keywords Paper

Active Learning, structured tasks, sequence tasks, Imitation algorithms

0

0

0

0

7:59

16/11/2020

Transformer Based Multi-Source Domain Adaptation

Dustin Wright, Isabelle Augenstein

Keywords Paper

unsupervised adaptation, cnns, rnns, domain classifiers

0

0

0

0

11:30

06/12/2021

IQ-Learn: Inverse soft-Q Learning for Imitation

Divyansh Garg, Shuvam Chakraborty, Chris Cundy and
Jiaming Song, Stefano Ermon

Keywords Paper

optimization, reinforcement learning and planning, adversarial robustness and security

0

0

0

0

8:25

06/12/2020

Training Stronger Baselines for Learning to Optimize

Tianlong Chen, Weiyi Zhang, Zhou Jingyang and
Shiyu Chang, Sijia Liu, Lisa Amini, Zhangyang Wang

Keywords Paper

0

0

0

0

3:18

18/07/2021

Imitation by Predicting Observations

Andrew Jaegle, Yury Sulsky, Arun Ahuja and
Jake Bruce, Rob Fergus, Greg Wayne

Keywords Paper

Reinforcement Learning and Planning, Others

0

0

0

0

5:15

06/12/2020

Learning from Failure: De-biasing Classifier from Biased Classifier

Junhyun Nam, Hyuntak Cha, Sungsoo Ahn and
Jaeho Lee, Jinwoo Shin

Keywords Paper

0

0

0

0

3:21

19/08/2021

Robust Adversarial Imitation Learning via Adaptively-Selected Demonstrations

Yunke Wang, Chang Xu, Bo Du

Keywords Paper

Machine Learning, Adversarial Machine Learning, Unsupervised Learning

0

0

0

0

9:57

06/12/2021

Network-to-Network Regularization: Enforcing Occam's Razor to Improve Generalization

Rohan Ghosh, Mehul Motani

Keywords Paper

theory, deep learning, machine learning

0

0

0

0

14:07

06/12/2020

Toward the Fundamental Limits of Imitation Learning

Nived Rajaraman, Lin Yang, Jiantao Jiao, Kannan Ramchandran

Keywords Paper

0

0

0

1

3:22

09/07/2020

Precise Tradeoffs in Adversarial Training for Linear Regression

Adel Javanmard, Mahdi Soltanolkotabi, Hamed Hassani

Keywords Paper

Adversarial learning and robustness, High-dimensional statistics, Regression

0

0

0

0

15:49

03/05/2021

Behavioral Cloning from Noisy Demonstrations

Fumihiro Sasaki, Ryota Yamashina

Keywords Paper

Inverse Reinforcement Learning, Noisy Demonstrations, Imitation Learning

0

0

0

0

9:36

06/12/2021

On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

Tim G. J. Rudner, Cong Lu, Michael A Osborne and
Yarin Gal, Yee Teh

Keywords Paper

reinforcement learning and planning, kernel methods

0

0

0

0

12:26

06/12/2021

Adversarial Robustness with Semi-Infinite Constrained Learning

Alexander Robey, Luiz Chamon, George J. Pappas and
Hamed Hassani, Alejandro Ribeiro

Keywords Paper

theory, deep learning, optimization, robustness, adversarial robustness and security

0

0

0

0

14:48

18/07/2021

Training Data Subset Selection for Regression with Controlled Generalization Error

Durga S, Rishabh Iyer, Ganesh Ramakrishnan, Abir De

Keywords Paper

, Algorithms, Online Learning, Algorithms, Supervised Learning

0

0

0

0

4:15

26/04/2020

Deep Imitative Models for Flexible Inference, Planning, and Control

Nicholas Rhinehart, Rowan McAllister, Sergey Levine

Keywords Paper

imitation learning, planning, autonomous driving

0

0

0

0

5:02

02/02/2021

Theoretically Principled Deep RL Acceleration via Nearest Neighbor Function Approximation

Junhong Shen, Lin F. Yang

Keywords Paper

0

0

0

0

19:12

06/12/2021

Automatic Data Augmentation for Generalization in Reinforcement Learning

Roberta Raileanu, Maxwell Goldstein, Denis Yarats and
Ilya Kostrikov, Rob Fergus

Keywords Paper

reinforcement learning and planning, machine learning

0

0

0

0

14:26

13/04/2021

Online model selection for reinforcement learning with function approximation

Jonathan Lee, Aldo Pacchiano, Vidya Muthukumar and
Weihao Kong, Emma Brunskill

Keywords Paper

0

0

0

0

3:15

14/06/2020

Overcoming Multi-Model Forgetting in One-Shot NAS With Diversity Maximization

Miao Zhang, Huiqi Li, Shirui Pan and
Xiaojun Chang, Steven Su

Keywords Paper

automl, neural architecture search, catastrophic forgetting, novelty search, continual learning

0

0

0

0

1:01

06/12/2021

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and
Danil Karpushkin, Dmitry Vetrov

Keywords Paper

deep learning, optimization

0

0

0

0

14:26

06/12/2021

Fast Axiomatic Attribution for Neural Networks

Robin Hesse, Simone Schaub-Meyer, Stefan Roth

Keywords Paper

deep learning, interpretability

0

0

0

0

14:49

18/07/2021

PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees

Jonas Rothfuss, Vincent Fortuin, Martin Josifoski, Andreas Krause

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

1

1

0

0

5:46

06/12/2020

Causal Imitation Learning With Unobserved Confounders

Junzhe Zhang, Daniel Kumor, Elias Bareinboim

Keywords Paper

0

0

0

0

3:18

12/07/2020

Optimizing Multiagent Cooperation via Policy Evolution and Shared Experiences

Somdeb Majumdar, Shauharda Khadka, Santiago Miret and
Stephen Mcaleer, Kagan Tumer

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:53

23/08/2020

Diverse rule sets

Guangyi Zhang, Aristides Gionis

Keywords Paper

sampling, classifier, pattern mining, rule learning, diversification, rule sets

0

0

0

0

9:41

02/02/2021

Stable Adversarial Learning under Distributional Shifts

Jiashuo Liu, Zheyan Shen, Peng Cui and
Linjun Zhou, Kun Kuang, Bo Li, Yishi Lin

Keywords Paper

0

0

0

0

14:30

26/04/2020

Scalable and Order-robust Continual Learning with Additive Parameter Decomposition

Jaehong Yoon, Saehoon Kim, Eunho Yang, Sung Ju Hwang

Keywords Paper

Continual Learning, Lifelong Learning, Catastrophic Forgetting, Deep Learning

0

0

0

0

5:07

04/08/2021

Adversarially Robust Low Dimensional Representations

Pranjal Awasthi, Vaggos Chatziafratis, Xue Chen, Aravindan Vijayaraghavan

Keywords Paper

0

0

0

0

20:19

26/04/2020

Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML

Aniruddh Raghu, Maithra Raghu, Samy Bengio, Oriol Vinyals

Keywords Paper

deep learning analysis, representation learning, meta-learning, few-shot learning

0

0

0

0

5:25

18/07/2021

Hyperparameter Selection for Imitation Learning

Léonard Hussenot, Marcin Andrychowicz, Damien Vincent and
Robert Dadashi, Anton Raichuk, Sabela Ramos, Nikola Momchev, Sertan Girgin, Raphael Marinier, Lukasz Stafiniak, Emmanuel Orsini, Olivier Bachem, Matthieu Geist, Olivier Pietquin

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

16:23

26/04/2020

Imitation Learning via Off-Policy Distribution Matching

Ilya Kostrikov, Ofir Nachum, Jonathan Tompson

Keywords Paper

reinforcement learning, deep learning, imitation learning, adversarial learning

0

0

0

0

5:31

02/02/2021

Fast and Scalable Adversarial Training of Kernel SVM via Doubly Stochastic Gradients

Huimin Wu, Zhengmian Hu, Bin Gu

Keywords Paper

0

0

0

0

14:04

06/12/2021

Towards Deeper Deep Reinforcement Learning with Spectral Normalization

Nils Bjorck, Carla Gomes, Kilian Weinberger

Keywords Paper

reinforcement learning and planning, vision, language

0

0

0

0

9:28