Adversarial Option-Aware Hierarchical Imitation Learning

18/07/2021

Adversarial Option-Aware Hierarchical Imitation Learning

Mingxuan Jing, Wenbing Huang, Fuchun Sun, Xiaojian Ma, Tao Kong, Chuang Gan, Lei Li

Keywords: Reinforcement Learning and Planning, Planning and Control

Abstract Paper Similar Papers

Abstract: It has been a challenge to learning skills for an agent from long-horizon unannotated demonstrations. Existing approaches like Hierarchical Imitation Learning(HIL) are prone to compounding errors or suboptimal solutions. In this paper, we propose Option-GAIL, a novel method to learn skills at long horizon. The key idea of Option-GAIL is modeling the task hierarchy by options and train the policy via generative adversarial optimization. In particular, we propose an Expectation-Maximization(EM)-style algorithm: an E-step that samples the options of expert conditioned on the current learned policy, and an M-step that updates the low- and high-level policies of agent simultaneously to minimize the newly proposed option-occupancy measurement between the expert and the agent. We theoretically prove the convergence of the proposed algorithm. Experiments show that Option-GAIL outperforms other counterparts consistently across a variety of tasks.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

26/04/2020

On Computation and Generalization of Generative Adversarial Imitation Learning

Minshuo Chen, Yizhou Wang, Tianyi Liu and
Zhuoran Yang, Xingguo Li, Zhaoran Wang, Tuo Zhao

Keywords Paper

0

0

0

0

5:08

13/04/2021

Provable hierarchical imitation learning via EM

Zhiyu Zhang, Ioannis Paschalidis

Keywords Paper

0

0

0

0

3:05

12/07/2020

Generative Adversarial Imitation Learning with Neural Network Parameterization: Global Optimality and Convergence Rate

Yufeng Zhang, Qi Cai, Zhuoran Yang, Zhaoran Wang

Keywords Paper

Planning, Control, and Multiagent Learning

0

0

0

0

10:52

06/12/2020

f-GAIL: Learning f-Divergence for Generative Adversarial Imitation Learning

Xin Zhang, Yanhua Li, Ziming Zhang, Zhi-Li Zhang

Keywords Paper

0

0

0

0

3:22

18/07/2021

EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL

Seyed Kamyar Seyed Ghasemipour, Dale Schuurmans, Shixiang Gu

Keywords Paper

Reinforcement Learning and Planning

0

0

0

1

5:54

06/12/2020

Off-Policy Imitation Learning from Observations

Zhuangdi Zhu, Kaixiang Lin, Bo Dai, Jiayu Zhou

Keywords Paper

0

0

0

1

3:24

06/12/2021

Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Making by Reinforcement Learning

Kai Wang, Sanket Shah, Haipeng Chen and
Andrew Perrault, Finale Doshi-Velez, Milind Tambe

Keywords Paper

deep learning, optimization, reinforcement learning and planning

0

0

0

0

14:52

19/08/2021

Variational Model-based Policy Optimization

Yinlam Chow, Brandon Cui, Moonkyung Ryu, Mohammad Ghavamzadeh

Keywords Paper

Machine Learning, Reinforcement Learning

0

0

0

0

15:31

06/12/2020

Modeling and Optimization Trade-off in Meta-learning

Katelyn Gao, Ozan Sener

Keywords Paper

0

0

0

0

3:21

03/05/2021

UPDeT: Universal Multi-agent RL via Policy Decoupling with Transformers

Siyi Hu, Fengda Zhu, Xiaojun Chang, Xiaodan Liang

Keywords Paper

Transfer Learning, Multi-agent Reinforcement Learning

0

0

0

0

2:46

18/07/2021

Provably Correct Optimization and Exploration with Non-linear Policies

Fei Feng, Wotao Yin, Alekh Agarwal, Lin Yang

Keywords Paper

Deep Learning, Adversarial Networks, Applications, Fairness, Accountability, and Transparency, Theory, RL, Decisions and Control Theory

0

0

0

0

5:03

03/05/2021

Neurally Augmented ALISTA

Freya Behrens, Jonathan Sauder, Peter Jung

Keywords Paper

learned ISTA, unrolled algorithms, compressed sensing, sparse reconstruction

0

0

0

0

5:18

18/07/2021

Doubly Robust Off-Policy Actor-Critic: Convergence and Optimality

Tengyu Xu, Zhuoran Yang, Zhaoran Wang, Yingbin LIANG

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

4:23

18/07/2021

A Regret Minimization Approach to Iterative Learning Control

Naman Agarwal, Elad Hazan, Anirudha Majumdar, Karan Singh

Keywords Paper

Reinforcement Learning and Planning, Planning and Control

0

0

0

0

5:13

06/12/2020

Learning Linear Programs from Optimal Decisions

Yingcong Tan, Daria Terekhov, Andrew Delong

Keywords Paper

, Applications -> Privacy, Anonymity, and Security

0

0

0

0

3:21

18/07/2021

A Distribution-dependent Analysis of Meta Learning

Mikhail Konobeev, Ilja Kuzborskij, Csaba Szepesvari

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:06

06/12/2021

Robust Implicit Networks via Non-Euclidean Contractions

Saber Jafarpour, Alexander Davydov, Anton Proskurnikov, Francesco Bullo

Keywords Paper

theory, deep learning, machine learning, robustness, vision

0

0

0

0

14:59

03/05/2021

Learning Value Functions in Deep Policy Gradients using Residual Variance

Yannis Flet-Berliac, reda ouhamma, odalric-ambrym maillard, philippe preux

Keywords Paper

0

0

0

0

4:49

18/07/2021

Representation Matters: Offline Pretraining for Sequential Decision Making

Mengjiao Yang, Ofir Nachum

Keywords Paper

Reinforcement Learning and Planning

1

0

0

0

5:06

06/12/2020

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Paper

0

0

0

0

3:11

26/04/2020

Learning Nearly Decomposable Value Functions Via Communication Minimization

Tonghan Wang, Jianhao Wang, Chongyi Zheng, Chongjie Zhang

Keywords Paper

Multi-agent reinforcement learning, Nearly decomposable value function, Minimized communication

0

0

0

0

5:00

06/12/2021

Dual Adaptivity: A Universal Algorithm for Minimizing the Adaptive Regret of Convex Functions

Lijun Zhang, Guanghui Wang, Wei-Wei Tu and
Wei Jiang, Zhi-Hua Zhou

Keywords Paper

optimization, online learning

0

0

0

0

11:38

22/09/2020

FISSA: Fusing item similarity models with self-attention networks for sequential recommendation

Jing Lin, Weike Pan, Zhong Ming

Keywords Paper

Item Similarity Models, Sequential Recommendation, Gating Networks, Self-Attention

0

0

0

0

2:06

06/12/2021

Taming Communication and Sample Complexities in Decentralized Policy Evaluation for Cooperative Multi-Agent Reinforcement Learning

Xin Zhang, Zhuqing Liu, Jia Liu and
Zhengyuan Zhu, Songtao Lu

Keywords Paper

theory, optimization, reinforcement learning and planning

0

0

0

0

14:54

22/11/2021

Parameter Efficient Dynamic Convolution via Tensor Decomposition

Zejiang Hou, Sun-Yuan Kung

Keywords Paper

dynamic convolution, input-dependent reparameterization, parameter efficiency, tensor decomposition

0

0

0

0

3:58

06/12/2021

COMBO: Conservative Offline Model-Based Policy Optimization

Tianhe Yu, Aviral Kumar, Rafael Rafailov and
Aravind Rajeswaran, Sergey Levine, Chelsea Finn

Keywords Paper

deep learning, optimization, reinforcement learning and planning

0

0

0

0

12:35

12/07/2020

Tightening Exploration in Upper Confidence Reinforcement Learning

Hippolyte Bourel, Odalric-Ambrym Maillard, Mohammad Sadegh Talebi

Keywords Paper

Reinforcement Learning - General

0

0

0

0

16:14

06/12/2020

Multi-task Batch Reinforcement Learning with Metric Learning

Jiachen Li, Quan Vuong, Shuang Liu and
Minghua Liu, Kamil Ciosek, Henrik Christensen, Hao Su

Keywords Paper

Algorithms -> Multitask and Transfer Learning; Algorithms -> Representation Learning; Data, Challenges, Implementations, and So, Applications -> Natural Language Processing

0

0

0

0

3:15

13/04/2021

Online model selection for reinforcement learning with function approximation

Jonathan Lee, Aldo Pacchiano, Vidya Muthukumar and
Weihao Kong, Emma Brunskill

Keywords Paper

0

0

0

0

3:15

13/04/2021

Bayesian active learning by soft mean objective cost of uncertainty

Guang Zhao, Edward Dougherty, Byung-Jun Yoon and
Francis J. Alexander, Xiaoning Qian

Keywords Paper

0

0

0

0

3:02

08/12/2020

Uncertainty Modeling for Machine Comprehension Systems using Efficient Bayesian Neural Networks

Zhengyuan Liu, Pavitra Krishnaswamy, Ai Ti Aw, Nancy Chen

Keywords Paper

0

0

0

0

7:30

26/04/2020

ES-MAML: Simple Hessian-Free Meta Learning

Xingyou Song, Wenbo Gao, Yuxiang Yang and
Krzysztof Choromanski, Aldo Pacchiano, Yunhao Tang

Keywords Paper

ES, MAML, evolution, strategies, meta, learning, gaussian, perturbation, reinforcement, learning, adaptation

0

0

0

0

3:23

03/05/2021

Parameter-Based Value Functions

Francesco Faccio, Louis Kirsch, Jürgen Schmidhuber

Keywords Paper

Off-Policy Reinforcement Learning, Reinforcement Learning

0

0

0

0

2:45

07/09/2020

Transferring Pretrained Networks to Small Data via Category Decorrelation

Ying Jin, Zhangjie Cao, Mingsheng Long, Jianmin Wang

Keywords Paper

Category Decorrelation, Under Transfer

1

1

0

0

8:39

03/05/2021

Meta-Learning with Neural Tangent Kernels

Yufan Zhou, Zhenyi Wang, Jiayi Xian and
Changyou Chen, Jinhui Xu

Keywords Paper

neural tangent kernel, meta-learning

0

0

0

0

3:54

06/12/2020

Model-based Adversarial Meta-Reinforcement Learning

Zichuan Lin, Garrett Thomas, Guangwen Yang, Tengyu Ma

Keywords Paper

0

0

0

0

3:31

26/04/2020

CAQL: Continuous Action Q-Learning

Moonkyung Ryu, Yinlam Chow, Ross Anderson and
Christian Tjandraatmadja, Craig Boutilier

Keywords Paper

Reinforcement learning (RL), DQN, Continuous control, Mixed-Integer Programming (MIP)

0

0

0

0

5:36

03/08/2020

Joint Stochastic Approximation and Its Application to Learning Discrete Latent Variable Models

Zhijian Ou, Yunfu Song

Keywords Paper

0

0

0

0

8:24

06/12/2020

Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinforcement Learning

Younggyo Seo, Kimin Lee, Ignasi Clavera Gilaberte and
Thanard Kurutach, Jinwoo Shin, Pieter Abbeel

Keywords Paper

0

0

0

0

3:20

06/12/2020

Deep Inverse Q-learning with Constraints

Gabriel Kalweit, Maria Huegle, Moritz Werling, Joschka Boedecker

Keywords Paper

0

0

0

0

3:14