Learning Efficient Dialogue Policy from Demonstrations through Shaping

04/07/2020

Learning Efficient Dialogue Policy from Demonstrations through Shaping

Huimin Wang, Baolin Peng, Kam-Fai Wong

Keywords: Demonstrations, learning progress, domain task, human evaluation

Abstract Paper Similar Papers

Abstract: Training a task-oriented dialogue agent with reinforcement learning is prohibitively expensive since it requires a large volume of interactions with users. Human demonstrations can be used to accelerate learning progress. However, how to effectively leverage demonstrations to learn dialogue policy remains less explored. In this paper, we present S^2Agent that efficiently learns dialogue policy from demonstrations through policy shaping and reward shaping. We use an imitation model to distill knowledge from demonstrations, based on which policy shaping estimates feedback on how the agent should act in policy space. Reward shaping is then incorporated to bonus state-actions similar to demonstrations explicitly in value space encouraging better exploration. The effectiveness of the proposed S^2Agentt is demonstrated in three dialogue domains and a challenging domain adaptation task with both user simulator evaluation and human evaluation.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

26/04/2020

Watch, Try, Learn: Meta-Learning from Demonstrations and Rewards

Allan Zhou, Eric Jang, Daniel Kappler and
Alex Herzog, Mohi Khansari, Paul Wohlhart, Yunfei Bai, Mrinal Kalakrishnan, Sergey Levine, Chelsea Finn

Keywords Paper

meta-learning, reinforcement learning, imitation learning

0

0

0

0

4:34

16/11/2020

Interactive Imitation Learning in State-Space

Snehal Jauhri, Carlos Celemin, Jens Kober

Keywords Paper

0

0

0

0

5:05

18/07/2021

Demonstration-Conditioned Reinforcement Learning for Few-Shot Imitation

Christopher Dance, Perez Julien, Théo Cachet

Keywords Paper

Reinforcement Learning and Planning, Planning and Control

0

0

0

0

5:13

04/07/2020

Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition

Ryuichi Takanobu, Runze Liang, Minlie Huang

Keywords Paper

pretraining, Multi-Agent Learning, Role-Aware Decomposition, reinforcement learning

0

0

0

0

13:00

06/12/2021

Visual Adversarial Imitation Learning using Variational Models

Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn

Keywords Paper

theory, reinforcement learning and planning, adversarial robustness and security, representation learning

0

0

0

0

7:25

18/07/2021

PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training

Kimin Lee, Laura Smith, Pieter Abbeel

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

15:02

04/07/2020

Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation

Xinting Huang, Jianzhong Qi, Yu Sun, Rui Zhang

Keywords Paper

Semi-Supervised Learning, generalization function, Stochastic Estimation, Dialogue optimization

0

0

0

0

11:31

18/07/2021

MURAL: Meta-Learning Uncertainty-Aware Rewards for Outcome-Driven Reinforcement Learning

Kevin Li, Abhishek Gupta, Ashwin D Reddy and
Vitchyr Pong, Aurick Zhou, Justin Yu, Sergey Levine

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:16

04/07/2020

Learning Dialog Policies from Weak Demonstrations

Gabriel Gordon-Hall, Philip John Gorinski, Shay B. Cohen

Keywords Paper

Weak Demonstrations, dialog manager, multi-domain systems, expert demonstrators

0

0

0

0

11:14

03/05/2021

Learning to Represent Action Values as a Hypergraph on the Action Vertices

Arash Tavakoli, Mehdi Fatemi, Petar Kormushev

Keywords Paper

reinforcement learning, learning action representations, multi-dimensional discrete action spaces, structural inductive bias, structural credit assignment

0

0

0

0

3:43

06/12/2020

AvE: Assistance via Empowerment

Yuqing Du, Stas Tiomkin, Emre Kiciman and
Daniel Polani, Pieter Abbeel, Anca Dragan

Keywords Paper

0

0

0

0

3:23

18/07/2021

Goal-Conditioned Reinforcement Learning with Imagined Subgoals

Elliot Chane-Sane, Cordelia Schmid, Ivan Laptev

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

4:57

08/12/2020

LAVA: Latent Action Spaces via Variational Auto-encoding for Dialogue Policy Optimization

Nurul Lubis, Christian Geishauser, Michael Heck and
Hsien-chin Lin, Marco Moresi, Carel van Niekerk, Milica Gasic

Keywords Paper

0

0

0

0

15:12

01/07/2020

Multi-Action Dialog Policy Learning with Interactive Human Teaching

Megha Jhunjhunwala, Caleb Bryant, Pararth Shah

Keywords Paper

0

0

0

0

7:09

18/07/2021

Guided Exploration with Proximal Policy Optimization using a Single Demonstration

Gabriele Libardi, Gianni De Fabritiis, Sebastian Dittert

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:12

06/12/2021

Teachable Reinforcement Learning via Advice Distillation

Olivia Watkins, Abhishek Gupta, Trevor Darrell and
Pieter Abbeel, Jacob Andreas

Keywords Paper

reinforcement learning and planning, active learning

0

0

0

0

12:45

16/11/2020

The EMPATHIC Framework for Task Learning from Implicit Human Feedback

Yuchen Cui, Qiping Zhang, Brad Knox and
Alessandro Allievi, Peter Stone, Scott Niekum

Keywords Paper

0

0

0

0

5:11

12/07/2020

Variational Imitation Learning with Diverse-quality Demonstrations

Voot Tangkaratt, Bo Han, Mohammad Emtiyaz Khan, Masashi Sugiyama

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

13:52

06/12/2021

Episodic Multi-agent Reinforcement Learning with Curiosity-driven Exploration

Lulu Zheng, Jiarui Chen, Jianhao Wang and
Jiamin He, Yujing Hu, Yingfeng Chen, Changjie Fan, Yang Gao, Chongjie Zhang

Keywords Paper

reinforcement learning and planning

0

0

0

0

12:25

04/07/2020

Data Manipulation: Towards Effective Instance Learning for Neural Dialogue Generation via Learning to Augment and Reweight

Hengyi Cai, Hongshen Chen, Yonghao Song and
Cheng Zhang, Xiaofang Zhao, Dawei Yin

Keywords Paper

Data Manipulation, Neural Generation, learning, dialogue generation

0

0

0

1

9:39

02/02/2021

Automatic Curriculum Learning With Over-repetition Penalty for Dialogue Policy Learning

Yangyang Zhao, Zhenyu Wang, Zhenhua Huang

Keywords Paper

0

0

0

0

15:41

06/12/2021

Confidence-Aware Imitation Learning from Demonstrations with Varying Optimality

Songyuan Zhang, ZHANGJIE CAO, Dorsa Sadigh, Yanan Sui

Keywords Paper

reinforcement learning and planning

0

0

0

0

13:50

18/07/2021

Targeted Data Acquisition for Evolving Negotiation Agents

Minae Kwon, Sidd Karamcheti, Mariano-Florentino Cuellar, Dorsa Sadigh

Keywords Paper

Reinforcement Learning and Planning, Multi-Agent RL

0

0

0

0

5:15

26/04/2020

Automated curriculum generation through setter-solver interactions

Sebastien Racaniere, Andrew Lampinen, Adam Santoro and
David Reichert, Vlad Firoiu, Timothy Lillicrap

Keywords Paper

Deep Reinforcement Learning, Automatic Curriculum

0

0

0

0

3:55

25/07/2020

GoChat: Goal-oriented chatbots with hierarchical reinforcement learning

Jianfeng Liu, Feiyang Pan, Ling Luo

Keywords Paper

dialogue system, reinforcement learning, goal-oriented chatbot

0

0

0

0

9:15

06/12/2021

Unsupervised Domain Adaptation with Dynamics-Aware Rewards in Reinforcement Learning

Jinxin Liu, Hao Shen, Donglin Wang and
Yachen Kang, Qiangxing Tian

Keywords Paper

reinforcement learning and planning, domain adaptation

0

0

0

0

8:07

26/04/2020

Intrinsic Motivation for Encouraging Synergistic Behavior

Rohan Chitnis, Shubham Tulsiani, Saurabh Gupta, Abhinav Gupta

Keywords Paper

reinforcement learning, intrinsic motivation, synergistic, robot manipulation

0

0

0

0

5:02

26/04/2020

Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning

Hengyuan Hu, Jakob N Foerster

Keywords Paper

multi-agent RL, theory of mind

0

0

0

0

5:20

26/04/2020

Dynamical Distance Learning for Semi-Supervised and Unsupervised Skill Discovery

Kristian Hartikainen, Xinyang Geng, Tuomas Haarnoja, Sergey Levine

Keywords Paper

reinforcement learning, semi-supervised learning, unsupervised learning, robotics, deep learning

0

0

0

0

5:07

16/11/2020

Model-Based Inverse Reinforcement Learning from Visual Demonstrations

Neha Das, Sarah Bechtle, Todor Davchev and
Dinesh Jayaraman, Akshara Rai, Franziska Meier

Keywords Paper

0

0

0

0

5:03

19/08/2021

Reward-Constrained Behavior Cloning

Zhaorong Wang, Meng Wang, Jingqi Zhang and
Yingfeng Chen, Chongjie Zhang

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Reinforcement Learning, Constraint Optimization

0

0

0

0

14:43

03/05/2021

Learning to Reach Goals via Iterated Supervised Learning

Dibya Ghosh, Abhishek Gupta, Ashwin D Reddy and
Justin Fu, Coline M Devin, Ben Eysenbach, Sergey Levine

Keywords Paper

goal reaching, reinforcement learning, goal-conditioned RL, behavior cloning

0

0

0

0

15:19

06/12/2020

Policy Improvement via Imitation of Multiple Oracles

Ching-An Cheng, Andrey Kolobov, Alekh Agarwal

Keywords Paper

0

0

0

0

3:12

13/04/2021

Sample elicitation

Jiaheng Wei, Zuyue Fu, Yang Liu and
Xingyu Li, Zhuoran Yang, Zhaoran Wang

Keywords Paper

0

0

0

0

3:16

06/12/2021

Discovery of Options via Meta-Learned Subgoals

Vivek Veeriah, Tom Zahavy, Matteo Hessel and
Zhongwen Xu, Junhyuk Oh, Iurii Kemaev, Hado van Hasselt, David Silver, Satinder Singh

Keywords Paper

deep learning, reinforcement learning and planning

0

0

0

0

4:13

06/12/2021

Bridging the Imitation Gap by Adaptive Insubordination

Luca Weihs, Unnat Jain, Iou-Jen Liu and
Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alex Schwing

Keywords Paper

reinforcement learning and planning

0

0

0

0

6:51

18/07/2021

Interactive Learning from Activity Description

Khanh Nguyen, Dipendra Misra, Robert Schapire and
Miro Dudik, Patrick Shafto

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

4:57

26/04/2020

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

Siddharth Reddy, Anca D. Dragan, Sergey Levine

Keywords Paper

Imitation Learning, Reinforcement Learning

0

0

0

0

4:38

19/08/2021

Bayesian Experience Reuse for Learning from Multiple Demonstrators

Mike Gimelfarb, Scott Sanner, Chi-Guhn Lee

Keywords Paper

Machine Learning, Deep Reinforcement Learning, Transfer, Adaptation, Multi-task Learning, Approximate Probabilistic Inference, Bayesian Networks

0

0

0

0

12:09

06/12/2020

Zero-Resource Knowledge-Grounded Dialogue Generation

Linxiao Li, Can Xu, Wei Wu and
YUFAN ZHAO, Xueliang Zhao, Chongyang Tao

Keywords Paper

0

0

0

1

3:22