Adaptive Dialog Policy Learning with Hindsight and User Modeling

01/07/2020

Adaptive Dialog Policy Learning with Hindsight and User Modeling

Yan Cao, Keting Lu, Xiaoping Chen, Shiqi Zhang

Keywords:

Abstract Paper Similar Papers

Abstract: Reinforcement learning (RL) methods have been widely used for learning dialog policies. Sample efficiency, i.e., the efficiency of learning from limited dialog experience, is particularly important in RL-based dialog policy learning, because interacting with people is costly and low-quality dialog policies produce very poor user experience. In this paper, we develop LHUA (Learning with Hindsight, User modeling, and Adaptation) that, for the first time, enables dialog agents to adaptively learn with hindsight from both simulated and real users. Simulation and hindsight provide the dialog agent with more experience and more (positive) reinforcement respectively. Experimental results suggest that LHUA outperforms competitive baselines from the literature, including its no-simulation, no-adaptation, and no-hindsight counterparts.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at SIGDIAL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

Uncertainty Weighted Actor-Critic for Offline Reinforcement Learning

Yue Wu, Shuangfei Zhai, Nitish Srivastava and
Josh Susskind, Jian Zhang, Russ Salakhutdinov, Hanlin Goh

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:01

06/12/2020

Improving Generalization in Reinforcement Learning with Mixture Regularization

KAIXIN WANG, Bingyi Kang, Jie Shao, Jiashi Feng

Keywords Paper

0

0

0

1

3:14

01/07/2020

Learning to Classify Intents and Slot Labels Given a Handful of Examples

Jason Krone, Yi Zhang, Mona Diab

Keywords Paper

0

0

0

0

11:52

18/07/2021

PsiPhi-Learning: Reinforcement Learning with Demonstrations using Successor Features and Inverse Temporal Difference Learning

Angelos Filos, Clare Lyle, Yarin Gal and
Sergey Levine, Natasha Jaques, Gregory Farquhar

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

15:18

06/12/2020

The Power of Comparisons for Actively Learning Linear Classifiers

Max Hopkins, Daniel Kane, Shachar Lovett

Keywords Paper

0

0

0

0

3:25

06/12/2020

Off-Policy Imitation Learning from Observations

Zhuangdi Zhu, Kaixiang Lin, Bo Dai, Jiayu Zhou

Keywords Paper

0

0

0

1

3:24

18/07/2021

A Representation Learning Perspective on the Importance of Train-Validation Splitting in Meta-Learning

Nikunj Saunshi, Arushi Gupta, Wei Hu

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

5:20

18/07/2021

REPAINT: Knowledge Transfer in Deep Reinforcement Learning

Yunzhe Tao, Sahika Genc, Jonathan Chung and
TAO SUN, Sunil Mallya

Keywords Paper

Algorithms, Ranking and Preference Learning, Algorithms, Regression; Applications, Health; Theory, Learning Theory; Theory, Regularization, Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

5:04

06/12/2021

Uniform Sampling over Episode Difficulty

Sébastien Arnold, Guneet Dhillon, Avinash Ravichandran, Stefano Soatto

Keywords Paper

meta learning, few shot learning

0

0

0

0

13:08

19/08/2021

Cross-Domain Few-Shot Classification via Adversarial Task Augmentation

Haoqing Wang, Zhi-Hong Deng

Keywords Paper

Computer Vision, Recognition, Adversarial Machine Learning, Deep Learning

0

0

0

0

10:39

06/12/2020

Minimax Lower Bounds for Transfer Learning with Linear and One-hidden Layer Neural Networks

Mohammadreza Mousavi Kalan, Zalan Fabian, Salman Avestimehr, Mahdi Soltanolkotabi

Keywords Paper

0

0

0

0

3:16

06/12/2021

Visual Adversarial Imitation Learning using Variational Models

Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn

Keywords Paper

theory, reinforcement learning and planning, adversarial robustness and security, representation learning

0

0

0

0

7:25

06/12/2021

Bridging the Imitation Gap by Adaptive Insubordination

Luca Weihs, Unnat Jain, Iou-Jen Liu and
Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alex Schwing

Keywords Paper

reinforcement learning and planning

0

0

0

0

6:51

18/07/2021

PC-MLP: Model-based Reinforcement Learning with Policy Cover Guided Exploration

Yuda Song, Wen Sun

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:13

18/07/2021

Accelerating Safe Reinforcement Learning with Constraint-mismatched Baseline Policies

Jimmy Yang, Justinian Rosca, Karthik Narasimhan, Peter Ramadge

Keywords Paper

Algorithms, Adversarial Learning, Applications, Computer Vision; Deep Learning, Adversarial Networks; Deep Learning, Generative Models, Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:20

18/07/2021

Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning

Hiroki Furuta, Tatsuya Matsushima, Tadashi Kozuno and
Yutaka Matsuo, Sergey Levine, Ofir Nachum, Shixiang Gu

Keywords Paper

Reinforcement Learning and Planning, Deep RL

0

0

0

0

5:51

06/12/2021

COMBO: Conservative Offline Model-Based Policy Optimization

Tianhe Yu, Aviral Kumar, Rafael Rafailov and
Aravind Rajeswaran, Sergey Levine, Chelsea Finn

Keywords Paper

deep learning, optimization, reinforcement learning and planning

0

0

0

0

12:35

06/12/2021

Which Mutual-Information Representation Learning Objectives are Sufficient for Control?

Kate Rakelly, Abhishek Gupta, Carlos Florensa, Sergey Levine

Keywords Paper

reinforcement learning and planning, representation learning

1

0

0

0

10:44

06/12/2021

A Theoretical Analysis of Fine-tuning with Linear Teachers

Gal Shachaf, Alon Brutzkus, Amir Globerson

Keywords Paper

theory, deep learning, transfer learning

0

0

0

0

14:01

18/07/2021

Examining and Combating Spurious Features under Distribution Shift

Chunting Zhou, Xuezhe Ma, Paul Michel, Graham Neubig

Keywords Paper

Deep Learning, Embedding and Representation learning

0

0

0

0

5:53

03/05/2021

When Do Curricula Work?

Xiaoxia (Shirley) Wu, Ethan Dyer, Behnam Neyshabur

Keywords Paper

Empirical Investigation, Understanding Deep Learning, Curriculum Learning

0

0

0

0

14:37

26/04/2020

Ranking Policy Gradient

Kaixiang Lin, Jiayu Zhou

Keywords Paper

Sample-efficient reinforcement learning, off-policy learning.

0

0

0

0

5:43

03/05/2021

Hierarchical Reinforcement Learning by Discovering Intrinsic Options

Jesse Zhang, Haonan Yu, Wei Xu

Keywords Paper

reinforcement learning, unsupervised skill discovery, exploration, options, hierarchical reinforcement learning

0

0

0

0

4:58

03/05/2021

Meta-learning with negative learning rates

Alberto Bernacchia

Keywords Paper

Meta-learning

0

0

0

0

5:19

12/07/2020

Fair Generative Modeling via Weak Supervision

Kristy Choi, Aditya Grover, Trisha Singh and
Rui Shu, Stefano Ermon

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

13:01

13/04/2021

On the generalization properties of adversarial training

Yue Xing, Qifan Song, Guang Cheng

Keywords Paper

0

0

0

0

3:05

18/07/2021

Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision

Johan Björck, Xiangyu Chen, Christopher De Sa and
Carla Gomes, Kilian Weinberger

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:19

25/07/2020

Deep critiquing for VAE-based recommender systems

Kai Luo, Hojin Yang, Ga Wu, Scott Sanner

Keywords Paper

deep learning, recommender systems, critiquing

0

0

0

0

14:15

26/04/2020

On the interaction between supervision and self-play in emergent communication

Ryan Lowe, Abhinav Gupta, Jakob Foerster and
Douwe Kiela, Joelle Pineau

Keywords Paper

multi-agent communication, self-play, emergent languages

0

0

0

0

5:02

22/11/2021

Selective Pseudo-Labeling with Reinforcement Learning for Semi-Supervised Domain Adaptation

Bingyu Liu, Yuhong Guo, Jieping Ye, Weihong Deng

Keywords Paper

semi-supervised domain adaptation, reinforcement learning, pseudo-label

0

0

0

0

3:02

06/12/2020

Generalized Hindsight for Reinforcement Learning

Alex Li, Lerrel Pinto, Pieter Abbeel

Keywords Paper

0

0

0

0

3:20

03/05/2021

MoPro: Webly Supervised Learning with Momentum Prototypes

Junnan Li, Caiming Xiong, Steven Hoi

Keywords Paper

weakly-supervised learning, webly-supervised learning, contrastive learning, representation learning

0

0

0

0

4:47

06/12/2020

A Variational Approach for Learning from Positive and Unlabeled Data

Hui Chen, Fangqing Liu, Yin Wang and
Liyue Zhao, Hao Wu

Keywords Paper

0

0

0

0

3:13

06/12/2021

IQ-Learn: Inverse soft-Q Learning for Imitation

Divyansh Garg, Shuvam Chakraborty, Chris Cundy and
Jiaming Song, Stefano Ermon

Keywords Paper

optimization, reinforcement learning and planning, adversarial robustness and security

0

0

0

0

8:25

06/12/2021

To Beam Or Not To Beam: That is a Question of Cooperation for Language GANs

Thomas Scialom, Paul-Alexis Dray, Jacopo Staiano and
Sylvain Lamprier, Benjamin Piwowarski

Keywords Paper

reinforcement learning and planning, generative model

0

0

0

0

9:26

04/07/2020

Active Imitation Learning with Noisy Guidance

Kianté Brantley, Hal Daumé III, Amr Sharaf

Keywords Paper

Active Learning, structured tasks, sequence tasks, Imitation algorithms

0

0

0

0

7:59

13/04/2021

Finite-sample regret bound for distributionally robust offline tabular reinforcement learning

Zhengqing Zhou, Zhengyuan Zhou, Qinxun Bai and
Linhai Qiu, Jose Blanchet, Peter Glynn

Keywords Paper

0

0

0

0

3:02

12/07/2020

Structured Prediction with Partial Labelling through the Infimum Loss

Vivien Cabannnes, Francis Bach, Alessandro Rudi

Keywords Paper

Sequential, Network, and Time-Series Modeling

0

0

0

0

13:01

06/12/2021

Curriculum Offline Imitating Learning

Minghuan Liu, Hanye Zhao, Zhengyu Yang and
Jian Shen, Weinan Zhang, Li Zhao, Tie-Yan Liu

Keywords Paper

reinforcement learning and planning

0

0

0

0

14:28

13/04/2021

Learning with gradient descent and weakly convex losses

Dominic Richards, Mike Rabbat

Keywords Paper

0

0

0

0

3:20