25/07/2020

GoChat: Goal-oriented chatbots with hierarchical reinforcement learning

Jianfeng Liu, Feiyang Pan, Ling Luo

Keywords: dialogue system, reinforcement learning, goal-oriented chatbot

Abstract: A chatbot that converses like a human should be goal-oriented (i.e., be purposeful in conversation), which is beyond language generation. However, existing goal-oriented dialogue systems often heavily rely on cumbersome hand-crafted rules or costly labelled datasets, which limits the applicability. In this paper, we propose Goal-oriented Chatbots (GoChat), a framework for end-to-end training the chatbot to maximize the long-term return from offline multi-turn dialogue datasets. Our framework utilizes hierarchical reinforcement learning (HRL), where the high-level policy determines some sub-goals to guide the conversation towards the final goal, and the low-level policy fulfills the sub-goals by generating the corresponding utterance for response. In our experiments conducted on a real-world dialogue dataset for anti-fraud in financial, our approach outperforms previous methods on both the quality of response generation as well as the success rate of accomplishing the goal.

The video of this talk cannot be embedded. You can watch it here:
https://dl.acm.org/doi/10.1145/3397271.3401250#sec-supp
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at SIGIR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers