How Self-Attention Improves Rare Class Performance in a Question-Answering Dialogue Agent

01/07/2020

How Self-Attention Improves Rare Class Performance in a Question-Answering Dialogue Agent

Adam Stiff, Qi Song, Eric Fosler-Lussier

Keywords:

Abstract Paper Similar Papers

Abstract: Contextualized language modeling using deep Transformer networks has been applied to a variety of natural language processing tasks with remarkable success. However, we find that these models are not a panacea for a question-answering dialogue agent corpus task, which has hundreds of classes in a long-tailed frequency distribution, with only thousands of data points. Instead, we find substantial improvements in recall and accuracy on rare classes from a simple one-layer RNN with multi-headed self-attention and static word embeddings as inputs. While much research has used attention weights to illustrate what input is important for a task, the complexities of our dialogue corpus offer a unique opportunity to examine how the model represents what it attends to, and we offer a detailed analysis of how that contributes to improved performance on rare classes. A particularly interesting phenomenon we observe is that the model picks up implicit meanings by splitting different aspects of the semantics of a single word across multiple attention heads.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at SIGDIAL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

04/07/2020

A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation

Raúl Vázquez, Alessandro Raganato, Mathias Creutz, Jörg Tiedemann

Keywords Paper

Multilingual Translation, Neural translation, transfer learning, translation

0

0

0

0

14:05

16/11/2020

X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models

Zhengbao Jiang, Antonios Anastasopoulos, Jun Araki and
Haibo Ding, Graham Neubig

Keywords Paper

factual retrieval, language models, lms, probing methods

0

0

0

0

9:45

19/04/2021

Enriching non-autoregressive transformer with syntactic and semantic structures for neural machine translation

Ye Liu, Yao Wan, Jianguo Zhang and
Wenting Zhao, Philip Yu

Keywords Paper

0

0

0

0

10:18

02/02/2021

SARG: A Novel Semi Autoregressive Generator for Multi-turn Incomplete Utterance Restoration

Mengzuo Huang, Feng Li, Wuhe Zou, Weidong Zhang

Keywords Paper

0

0

0

0

14:50

06/12/2021

Leveraging the Inductive Bias of Large Language Models for Abstract Textual Reasoning

Christopher Rytting, David Wingate

Keywords Paper

language, transfer learning

0

0

0

0

14:16

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

04/07/2020

On the Linguistic Representational Power of Neural Machine Translation Models

Yonatan Belinkov, Nadir Durrani, Fahim Dalvi and
Hassan Sajjad, James Glass

Keywords Paper

Linguistic Models, natural processing, artificial intelligence, translating languages

0

0

0

0

19:17

12/07/2020

Educating Text Autoencoders: Latent Representation Guidance via Denoising

Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

17:06

04/07/2020

Learning Source Phrase Representations for Neural Machine Translation

Hongfei Xu, Josef van Genabith, Deyi Xiong and
Qiuhui Liu, Jingyi Zhang

Keywords Paper

Neural Translation, WMT tasks, Learning Representations, Transformer model

0

0

0

0

7:18

26/04/2020

Improving Neural Language Generation with Spectrum Control

Lingxiao Wang, Jing Huang, Kevin Huang and
Ziniu Hu, Guangtao Wang, Quanquan Gu

Keywords Paper

0

0

0

0

4:58

26/04/2020

Reducing Transformer Depth on Demand with Structured Dropout

Angela Fan, Edouard Grave, Armand Joulin

Keywords Paper

reduction, regularization, pruning, dropout, transformer

0

0

0

0

5:01

06/12/2021

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

Junnan Li, Ramprasaath Selvaraju, Akhilesh Gotmare and
Shafiq Joty, Caiming Xiong, Steven Chu Hong Hoi

Keywords Paper

transformers, vision, representation learning

0

0

0

0

9:40

06/12/2021

Refining Language Models with Compositional Explanations

Huihan Yao, Ying Chen, Qinyuan Ye and
Xisen Jin, Xiang Ren

Keywords Paper

machine learning, fairness, language

0

0

0

0

13:17

12/07/2020

Emergence of Separable Manifolds in Deep Language Representations

Jonathan Mamou, Hang Le, Miguel del Rio Fernandez and
Cory Stephenson, Hanlin Tang, Yoon Kim, SueYeon Chung

Keywords Paper

Applications - Neuroscience, Cognitive Science, Biology and Health

0

0

0

0

14:24

03/05/2021

Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning

Beliz Gunel, Jingfei Du, Alexis Conneau, Veselin Stoyanov

Keywords Paper

supervised contrastive learning, pre-trained language model fine-tuning, natural language understanding, generalization, few-shot learning, robustness

0

0

0

0

4:44

07/06/2020

Learning Cross-Lingual Word Embeddings from Twitter via Distant Supervision

Jose Camacho-Collados, Yerai Doval Mosquera, Eugenio Martínez-Cámara and
Luis Espinosa-Anke, Francesco Barbieri, Steven Schockaert

Keywords Paper

embedding spaces, embeddings, languages, learning, performance, representations, shared, spaces, texts, twitter, word embeddings, words

0

0

0

0

10:39

04/07/2020

Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation

Kaustubh Dhole, Christopher D. Manning

Keywords Paper

Question Generation, syntactic transformation, crowd-sourced evaluations, generating questions

0

0

0

0

12:24

04/07/2020

Fine-Grained Analysis of Cross-Linguistic Syntactic Divergences

Dmitry Nikolaev, Ofir Arviv, Taelin Karidi and
Neta Kenneth, Veronika Mitnik, Lilja Maria Saeboe, Omri Abend

Keywords Paper

Fine-Grained Divergences, cross-lingual transfer, full automation, cross-lingual parser

0

0

0

0

12:05

04/07/2020

Effective Estimation of Deep Generative Language Models

Tom Pelsmaeker, Wilker Aziz

Keywords Paper

Estimation Models, parameterisation models, posterior collapse, language modelling

0

0

0

0

12:19

26/04/2020

Compositional languages emerge in a neural iterated learning model

Yi Ren, Shangmin Guo, Matthieu Labeau and
Shay B. Cohen, Simon Kirby

Keywords Paper

Compositionality, Multi-agent, Emergent language, Iterated learning

0

0

0

0

5:07

25/07/2020

Combining contextualized and non-contextualized query translations to improve CLIR

Suraj Nair, Petra Galuscakova, Douglas W. Oard

Keywords Paper

CLIR, machine translation

0

0

0

0

8:39

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

26/04/2020

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

Sumanth Dathathri, Andrea Madotto, Janice Lan and
Jane Hung, Eric Frank, Piero Molino, Jason Yosinski, Rosanne Liu

Keywords Paper

controlled text generation, generative models, conditional generative models, language modeling, transformer

0

0

1

1

4:58

04/07/2020

Towards Robustifying NLI Models Against Lexical Dataset Biases

Xiang Zhou, Mohit Bansal

Keywords Paper

Natural Inference, data augmentation, Robustifying Models, deep models

0

0

0

0

11:34

03/05/2021

On Learning Universal Representations Across Languages

Xiangpeng Wei, Rongxiang Weng, Yue Hu and
Luxi Xing, Heng Yu, Weihua Luo

Keywords Paper

hierarchical contrastive learning, cross-lingual pretraining, universal representation learning

0

0

0

0

3:51

26/04/2020

A Probabilistic Formulation of Unsupervised Text Style Transfer

Junxian He, Xinyi Wang, Graham Neubig, Taylor Berg-Kirkpatrick

Keywords Paper

unsupervised text style transfer, deep latent sequence model

0

0

0

0

5:02

02/02/2021

C2C-GenDA: Cluster-to-Cluster Generation for Data Augmentation of Slot Filling

Yutai Hou, Sanyuan Chen, Wanxiang Che and
Cheng Chen, Ting Liu

Keywords Paper

0

0

0

0

15:01

02/02/2021

Do Response Selection Models Really Know What’s Next? Utterance Manipulation Strategies for Multi-turn Response Selection

Taesun Whang, Dongyub Lee, Dongsuk Oh and
Chanhee Lee, Kijong Han, Dong-hun Lee, Saebyeok Lee

Keywords Paper

0

0

0

0

17:37

16/11/2020

Sequence-Level Mixed Sample Data Augmentation

Demi Guo, Yoon Kim, Alexander Rush

Keywords Paper

sequence-to-sequence problems, scan, semantic parsing, neural networks

0

0

0

0

5:58

03/05/2021

Relating by Contrasting: A Data-efficient Framework for Multimodal Generative Models

Yuge Shi, Brooks Paige, Philip Torr, Siddharth N

Keywords Paper

Deep generative model, representation learning, multi-modal learning

0

0

0

0

5:09

16/11/2020

Generationary or “How We Went beyond Word Sense Inventories and Learned to Gloss”

Michele Bevilacqua, Marco Maru, Roberto Navigli

Keywords Paper

generative modeling, definition modeling, discriminative tasks, word disambiguation

0

0

0

0

11:49

05/12/2020

Heads-up! Unsupervised constituency parsing via self-attention heads

Bowen Li, Taeuk Kim, Reinald Kim Amplayo, Frank Keller

Keywords Paper

0

0

0

0

13:55

16/11/2020

Bridging Linguistic Typology and Multilingual Machine Translation with Multi-View Language Representations

Arturo Oncevay, Barry Haddow, Alexandra Birch

Keywords Paper

multilingual translation, language clustering, multilingual transfer, singular analysis

0

0

0

0

11:28

16/11/2020

Repulsive Attention: Rethinking Multi-head Attention as Bayesian Inference

Bang An, Jie Lyu, Zhenyi Wang and
Chunyuan Li, Changwei Hu, Fei Tan, Ruiyi Zhang, Yifan Hu, Changyou Chen

Keywords Paper

natural applications, attention collapse, neural mechanism, bayesian perspective

0

0

0

0

9:29

08/12/2020

Emergent Communication Pretraining for Few-Shot Machine Translation

Yaoyiran Li, Edoardo Maria Ponti, Ivan Vulić, Anna Korhonen

Keywords Paper

0

0

0

0

14:42

08/12/2020

Priorless Recurrent Networks Learn Curiously

Jeff Mitchell, Jeffrey Bowers

Keywords Paper

0

0

0

0

14:02

16/11/2020

LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool

Uma Roy, Noah Constant, Rami Al-Rfou and
Aditya Barua, Aaron Phillips, Yinfei Yang

Keywords Paper

language-agnostic retrieval, cross-lingual tasks, cross-lingual retrieval, alignment

0

0

0

0

12:07

06/12/2020

Unsupervised Text Generation by Learning from Search

Jingjing Li, Zichao Li, Lili Mou and
Xin Jiang, Michael Lyu, Irwin King

Keywords Paper

0

0

0

0

3:24

02/02/2021

KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning

Ye Liu, Yao Wan, Lifang He and
Hao Peng, Philip S. Yu

Keywords Paper

0

0

0

0

17:52

19/08/2021

ALaSca: an Automated approach for Large-Scale Lexical Substitution

Caterina Lacerra, Tommaso Pasini, Rocco Tripodi, Roberto Navigli

Keywords Paper

Natural Language Processing, Natural Language Semantics, Resources and Evaluation

0

0

0

0

14:27