How Does Selective Mechanism Improve Self-Attention Networks?

04/07/2020

How Does Selective Mechanism Improve Self-Attention Networks?

Xinwei Geng, Longyue Wang, Xing Wang, Bing Qin, Ting Liu, Zhaopeng Tu

Keywords: NLP tasks, natural inference, semantic labelling, machine translation

Abstract Paper Similar Papers

Abstract: Self-attention networks (SANs) with selective mechanism has produced substantial improvements in various NLP tasks by concentrating on a subset of input words. However, the underlying reasons for their strong performance have not been well explained. In this paper, we bridge the gap by assessing the strengths of selective SANs (SSANs), which are implemented with a flexible and universal Gumbel-Softmax. Experimental results on several representative NLP tasks, including natural language inference, semantic role labelling, and machine translation, show that SSANs consistently outperform the standard SANs. Through well-designed probing experiments, we empirically validate that the improvement of SSANs can be attributed in part to mitigating two commonly-cited weaknesses of SANs: word order encoding and structure modeling. Specifically, the selective mechanism improves SANs by paying more attention to content words that contribute to the meaning of the sentence.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

25/07/2020

Symmetric regularization based BERT for pair-wise semantic reasoning

Weidi Xu, Xingyi Cheng, Kunlong Chen, Taifeng Wang

Keywords Paper

BERT, natural language inference

0

0

0

0

8:58

14/06/2020

More Grounded Image Captioning by Distilling Image-Text Matching Model

Yuanen Zhou, Meng Wang, Daqing Liu and
Zhenzhen Hu, Hanwang Zhang

Keywords Paper

grounded image captioning, image-text matching, visual grounding, cross-task knowledge distillation

0

0

0

0

1:01

04/07/2020

Feature Projection for Improved Text Classification

Qi Qin, Wenpeng Hu, Bing Liu

Keywords Paper

Text Classification, classification, sentiment classification, Bert classification

0

0

0

0

10:57

16/11/2020

On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment

Zirui Wang, Zachary C. Lipton, Yulia Tsvetkov

Keywords Paper

multilingual models, meta-learning algorithm, multilingual representations, negative interference

0

0

0

0

12:03

02/02/2021

Have We Solved The Hard Problem? It’s Not Easy! Contextual Lexical Contrast as a Means to Probe Neural Coherence

Wenqiang Lei, Yisong Miao, Runpeng Xie and
Bonnie Webber, Meichun Liu, Tat-Seng Chua, Nancy F. Chen

Keywords Paper

0

0

0

0

18:55

14/09/2020

On Saliency Maps and Adversarial Robustness

Puneet Mangla, Vedant Singh, Vineeth Balasubramanian

Keywords Paper

adversarial robustness, saliency maps, deep neural networks

0

0

0

0

17:29

02/02/2021

Merging Statistical Feature via Adaptive Gate for Improved Text Classification

Xianming Li, Zongxi Li, Haoran Xie, Qing Li

Keywords Paper

0

0

0

0

14:56

03/05/2021

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

Yangming Li, lemao liu, Shuming Shi

Keywords Paper

Negative Sampling, Unlabeled Entity Problem, Named Entity Recognition

0

0

0

1

4:49

14/06/2020

Attention-Guided Hierarchical Structure Aggregation for Image Matting

Yu Qiao, Yuhao Liu, Xin Yang and
Dongsheng Zhou, Mingliang Xu, Qiang Zhang, Xiaopeng Wei

Keywords Paper

image matting, attention, hierarchical, aggregation, appearance cues

0

0

0

0

0:59

26/04/2020

Self-Adversarial Learning with Comparative Discrimination for Text Generation

Wangchunshu Zhou, Tao Ge, Ke Xu and
Furu Wei, Ming Zhou

Keywords Paper

adversarial learning, text generation

0

0

0

0

9:16

06/12/2021

Can contrastive learning avoid shortcut solutions?

Joshua Robinson, Li Sun, Ke Yu and
Kayhan Batmanghelich, Stefanie Jegelka, Suvrit Sra

Keywords Paper

self-supervised learning, contrastive learning

0

0

0

0

12:45

01/07/2020

Joint Training with Semantic Role Labeling for Better Generalization in Natural Language Inference

Cemil Cengiz, Deniz Yuret

Keywords Paper

0

0

0

0

4:38

16/11/2020

COD3S: Diverse Generation with Discrete Semantic Signatures

Nathaniel Weir, João Sedoc, Benjamin Van Durme

Keywords Paper

causal generation, cods, neural models, seqseqs

0

0

0

0

7:09

08/12/2020

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

0

0

0

0

13:01

16/11/2020

Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning

Yuning Mao, Yanru Qu, Yiqing Xie and
Xiang Ren, Jiawei Han

Keywords Paper

single-document summarization, single-document sds, multi-document summarization, multi-document mds

0

0

0

0

10:58

13/04/2021

Improving adversarial robustness via unlabeled out-of-domain data

Zhun Deng, Linjun Zhang, Amirata Ghorbani, James Zou

Keywords Paper

0

0

0

0

3:01

19/08/2021

Modelling General Properties of Nouns by Selectively Averaging Contextualised Embeddings

Na Li, Zied Bouraoui, Jose Camacho-Collados and
Luis Espinosa-Anke, Qing Gu, Steven Schockaert

Keywords Paper

Natural Language Processing, Natural Language Semantics, Natural Language Processing

0

0

0

0

14:09

04/07/2020

Improving Candidate Generation for Low-resource Cross-lingual Entity Linking

Shuyan Zhou, Shruti Rijhwani, John Wieting and
Jaime Carbonell, Graham Neubig

Keywords Paper

Candidate Generation, Low-resource Linking, Cross-lingual linking, Cross-lingual XEL

0

0

0

0

12:03

16/11/2020

An Unsupervised Sentence Embedding Method by Mutual Information Maximization

Yan Zhang, Ruidan He, Zuozhu Liu and
Kwan Hui Lim, Lidong Bing

Keywords Paper

sentence-pair tasks, clustering, semantic search, downstream tasks

0

0

0

0

12:22

16/11/2020

An Information Bottleneck Approach for Controlling Conciseness in Rationale Extraction

Bhargavi Paranjape, Mandar Joshi, John Thickstun and
Hannaneh Hajishirzi, Luke Zettlemoyer

Keywords Paper

language understanding, semi-supervised setting, complex models, explainer

0

0

0

0

11:44

08/12/2020

An analysis of language models for metaphor recognition

Arthur Neidlein, Philip Wiesenbach, Katja Markert

Keywords Paper

0

0

0

0

13:52

02/02/2021

Adversarial Training Reduces Information and Improves Transferability

Matteo Terzi, Alessandro Achille, Marco Maggipinto, Gian Antonio Susto

Keywords Paper

0

0

0

0

19:54

04/07/2020

Syntax-Aware Opinion Role Labeling with Dependency Graph Convolutional Networks

Bo Zhang, Yue Zhang, Rui Wang and
Zhenghua Li, Min Zhang

Keywords Paper

Syntax-Aware Labeling, Opinion labeling, ORL, opinion task

0

0

0

0

11:47

14/06/2020

Harmonizing Transferability and Discriminability for Adapting Object Detectors

Chaoqi Chen, Zebiao Zheng, Xinghao Ding and
Yue Huang, Qi Dou

Keywords Paper

unsupervised domain adaptation, cross-domain object detection, transfer learning, deep learning, hierarchical transferability calibration

0

0

0

0

1:01

03/05/2021

Variational Information Bottleneck for Effective Low-Resource Fine-Tuning

Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson

Keywords Paper

variational information bottleneck, biases, robust, over-fitting, large-scale pre-trained language models, NLP, Transfer learning

0

0

0

0

5:07

05/12/2020

AMR quality rating with a lightweight CNN

Juri Opitz

Keywords Paper

0

0

0

0

14:54

03/05/2021

FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

Pengyu Cheng, Weituo Hao, Siyang Yuan and
Shijing Si, Lawrence Carin

Keywords Paper

Mutual Information, Pretrained Text Encoders, Contrastive Learning, Fairness

0

0

0

0

4:43

01/07/2020

Learning Probabilistic Sentence Representations from Paraphrases

Mingda Chen, Kevin Gimpel

Keywords Paper

0

0

0

0

5:00

19/04/2021

Elastic weight consolidation for better bias inoculation

James Thorne, Andreas Vlachos

Keywords Paper

0

0

0

0

6:17

04/07/2020

Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation

Arya D. McCarthy, Xian Li, Jiatao Gu, Ning Dong

Keywords Paper

Variational Translation, posterior collapse, auxiliary task, uncertainty

0

0

0

0

11:00

25/07/2020

Training effective neural CLIR by bridging the translation gap

Hamed Bonab, Sheikh Muhammad Sarwar, James Allan

Keywords Paper

cross-lingual word embedding, cross-lingual information retrieval, neural clir, translation gap

0

0

0

0

15:33

06/12/2021

Generalized and Discriminative Few-Shot Object Detection via SVD-Dictionary Enhancement

Aming WU, Suqi Zhao, Cheng Deng, Wei Liu

Keywords Paper

machine learning, vision

0

0

0

0

9:04

06/12/2021

Joint Semantic Mining for Weakly Supervised RGB-D Salient Object Detection

Jingjing Li, Wei Ji, Qi Bi and
Cheng Yan, Miao Zhang, Yongri Piao, Huchuan Lu, Li cheng

Keywords Paper

vision

0

0

0

0

9:03

14/06/2020

On Vocabulary Reliance in Scene Text Recognition

Zhaoyi Wan, Jielei Zhang, Liang Zhang and
Jiebo Luo, Cong Yao

Keywords Paper

scene text recognition, text spotting, document analysis, ocr, scene text detection, sequence recognition, language and vision

0

0

0

0

1:00

04/07/2020

Moving Down the Long Tail of Word Sense Disambiguation with Gloss Informed Bi-encoders

Terra Blevins, Luke Zettlemoyer

Keywords Paper

Word Disambiguation, Word WSD, WSD, sense disambiguation

0

0

0

0

11:18

04/07/2020

Modelling Context and Syntactical Features for Aspect-based Sentiment Analysis

Minh Hieu Phan, Philip O. Ogunbona

Keywords Paper

Modelling Context, Aspect-based Analysis, aspect extraction, aspect classification

0

0

0

0

11:45

16/11/2020

Unsupervised Adaptation of Question Answering Systems via Generative Self-training

Steven Rennie, Etienne Marcheret, Neil Mallinar and
David Nahamoo, Vaibhava Goel

Keywords Paper

question-answering tasks, self-supervised tasks, word masking, sentence entailment

0

0

0

0

13:14

30/11/2020

Scale-Aware Polar Representation for Arbitrarily-Shaped Text Detection

Yanguang Bi, Zhiqiang Hu

Keywords Paper

0

0

0

0

9:56

02/02/2021

TextGAIL: Generative Adversarial Imitation Learning for Text Generation

Qingyang Wu, Lei Li, Zhou Yu

Keywords Paper

0

0

0

0

16:41

16/11/2020

Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training

Joe Stacey, Pasquale Minervini, Haim Dubossarsky and
Sebastian Riedel, Tim Rocktäschel

Keywords Paper

neural networks, adversarial training, sentence representations, nli models

0

0

0

0

7:28