Speakers Fill Lexical Semantic Gaps with Context

16/11/2020

Speakers Fill Lexical Semantic Gaps with Context

Tiago Pimentel, Rowan Hall Maudslay, Damian Blasi, Ryan Cotterell

Keywords: bert-based ambiguity, human annotation, lexical ambiguity, ambiguous words

Abstract Paper Similar Papers

Abstract: Lexical ambiguity is widespread in language, allowing for the reuse of economical word forms and therefore making language more efficient. If ambiguous words cannot be disambiguated from context, however, this gain in efficiency might make language less clear---resulting in frequent miscommunication. For a language to be clear and efficiently encoded, we posit that the lexical ambiguity of a word type should correlate with how much information context provides about it, on average. To investigate whether this is the case, we operationalise the lexical ambiguity of a word as the entropy of meanings it can take, and provide two ways to estimate this---one which requires human annotation (using WordNet), and one which does not (using BERT), making it readily applicable to a large number of languages. We validate these measures by showing that, on six high-resource languages, there are significant Pearson correlations between our BERT-based estimate of ambiguity and the number of synonyms a word has in WordNet (e.g. $h̊o = 0.40$ in English). We then test our main hypothesis---that a word′s lexical ambiguity should negatively correlate with its contextual uncertainty---and find significant correlations on all 18 typologically diverse languages we analyse. This suggests that, in the presence of ambiguity, speakers compensate by making contexts more informative.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

14/06/2020

More Grounded Image Captioning by Distilling Image-Text Matching Model

Yuanen Zhou, Meng Wang, Daqing Liu and
Zhenzhen Hu, Hanwang Zhang

Keywords Paper

grounded image captioning, image-text matching, visual grounding, cross-task knowledge distillation

0

0

0

0

1:01

16/11/2020

Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank

Eleftheria Briakou, Marine Carpuat

Keywords Paper

detecting content, cross-lingual nlp, machine problem, annotation

0

0

0

0

11:06

08/12/2020

An analysis of language models for metaphor recognition

Arthur Neidlein, Philip Wiesenbach, Katja Markert

Keywords Paper

0

0

0

0

13:52

19/04/2021

WER-BERT: Automatic WER estimation with BERT in a balanced ordinal classification paradigm

Akshay Krishna Sheshadri, Anvesh Rao Vijjini, Sukhdeep Kharbanda

Keywords Paper

0

0

0

0

11:45

02/02/2021

Have We Solved The Hard Problem? It’s Not Easy! Contextual Lexical Contrast as a Means to Probe Neural Coherence

Wenqiang Lei, Yisong Miao, Runpeng Xie and
Bonnie Webber, Meichun Liu, Tat-Seng Chua, Nancy F. Chen

Keywords Paper

0

0

0

0

18:55

19/04/2021

Disambiguatory signals are stronger in word-initial positions

Tiago Pimentel, Ryan Cotterell, Brian Roark

Keywords Paper

0

0

0

0

11:35

16/11/2020

CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models

Nikita Nangia, Clara Vania, Rasika Bhalerao, Samuel R. Bowman

Keywords Paper

nlp tasks, pretrained models, masked models, mlms

0

0

0

0

10:56

16/11/2020

Compositional and Lexical Semantics in RoBERTa, BERT and DistilBERT: A Case Study on CoQA

Ieva Staliūnaitė, Ignacio Iacobacci

Keywords Paper

nlp tasks, conversational task, semantic labeling, contextualized embeddings

0

0

0

0

11:23

16/11/2020

Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data

Lingkai Kong, Haoming Jiang, Yuchen Zhuang and
Jie Lyu, Tuo Zhao, Chao Zhang

Keywords Paper

augmented training, in-distribution calibration, text classification, expectation error

0

0

0

0

11:47

06/12/2021

Relative Uncertainty Learning for Facial Expression Recognition

Yuhang Zhang, Chengrui Wang, Weihong Deng

Keywords Paper

0

0

0

0

8:12

04/07/2020

CluBERT: A Cluster-Based Approach for Learning Sense Distributions in Multiple Languages

Tommaso Pasini, Federico Scozzafava, Bianca Scarlini

Keywords Paper

English tasks, disambiguation, multilingual tasks, CluBERT

0

0

0

0

12:17

16/11/2020

Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings

Vaibhav Kumar, Tenzin Bhotia, Vaibhav Kumar, Tanmoy Chakraborty

Keywords Paper

word embeddings, semantic words, coreference resolution, post-processing methods

0

0

0

0

11:56

25/07/2020

Symmetric regularization based BERT for pair-wise semantic reasoning

Weidi Xu, Xingyi Cheng, Kunlong Chen, Taifeng Wang

Keywords Paper

BERT, natural language inference

0

0

0

0

8:58

19/04/2021

PolyLM: Learning about polysemy through language modeling

Alan Ansell, Felipe Bravo-Marquez, Bernhard Pfahringer

Keywords Paper

0

0

0

0

11:40

08/12/2020

Multi-task Learning of Spoken Language Understanding by Integrating N-Best Hypotheses with Hierarchical Attention

Mingda Li, Xinyue Liu, Weitong Ruan and
Luca Soldaini, Wael Hamza, Chengwei Su

Keywords Paper

0

0

0

0

14:43

16/11/2020

Mind Your Inflections! Improving NLP for Non-Standard Englishes with Base-Inflection Encoding

Samson Tan, Shafiq Joty, Lav Varshney, Min-Yen Kan

Keywords Paper

comprehension, fine-tuning models, downstream tasks, nlp systems

0

0

0

0

10:22

02/02/2021

Improving Commonsense Causal Reasoning by Adversarial Training and Data Augmentation

Ieva Staliūnaitė, Philip John Gorinski, Ignacio Iacobacci

Keywords Paper

0

0

0

0

16:40

02/02/2021

MASKER: Masked Keyword Regularization for Reliable Text Classification

Seung Jun Moon, Sangwoo Mo, Kimin Lee and
Jaeho Lee, Jinwoo Shin

Keywords Paper

0

0

0

0

15:05

16/11/2020

On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment

Zirui Wang, Zachary C. Lipton, Yulia Tsvetkov

Keywords Paper

multilingual models, meta-learning algorithm, multilingual representations, negative interference

0

0

0

0

12:03

08/12/2020

Contextualized Word Embeddings Encode Aspects of Human-Like Word Sense Knowledge

Sathvik Nair, Mahesh Srinivasan, Stephan Meylan

Keywords Paper

0

0

0

0

14:58

04/07/2020

WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge

Hongming Zhang, Xinran Zhao, Yangqiu Song

Keywords Paper

Deep Knowledge, Answering Challenge, WinoWhy, commonsense reasoning

0

0

0

0

11:58

16/11/2020

Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training

Joe Stacey, Pasquale Minervini, Haim Dubossarsky and
Sebastian Riedel, Tim Rocktäschel

Keywords Paper

neural networks, adversarial training, sentence representations, nli models

0

0

0

0

7:28

04/07/2020

Generalized Entropy Regularization or: There's Nothing Special about Label Smoothing

Clara Meister, Elizabeth Salesky, Ryan Cotterell

Keywords Paper

label smoothing, language tasks, Generalized Regularization, Label Smoothing

0

0

0

0

12:03

19/04/2021

Stereotype and skew: Quantifying gender bias in pre-trained and fine-tuned language models

Daniel Vassimon Manela, David Errington, Thomas Fisher and
Boris Breugel, Pasquale Minervini

Keywords Paper

0

0

0

0

6:54

02/02/2021

Constructing a Fair Classifier with Generated Fair Data

Taeuk Jang, Feng Zheng, Xiaoqian Wang

Keywords Paper

0

0

0

0

15:58

08/12/2020

Assessing Polyseme Sense Similarity through Co-predication Acceptability and Contextualised Embedding Distance

Janosch Haber, Massimo Poesio

Keywords Paper

0

0

0

0

14:07

13/04/2021

Improving adversarial robustness via unlabeled out-of-domain data

Zhun Deng, Linjun Zhang, Amirata Ghorbani, James Zou

Keywords Paper

0

0

0

0

3:01

06/12/2021

Characterizing the risk of fairwashing

Ulrich Aïvodji, Hiromi Arai, Sébastien Gambs, Satoshi Hara

Keywords Paper

fairness

0

0

0

0

14:19

06/12/2021

An Uncertainty Principle is a Price of Privacy-Preserving Microdata

John Abowd, Robert Ashmead, Ryan Cumings-Menon and
Simson Garfinkel, Daniel Kifer, Philip Leclerc, William Sexton, Ashley Simpson, Christine Task, Pavel Zhuravlev

Keywords Paper

privacy

0

0

0

0

12:13

16/11/2020

UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation

Jian Guan, Minlie Huang

Keywords Paper

open-ended generation, story generation, evaluating generation, constructing samples

0

0

0

0

11:26

08/12/2020

Knowledge Aware Emotion Recognition in Textual Conversations via Multi-Task Incremental Transformer

Duzhen Zhang, Xiuyi Chen, Shuang Xu, Bo Xu

Keywords Paper

0

0

0

0

14:58

06/12/2020

Fairness without Demographics through Adversarially Reweighted Learning

Preethi Lahoti, Alex Beutel, Jilin Chen and
Kang Lee, Flavien Prost, Nithum Thain, Xuezhi Wang, Ed Chi

Keywords Paper

0

0

0

0

3:21

02/02/2021

The Gap on Gap: Tackling the Problem of Differing Data Distributions in Bias-Measuring Datasets

Vid Kocijan, Oana-Maria Camburu, Thomas Lukasiewicz

Keywords Paper

0

0

0

0

15:13

03/05/2021

Variational Information Bottleneck for Effective Low-Resource Fine-Tuning

Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson

Keywords Paper

variational information bottleneck, biases, robust, over-fitting, large-scale pre-trained language models, NLP, Transfer learning

0

0

0

0

5:07

04/07/2020

Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging

Nasser Zalmout, Nizar Habash

Keywords Paper

Joint features, joint modeling, Lemmatization, Normalization

0

0

0

0

12:25

04/07/2020

Improving Image Captioning Evaluation by Considering Inter References Variance

Yanzhi Yi, Hangyu Deng, Jinglu Hu

Keywords Paper

Image Evaluation, Evaluating captions, system-level tasks, BERTScore

0

0

0

0

11:31

03/05/2021

Active Contrastive Learning of Audio-Visual Video Representations

Shuang Ma, Zhaoyang Zeng, Daniel McDuff, Yale Song

Keywords Paper

video recognition, audio-visual representation, self-supervised learning, active learning, contrastive representation learning

0

0

0

0

5:22

04/07/2020

Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics

Nitika Mathur, Timothy Baldwin, Trevor Cohn

Keywords Paper

judging metrics, assessment, pairwise ranking, thresholding

0

0

0

0

11:39

16/11/2020

Understanding Neural Abstractive Summarization Models via Uncertainty

Jiacheng Xu, Shrey Desai, Greg Durrett

Keywords Paper

analyzing models, seqseq models, summarization decoders, pegasus

0

0

0

0

6:52

06/12/2020

Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies

Itai Gat, Idan Schwartz, Alex Schwing, Tamir Hazan

Keywords Paper

0

0

0

0

3:18