Improving Bilingual Lexicon Induction for Low Frequency Words

16/11/2020

Improving Bilingual Lexicon Induction for Low Frequency Words

Jiaji Huang, Xingyu Cai, Kenneth Church

Keywords: monolingual task, bilingual induction, low regime, hubness

Abstract Paper Similar Papers

Abstract: This paper designs a Monolingual Lexicon Induction task and observes that two factors accompany the degraded accuracy of bilingual lexicon induction for rare words. First, a diminishing margin between similarities in low frequency regime, and secondly, exacerbated hubness at low frequency. Based on the observation, we further propose two methods to address these two factors, respectively. The larger issue is hubness. Addressing that improves induction accuracy significantly, especially for low-frequency words.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

03/05/2021

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

Yangming Li, lemao liu, Shuming Shi

Keywords Paper

Negative Sampling, Unlabeled Entity Problem, Named Entity Recognition

0

0

0

1

4:49

19/04/2021

Stereotype and skew: Quantifying gender bias in pre-trained and fine-tuned language models

Daniel Vassimon Manela, David Errington, Thomas Fisher and
Boris Breugel, Pasquale Minervini

Keywords Paper

0

0

0

0

6:54

01/07/2020

Are All Languages Created Equal in Multilingual BERT?

Shijie Wu, Mark Dredze

Keywords Paper

0

0

0

0

7:45

06/12/2021

PortaSpeech: Portable and High-Quality Generative Text-to-Speech

Yi Ren, Jinglin Liu, Zhou Zhao

Keywords Paper

generative model

0

0

0

0

10:15

04/07/2020

Moving Down the Long Tail of Word Sense Disambiguation with Gloss Informed Bi-encoders

Terra Blevins, Luke Zettlemoyer

Keywords Paper

Word Disambiguation, Word WSD, WSD, sense disambiguation

0

0

0

0

11:18

19/04/2021

Multilingual neural machine translation with deep encoder and multiple shallow decoders

Xiang Kong, Adithya Renduchintala, James Cross and
Yuqing Tang, Jiatao Gu, Xian Li

Keywords Paper

0

0

0

0

10:26

19/04/2021

Disfluency correction using unsupervised and semi-supervised learning

Nikhil Saini, Drumil Trivedi, Shreya Khare and
Tejas Dhamecha, Preethi Jyothi, Samarth Bharadwaj, Pushpak Bhattacharyya

Keywords Paper

0

0

0

0

7:13

04/07/2020

Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation

Biao Zhang, Philip Williams, Ivan Titov, Rico Sennrich

Keywords Paper

Massively Translation, Zero-Shot Translation, neural translation, NMT

0

0

0

0

11:47

04/07/2020

Improving Candidate Generation for Low-resource Cross-lingual Entity Linking

Shuyan Zhou, Shruti Rijhwani, John Wieting and
Jaime Carbonell, Graham Neubig

Keywords Paper

Candidate Generation, Low-resource Linking, Cross-lingual linking, Cross-lingual XEL

0

0

0

0

12:03

02/02/2021

On the Importance of Word Order Information in Cross-lingual Sequence Labeling

Zihan Liu, Genta I Winata, Samuel Cahyawijaya and
Andrea Madotto, Zhaojiang Lin, Pascale Fung

Keywords Paper

0

0

0

0

15:22

02/02/2021

Have We Solved The Hard Problem? It’s Not Easy! Contextual Lexical Contrast as a Means to Probe Neural Coherence

Wenqiang Lei, Yisong Miao, Runpeng Xie and
Bonnie Webber, Meichun Liu, Tat-Seng Chua, Nancy F. Chen

Keywords Paper

0

0

0

0

18:55

01/07/2020

Joint Training with Semantic Role Labeling for Better Generalization in Natural Language Inference

Cemil Cengiz, Deniz Yuret

Keywords Paper

0

0

0

0

4:38

02/02/2021

DropLoss for Long-Tail Instance Segmentation

Ting-I Hsieh, Esther Robb, Hwann-Tzong Chen, Jia-Bin Huang

Keywords Paper

0

0

0

0

18:56

04/07/2020

On the Encoder-Decoder Incompatibility in Variational Text Modeling and Beyond

Chen Wu, Prince Zizhuang Wang, William Yang Wang

Keywords Paper

Encoder-Decoder Incompatibility, Variational Modeling, text modeling, probability estimation

0

0

0

0

6:51

25/07/2020

Training effective neural CLIR by bridging the translation gap

Hamed Bonab, Sheikh Muhammad Sarwar, James Allan

Keywords Paper

cross-lingual word embedding, cross-lingual information retrieval, neural clir, translation gap

0

0

0

0

15:33

03/08/2020

Adapting Text Embeddings for Causal Inference

Victor Veitch, Dhanya Sridhar, David Blei

Keywords Paper

0

0

0

0

8:51

16/11/2020

Compositional and Lexical Semantics in RoBERTa, BERT and DistilBERT: A Case Study on CoQA

Ieva Staliūnaitė, Ignacio Iacobacci

Keywords Paper

nlp tasks, conversational task, semantic labeling, contextualized embeddings

0

0

0

0

11:23

08/12/2020

An analysis of language models for metaphor recognition

Arthur Neidlein, Philip Wiesenbach, Katja Markert

Keywords Paper

0

0

0

0

13:52

08/12/2020

Is it Great or Terrible? Preserving Sentiment in Neural Machine Translation of Arabic Reviews

Hadeel Saadany, Constantin Orasan

Keywords Paper

0

0

0

0

14:35

19/04/2021

WER-BERT: Automatic WER estimation with BERT in a balanced ordinal classification paradigm

Akshay Krishna Sheshadri, Anvesh Rao Vijjini, Sukhdeep Kharbanda

Keywords Paper

0

0

0

0

11:45

16/11/2020

On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment

Zirui Wang, Zachary C. Lipton, Yulia Tsvetkov

Keywords Paper

multilingual models, meta-learning algorithm, multilingual representations, negative interference

0

0

0

0

12:03

16/11/2020

If beam search is the answer, what was the question?

Clara Meister, Ryan Cotterell, Tim Vieira

Keywords Paper

language tasks, beam search, decoding, maximum decoding

0

0

0

0

12:18

03/05/2021

Taking Notes on the Fly Helps Language Pre-Training

Qiyu Wu, Chen Xing, Yatao Li and
Guolin Ke, Di He, Tie-Yan Liu

Keywords Paper

Natural Language Processing, Pre-training

0

0

0

0

5:21

16/11/2020

Are All Good Word Vector Spaces Isomorphic?

Ivan Vulić, Sebastian Ruder, Anders Søgaard

Keywords Paper

aligning spaces, monolingual training, vector spaces, non-isomorphic spaces

0

0

0

0

12:22

02/02/2021

Effective Slot Filling via Weakly-Supervised Dual-Model Learning

Jue Wang, Ke Chen, Lidan Shou and
Sai Wu, Gang Chen

Keywords Paper

0

0

0

0

18:02

16/11/2020

From Zero to Hero: On the Limitations of Zero-Shot Language Transfer with Multilingual Transformers

Anne Lauscher, Vinit Ravishankar, Ivan Vulić, Goran Glavaš

Keywords Paper

zero-shot transfer, downstream transfer, resource-lean scenarios, pos tagging

0

0

0

0

11:45

03/05/2021

Active Contrastive Learning of Audio-Visual Video Representations

Shuang Ma, Zhaoyang Zeng, Daniel McDuff, Yale Song

Keywords Paper

video recognition, audio-visual representation, self-supervised learning, active learning, contrastive representation learning

0

0

0

0

5:22

14/06/2020

Real-World Person Re-Identification via Degradation Invariance Learning

Yukun Huang, Zheng-Jun Zha, Xueyang Fu and
Richang Hong, Liang Li

Keywords Paper

disentangled representation learning, person re-identification, generative adversarial network, image degradation, self-supervised learning

0

0

0

0

1:01

04/07/2020

How Does Selective Mechanism Improve Self-Attention Networks?

Xinwei Geng, Longyue Wang, Xing Wang and
Bing Qin, Ting Liu, Zhaopeng Tu

Keywords Paper

NLP tasks, natural inference, semantic labelling, machine translation

0

0

0

0

11:43

16/11/2020

Iterative Domain-Repaired Back-Translation

Hao-Ran Wei, Zhirui Zhang, Boxing Chen, Weihua Luo

Keywords Paper

domain-specific translation, domain adaptation, back-translation method, out-of-domain systems

0

0

0

0

11:35

03/05/2021

On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines

Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow

Keywords Paper

BERT, transfer learning, pretrained language model, fine-tuning stability

0

0

0

0

3:01

14/09/2020

A Deep Dive into Multilingual Hate Speech Classification

Sai Saketh Aluru, Binny Mathew, Punyajoy Saha, Animesh Mukherjee

Keywords Paper

hate speech, multilingual, classification, bert, embeddings

0

0

0

0

14:20

19/08/2021

Bipartite Matching for Crowd Counting with Point Supervision

Hao Liu, Qiang Zhao, Yike Ma, Feng Dai

Keywords Paper

Computer Vision, Perception, Video, Deep Learning

0

0

0

0

12:46

06/12/2021

Label-Imbalanced and Group-Sensitive Classification under Overparameterization

Ganesh Ramachandra Kini, Orestis Paraskevas, Samet Oymak, Christos Thrampoulidis

Keywords Paper

machine learning, fairness

0

0

0

0

14:10

02/02/2021

Learning from Noisy Labels with Complementary Loss Functions

Deng-Bao Wang, Yong Wen, Lujia Pan, Min-Ling Zhang

Keywords Paper

0

0

0

0

14:00

19/08/2021

Focus on Interaction: A Novel Dynamic Graph Model for Joint Multiple Intent Detection and Slot Filling

Zeyuan Ding, Zhihao Yang, Hongfei Lin, Jian Wang

Keywords Paper

Natural Language Processing, Dialogue, Natural Language Processing

0

0

0

0

12:36

06/12/2020

Autoencoders that don't overfit towards the Identity

Harald Steck

Keywords Paper

0

0

0

0

3:22

06/12/2021

Mixability made efficient: Fast online multiclass logistic regression

Rémi Jézéquel, Pierre Gaillard, Alessandro Rudi

Keywords Paper

online learning

0

0

0

0

13:11

30/11/2020

DEAL: Difficulty-aware Active Learning for Semantic Segmentation

Shuai Xie, Zunlei Feng, Ying chen and
Songtao Sun, Chao Ma, Mingli Song

Keywords Paper

0

0

0

0

9:41

19/04/2021

Machine translationese: Effects of algorithmic bias on linguistic complexity in machine translation

Eva Vanmassenhove, Dimitar Shterionov, Matthew Gwilliam

Keywords Paper

0

0

0

0

11:19