Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank

Abstract: Detecting fine-grained differences in content conveyed in different languages matters for cross-lingual NLP and multilingual corpora analysis, but it is a challenging machine learning problem since annotation is expensive and hard to scale. This work improves the prediction and annotation of fine-grained semantic divergences. We introduce a training strategy for multilingual BERT models by learning to rank synthetic divergent examples of varying granularity. We evaluate our models on the Rationalized English-French Semantic Divergences, a new dataset released with this work, consisting of English-French sentence-pairs annotated with semantic divergence classes and token-level rationales. Learning to rank helps detect fine-grained sentence-level divergences more accurately than a strong sentence-level similarity model, while token-level predictions have the potential of further distinguishing between coarse and fine-grained divergences.

16/11/2020

Sentiment, Syntax, Probe, BERT, Hyperbolic

5:10

02/02/2021

Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank

Eleftheria Briakou, Marine Carpuat

Comments

Similar Papers

Compositional and Lexical Semantics in RoBERTa, BERT and DistilBERT: A Case Study on CoQA

Ieva Staliūnaitė, Ignacio Iacobacci

Keywords Abstract Paper

nlp tasks, conversational task, semantic labeling, contextualized embeddings

Refining Language Models with Compositional Explanations

Huihan Yao, Ying Chen, Qinyuan Ye and Xisen Jin, Xiang Ren

Keywords Abstract Paper

machine learning, fairness, language

Probing for Referential Information in Language Models

Ionut-Teodor Sorodoc, Kristina Gulordava, Gemma Boleda

Keywords Abstract Paper

Probing, probe tasks, Language Models, LSTM architectures

Probing BERT in Hyperbolic Spaces

Boli Chen, Yao Fu, Guangwei Xu and Pengjun Xie, Chuanqi Tan, Mosha Chen, Liping Jing

Keywords Abstract Paper

Sentiment, Syntax, Probe, BERT, Hyperbolic

Improving Commonsense Causal Reasoning by Adversarial Training and Data Augmentation

Ieva Staliūnaitė, Philip John Gorinski, Ignacio Iacobacci

Keywords Abstract Paper

On Learning Universal Representations Across Languages

Xiangpeng Wei, Rongxiang Weng, Yue Hu and Luxi Xing, Heng Yu, Weihua Luo

Keywords Abstract Paper

hierarchical contrastive learning, cross-lingual pretraining, universal representation learning

Have We Solved The Hard Problem? It’s Not Easy! Contextual Lexical Contrast as a Means to Probe Neural Coherence

Wenqiang Lei, Yisong Miao, Runpeng Xie and Bonnie Webber, Meichun Liu, Tat-Seng Chua, Nancy F. Chen

Keywords Abstract Paper

LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool

Uma Roy, Noah Constant, Rami Al-Rfou and Aditya Barua, Aaron Phillips, Yinfei Yang

Keywords Abstract Paper

language-agnostic retrieval, cross-lingual tasks, cross-lingual retrieval, alignment

A Bilingual Generative Transformer for Semantic Sentence Embedding

John Wieting, Graham Neubig, Taylor Berg-Kirkpatrick

Keywords Abstract Paper

source separation, semantic encoding, data distributions, unsupervised evaluations

On the Linguistic Representational Power of Neural Machine Translation Models

Yonatan Belinkov, Nadir Durrani, Fahim Dalvi and Hassan Sajjad, James Glass

Keywords Abstract Paper

Linguistic Models, natural processing, artificial intelligence, translating languages

Interpretability for morphological inflection: From character-level predictions to subword-level rules

Tatyana Ruzsics, Olga Sozinova, Ximena Gutierrez-Vasques, Tanja Samardzic

Keywords Abstract Paper

Syntactic Structure Distillation Pretraining for Bidirectional Encoders

Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried and Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Keywords Abstract Paper

bert pretraining, structured tasks, natural understanding, textual learners

Bayesian Methods for Semi-supervised Text Annotation

Kristian Miok, Gregor Pirs, Marko Robnik-Sikonja

Keywords Abstract Paper

Native-like Expression Identification by Contrasting Native and Proficient Second Language Speakers

Oleksandr Harust, Yugo Murawaki, Sadao Kurohashi

Keywords Abstract Paper

Fake it Till You Make it: Self-Supervised Semantic Shifts for Monolingual Word Embedding Tasks

Maurício Gruppi, Pin-Yu Chen, Sibel Adali

Keywords Abstract Paper

Probing Pretrained Language Models for Lexical Semantics

Ivan Vulić, Edoardo Maria Ponti, Robert Litschko and Goran Glavaš, Anna Korhonen

Keywords Abstract Paper

lexical tasks, pretrained models, lms, lexical strategies

Towards Robustifying NLI Models Against Lexical Dataset Biases

Xiang Zhou, Mohit Bansal

Keywords Abstract Paper

Natural Inference, data augmentation, Robustifying Models, deep models

Residual Energy-Based Models for Text Generation

Yuntian Deng, Anton Bakhtin, Myle Ott and Arthur Szlam, Marc'Aurelio Ranzato

Keywords Abstract Paper

energy-based models, text generation

Correlation-Guided Representation for Multi-Label Text Classification

Qian-Wen Zhang, Ximing Zhang, Zhao Yan and Ruifang Liu, Yunbo Cao, Min-Ling Zhang

Keywords Abstract Paper

Machine Learning, Multi-instance; Multi-label; Multi-view learning, Classification, Text Classification

Cross-Linguistic Syntactic Evaluation of Word Prediction Models

Aaron Mueller, Garrett Nicolai, Panayiota Petrou-Zeniou and Natalia Talmina, Tal Linzen

Keywords Abstract Paper

Cross-Linguistic Syntax, Syntax, Cross-Linguistic Models, neural models

Modeling Content Importance for Summarization with Pre-trained Language Models

Liqiang Xiao, Lu Wang, Hao He, Yaohui Jin

Keywords Paper

Huihan Yao, Ying Chen, Qinyuan Ye and
Xisen Jin, Xiang Ren

Keywords Paper

Keywords Paper

Boli Chen, Yao Fu, Guangwei Xu and
Pengjun Xie, Chuanqi Tan, Mosha Chen, Liping Jing

Keywords Paper

Keywords Paper

Xiangpeng Wei, Rongxiang Weng, Yue Hu and
Luxi Xing, Heng Yu, Weihua Luo

Keywords Paper

Wenqiang Lei, Yisong Miao, Runpeng Xie and
Bonnie Webber, Meichun Liu, Tat-Seng Chua, Nancy F. Chen

Keywords Paper

Uma Roy, Noah Constant, Rami Al-Rfou and
Aditya Barua, Aaron Phillips, Yinfei Yang

Keywords Paper

Keywords Paper

Yonatan Belinkov, Nadir Durrani, Fahim Dalvi and
Hassan Sajjad, James Glass

Keywords Paper

Keywords Paper

Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried and
Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Ivan Vulić, Edoardo Maria Ponti, Robert Litschko and
Goran Glavaš, Anna Korhonen

Keywords Paper

Keywords Paper

Yuntian Deng, Anton Bakhtin, Myle Ott and
Arthur Szlam, Marc'Aurelio Ranzato

Keywords Paper

Qian-Wen Zhang, Ximing Zhang, Zhao Yan and
Ruifang Liu, Yunbo Cao, Min-Ling Zhang

Keywords Paper

Aaron Mueller, Garrett Nicolai, Panayiota Petrou-Zeniou and
Natalia Talmina, Tal Linzen

Keywords Paper

Keywords Paper

Keywords Paper

Alex Warstadt, Yian Zhang, Xiaocheng Li and
Haokun Liu, Samuel R. Bowman

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Xisen Jin, Zhongyu Wei, Junyi Du and
Xiangyang Xue, Xiang Ren

Keywords Paper

Keywords Paper

Xiyue Zhang, Xiaoning Du, Xiaofei Xie and
Lei Ma, Yang Liu, Meng Sun

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Seung Jun Moon, Sangwoo Mo, Kimin Lee and
Jaeho Lee, Jinwoo Shin

Keywords Paper

Keywords Paper

Yuanen Zhou, Meng Wang, Daqing Liu and
Zhenzhen Hu, Hanwang Zhang

Keywords Paper