BLiMP: The Benchmark of Linguistic Minimal Pairs for English

Abstract: We introduce The Benchmark of Linguistic Minimal Pairs (BLiMP),1 a challenge set for evaluating the linguistic knowledge of language models (LMs) on major grammatical phenomena in English. BLiMP consists of 67 individual datasets, each containing 1,000 minimal pairs---that is, pairs of minimally different sentences that contrast in grammatical acceptability and isolate specific phenomenon in syntax, morphology, or semantics. We generate the data according to linguist-crafted grammar templates, and human aggregate agreement with the labels is 96.4%. We evaluate n-gram, LSTM, and Transformer (GPT-2 and Transformer-XL) LMs by observing whether they assign a higher probability to the acceptable sentence in each minimal pair. We find that state-of-the-art models identify morphological contrasts related to agreement reliably, but they struggle with some subtle semantic and syntactic phenomena, such as negative polarity items and extraction islands.

04/07/2020

Sentiment, Syntax, Probe, BERT, Hyperbolic

5:10

03/05/2021

BLiMP: The Benchmark of Linguistic Minimal Pairs for English

Alex Warstadt, Alicia Parrish, Haokun Liu, Anhad Monananey, Wei Peng, Sheng-Fu Wang, Samuel Bowman

Comments

Similar Papers

Cross-Linguistic Syntactic Evaluation of Word Prediction Models

Aaron Mueller, Garrett Nicolai, Panayiota Petrou-Zeniou and Natalia Talmina, Tal Linzen

Keywords Abstract Paper

Cross-Linguistic Syntax, Syntax, Cross-Linguistic Models, neural models

SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILP

Katsuki Chousa, Masaaki Nagata, Masaaki Nishino

Keywords Abstract Paper

Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment

Forrest Davis, Marten van Schijndel

Keywords Abstract Paper

production, Recurrent Always, language models, RNN LMs

Deep subjecthood: Higher-order grammatical features in multilingual BERT

Isabel Papadimitriou, Ethan A. Chi, Richard Futrell, Kyle Mahowald

Keywords Abstract Paper

Have We Solved The Hard Problem? It’s Not Easy! Contextual Lexical Contrast as a Means to Probe Neural Coherence

Wenqiang Lei, Yisong Miao, Runpeng Xie and Bonnie Webber, Meichun Liu, Tat-Seng Chua, Nancy F. Chen

Keywords Abstract Paper

Self-Attention with Cross-Lingual Position Representation

Liang Ding, Longyue Wang, Dacheng Tao

Keywords Abstract Paper

natural tasks, WMT'17 tasks, Cross-Lingual Representation, Position encoding

What determines the order of adjectives in English? Comparing efficiency-based theories using dependency treebanks

Richard Futrell, William Dyer, Greg Scontras

Keywords Abstract Paper

efficiency-based theories, order adjectives, information locality, integration cost

Probing BERT in Hyperbolic Spaces

Boli Chen, Yao Fu, Guangwei Xu and Pengjun Xie, Chuanqi Tan, Mosha Chen, Liping Jing

Keywords Abstract Paper

Sentiment, Syntax, Probe, BERT, Hyperbolic

On Learning Universal Representations Across Languages

Xiangpeng Wei, Rongxiang Weng, Yue Hu and Luxi Xing, Heng Yu, Weihua Luo

Keywords Abstract Paper

hierarchical contrastive learning, cross-lingual pretraining, universal representation learning

Probing for Referential Information in Language Models

Ionut-Teodor Sorodoc, Kristina Gulordava, Gemma Boleda

Keywords Abstract Paper

Probing, probe tasks, Language Models, LSTM architectures

A Bilingual Generative Transformer for Semantic Sentence Embedding

John Wieting, Graham Neubig, Taylor Berg-Kirkpatrick

Keywords Abstract Paper

source separation, semantic encoding, data distributions, unsupervised evaluations

Investigating Cross-Linguistic Adjective Ordering Tendencies with a Latent-Variable Model

Jun Yen Leung, Guy Emerson, Ryan Cotterell

Keywords Abstract Paper

multi-lingual ordering, corpus-driven model, latent-variable model, statistical model

Understanding the effects of word-level linguistic annotations in under-resourced neural machine translation

Víctor M. Sánchez-Cartagena, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez

Keywords Abstract Paper

Contextual and Non-Contextual Word Embeddings: an in-depth Linguistic Investigation

Alessio Miaschi, Felice Dell’Orletta

Keywords Abstract Paper

X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models

Zhengbao Jiang, Antonios Anastasopoulos, Jun Araki and Haibo Ding, Graham Neubig

Keywords Abstract Paper

factual retrieval, language models, lms, probing methods

A Sentiment-annotated Dataset of English Causal Connectives

Marta Andersson, Murathan Kurfalı, Robert Östling

Keywords Abstract Paper

Unsupervised Parsing via Constituency Tests

Steven Cao, Nikita Kitaev, Dan Klein

Keywords Abstract Paper

unsupervised parsing, constituency test, grammaticality decisions, unsupervised parser

The Secret is in the Spectra: Predicting Cross-lingual Task Performance with Spectral Similarity Measures

Haim Dubossarsky, Ivan Vulić, Roi Reichart, Anna Korhonen

Keywords Abstract Paper

cross-lingual tasks, large-scale study, bli, parsing

Compositional and Lexical Semantics in RoBERTa, BERT and DistilBERT: A Case Study on CoQA

Ieva Staliūnaitė, Ignacio Iacobacci

Keywords Abstract Paper

nlp tasks, conversational task, semantic labeling, contextualized embeddings

Multilingual AMR-to-Text Generation

Angela Fan, Claire Gardent

Keywords Abstract Paper

multilingual generation, cross-lingual embeddings, pretraining, multilingual models

LAReQA: Language-Agnostic Answer Retrieval from a Multilingual Pool

Uma Roy, Noah Constant, Rami Al-Rfou and Aditya Barua, Aaron Phillips, Yinfei Yang

Aaron Mueller, Garrett Nicolai, Panayiota Petrou-Zeniou and
Natalia Talmina, Tal Linzen

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Wenqiang Lei, Yisong Miao, Runpeng Xie and
Bonnie Webber, Meichun Liu, Tat-Seng Chua, Nancy F. Chen

Keywords Paper

Keywords Paper

Keywords Paper

Boli Chen, Yao Fu, Guangwei Xu and
Pengjun Xie, Chuanqi Tan, Mosha Chen, Liping Jing

Keywords Paper

Xiangpeng Wei, Rongxiang Weng, Yue Hu and
Luxi Xing, Heng Yu, Weihua Luo

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Zhengbao Jiang, Antonios Anastasopoulos, Jun Araki and
Haibo Ding, Graham Neubig

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Uma Roy, Noah Constant, Rami Al-Rfou and
Aditya Barua, Aaron Phillips, Yinfei Yang

Keywords Paper

Anna Breit, Artem Revenko, Kiamehr Rezaee and
Mohammad Taher Pilehvar, Jose Camacho-Collados

Keywords Paper

Keywords Paper

Ivana Kvapilíková, Mikel Artetxe, Gorka Labaka and
Eneko Agirre, Ondřej Bojar

Keywords Paper

Keywords Paper

Marius Mosbach, Stefania Degaetano-Ortlieb, Marie-Pauline Krielke and
Badr M. Abdullah, Dietrich Klakow

Keywords Paper

Edoardo Barba, Luigi Procopio, Caterina Lacerra and
Tommaso Pasini, Roberto Navigli

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper