Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-Trained Language Models

Abstract: Recent works show that pre-trained language models (PTLMs), such as BERT, possess certain commonsense and factual knowledge. They suggest that it is promising to use PTLMs as ``neural knowledge bases″ via predicting masked words. Surprisingly, we find that this may not work for numerical commonsense knowledge (e.g., a bird usually has two legs). In this paper, we investigate whether and to what extent we can induce numerical commonsense knowledge from PTLMs as well as the robustness of this process. In this paper, we investigate whether and to what extent we can induce numerical commonsense knowledge from PTLMs as well as the robustness of this process. To study this, we introduce a novel probing task with a diagnostic dataset, NumerSense, containing 13.6k masked-word-prediction probes (10.5k for fine-tuning and 3.1k for testing). Our analysis reveals that: (1) BERT and its stronger variant RoBERTa perform poorly on the diagnostic dataset prior to any fine-tuning; (2) fine-tuning with distant supervision brings some improvement; (3) the best supervised model still performs poorly as compared to human performance (54.06% vs. 96.3% in accuracy).

16/11/2020

Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-Trained Language Models

Bill Yuchen Lin, Seyeon Lee, Rahul Khanna, Xiang Ren

Comments

Similar Papers

On the Sentence Embeddings from Pre-trained Language Models

Bohan Li, Hao Zhou, Junxian He and Mingxuan Wang, Yiming Yang, Lei Li

Keywords Abstract Paper

natural processing, semantic task, semantic tasks, pre-trained representations

A pairwise probe for understanding BERT fine-tuning on machine reading comprehension

Jie Cai, Zhengzhou Zhu, Ping Nie, Qian Liu

Keywords Abstract Paper

machine reading comprehension, pairwise, fine-tune, BERT

Linguistic Profiling of a Neural Language Model

Alessio Miaschi, Dominique Brunato, Felice Dell’Orletta, Giulia Venturi

Keywords Abstract Paper

On Losses for Modern Language Models

Stéphane Aroca-Ouellette, Frank Rudzicz

Keywords Abstract Paper

pre-training, masked modelling, next prediction, nsp

What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models

Allyson Ettinger

Keywords Abstract Paper

Pre-training, NLP tasks, inference, role-based prediction

Classifier Probes May Just Learn from Linear Context Features

Jenny Kunz, Marco Kuhlmann

Keywords Abstract Paper

Evaluating neural model robustness for machine comprehension

Winston Wu, Dustin Arendt, Svitlana Volkova

Keywords Abstract Paper

Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction

Masahiro Kaneko, Masato Mita, Shun Kiyono and Jun Suzuki, Kentaro Inui

Keywords Abstract Paper

Grammatical Correction, GEC, Encoder-Decoder Models, Pre-trained Models

Syntactic Structure Distillation Pretraining for Bidirectional Encoders

Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried and Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Keywords Abstract Paper

bert pretraining, structured tasks, natural understanding, textual learners

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and Anna Korhonen, Goran Glavaš

Keywords Abstract Paper

Noninvasive Self-attention for Side Information Fusion in Sequential Recommendation

Chang Liu, Xiaoguang Li, Guohao Cai and Zhenhua Dong, Hong Zhu, Lifeng Shang

Keywords Abstract Paper

Feature Adaptation of Pre-Trained Language Models across Languages and Domains with Robust Self-Training

Hai Ye, Qingyu Tan, Ruidan He and Juntao Li, Hwee Tou Ng, Lidong Bing

Keywords Abstract Paper

unsupervised adaptation, self-training, pre-trained models, bert

Incorporating BERT into Neural Machine Translation

Jinhua Zhu, Yingce Xia, Lijun Wu and Di He, Tao Qin, Wengang Zhou, Houqiang Li, Tieyan Liu

Keywords Abstract Paper

BERT, Neural Machine Translation

MASKER: Masked Keyword Regularization for Reliable Text Classification

Seung Jun Moon, Sangwoo Mo, Kimin Lee and Jaeho Lee, Jinwoo Shin

Keywords Abstract Paper

MPNet: Masked and Permuted Pre-training for Language Understanding

Kaitao Song, Xu Tan, Tao Qin and Jianfeng Lu, Tie-Yan Liu

Keywords Abstract Paper

Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT

Zhiyong Wu, Yun Chen, Ben Kao, Qun Liu

Keywords Abstract Paper

Analyzing BERT, linguistic tasks, dependency parsing, probing tasks

On Position Embeddings in BERT

Wang Benyou, Lifeng Shang, Christina Lioma and Xin Jiang, Hao Yang, Qun Liu, Jakob Simonsen

Keywords Abstract Paper

pretrained language model., Position Embedding, BERT

Bayesian Methods for Semi-supervised Text Annotation

Kristian Miok, Gregor Pirs, Marko Robnik-Sikonja

Keywords Abstract Paper

DagoBERT: Generating Derivational Morphology with a Pretrained Language Model

Valentin Hofmann, Janet Pierrehumbert, Hinrich Schütze

Keywords Abstract Paper

full finetuning, derivation generation, pretrained models, plms

Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work?

Yada Pruksachatkun, Jason Phang, Haokun Liu and Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman

Keywords Abstract Paper

Intermediate-Task Learning, natural tasks, data-rich task, intermediate-task training

BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance

Timo Schick, Hinrich Schütze

Keywords Abstract Paper

NLP, rare task, BERTRAM, Word Embeddings

Bohan Li, Hao Zhou, Junxian He and
Mingxuan Wang, Yiming Yang, Lei Li

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Masahiro Kaneko, Masato Mita, Shun Kiyono and
Jun Suzuki, Kentaro Inui

Keywords Paper

Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried and
Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Keywords Paper

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

Chang Liu, Xiaoguang Li, Guohao Cai and
Zhenhua Dong, Hong Zhu, Lifeng Shang

Keywords Paper

Hai Ye, Qingyu Tan, Ruidan He and
Juntao Li, Hwee Tou Ng, Lidong Bing

Keywords Paper

Jinhua Zhu, Yingce Xia, Lijun Wu and
Di He, Tao Qin, Wengang Zhou, Houqiang Li, Tieyan Liu

Keywords Paper

Seung Jun Moon, Sangwoo Mo, Kimin Lee and
Jaeho Lee, Jinwoo Shin

Keywords Paper

Kaitao Song, Xu Tan, Tao Qin and
Jianfeng Lu, Tie-Yan Liu

Keywords Paper

Keywords Paper

Wang Benyou, Lifeng Shang, Christina Lioma and
Xin Jiang, Hao Yang, Qun Liu, Jakob Simonsen

Keywords Paper

Keywords Paper

Keywords Paper

Yada Pruksachatkun, Jason Phang, Haokun Liu and
Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman

Keywords Paper

Keywords Paper

Keywords Paper

Kaixin Ma, Filip Ilievski, Jonathan Francis and
Yonatan Bisk, Eric Nyberg, Alessandro Oltramari

Keywords Paper

Keywords Paper

Dongming Yang, Yuexian Zou, Can Zhang and
Meng Cao, Jie Chen

Keywords Paper

Keywords Paper

Yuxian Gu, Zhengyan Zhang, Xiaozhi Wang and
Zhiyuan Liu, Maosong Sun

Keywords Paper

Keywords Paper

Changsheng Zhao, Ting Hua, Yilin Shen and
Qian Lou, Hongxia Jin

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper