Subword pooling makes a difference

19/04/2021

Subword pooling makes a difference

Judit Ács, Ákos Kádár, Andras Kornai

Keywords:

Abstract Paper Similar Papers

Abstract: Contextual word-representations became a standard in modern natural language processing systems. These models use subword tokenization to handle large vocabularies and unknown words. Word-level usage of such systems requires a way of pooling multiple subwords that correspond to a single word. In this paper we investigate how the choice of subword pooling affects the downstream performance on three tasks: morphological probing, POS tagging and NER, in 9 typologically diverse languages. We compare these in two massively multilingual models, mBERT and XLM-RoBERTa. For morphological tasks, the widely used ‘choose the first subword’ is the worst strategy and the best results are obtained by using attention over the subwords. For POS tagging both of these strategies perform poorly and the best choice is to use a small LSTM over the subwords. The same strategy works best for NER and we show that mBERT is better than XLM-RoBERTa in all 9 languages. We publicly release all code, data and the full result tables at https://github.com/juditacs/subword-choice .

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EACL 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

19/08/2021

Correlation-Guided Representation for Multi-Label Text Classification

Qian-Wen Zhang, Ximing Zhang, Zhao Yan and
Ruifang Liu, Yunbo Cao, Min-Ling Zhang

Keywords Paper

Machine Learning, Multi-instance; Multi-label; Multi-view learning, Classification, Text Classification

0

0

0

0

11:13

12/07/2020

Context Aware Local Differential Privacy

Jayadev Acharya, Kallista Bonawitz, Peter Kairouz and
Daniel Ramage, Ziteng Sun

Keywords Paper

Privacy-preserving Statistics and Machine Learning

0

0

0

0

14:51

04/07/2020

On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation

Wei Zhao, Goran Glavaš, Maxime Peyrard and
Yang Gao, Robert West, Steffen Eger

Keywords Paper

Evaluation encoders, zero-shot transfer, supervised tasks, web-scale systems

0

0

0

0

12:19

01/07/2020

Joint Training with Semantic Role Labeling for Better Generalization in Natural Language Inference

Cemil Cengiz, Deniz Yuret

Keywords Paper

0

0

0

0

4:38

08/12/2020

Detecting Urgency Status of Crisis Tweets: A Transfer Learning Approach for Low Resource Languages

Efsun Sarioglu Kayi, Linyong Nan, Bohan Qu and
Mona Diab, Kathleen McKeown

Keywords Paper

0

0

0

0

14:37

02/02/2021

Non-Autoregressive Coarse-to-Fine Video Captioning

Bang Yang, Yuexian Zou, Fenglin Liu, Can Zhang

Keywords Paper

0

0

0

0

18:21

04/07/2020

Masked Language Model Scoring

Julian Salazar, Davis Liang, Toan Q. Nguyen, Katrin Kirchhoff

Keywords Paper

Masked Scoring, NLP tasks, domain adaptation, language scoring

0

0

0

0

11:24

07/09/2020

From Saturation to Zero-Shot Visual Relationship Detection Using Local Context

Nikolaos Gkanatsios, Vassilis Pitsikalis, Petros Maragos

Keywords Paper

Visual Relationship Detection, Scene Graph Generation, Zero-shot Classification, Local Context, Language Bias

0

0

0

0

7:17

08/12/2020

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

0

0

0

0

13:01

06/12/2021

Low-Rank Subspaces in GANs

Jiapeng Zhu, Ruili Feng, Yujun Shen and
Deli Zhao, Zheng-Jun Zha, Jingren Zhou, Qifeng Chen

Keywords Paper

generative model

0

0

0

0

11:41

02/11/2020

Multi-task regularization based on infrequent classes for audio captioning

Emre Çakır, Konstantinos Drossos, Tuomas Virtanen

Keywords Paper

0

0

0

0

16:13

04/07/2020

Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs

Dong Bok Lee, Seanie Lee, Woo Tae Jeong and
Donghwan Kim, Sung Ju Hwang

Keywords Paper

question answering, QA, QA, Information-Maximizing VAEs

0

0

0

0

11:40

06/12/2020

Multi-label Contrastive Predictive Coding

Jiaming Song, Stefano Ermon

Keywords Paper

0

0

0

0

3:10

03/05/2021

PMI-Masking: Principled masking of correlated spans

Yoav Levine, Barak Lenz, Opher Lieber and
Omri Abend, Kevin Leyton-Brown, Moshe Tennenholtz, Yoav Shoham

Keywords Paper

BERT, pointwise mutual information, Language modeling

0

0

0

0

12:19

26/04/2020

Data-dependent Gaussian Prior Objective for Language Generation

Zuchao Li, Rui Wang, Kehai Chen and
Masso Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao

Keywords Paper

Gaussian Prior Objective, Language Generation

0

0

0

0

14:27

14/06/2020

Equalization Loss for Long-Tailed Object Recognition

Jingru Tan, Changbao Wang, Buyu Li and
Quanquan Li, Wanli Ouyang, Changqing Yin, Junjie Yan

Keywords Paper

long tail, object detection, lvis, object recognition

0

0

0

0

1:00

06/12/2021

LLC: Accurate, Multi-purpose Learnt Low-dimensional Binary Codes

Aditya Kusupati, Matthew Wallingford, Vivek Ramanujan and
Raghav Somani, Jae Sung Park, Krishna Pillutla, Prateek Jain, Sham Kakade, Ali Farhadi

Keywords Paper

machine learning, representation learning

0

0

0

0

15:05

02/02/2021

LightXML: Transformer with Dynamic Negative Sampling for High-Performance Extreme Multi-label Text Classification

Ting Jiang, Deqing Wang, Leilei Sun and
Huayi Yang, Zhengyang Zhao, Fuzhen Zhuang

Keywords Paper

0

0

0

0

16:28

05/01/2021

SubICap: Towards Subword-Informed Image Captioning

Naeha Sharif, Mohammed Bennamoun, Wei Liu, Syed Afaq Ali Shah

Keywords Paper

0

0

0

0

4:34

03/08/2020

Adapting Text Embeddings for Causal Inference

Victor Veitch, Dhanya Sridhar, David Blei

Keywords Paper

0

0

0

0

8:51

07/09/2020

On Modality Bias in the TVQA Dataset

Thomas Winterbottom, Sarah Xiao, Alistair McLean, Noura Al Moubayed

Keywords Paper

Multimodality, Unimodal Bias, Dataset Bias, TVQA, Video-QA, BERT, Bilinear Pooling, TVQA+

0

0

0

0

10:02

06/12/2020

Understanding Anomaly Detection with Deep Invertible Networks through Hierarchies of Distributions and Features

Robin Schirrmeister, Yuxuan Zhou, Tonio Ball, Dan Zhang

Keywords Paper

0

0

0

0

3:21

26/04/2020

Adversarially Robust Representations with Smooth Encoders

Taylan Cemgil, Sumedh Ghaisas, Krishnamurthy (Dj) Dvijotham, Pushmeet Kohli

Keywords Paper

Adversarial Learning, Robust Representations, Variational AutoEncoder, Wasserstein Distance, Variational Inference

0

0

0

0

5:16

06/12/2021

Making a (Counterfactual) Difference One Rationale at a Time

Mitchell Plyler, Michael Green, Min Chi

Keywords Paper

theory, generative model, language, interpretability

0

0

0

0

13:57

16/11/2020

If beam search is the answer, what was the question?

Clara Meister, Ryan Cotterell, Tim Vieira

Keywords Paper

language tasks, beam search, decoding, maximum decoding

0

0

0

0

12:18

06/12/2021

Joint Semantic Mining for Weakly Supervised RGB-D Salient Object Detection

Jingjing Li, Wei Ji, Qi Bi and
Cheng Yan, Miao Zhang, Yongri Piao, Huchuan Lu, Li cheng

Keywords Paper

vision

0

0

0

0

9:03

16/11/2020

Form2Seq : A Framework for Higher-Order Form Structure Extraction

Milan Aggarwal, Hiresh Gupta, Mausoom Sarkar, Balaji Krishnamurthy

Keywords Paper

document extraction, semantic task, image resolution, structure extraction

0

0

0

0

11:26

14/06/2020

Suppressing Uncertainties for Large-Scale Facial Expression Recognition

Kai Wang, Xiaojiang Peng, Jianfei Yang and
Shijian Lu, Yu Qiao

Keywords Paper

emotion recognition, self-cure network, uncertainties

0

0

0

0

1:01

22/11/2021

Feature Fusion Vision Transformer for Fine-Grained Visual Categorization

Jun Wang, Xiaohan Yu, Yongsheng Gao

Keywords Paper

Fine-grained visual categorization, Vision transformer, Self-attention, Feature Fusion

0

0

0

0

3:02

02/02/2021

LIREx: Augmenting Language Inference with Relevant Explanations

Xinyan Zhao, V.G.Vinod Vydiswaran

Keywords Paper

0

0

0

0

18:56

02/02/2021

Adaptive Beam Search Decoding for Discrete Keyphrase Generation

Xiaoli Huang, Tongge Xu, Lvan Jiao and
Yueran Zu, Youmin Zhang

Keywords Paper

0

0

0

0

14:36

07/09/2020

Object Detection as a Positive-Unlabeled Problem

Yuewei Yang, Kevin Liang, Lawrence Carin Duke

Keywords Paper

object detections, positive unlabeled learning

0

0

0

0

8:54

02/02/2021

MASKER: Masked Keyword Regularization for Reliable Text Classification

Seung Jun Moon, Sangwoo Mo, Kimin Lee and
Jaeho Lee, Jinwoo Shin

Keywords Paper

0

0

0

0

15:05

19/04/2021

Framing word sense disambiguation as a multi-label problem for model-agnostic knowledge integration

Simone Conia, Roberto Navigli

Keywords Paper

0

0

0

0

6:38

04/07/2020

BPE-Dropout: Simple and Effective Subword Regularization

Ivan Provilkov, Dmitrii Emelianenko, Elena Voita

Keywords Paper

open problem, machine translation, subword segmentation, training

0

0

0

0

9:33

16/11/2020

T3: Tree-Autoencoder Constrained Adversarial Text Generation for Targeted Attack

Boxin Wang, Hengzhi Pei, Boyuan Pan and
Qian Chen, Shuohang Wang, Bo Li

Keywords Paper

adversarial generation, nlp tasks, sentiment analysis, qa

0

0

0

0

11:59

02/02/2021

Train a One-Million-Way Instance Classifier for Unsupervised Visual Representation Learning

Yu Liu, Lianghua Huang, Pan Pan and
Bin Wang, Yinghui Xu, Rong Jin

Keywords Paper

0

0

0

0

15:15

02/02/2021

FILTER: An Enhanced Fusion Method for Cross-lingual Language Understanding

Yuwei Fang, Shuohang Wang, Zhe Gan and
Siqi Sun, Jingjing Liu

Keywords Paper

0

0

0

0

17:39

02/02/2021

Unsupervised Domain Adaptation for Semantic Segmentation by Content Transfer

Suhyeon Lee, Junhyuk Hyun, Hongje Seong, Euntai Kim

Keywords Paper

0

0

0

0

15:27

06/12/2021

Few-Shot Object Detection via Association and DIscrimination

Yuhang Cao, Jiaqi Wang, Ying Jin and
Tong Wu, Kai Chen, Ziwei Liu, Dahua Lin

Keywords Paper

deep learning, machine learning, vision

0

0

0

0

10:31