Variational Information Bottleneck for Effective Low-Resource Fine-Tuning

03/05/2021

Variational Information Bottleneck for Effective Low-Resource Fine-Tuning

Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson

Keywords: variational information bottleneck, biases, robust, over-fitting, large-scale pre-trained language models, NLP, Transfer learning

Abstract Paper Similar Papers

Abstract: While large-scale pretrained language models have obtained impressive results when fine-tuned on a wide variety of tasks, they still often suffer from overfitting in low-resource scenarios. Since such models are general-purpose feature extractors, many of these features are inevitably irrelevant for a given target task. We propose to use Variational Information Bottleneck (VIB) to suppress irrelevant features when fine-tuning on low-resource target tasks, and show that our method successfully reduces overfitting. Moreover, we show that our VIB model finds sentence representations that are more robust to biases in natural language inference datasets, and thereby obtains better generalization to out-of-domain datasets. Evaluation on seven low-resource datasets in different tasks shows that our method significantly improves transfer learning in low-resource scenarios, surpassing prior work. Moreover, it improves generalization on 13 out of 15 out-of-domain natural language inference benchmarks. Our code is publicly available in https://github.com/rabeehk/vibert.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

16/11/2020

Visually Grounded Compound PCFGs

Yanpeng Zhao, Ivan Titov

Keywords Paper

exploiting groundings, language understanding, gradient estimates, fully-differentiable learning

0

0

0

0

12:24

26/04/2020

Learning The Difference That Makes A Difference With Counterfactually-Augmented Data

Divyansh Kaushik, Eduard Hovy, Zachary Lipton

Keywords Paper

humans in the loop, annotation artifacts, text classification, sentiment analysis, natural language inference

0

0

0

0

4:25

14/06/2020

Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing

Vedika Agarwal, Rakshith Shetty, Mario Fritz

Keywords Paper

robustness, vqa, causality, gan, dataset, evaluation, automated semantic scene editing, data augmentation, invariance, covariance

0

0

0

0

1:00

02/02/2021

MASKER: Masked Keyword Regularization for Reliable Text Classification

Seung Jun Moon, Sangwoo Mo, Kimin Lee and
Jaeho Lee, Jinwoo Shin

Keywords Paper

0

0

0

0

15:05

03/05/2021

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition

Yangming Li, lemao liu, Shuming Shi

Keywords Paper

Negative Sampling, Unlabeled Entity Problem, Named Entity Recognition

0

0

0

1

4:49

08/12/2020

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

0

0

0

0

13:01

19/04/2021

Generative text modeling through short run inference

Bo Pang, Erik Nijkamp, Tian Han, Ying Nian Wu

Keywords Paper

0

0

0

0

7:55

16/11/2020

An Unsupervised Sentence Embedding Method by Mutual Information Maximization

Yan Zhang, Ruidan He, Zuozhu Liu and
Kwan Hui Lim, Lidong Bing

Keywords Paper

sentence-pair tasks, clustering, semantic search, downstream tasks

0

0

0

0

12:22

04/07/2020

Evaluating and Enhancing the Robustness of Neural Network-based Dependency Parsing Models with Adversarial Examples

Xiaoqing Zheng, Jiehang Zeng, Yi Zhou and
Cho-Jui Hsieh, Minhao Cheng, Xuanjing Huang

Keywords Paper

semantic tasks, sentiment analysis, question answering, reading comprehension

0

0

0

0

11:57

04/07/2020

Max-Margin Incremental CCG Parsing

Miloš Stanojević, Mark Steedman

Keywords Paper

Incremental parsing, human processing, ASR, MT

0

0

0

0

11:39

16/11/2020

An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models

Lifu Tu, Garima Lalwani, Spandana Gella, He He

Keywords Paper

generalization, natural inference, paraphrase identification, pre-trained models

0

0

0

0

11:55

04/07/2020

Are we Estimating or Guesstimating Translation Quality?

Shuo Sun, Francisco Guzmán, Lucia Specia

Keywords Paper

Estimating Quality, quality estimation, machine translation, QE task

0

0

0

0

5:56

14/06/2020

On Vocabulary Reliance in Scene Text Recognition

Zhaoyi Wan, Jielei Zhang, Liang Zhang and
Jiebo Luo, Cong Yao

Keywords Paper

scene text recognition, text spotting, document analysis, ocr, scene text detection, sequence recognition, language and vision

0

0

0

0

1:00

03/05/2021

Deconstructing the Regularization of BatchNorm

Yann Dauphin, Ekin Cubuk

Keywords Paper

understanding neural networks, batch normalization, regularization, deep learning

0

0

0

0

5:09

04/07/2020

Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance

Prasetya Ajie Utama, Nafise Sadat Moosavi, Iryna Gurevych

Keywords Paper

Debiasing Models, natural tasks, NLU tasks, debiasing methods

0

0

0

1

11:09

06/12/2021

How Should Pre-Trained Language Models Be Fine-Tuned Towards Adversarial Robustness?

Xinshuai Dong, Anh Tuan Luu, Min Lin and
Shuicheng Yan, Hanwang Zhang

Keywords Paper

robustness, adversarial robustness and security, language

0

0

0

0

10:26

08/12/2020

Emergent Communication Pretraining for Few-Shot Machine Translation

Yaoyiran Li, Edoardo Maria Ponti, Ivan Vulić, Anna Korhonen

Keywords Paper

0

0

0

0

14:42

01/07/2020

Joint Training with Semantic Role Labeling for Better Generalization in Natural Language Inference

Cemil Cengiz, Deniz Yuret

Keywords Paper

0

0

0

0

4:38

16/11/2020

On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment

Zirui Wang, Zachary C. Lipton, Yulia Tsvetkov

Keywords Paper

multilingual models, meta-learning algorithm, multilingual representations, negative interference

0

0

0

0

12:03

02/02/2021

IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization

Wenxuan Zhou, Bill Yuchen Lin, Xiang Ren

Keywords Paper

0

0

0

0

16:25

16/11/2020

Does my multimodal model learn cross-modal interactions? It's harder to tell than you might think!

Jack Hessel, Lillian Lee

Keywords Paper

modeling interactions, multimodal tasks, visual answering, multimodal learning

0

0

0

0

12:02

03/05/2021

FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders

Pengyu Cheng, Weituo Hao, Siyang Yuan and
Shijing Si, Lawrence Carin

Keywords Paper

Mutual Information, Pretrained Text Encoders, Contrastive Learning, Fairness

0

0

0

0

4:43

08/12/2020

An analysis of language models for metaphor recognition

Arthur Neidlein, Philip Wiesenbach, Katja Markert

Keywords Paper

0

0

0

0

13:52

16/11/2020

Syntactic Structure Distillation Pretraining for Bidirectional Encoders

Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried and
Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Keywords Paper

bert pretraining, structured tasks, natural understanding, textual learners

0

0

0

0

12:23

08/12/2020

Model-agnostic Methods for Text Classification with Inherent Noise

Kshitij Tayal, Rahul Ghosh, Vipin Kumar

Keywords Paper

0

0

0

0

8:46

16/11/2020

Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data

Lingkai Kong, Haoming Jiang, Yuchen Zhuang and
Jie Lyu, Tuo Zhao, Chao Zhang

Keywords Paper

augmented training, in-distribution calibration, text classification, expectation error

0

0

0

0

11:47

02/02/2021

How Linguistically Fair Are Multilingual Pre-Trained Language Models?

Monojit Choudhury, Amit Deshpande

Keywords Paper

0

0

0

0

17:57

04/07/2020

Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence

Xiaoyu Shen, Ernie Chang, Hui Su and
Cheng Niu, Dietrich Klakow

Keywords Paper

Neural Generation, Segmentation, data-to-text tasks, neural model

0

0

0

0

9:09

06/12/2021

Uncertainty Quantification and Deep Ensembles

Rahul Rahaman, alexandre thiery

Keywords Paper

deep learning, machine learning

0

0

0

0

14:40

16/11/2020

If beam search is the answer, what was the question?

Clara Meister, Ryan Cotterell, Tim Vieira

Keywords Paper

language tasks, beam search, decoding, maximum decoding

0

0

0

0

12:18

16/11/2020

Towards Better Context-aware Lexical Semantics:Adjusting Contextualized Representations through Static Anchors

Qianchu Liu, Diana McCarthy, Anna Korhonen

Keywords Paper

transformation, contextualized models, dynamic embeddings, post-processing technique

0

0

0

0

6:53

19/08/2021

Modelling General Properties of Nouns by Selectively Averaging Contextualised Embeddings

Na Li, Zied Bouraoui, Jose Camacho-Collados and
Luis Espinosa-Anke, Qing Gu, Steven Schockaert

Keywords Paper

Natural Language Processing, Natural Language Semantics, Natural Language Processing

0

0

0

0

14:09

12/07/2020

Aligned Cross Entropy for Non-Autoregressive Machine Translation

Marjan Ghazvininejad, Vladimir Karpukhin, Luke Zettlemoyer, Omer Levy

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

14:43

06/12/2021

True Few-Shot Learning with Language Models

Ethan Perez, Douwe Kiela, Kyunghyun Cho

Keywords Paper

language, few shot learning

0

0

0

0

15:04

06/12/2021

Relative Uncertainty Learning for Facial Expression Recognition

Yuhang Zhang, Chengrui Wang, Weihong Deng

Keywords Paper

0

0

0

0

8:12

14/06/2020

Towards Accurate Scene Text Recognition With Semantic Reasoning Networks

Deli Yu, Xuan Li, Chengquan Zhang and
Tao Liu, Junyu Han, Jingtuo Liu, Errui Ding

Keywords Paper

scene text recognition, global semantic reasoning, strong semantic context, parallel decoding/inference, parallel visual attention, efficient decoder.

0

0

0

0

1:01

04/07/2020

Sources of Transfer in Multilingual Named Entity Recognition

David Mueller, Nicholas Andrews, Mark Dredze

Keywords Paper

Multilingual Recognition, polyglot recognition, multilingual transfer, naive models

0

0

0

0

12:23

16/11/2020

On the Sentence Embeddings from Pre-trained Language Models

Bohan Li, Hao Zhou, Junxian He and
Mingxuan Wang, Yiming Yang, Lei Li

Keywords Paper

natural processing, semantic task, semantic tasks, pre-trained representations

0

0

0

0

9:11

16/11/2020

What Have We Achieved on Text Summarization?

Dandan Huang, Leyang Cui, Sen Yang and
Guangsheng Bao, Kun Wang, Jun Xie, Yue Zhang

Keywords Paper

text summarization, deep learning, automatic summarizers, summarization systems

0

0

0

0

11:20

04/07/2020

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

Haoming Jiang, Pengcheng He, Weizhu Chen and
Xiaodong Liu, Jianfeng Gao, Tuo Zhao

Keywords Paper

NLP, generalization, NLP tasks, SMART

0

0

0

0

11:43