Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data

16/11/2020

Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data

Lingkai Kong, Haoming Jiang, Yuchen Zhuang, Jie Lyu, Tuo Zhao, Chao Zhang

Keywords: augmented training, in-distribution calibration, text classification, expectation error

Abstract Paper Similar Papers

Abstract: Fine-tuned pre-trained language models can suffer from severe miscalibration for both in-distribution and out-of-distribution (OOD) data due to over-parameterization. To mitigate this issue, we propose a regularized fine-tuning method. Our method introduces two types of regularization for better calibration: (1) On-manifold regularization, which generates pseudo on-manifold samples through interpolation within the data manifold. Augmented training with these pseudo samples imposes a smoothness regularization to improve in-distribution calibration. (2) Off-manifold regularization, which encourages the model to output uniform distributions for pseudo off-manifold samples to address the over-confidence issue for OOD data. Our experiments demonstrate that the proposed method outperforms existing calibration methods for text classification in terms of expectation calibration error, misclassification detection, and OOD detection on six datasets. Our code can be found at https://github.com/Lingkai-Kong/Calibrated-BERT-Fine-Tuning.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

12/07/2020

Educating Text Autoencoders: Latent Representation Guidance via Denoising

Tianxiao Shen, Jonas Mueller, Regina Barzilay, Tommi Jaakkola

Keywords Paper

Deep Learning - Generative Models and Autoencoders

0

0

0

0

17:06

26/04/2020

Residual Energy-Based Models for Text Generation

Yuntian Deng, Anton Bakhtin, Myle Ott and
Arthur Szlam, Marc'Aurelio Ranzato

Keywords Paper

energy-based models, text generation

0

0

0

0

4:59

02/02/2021

IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization

Wenxuan Zhou, Bill Yuchen Lin, Xiang Ren

Keywords Paper

0

0

0

0

16:25

26/04/2020

Adversarially Robust Representations with Smooth Encoders

Taylan Cemgil, Sumedh Ghaisas, Krishnamurthy (Dj) Dvijotham, Pushmeet Kohli

Keywords Paper

Adversarial Learning, Robust Representations, Variational AutoEncoder, Wasserstein Distance, Variational Inference

0

0

0

0

5:16

02/02/2021

MASKER: Masked Keyword Regularization for Reliable Text Classification

Seung Jun Moon, Sangwoo Mo, Kimin Lee and
Jaeho Lee, Jinwoo Shin

Keywords Paper

0

0

0

0

15:05

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

04/07/2020

Learning to Faithfully Rationalize by Construction

Sarthak Jain, Sarah Wiegreffe, Yuval Pinter, Byron C. Wallace

Keywords Paper

NLP, neural classification, training, automatic evaluations

0

0

0

0

11:55

06/12/2021

Refining Language Models with Compositional Explanations

Huihan Yao, Ying Chen, Qinyuan Ye and
Xisen Jin, Xiang Ren

Keywords Paper

machine learning, fairness, language

0

0

0

0

13:17

05/12/2020

Towards a better understanding of label smoothing in neural machine translation

Yingbo Gao, Weiyue Wang, Christian Herold and
Zijian Yang, Hermann Ney

Keywords Paper

0

0

0

0

13:37

26/04/2020

Improving Neural Language Generation with Spectrum Control

Lingxiao Wang, Jing Huang, Kevin Huang and
Ziniu Hu, Guangtao Wang, Quanquan Gu

Keywords Paper

0

0

0

0

4:58

06/12/2021

Improved Regularization and Robustness for Fine-tuning in Neural Networks

Dongyue Li, Hongyang Zhang

Keywords Paper

deep learning, machine learning, robustness, vision, transfer learning

0

0

0

0

12:03

12/07/2020

On Variational Learning of Controllable Representations for Text without Supervision

Peng Xu, Jackie Chi Kit Cheung, Yanshuai Cao

Keywords Paper

Representation Learning

0

0

0

0

14:51

18/07/2021

Delving into Deep Imbalanced Regression

Yuzhe Yang, Kaiwen Zha, YINGCONG CHEN and
Hao Wang, Dina Katabi

Keywords Paper

Applications

0

0

0

0

16:37

16/11/2020

Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data

Shachar Rosenman, Alon Jacovi, Yoav Goldberg

Keywords Paper

data process, re collection, sota models, tacred

0

0

0

0

5:55

16/11/2020

Masking as an Efficient Alternative to Finetuning for Pretrained Language Models

Mengjie Zhao, Tao Lin, Fei Mi and
Martin Jaggi, Hinrich Schütze

Keywords Paper

masking bert, nlp tasks, downstream tasks, masking

0

0

0

0

12:40

14/06/2020

ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection

Yuxin Wang, Hongtao Xie, Zheng-Jun Zha and
Mengting Xing, Zilong Fu, Yongdong Zhang

Keywords Paper

scene text detection, arbitrary shapes, false-positive suppression, large scale variance

0

0

0

0

1:01

06/12/2021

Learning to Generate Visual Questions with Noisy Supervision

Shen Kai, Lingfei Wu, Siliang Tang and
Yueting Zhuang, zhen he, Zhuoye Ding, Yun Xiao, Bo Long

Keywords Paper

generative model

0

0

0

0

14:54

03/05/2021

Supervised Contrastive Learning for Pre-trained Language Model Fine-tuning

Beliz Gunel, Jingfei Du, Alexis Conneau, Veselin Stoyanov

Keywords Paper

supervised contrastive learning, pre-trained language model fine-tuning, natural language understanding, generalization, few-shot learning, robustness

0

0

0

0

4:44

03/05/2021

Explaining the Efficacy of Counterfactually Augmented Data

Divyansh Kaushik, Amrith Setlur, Eduard H Hovy, Zachary Lipton

Keywords Paper

sentiment analysis, text classification, natural language inference, annotation artifacts, humans in the loop

0

0

0

0

5:11

13/04/2021

Improving adversarial robustness via unlabeled out-of-domain data

Zhun Deng, Linjun Zhang, Amirata Ghorbani, James Zou

Keywords Paper

0

0

0

0

3:01

18/07/2021

On Linear Identifiability of Learned Representations

Geoffrey Roeder, Luke Metz, Durk Kingma

Keywords Paper

Deep Learning, Embedding and Representation learning

0

0

0

0

5:11

14/06/2020

Forward and Backward Information Retention for Accurate Binary Neural Networks

Haotong Qin, Ruihao Gong, Xianglong Liu and
Mingzhu Shen, Ziran Wei, Fengwei Yu, Jingkuan Song

Keywords Paper

model compression, binary neural networks, deep learning, quantization, computer vision

0

0

0

0

1:00

04/07/2020

Addressing Posterior Collapse with Mutual Information for Improved Variational Neural Machine Translation

Arya D. McCarthy, Xian Li, Jiatao Gu, Ning Dong

Keywords Paper

Variational Translation, posterior collapse, auxiliary task, uncertainty

0

0

0

0

11:00

26/08/2020

Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation

Yuxuan Song, Ning Miao, Hao Zhou and
Lantao Yu, Mingxuan Wang, Lei Li

Keywords Paper

0

0

0

0

12:32

06/12/2021

Using Random Effects to Account for High-Cardinality Categorical Features and Repeated Measures in Deep Neural Networks

Giora Simchoni, Saharon Rosset

Keywords Paper

deep learning, machine learning, vision

0

0

0

0

13:33

05/01/2021

Intra-Class Part Swapping for Fine-Grained Image Classification

Lianbo Zhang, Shaoli Huang, Wei Liu

Keywords Paper

0

0

0

0

4:43

12/07/2020

Variable-Bitrate Neural Compression via Bayesian Arithmetic Coding

Yibo Yang, Robert Bamler, Stephan Mandt

Keywords Paper

Deep Learning - General

0

0

0

0

15:08

06/12/2020

Boundary thickness and robustness in learning models

Yaoqing Yang, Rajiv Khanna, Yaodong Yu and
Amir Gholami, Kurt Keutzer, Joseph Gonzalez, Kannan Ramchandran, Michael W Mahoney

Keywords Paper

0

0

0

0

3:09

12/07/2020

When are Non-Parametric Methods Robust?

Robi Bhattacharjee, Kamalika Chaudhuri

Keywords Paper

Learning Theory

0

0

0

0

15:17

26/04/2020

Towards Hierarchical Importance Attribution: Explaining Compositional Semantics for Neural Sequence Models

Xisen Jin, Zhongyu Wei, Junyi Du and
Xiangyang Xue, Xiang Ren

Keywords Paper

natural language processing, interpretability

0

0

0

0

4:58

06/12/2021

Beware of the Simulated DAG! Causal Discovery Benchmarks May Be Easy to Game

Alexander Reisach, Christof Seiler, Sebastian Weichwald

Keywords Paper

optimization, graph learning, causality

0

0

0

0

14:13

06/12/2021

Grounding inductive biases in natural images: invariance stems from variations in data

Diane Bouchacourt, Mark Ibrahim, Ari Morcos

Keywords Paper

machine learning, transformers

0

0

0

0

14:19

06/12/2021

SmoothMix: Training Confidence-calibrated Smoothed Classifiers for Certified Robustness

Jongheon Jeong, Sejun Park, Minkyu Kim and
Heung-Chang Lee, Do-Guk Kim, Jinwoo Shin

Keywords Paper

deep learning, machine learning, robustness, adversarial robustness and security

0

0

0

0

12:23

03/05/2021

Representation Learning for Sequence Data with Deep Autoencoding Predictive Components

Junwen Bai, Weiran Wang, Yingbo Zhou, Caiming Xiong

Keywords Paper

Unsupervised Learning, Mutual Information, Masked Reconstruction, Sequence Data

0

0

0

0

5:08

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

06/12/2021

Local Explanation of Dialogue Response Generation

Yi-Lin Tuan, Connor Pryor, Wenhu Chen and
Lise Getoor, William Yang Wang

Keywords Paper

machine learning

0

0

0

0

13:14

16/11/2020

Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks

Denis Emelin, Ivan Titov, Rico Sennrich

Keywords Paper

word disambiguation, nmt, prediction errors, adversarial strategy

0

0

0

0

12:57

06/12/2021

Relative Uncertainty Learning for Facial Expression Recognition

Yuhang Zhang, Chengrui Wang, Weihong Deng

Keywords Paper

0

0

0

0

8:12

06/12/2021

Improving Deep Learning Interpretability by Saliency Guided Training

Aya Abdelsalam Ismail, Hector Corrada Bravo, Soheil Feizi

Keywords Paper

deep learning, transformers, vision, language, interpretability

0

0

0

0

10:45

16/11/2020

Do Explicit Alignments Robustly Improve Multilingual Encoders?

Shijie Wu, Mark Dredze

Keywords Paper

multilingual, unsupervised encoders, cross-lingual representation, contrastive objective

0

0

0

0

7:14