Joint energy-based model training for better calibrated natural language understanding models

19/04/2021

Joint energy-based model training for better calibrated natural language understanding models

Tianxing He, Bryan McCann, Caiming Xiong, Ehsan Hosseini-Asl

Keywords:

Abstract Paper Similar Papers

Abstract: In this work, we explore joint energy-based model (EBM) training during the finetuning of pretrained text encoders (e.g., Roberta) for natural language understanding (NLU) tasks. Our experiments show that EBM training can help the model reach a better calibration that is competitive to strong baselines, with little or no loss in accuracy. We discuss three variants of energy functions (namely scalar, hidden, and sharp-hidden) that can be defined on top of a text encoder, and compare them in experiments. Due to the discreteness of text data, we adopt noise contrastive estimation (NCE) to train the energy-based model. To make NCE training more effective, we train an auto-regressive noise model with the masked language model (MLM) objective.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EACL 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Latent Template Induction with Gumbel-CRFs

Yao Fu, Chuanqi Tan, Bin Bi and
Mosha Chen, Yansong Feng, Alexander Rush

Keywords Paper

0

0

0

0

3:14

08/12/2020

Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks

Lichao Sun, Congying Xia, Wenpeng Yin and
Tingting Liang, Philip Yu, Lifang He

Keywords Paper

0

0

0

0

9:52

26/04/2020

Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech

David Harwath, Wei-Ning Hsu, James Glass

Keywords Paper

visually-grounded speech, self-supervised learning, discrete representation learning, vision and language, vision and speech, hierarchical representation learning

0

0

0

0

13:42

04/07/2020

Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation

Bei Li, Hui Liu, Ziyang Wang and
Yufan Jiang, Tong Xiao, Jingbo Zhu, Tongran Liu, Changliang Li

Keywords Paper

Context-Aware Translation, document-level translation, document-level NMT, document-level

0

0

0

0

6:42

03/05/2021

Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis

Rafael Valle, Kevin J Shih, Ryan Prenger, Bryan Catanzaro

Keywords Paper

normalizing flows, deep learning, Text to speech synthesis

0

0

0

0

5:11

14/06/2020

Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition

Canjie Luo, Yuanzhi Zhu, Lianwen Jin, Yongpan Wang

Keywords Paper

data augmentation, text recognition, joint training

0

0

0

0

0:59

18/07/2021

LTL2Action: Generalizing LTL Instructions for Multi-Task RL

Pashootan Vaezipoor, Andrew C Li, Rodrigo A Toro Icarte, Sheila McIlraith

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:07

16/11/2020

Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models

Isabel Papadimitriou, Dan Jurafsky

Keywords Paper

analyzing structure, encoding structure, natural acquisition, transfer learning

0

0

0

0

11:44

16/11/2020

Data Weighted Training Strategies for Grammatical Error Correction

Jared Lichtarge, Chris Alberti, Shankar Kumar

Keywords Paper

neural nmt, neural, example scoring, gec

0

0

0

0

10:22

04/07/2020

Using Context in Neural Machine Translation Training Objectives

Danielle Saunders, Felix Stahlberg, Bill Byrne

Keywords Paper

Neural training, NMT training, document-level training, NMT objective

0

0

0

0

6:48

16/11/2020

Controllable Meaning Representation to Text Generation: Linearization and Data Augmentation Strategies

Chris Kedzie, Kathleen McKeown

Keywords Paper

natural generation, training, data augmentation, neural models

0

0

0

0

11:16

16/11/2020

Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision

Hao Tan, Mohit Bansal

Keywords Paper

speaking, writing, text-only self-supervision, pure-language tasks

0

0

0

0

11:59

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

16/11/2020

Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation

Minki Kang, Moonsu Han, Sung Ju Hwang

Keywords Paper

self-supervised pre-training, question answering, task, reinforcement learning

0

0

0

0

12:00

26/04/2020

FreeLB: Enhanced Adversarial Training for Natural Language Understanding

Chen Zhu, Yu Cheng, Zhe Gan and
Siqi Sun, Tom Goldstein, Jingjing Liu

Keywords Paper

0

0

0

0

5:26

04/07/2020

Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation

Junliang Guo, Linli Xu, Enhong Chen

Keywords Paper

Non-Autoregressive Translation, natural tasks, non-autoregressive translation~(NAT, non-autoregressive

0

0

0

0

10:47

03/05/2021

$i$-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning

Kibok Lee, Yian Zhu, Kihyuk Sohn and
Chun-Liang Li, Jinwoo Shin, Honglak Lee

Keywords Paper

self-supervised learning, unsupervised representation learning, data augmentation, MixUp, contrastive representation learning

0

0

0

0

5:04

03/05/2021

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

Boxin Wang, Shuohang Wang, Yu Cheng and
Zhe Gan, Ruoxi Jia, Bo Li, Jingjing Liu

Keywords Paper

adversarial training, QA, NLI, BERT, information theory, adversarial robustness

0

0

0

0

5:21

30/11/2020

Watch, read and lookup: learning to spot signs from multiple supervisors

Liliane Momeni, Gul Varol, Samuel Albanie and
Triantafyllos Afouras, Andrew Zisserman

Keywords Paper

0

0

0

0

9:58

18/07/2021

Latent Space Energy-Based Model of Symbol-Vector Coupling for Text Generation and Classification

Bo Pang, Ying Nian Wu

Keywords Paper

Algorithms, Unsupervised Learning

0

0

0

0

5:17

03/05/2021

A Distributional Approach to Controlled Text Generation

Muhammad Khalifa, Hady Elsahar, Marc Dymetman

Keywords Paper

Exponential Families, Information Geometry, Energy-Based Models, Bias in Language Models, Pretrained Language Models, Controlled NLG

0

0

0

0

15:03

16/11/2020

Learning Which Features Matter: RoBERTa Acquires a Preference for Linguistic Generalizations (Eventually)

Alex Warstadt, Yian Zhang, Xiaocheng Li and
Haokun Liu, Samuel R. Bowman

Keywords Paper

self-supervised tasks, language understanding, ambiguous tasks, finetuning

0

0

0

0

12:04

06/12/2021

Referring Transformer: A One-step Approach to Multi-task Visual Grounding

Muchen Li, Leonid Sigal

Keywords Paper

transformers, vision

0

0

0

0

7:54

16/11/2020

Data Rejuvenation: Exploiting Inactive Training Examples for Neural Machine Translation

Wenxiang Jiao, Xing Wang, Shilin He and
Irwin King, Michael Lyu, Zhaopeng Tu

Keywords Paper

data rejuvenation, neural models, nmt models, identification model

0

0

0

0

11:56

03/05/2021

SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing

Tao Yu, Rui Zhang, Alex Polozov and
Christopher Meek, Ahmed H Awadallah

Keywords Paper

0

0

0

0

5:11

04/07/2020

Unsupervised Word Translation with Adversarial Autoencoder

Tasnim Mohiuddin, Shafiq Joty

Keywords Paper

Unsupervised Translation, machine translation, transfer learning, word task

0

0

0

0

14:56

02/02/2021

TaLNet: Voice Reconstruction from Tongue and Lip Articulation with Transfer Learning from Text-to-Speech Synthesis

Jing-Xuan Zhang, Korin Richmond, Zhen-Hua Ling, Lirong Dai

Keywords Paper

0

0

0

0

19:58

06/12/2020

Uncertainty-aware Self-training for Few-shot Text Classification

Subhabrata Mukherjee, Ahmed Awadallah

Keywords Paper

0

0

0

0

3:16

16/11/2020

Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models

Ethan Wilcox, Peng Qian, Richard Futrell and
Ryosuke Kohita, Roger Levy, Miguel Ballesteros

Keywords Paper

learning outcomes, syntactic representations, neural models, n-gram baseline

0

0

0

0

11:29

06/12/2020

Unsupervised Data Augmentation for Consistency Training

Qizhe Xie, Zihang Dai, Eduard Hovy and
Thang Luong, Quoc V Le

Keywords Paper

0

0

0

0

3:29

02/02/2021

Learning a Few-shot Embedding Model with Contrastive Learning

Chen Liu, Yanwei Fu, Chengming Xu and
Siqian Yang, Jilin Li, Chengjie Wang, Li Zhang

Keywords Paper

0

0

0

0

15:02

06/12/2020

Counterfactual Vision-and-Language Navigation: Unravelling the Unseen

Amin Parvaneh, Ehsan Abbasnejad, Damien Teney and
Javen Qinfeng Shi, Anton van den Hengel

Keywords Paper

0

0

0

0

3:17

02/11/2020

Self-supervised classification for detecting anomalous sounds

Ritwik Giri, Srikanth V. Tenneti, Fangzhou Cheng and
Karim Helwani, Umut Isik, Arvindh Krishnaswamy

Keywords Paper

0

0

0

0

13:28

26/04/2020

Improving Neural Language Generation with Spectrum Control

Lingxiao Wang, Jing Huang, Kevin Huang and
Ziniu Hu, Guangtao Wang, Quanquan Gu

Keywords Paper

0

0

0

0

4:58

16/11/2020

Improving AMR Parsing with Sequence-to-Sequence Pre-training

Dongqin Xu, Junhui Li, Muhua Zhu and
Min Zhang, Guodong Zhou

Keywords Paper

abstract parsing, amr parsing, sequence-to-sequence parsing, machine translation

0

0

0

0

11:42

16/11/2020

DAGA: Data Augmentation with a Generation Approach forLow-resource Tagging Tasks

Bosheng Ding, Linlin Liu, Lidong Bing and
Canasai Kruengkrai, Thien Hai Nguyen, Shafiq Joty, Luo Si, Chunyan Miao

Keywords Paper

machine learning, generalization, low-resource tasks, named recognition

0

0

0

0

11:09

03/05/2021

Multi-timescale Representation Learning in LSTM Language Models

Shivangi Mahto, Vy Vo, Javier Turek, Alexander Huth

Keywords Paper

LSTM, timescales, Language Model

0

0

0

0

4:57

18/07/2021

EfficientTTS: An Efficient and High-Quality Text-to-Speech Architecture

Chenfeng Miao, Liang Shuang, Zhengchen Liu and
Chen Minchuan, Jun Ma, Shaojun Wang, Jing Xiao

Keywords Paper

Applications, Audio and Speech Processing

0

0

0

0

5:13

18/07/2021

Function Contrastive Learning of Transferable Meta-Representations

Waleed Gondal, Shruti Joshi, Nasim Rahaman and
Stefan Bauer, Manuel Wuthrich, Bernhard Schölkopf

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

5:46

06/12/2021

Improved Regularization and Robustness for Fine-tuning in Neural Networks

Dongyue Li, Hongyang Zhang

Keywords Paper

deep learning, machine learning, robustness, vision, transfer learning

0

0

0

0

12:03