Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models

Abstract: Recent models for unsupervised representation learning of text have employed a number of techniques to improve contextual word representations but have put little focus on discourse-level representations. We propose Conpono, an inter-sentence objective for pretraining language models that models discourse coherence and the distance between sentences. Given an anchor sentence, our model is trained to predict the text k sentences away using a sampled-softmax objective where the candidates consist of neighboring sentences and sentences randomly sampled from the corpus. On the discourse representation benchmark DiscoEval, our model improves over the previous state-of-the-art by up to 13% and on average 4% absolute across 7 tasks. Our model is the same size as BERT-Base, but outperforms the much larger BERT-Large model and other more recent approaches that incorporate discourse. We also show that Conpono yields gains of 2%-6% absolute even for tasks that do not explicitly evaluate discourse: textual entailment (RTE), common sense reasoning (COPA) and reading comprehension (ReCoRD).

16/11/2020

variational information bottleneck, biases, robust, over-fitting, large-scale pre-trained language models, NLP, Transfer learning

5:07

16/11/2020

Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models

Dan Iter, Kelvin Guu, Larry Lansing, Dan Jurafsky

Comments

Similar Papers

Visually Grounded Compound PCFGs

Yanpeng Zhao, Ivan Titov

Keywords Abstract Paper

exploiting groundings, language understanding, gradient estimates, fully-differentiable learning

Syntactic Structure Distillation Pretraining for Bidirectional Encoders

Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried and Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Keywords Abstract Paper

bert pretraining, structured tasks, natural understanding, textual learners

Fine-grained Information Status Classification Using Discourse Context-Aware BERT

Yufang Hou

Keywords Abstract Paper

Variational Information Bottleneck for Effective Low-Resource Fine-Tuning

Rabeeh Karimi Mahabadi, Yonatan Belinkov, James Henderson

Keywords Abstract Paper

variational information bottleneck, biases, robust, over-fitting, large-scale pre-trained language models, NLP, Transfer learning

Compositional and Lexical Semantics in RoBERTa, BERT and DistilBERT: A Case Study on CoQA

Ieva Staliūnaitė, Ignacio Iacobacci

Keywords Abstract Paper

nlp tasks, conversational task, semantic labeling, contextualized embeddings

Federated Learning for Spoken Language Understanding

Zhiqi Huang, Fenglin Liu, Yuexian Zou

Keywords Abstract Paper

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and Anna Korhonen, Goran Glavaš

Keywords Abstract Paper

Sparsity Makes Sense: Word Sense Disambiguation Using Sparse Contextualized Word Representations

Gábor Berend

Keywords Abstract Paper

fine-grained disambiguation, part-of-speech tagging, sparse representations, task-specific models

PortaSpeech: Portable and High-Quality Generative Text-to-Speech

Yi Ren, Jinglin Liu, Zhou Zhao

Keywords Abstract Paper

generative model

The Curious Case of Neural Text Degeneration

Ari Holtzman, Jan Buys, Li Du and Maxwell Forbes, Yejin Choi

Keywords Abstract Paper

generation, text, NLG, NLP, natural language, natural language generation, language model, neural, neural language model

Cross-Linguistic Syntactic Evaluation of Word Prediction Models

Aaron Mueller, Garrett Nicolai, Panayiota Petrou-Zeniou and Natalia Talmina, Tal Linzen

Keywords Abstract Paper

Cross-Linguistic Syntax, Syntax, Cross-Linguistic Models, neural models

oLMpics - On what Language Model Pre-training Captures

Alon Talmor, Yanai Elazar, Yoav Goldberg, Jonathan Berant

Keywords Abstract Paper

symbolic tasks, reasoning tasks, zero-shot evaluation, pre-training

English intermediate-task training improves zero-shot cross-lingual transfer too

Jason Phang, Iacer Calixto, Phu Mon Htut and Yada Pruksachatkun, Haokun Liu, Clara Vania, Katharina Kann, Samuel R. Bowman

Keywords Abstract Paper

Text Classification by Contrastive Learning and Cross-lingual Data Augmentation for Alzheimer’s Disease Detection

Zhiqiang Guo, Zhaoci Liu, Zhenhua Ling and Shijin Wang, Lingjing Jin, Yunxia Li

Keywords Abstract Paper

Event Extraction as Machine Reading Comprehension

Jian Liu, Yubo Chen, Kang Liu and Wei Bi, Xiaojiang Liu

Keywords Abstract Paper

event extraction, ee, information task, classification task

Disfluency correction using unsupervised and semi-supervised learning

Nikhil Saini, Drumil Trivedi, Shreya Khare and Tejas Dhamecha, Preethi Jyothi, Samarth Bharadwaj, Pushpak Bhattacharyya

Keywords Abstract Paper

Evidence weighted tree ensembles for text classification

Md Zahidul Islam, Jixue Liu, Jiuyong Li and Lin Liu, Wei Kang

Keywords Abstract Paper

decision semantics, ensemble methods, text classification

Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech

Yoonhyung Lee, Joongbo Shin, Kyomin Jung

Keywords Abstract Paper

VAE, non-autoregressive, speech synthesis, text-to-speech

Modeling Content Importance for Summarization with Pre-trained Language Models

Liqiang Xiao, Lu Wang, Hao He, Yaohui Jin

Keywords Abstract Paper

modeling importance, summarization, statistical methods, information theory

DialogBERT: Discourse-Aware Response Generation via Learning to Recover and Rank Utterances

Xiaodong Gu, Kang Min Yoo, Jung-Woo Ha

Keywords Abstract Paper

Improving Multilingual Models with Language-Clustered Vocabularies

Hyung Won Chung, Dan Garrette, Kiat Chuan Tan, Jason Riesa

Keywords Abstract Paper

Keywords Paper

Adhiguna Kuncoro, Lingpeng Kong, Daniel Fried and
Dani Yogatama, Laura Rimell, Chris Dyer, Phil Blunsom

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

Keywords Paper

Keywords Paper

Ari Holtzman, Jan Buys, Li Du and
Maxwell Forbes, Yejin Choi

Keywords Paper

Aaron Mueller, Garrett Nicolai, Panayiota Petrou-Zeniou and
Natalia Talmina, Tal Linzen

Keywords Paper

Keywords Paper

Jason Phang, Iacer Calixto, Phu Mon Htut and
Yada Pruksachatkun, Haokun Liu, Clara Vania, Katharina Kann, Samuel R. Bowman

Keywords Paper

Zhiqiang Guo, Zhaoci Liu, Zhenhua Ling and
Shijin Wang, Lingjing Jin, Yunxia Li

Keywords Paper

Jian Liu, Yubo Chen, Kang Liu and
Wei Bi, Xiaojiang Liu

Keywords Paper

Nikhil Saini, Drumil Trivedi, Shreya Khare and
Tejas Dhamecha, Preethi Jyothi, Samarth Bharadwaj, Pushpak Bhattacharyya

Keywords Paper

Md Zahidul Islam, Jixue Liu, Jiuyong Li and
Lin Liu, Wei Kang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Peng Xu, Mostofa Patwary, Mohammad Shoeybi and
Raul Puri, Pascale Fung, Anima Anandkumar, Bryan Catanzaro

Keywords Paper

Qianli Ma, Zhenxi Lin, Jiangyue Yan and
Zipeng Chen, Liuhong Yu

Keywords Paper

Keywords Paper

Keywords Paper

Wenqiang Lei, Yisong Miao, Runpeng Xie and
Bonnie Webber, Meichun Liu, Tat-Seng Chua, Nancy F. Chen

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Zhepei Wei, Jianlin Su, Yue Wang and
Yuan Tian, Yi Chang

Keywords Paper

Yichong Leng, Xu Tan, Linchen Zhu and
Jin Xu, Renqian Luo, Linquan Liu, Tao Qin, Xiangyang Li, Edward Lin, Tie-Yan Liu

Keywords Paper

Keywords Paper