Leveraging Pre-trained Checkpoints for Sequence Generation Tasks

04/07/2020

Leveraging Pre-trained Checkpoints for Sequence Generation Tasks

Sascha Rothe, Shashi Narayan and Aliaksei Severyn

Keywords: Sequence Tasks, Natural Processing, Natural tasks, Sequence Generation

Abstract Paper Similar Papers

Abstract: Unsupervised pre-training of large neural models has recently revolutionized Natural Language Processing. By warm-starting from the publicly released checkpoints, NLP practitioners have pushed the state-of-the-art on multiple benchmarks while saving significant amounts of compute time. So far the focus has been mainly on the Natural Language Understanding tasks. In this paper, we demonstrate the efficacy of pre-trained checkpoints for Sequence Generation. We developed a Transformer-based sequence-to-sequence model that is compatible with publicly available pre-trained BERT, GPT-2 and RoBERTa checkpoints and conducted an extensive empirical study on the utility of initializing our model, both encoder and decoder, with these checkpoints. Our models result in new state-of-the-art results on Machine Translation, Text Summarization, Sentence Splitting, and Sentence Fusion.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ACL 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

05/12/2020

Heads-up! Unsupervised constituency parsing via self-attention heads

Bowen Li, Taeuk Kim, Reinald Kim Amplayo, Frank Keller

Keywords Paper

0

0

0

0

13:55

16/11/2020

POINTER: Constrained Progressive Text Generation via Insertion-based Generative Pre-training

Yizhe Zhang, Guoyin Wang, Chunyuan Li and
Zhe Gan, Chris Brockett, Bill Dolan

Keywords Paper

language learning, free-form generation, hard-constrained generation, hard-constrained tasks

0

0

0

0

10:09

06/12/2021

DOBF: A Deobfuscation Pre-Training Objective for Programming Languages

Marie-Anne Lachaux, Baptiste Roziere, Marc Szafraniec, Guillaume Lample

Keywords Paper

self-supervised learning

0

0

0

0

13:09

26/04/2020

Reducing Transformer Depth on Demand with Structured Dropout

Angela Fan, Edouard Grave, Armand Joulin

Keywords Paper

reduction, regularization, pruning, dropout, transformer

0

0

0

0

5:01

04/07/2020

Deep Contextualized Self-training for Low Resource Dependency Parsing

Guy Rotman, Roi Reichart

Keywords Paper

Low Parsing, sequence tasks, Deep Self-training, Neural parsing

0

0

0

0

11:41

08/12/2020

Data-Efficient Paraphrase Generation to Bootstrap Intent Classification and Slot Labeling for New Features in Task-Oriented Dialog Systems

Shailza Jolly, Tobias Falke, Caglar Tirkaz, Daniil Sorokin

Keywords Paper

0

0

0

0

11:38

14/06/2020

OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page Text Recognition by learning to unfold

Mohamed Yousef, Tom E. Bishop

Keywords Paper

text recognition, weakly supervised, handwriting recognition, convolutional neural network fully convolutional, ctc

0

0

0

0

1:00

04/07/2020

Roles and Utilization of Attention Heads in Transformer-based Neural Language Models

Jae-young Jo, Sung-Hyon Myaeng

Keywords Paper

Transformer-based Models, natural tasks, downstream tasks, probing tasks

0

0

0

0

12:17

04/07/2020

Harvesting and Refining Question-Answer Pairs for Unsupervised QA

Zhongli Li, Wenhui Wang, Li Dong and
Furu Wei, Ke Xu

Keywords Paper

Unsupervised QA, Question Answering, Question QA, QA

0

0

0

0

10:28

04/07/2020

IMoJIE: Iterative Memory-Based Joint Open Information Extraction

Keshav Kolluru, Samarth Aggarwal, Vipul Rathore and
Mausam -, Soumen Chakrabarti

Keywords Paper

Iterative Extraction, Open Extraction, IMoJIE, Iterative

0

0

0

0

9:31

16/11/2020

Meta Fine-Tuning Neural Language Models for Multi-Domain Text Mining

Chengyu Wang, Minghui Qiu, Jun Huang, Xiaofeng He

Keywords Paper

nlp tasks, fine-tuning, learning process, multi-domain tasks

0

0

0

0

9:58

19/08/2021

Automatically Paraphrasing via Sentence Reconstruction and Round-trip Translation

Zilu Guo, Zhongqiang Huang, Kenny Q. Zhu and
Guandan Chen, Kaibo Zhang, Boxing Chen, Fei Huang

Keywords Paper

Natural Language Processing, Machine Translation, Natural Language Generation, NLP Applications and Tools

0

0

0

0

13:53

26/04/2020

A Neural Dirichlet Process Mixture Model for Task-Free Continual Learning

Soochan Lee, Junsoo Ha, Dongsu Zhang, Gunhee Kim

Keywords Paper

continual learning, task-free, task-agnostic

0

0

0

0

5:08

04/07/2020

Curriculum Learning for Natural Language Understanding

Benfeng Xu, Licheng Zhang, Zhendong Mao and
Quan Wang, Hongtao Xie, Yongdong Zhang

Keywords Paper

Curriculum Learning, Natural Understanding, natural tasks, NLU tasks

0

0

0

0

9:41

30/11/2020

Learn more, forget less: Cues from human brain

Arijit Patra, Tapabrata Chakraborti

Keywords Paper

0

0

0

0

5:20

04/07/2020

Uncertainty-Aware Curriculum Learning for Neural Machine Translation

Yikai Zhou, Baosong Yang, Derek F. Wong and
Yu Wan, Lidia S. Chao

Keywords Paper

Neural Translation, assessment difficulty, translation tasks, Uncertainty-Aware Learning

0

0

0

0

8:20

04/07/2020

Language to Network: Conditional Parameter Adaptation with Natural Language Descriptions

Tian Jin, Zhun Liu, Shengjia Yan and
Alexandre Eichenberger, Louis-Philippe Morency

Keywords Paper

Transfer learning, computer tasks, fine-tuning, Conditional Adaptation

0

0

0

0

5:42

12/07/2020

Evolving Machine Learning Algorithms From Scratch

Esteban Real, Chen Liang, David So, Quoc Le

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

15:01

04/07/2020

MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification

Jiaao Chen, Zichao Yang, Diyi Yang

Keywords Paper

Semi-Supervised Classification, text classification, data augmentation, supervision

0

0

0

0

10:54

12/07/2020

Retrieval Augmented Language Model Pre-Training

Kelvin Guu, Kenton Lee, Zora Tung and
Panupong Pasupat, Mingwei Chang

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

14:44

19/10/2020

AutoADR: Automatic model design for ad relevance

Yiren Chen, Yaming Yang, Hong Sun and
Yujing Wang, Yu Xu, Wei Shen, Rong Zhou, Yunhai Tong, Jing Bai, Ruofei Zhang

Keywords Paper

neural architecture search, knowledge distillation, ad relevance

0

0

0

0

9:24

04/07/2020

Lexically Constrained Neural Machine Translation with Levenshtein Transformer

Raymond Hendy Susanto, Shamil Chollampatt, Liling Tan

Keywords Paper

Lexically Translation, neural translation, Levenshtein Transformer, beam decoding

0

0

0

0

7:05

04/07/2020

Logical Natural Language Generation from Open-Domain Tables

Wenhu Chen, Jianshu Chen, Yu Su and
Zhiyu Chen, William Yang Wang

Keywords Paper

Logical Generation, neural NLG, surface-level realizations, logical inference

0

0

0

0

11:48

04/07/2020

Learning a Multi-Domain Curriculum for Neural Machine Translation

Wei Wang, Ye Tian, Jiquan Ngiam and
Yinfei Yang, Isaac Caswell, Zarana Parekh

Keywords Paper

Neural Translation, data selection, machine translation, multi-domain curriculum

0

0

0

0

11:44

08/12/2020

Dynamic Curriculum Learning for Low-Resource Neural Machine Translation

Chen Xu, Bojie Hu, Yufan Jiang and
Kai Feng, Zeyang Wang, Shen Huang, Qi Ju, Tong Xiao, Jingbo Zhu

Keywords Paper

0

0

0

0

13:28

01/07/2020

Re-translation versus Streaming for Simultaneous Translation

Naveen Arivazhagan, Colin Cherry, Wolfgang Macherey, George Foster

Keywords Paper

0

0

0

0

23:21

05/04/2021

CODE: Compiler-based Neuron-aware Ensemble training

Ettore M. G. Trainiti, Thanapon Noraset, David Demeter and
Doug Downey, Simone Campanoni Campanoni

Keywords Paper

Neuroscience and Cognitive Science -> Memory, Neuroscience and Cognitive Science -> Plasticity and Adaptation

0

0

0

0

18:48

26/04/2020

CLN2INV: Learning Loop Invariants with Continuous Logic Networks

Gabriel Ryan, Justin Wong, Jianan Yao and
Ronghui Gu, Suman Jana

Keywords Paper

loop invariants, deep learning, logic learning

0

0

0

0

5:12

26/04/2020

Compositional Language Continual Learning

Yuanpeng Li, Liang Zhao, Kenneth Church, Mohamed Elhoseiny

Keywords Paper

Compositionality, Continual Learning, Lifelong Learning, Sequence to Sequence Modeling

0

0

0

1

4:54

12/07/2020

Improving Transformer Optimization Through Better Initialization

Xiao Shi Huang, Felipe Perez, Jimmy Ba, Maksims Volkovs

Keywords Paper

Sequential, Network, and Time-Series Modeling

0

0

0

0

14:52

12/08/2020

Updates-Leak: Data Set Inference and Reconstruction Attacks in Online Learning

Ahmed Salem, Apratim Bhattacharya, Michael Backes and
Mario Fritz, Yang Zhang

Keywords Paper

0

0

0

0

13:05

16/11/2020

Autoregressive Knowledge Distillation through Imitation Learning

Alexander Lin, Jeremy Wohlwend, Howard Chen, Tao Lei

Keywords Paper

natural tasks, knowledge distillation, exposure problem, prototypical tasks

0

0

0

0

12:43

26/04/2020

Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model

Wenhan Xiong, Jingfei Du, William Yang Wang, Veselin Stoyanov

Keywords Paper

0

0

0

0

5:00

16/11/2020

Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting

Sanyuan Chen, Yutai Hou, Yiming Cui and
Wanxiang Che, Ting Liu, Xiangzhan Yu

Keywords Paper

pretraining, pretraining tasks, learning tasks, fine-tuning bert-large

0

0

0

1

10:52

04/07/2020

MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization

Canwen Xu, Jiaxin Pei, Hongtao Wu and
Yiyu Liu, Chenliang Li

Keywords Paper

Classification, Question Answering, Summarization, Natural Processing

0

0

0

0

7:00

08/12/2020

SentiX: A Sentiment-Aware Pre-Trained Model for Cross-Domain Sentiment Analysis

Jie Zhou, Junfeng Tian, Rui Wang and
Yuanbin Wu, Wenming Xiao, Liang He

Keywords Paper

0

0

0

0

12:42

08/12/2020

Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity

Hamza Harkous, Isabel Groves, Amir Saffari

Keywords Paper

0

0

0

0

14:37

16/11/2020

Self-Paced Learning for Neural Machine Translation

Yu Wan, Baosong Yang, Derek F. Wong and
Yikai Zhou, Lidia S. Chao, Haibo Zhang, Boxing Chen

Keywords Paper

neural, curriculum learning, translation tasks, nmt

0

0

0

0

6:03

03/05/2021

Dataset Meta-Learning from Kernel Ridge-Regression

Timothy Nguyen, Zhourong Chen, Jaehoon Lee

Keywords Paper

dataset corruption, infinite-width networks, neural kernels, kernel-ridge regression, dataset compression, dataset distillation, meta-learning

0

0

0

0

4:59

16/11/2020

How Much Knowledge Can You Pack Into the Parameters of a Language Model?

Adam Roberts, Colin Raffel, Noam Shazeer

Keywords Paper

fine-tuning models, neural models, open-domain systems, model size

0

0

0

0

7:31