Jointly improving language understanding and generation with quality-weighted weak supervision of automatic labeling

Abstract: Neural natural language generation (NLG) and understanding (NLU) models are data-hungry and require massive amounts of annotated data to be competitive. Recent frameworks address this bottleneck with generative models that synthesize weak labels at scale, where a small amount of training labels are expert-curated and the rest of the data is automatically annotated. We follow that approach, by automatically constructing a large-scale weakly-labeled data with a fine-tuned GPT-2, and employ a semi-supervised framework to jointly train the NLG and NLU models. The proposed framework adapts the parameter updates to the models according to the estimated label-quality. On both the E2E and Weather benchmarks, we show that this weakly supervised training paradigm is an effective approach under low resource scenarios with as little as 10 data instances, and outperforming benchmark systems on both datasets when 100% of the training data is used.

18/07/2021

Ankit Arun, Soumya Batra, Vikas Bhardwaj and
Ashwini Challa, Pinar Donmez, Peyman Heidari, Hakan Inan, Shashank Jain, Anuj Kumar, Shawn Mei, Karthik Mohan, Michael White

Jointly improving language understanding and generation with quality-weighted weak supervision of automatic labeling

Ernie Chang, Vera Demberg, Alex Marin

Comments

Similar Papers

LogME: Practical Assessment of Pre-trained Models for Transfer Learning

Kaichao You, Yong Liu, Jianmin Wang, Mingsheng Long

Keywords Abstract Paper

Algorithms, Multitask, Transfer, and Meta Learning

Best Practices for Data-Efficient Modeling in NLG:How to Train Production-Ready Neural Models with Less Data

Ankit Arun, Soumya Batra, Vikas Bhardwaj and Ashwini Challa, Pinar Donmez, Peyman Heidari, Hakan Inan, Shashank Jain, Anuj Kumar, Shawn Mei, Karthik Mohan, Michael White

Keywords Abstract Paper

Improving Molecular Design by Stochastic Iterative Target Augmentation

Kevin Yang, Wengong Jin, Kyle Swanson and Regina Barzilay, Tommi Jaakkola

Keywords Abstract Paper

Deep Learning - Generative Models and Autoencoders

Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation

Yuxuan Song, Ning Miao, Hao Zhou and Lantao Yu, Mingxuan Wang, Lei Li

Keywords Abstract Paper

Co-Tuning for Transfer Learning

Kaichao You, Zhi Kou, Mingsheng Long, Jianmin Wang

Keywords Abstract Paper

Automatic Mixed-Precision Quantization Search of BERT

Changsheng Zhao, Ting Hua, Yilin Shen and Qian Lou, Hongxia Jin

Keywords Abstract Paper

Machine Learning, Deep Learning, NLP Applications and Tools, Text Classification

Weakly Supervised Semantic Segmentation for Large-Scale Point Cloud

Yachao Zhang, Zonghao Li, Yuan Xie and Yanyun Qu, Cuihua Li, Tao Mei

Keywords Abstract Paper

Automated embedding size search in deep recommender systems

Haochen Liu, Xiangyu Zhao, Chong Wang and Xiaobing Liu, Jiliang Tang

Keywords Abstract Paper

embedding, recommender system, AutoML

Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning

Alexandre Tamborrino, Nicola Pellicanò, Baptiste Pannier and Pascal Voitot, Louise Naudin

Keywords Abstract Paper

Commonsense Reasoning, common tasks, plausibility task, pre-training phase

Open Compound Domain Adaptation

Ziwei Liu, Zhongqi Miao, Xingang Pan and Xiaohang Zhan, Dahua Lin, Stella X. Yu, Boqing Gong

Keywords Abstract Paper

domain adaptation, compound domains, open world, curriculum learning, visual memory

Few-Shot NLG with Pre-Trained Language Model

Zhiyu Chen, Harini Eavani, Wenhu Chen and Yinyin Liu, William Yang Wang

Keywords Abstract Paper

natural generation, NLG, real-world applications, content selection

dS^2LBI: Exploring Structural Sparsity on Deep Network via Differential Inclusion Paths

Yanwei Fu, Chen Liu, Donghao Li and Xinwei Sun, Jinshan ZENG, Yuan Yao

Keywords Abstract Paper

Deep Learning - Algorithms

Learning with Labeling Induced Abstentions

Kareem Amin, Giulia DeSalvo, Afshin Rostamizadeh

Keywords Abstract Paper

machine learning, active learning

Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model

Qizhou Wang, Bo Han, Tongliang Liu and Gang Niu, Jian Yang, Chen Gong

Keywords Abstract Paper

FLAML: A Fast and Lightweight AutoML Library

Chi Wang, Qingyun Wu, Markus Weimer, Erkang Zhu

Keywords Abstract Paper

FLAML: A Fast and Lightweight AutoML Library

Chi Wang, Qingyun Wu, Markus Weimer, Erkang Zhu

Keywords Abstract Paper

Automatic Unsupervised Outlier Model Selection

Yue Zhao, Ryan Rossi, Leman Akoglu

Keywords Abstract Paper

machine learning, self-supervised learning, meta learning, clustering

To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging

Kasturi Bhattacharjee, Miguel Ballesteros, Rishita Anubhai and Smaranda Muresan, Jie Ma, Faisal Ladhak, Yaser Al-Onaizan

Keywords Abstract Paper

learning representations, downstream tasks, cross-view cvt, sequence tasks

Adversarial AutoAugment

Xinyu Zhang, Qiang Wang, Jian Zhang, Zhao Zhong

Keywords Abstract Paper

Automatic Data Augmentation, Adversarial Learning, Reinforcement Learning

EvidentialMix: Learning With Combined Open-Set and Closed-Set Noisy Labels

Ragav Sachdeva, Filipe R. Cordeiro, Vasileios Belagiannis and Ian Reid, Gustavo Carneiro

Keywords Abstract Paper

Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification

Nan Lu, Shida Lei, Gang Niu and Issei Sato, Masashi Sugiyama

Keywords Abstract Paper

Algorithms, Semi-Supervised Learning

Keywords Paper

Ankit Arun, Soumya Batra, Vikas Bhardwaj and
Ashwini Challa, Pinar Donmez, Peyman Heidari, Hakan Inan, Shashank Jain, Anuj Kumar, Shawn Mei, Karthik Mohan, Michael White

Keywords Paper

Kevin Yang, Wengong Jin, Kyle Swanson and
Regina Barzilay, Tommi Jaakkola

Keywords Paper

Yuxuan Song, Ning Miao, Hao Zhou and
Lantao Yu, Mingxuan Wang, Lei Li

Keywords Paper

Keywords Paper

Changsheng Zhao, Ting Hua, Yilin Shen and
Qian Lou, Hongxia Jin

Keywords Paper

Yachao Zhang, Zonghao Li, Yuan Xie and
Yanyun Qu, Cuihua Li, Tao Mei

Keywords Paper

Haochen Liu, Xiangyu Zhao, Chong Wang and
Xiaobing Liu, Jiliang Tang

Keywords Paper

Alexandre Tamborrino, Nicola Pellicanò, Baptiste Pannier and
Pascal Voitot, Louise Naudin

Keywords Paper

Ziwei Liu, Zhongqi Miao, Xingang Pan and
Xiaohang Zhan, Dahua Lin, Stella X. Yu, Boqing Gong

Keywords Paper

Zhiyu Chen, Harini Eavani, Wenhu Chen and
Yinyin Liu, William Yang Wang

Keywords Paper

Yanwei Fu, Chen Liu, Donghao Li and
Xinwei Sun, Jinshan ZENG, Yuan Yao

Keywords Paper

Keywords Paper

Qizhou Wang, Bo Han, Tongliang Liu and
Gang Niu, Jian Yang, Chen Gong

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Kasturi Bhattacharjee, Miguel Ballesteros, Rishita Anubhai and
Smaranda Muresan, Jie Ma, Faisal Ladhak, Yaser Al-Onaizan

Keywords Paper

Keywords Paper

Ragav Sachdeva, Filipe R. Cordeiro, Vasileios Belagiannis and
Ian Reid, Gustavo Carneiro

Keywords Paper

Nan Lu, Shida Lei, Gang Niu and
Issei Sato, Masashi Sugiyama

Keywords Paper

Ernie Chang, Xiaoyu Shen, Dawei Zhu and
Vera Demberg, Hui Su

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Tai-Yu Pan, Cheng Zhang, Yandong Li and
Hexiang Hu, Dong Xuan, Soravit Changpinyo, Boqing Gong, Wei-Lun Chao

Keywords Paper

Keywords Paper

Keywords Paper

Yaoqing Yang, Liam Hodgkinson, Ryan Theisen and
Joe Zou, Joseph Gonzalez, Kannan Ramchandran, Michael W Mahoney

Keywords Paper

Keywords Paper

Kanil Patel, William H Beluch, Bin Yang and
Michael Pfeiffer, Dan Zhang

Keywords Paper

Jiashuo Liu, Zheyuan Hu, Peng Cui and
Bo Li, Zheyan Shen

Keywords Paper

Keywords Paper

Weijia Wu, Ning Lu, Enze Xie and
Yuxing Wang, Wenwen Yu, Cheng Yang, Hong Zhou

Keywords Paper

Keywords Paper

Sven Gowal, Po-Sen Huang, Aaron v den and
Timothy A Mann, Pushmeet Kohli

Keywords Paper