Text Generation by Learning from Demonstrations

Abstract: Current approaches to text generation largely rely on autoregressive models and maximum likelihood estimation. This paradigm leads to (i) diverse but low-quality samples due to mismatched learning objective and evaluation metric (likelihood vs. quality) and (ii) exposure bias due to mismatched history distributions (gold vs. model-generated). To alleviate these problems, we frame text generation as an offline reinforcement learning (RL) problem with expert demonstrations (i.e., the reference), where the goal is to maximize quality given model-generated histories. We propose GOLD (generation by off-policy learning from demonstrations): an easy-to-optimize algorithm that learns from the demonstrations by importance weighting. Intuitively, GOLD upweights confident tokens and downweights unconfident ones in the reference during training, avoiding optimization issues faced by prior RL approaches that rely on online data collection. According to both automatic and human evaluation, models trained by GOLD outperform those trained by MLE and policy gradient on summarization, question generation, and machine translation. Further, our models are less sensitive to decoding algorithms and alleviate exposure bias.

16/11/2020

Text Generation by Learning from Demonstrations

Richard Pang, He He

Comments

Similar Papers

An Information Bottleneck Approach for Controlling Conciseness in Rationale Extraction

Bhargavi Paranjape, Mandar Joshi, John Thickstun and Hannaneh Hajishirzi, Luke Zettlemoyer

Keywords Abstract Paper

language understanding, semi-supervised setting, complex models, explainer

Active Imitation Learning with Noisy Guidance

Kianté Brantley, Hal Daumé III, Amr Sharaf

Keywords Abstract Paper

Active Learning, structured tasks, sequence tasks, Imitation algorithms

DAGA: Data Augmentation with a Generation Approach forLow-resource Tagging Tasks

Bosheng Ding, Linlin Liu, Lidong Bing and Canasai Kruengkrai, Thien Hai Nguyen, Shafiq Joty, Luo Si, Chunyan Miao

Keywords Abstract Paper

machine learning, generalization, low-resource tasks, named recognition

Seeing without Looking: Contextual Rescoring of Object Detections for AP Maximization

Lourenço V. Pato, Renato Negrinho, Pedro M. Q. Aguiar

Keywords Abstract Paper

object detection, context, rescoring, average precision, non-maximum suppression

Neural Text Generation With Unlikelihood Training

Sean Welleck, Ilia Kulikov, Stephen Roller and Emily Dinan, Kyunghyun Cho, Jason Weston

Keywords Abstract Paper

language modeling, machine learning

Low-Resource Generation of Multi-hop Reasoning Questions

Jianxing Yu, Wei Liu, Shuang Qiu and Qinliang Su, Kai Wang, Xiaojun Quan, Jian Yin

Keywords Abstract Paper

Low-Resource Questions, generating questions, machine comprehension, multi-hop model

IQ-Learn: Inverse soft-Q Learning for Imitation

Divyansh Garg, Shuvam Chakraborty, Chris Cundy and Jiaming Song, Stefano Ermon

Keywords Abstract Paper

optimization, reinforcement learning and planning, adversarial robustness and security

SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards

Siddharth Reddy, Anca D. Dragan, Sergey Levine

Keywords Abstract Paper

Imitation Learning, Reinforcement Learning

Task-Robust Model-Agnostic Meta-Learning

Liam Collins, Aryan Mokhtari, Sanjay Shakkottai

Keywords Abstract Paper

Robustness and scalability under heavy tails, without strong convexity

Matthew Holland

Keywords Abstract Paper

Robust Overfitting may be mitigated by properly learned smoothening

Tianlong Chen, Zhenyu Zhang, Sijia Liu and Shiyu Chang, Zhangyang Wang

Keywords Abstract Paper

Robust Overfitting, Adversarial Training, Adversarial Robustness

Self-supervised Adversarial Robustness for the Low-label, High-data Regime

Sven Gowal, Po-Sen Huang, Aaron v den and Timothy A Mann, Pushmeet Kohli

Keywords Abstract Paper

self-supervised, adversarial training, robustness

Self-Adversarial Learning with Comparative Discrimination for Text Generation

Wangchunshu Zhou, Tao Ge, Ke Xu and Furu Wei, Ming Zhou

Keywords Abstract Paper

adversarial learning, text generation

Efficient Iterative Amortized Inference for Learning Symmetric and Disentangled Multi-Object Representations

Patrick Emami, Pan He, Sanjay Ranka, Anand Rangarajan

Keywords Abstract Paper

Deep Learning, Embedding and Representation learning

Supervising the Transfer of Reasoning Patterns in VQA

Corentin Kervadec, Christian Wolf, Grigory Antipov and Moez Baccouche, Madiha Nadri

Keywords Abstract Paper

theory, deep learning, vision

Straight to the Gradient: Learning to Use Novel Tokens for Neural Text Generation

Xiang Lin, Simeng Han, Shafiq Joty

Keywords Abstract Paper

Applications, Natural Language Processing

Deciphering and Optimizing Multi-Task Learning: a Random Matrix Approach

Malik Tiomoko, Hafiz Tiomoko Ali, Romain Couillet

Keywords Abstract Paper

Transfer Learning, Random Matrix Theory, Multi Task Learning

MetaAugment: Sample-Aware Data Augmentation Policy Learning

Fengwei Zhou, Jiawei Li, Chuanlong Xie and Fei Chen, Lanqing Hong, Rui Sun, Zhenguo Li

Keywords Abstract Paper

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

Paul Barde, Julien Roy, Wonseok Jeon and Joelle Pineau, Chris Pal, Derek Nowrouzezahrai

Keywords Abstract Paper

Adversarially robust estimate and risk analysis in linear regression

Yue Xing, Ruizhi Zhang, Guang Cheng

Keywords Abstract Paper

Deep Reinforcement Learning at the Edge of the Statistical Precipice

Bhargavi Paranjape, Mandar Joshi, John Thickstun and
Hannaneh Hajishirzi, Luke Zettlemoyer

Keywords Paper

Keywords Paper

Bosheng Ding, Linlin Liu, Lidong Bing and
Canasai Kruengkrai, Thien Hai Nguyen, Shafiq Joty, Luo Si, Chunyan Miao

Keywords Paper

Keywords Paper

Sean Welleck, Ilia Kulikov, Stephen Roller and
Emily Dinan, Kyunghyun Cho, Jason Weston

Keywords Paper

Jianxing Yu, Wei Liu, Shuang Qiu and
Qinliang Su, Kai Wang, Xiaojun Quan, Jian Yin

Keywords Paper

Divyansh Garg, Shuvam Chakraborty, Chris Cundy and
Jiaming Song, Stefano Ermon

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Tianlong Chen, Zhenyu Zhang, Sijia Liu and
Shiyu Chang, Zhangyang Wang

Keywords Paper

Sven Gowal, Po-Sen Huang, Aaron v den and
Timothy A Mann, Pushmeet Kohli

Keywords Paper

Wangchunshu Zhou, Tao Ge, Ke Xu and
Furu Wei, Ming Zhou

Keywords Paper

Keywords Paper

Corentin Kervadec, Christian Wolf, Grigory Antipov and
Moez Baccouche, Madiha Nadri

Keywords Paper

Keywords Paper

Keywords Paper

Fengwei Zhou, Jiawei Li, Chuanlong Xie and
Fei Chen, Lanqing Hong, Rui Sun, Zhenguo Li

Keywords Paper

Paul Barde, Julien Roy, Wonseok Jeon and
Joelle Pineau, Chris Pal, Derek Nowrouzezahrai

Keywords Paper

Keywords Paper

Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro and
Aaron Courville, Marc Bellemare

Keywords Paper

Anilesh K. Krishnaswamy, Haoming Li, David Rein and
Hanrui Zhang, Vincent Conitzer

Keywords Paper

Yeong-Dae Kwon, Jinho Choo, Byoungjip Kim and
Iljoo Yoon, Youngjune Gwon, Seungjai Min

Keywords Paper

Keywords Paper

Keywords Paper

Zuchao Li, Rui Wang, Kehai Chen and
Masso Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao

Keywords Paper

Ying Huang, Shangfeng Qiu, Wenwei Zhang and
Xianghui Luo, Jinzhuo Wang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Tengyang Xie, Ching-An Cheng, Nan Jiang and
Paul Mineiro, Alekh Agarwal

Keywords Paper

Zibo Lin, Deng Cai, Yan Wang and
Xiaojiang Liu, Haitao Zheng, Shuming Shi

Keywords Paper

Beidi Chen, Tri Dao, Eric Winsor and
Zhao Song, Atri Rudra, Christopher Ré

Keywords Paper

Keywords Paper