Data-dependent Gaussian Prior Objective for Language Generation

26/04/2020

Data-dependent Gaussian Prior Objective for Language Generation

Zuchao Li, Rui Wang, Kehai Chen, Masso Utiyama, Eiichiro Sumita, Zhuosheng Zhang, Hai Zhao

Keywords: Gaussian Prior Objective, Language Generation

Abstract Paper Code Similar Papers

Abstract: For typical sequence prediction problems such as language generation, maximum likelihood estimation (MLE) has commonly been adopted as it encourages the predicted sequence most consistent with the ground-truth sequence to have the highest probability of occurring. However, MLE focuses on once-to-all matching between the predicted sequence and gold-standard, consequently treating all incorrect predictions as being equally incorrect. We refer to this drawback as {\it negative diversity ignorance} in this paper. Treating all incorrect predictions as equal unfairly downplays the nuance of these sequences' detailed token-wise structure. To counteract this, we augment the MLE loss by introducing an extra Kullback--Leibler divergence term derived by comparing a data-dependent Gaussian prior and the detailed training prediction. The proposed data-dependent Gaussian prior objective (D2GPo) is defined over a prior topological order of tokens and is poles apart from the data-independent Gaussian prior (L2 regularization) commonly adopted in smoothing the training of MLE. Experimental results show that the proposed method makes effective use of a more detailed prior in the data and has improved performance in typical language generation tasks, including supervised and unsupervised machine translation, text summarization, storytelling, and image captioning.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

Adaptive Prior-Dependent Correction Enhanced Reinforcement Learning for Natural Language Generation

Wei Cheng, Ziyan Luo, Qiyue Yin

Keywords Paper

0

0

0

0

13:53

16/11/2020

F2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax

Byung-Ju Choi, Jimin Hong, David Park, Sang Wan Lee

Keywords Paper

neural generation, sub-optimal generation, learning model, mefmax

0

0

0

0

11:37

03/05/2021

PMI-Masking: Principled masking of correlated spans

Yoav Levine, Barak Lenz, Opher Lieber and
Omri Abend, Kevin Leyton-Brown, Moshe Tennenholtz, Yoav Shoham

Keywords Paper

BERT, pointwise mutual information, Language modeling

0

0

0

0

12:19

16/11/2020

An Information Bottleneck Approach for Controlling Conciseness in Rationale Extraction

Bhargavi Paranjape, Mandar Joshi, John Thickstun and
Hannaneh Hajishirzi, Luke Zettlemoyer

Keywords Paper

language understanding, semi-supervised setting, complex models, explainer

0

0

0

0

11:44

06/12/2021

FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition

Yichong Leng, Xu Tan, Linchen Zhu and
Jin Xu, Renqian Luo, Linquan Liu, Tao Qin, Xiangyang Li, Edward Lin, Tie-Yan Liu

Keywords Paper

0

0

0

0

13:44

03/05/2021

Text Generation by Learning from Demonstrations

Richard Pang, He He

Keywords Paper

learning from demonstrations, nlp, text generation

0

0

0

0

5:21

04/07/2020

Learning to Faithfully Rationalize by Construction

Sarthak Jain, Sarah Wiegreffe, Yuval Pinter, Byron C. Wallace

Keywords Paper

NLP, neural classification, training, automatic evaluations

0

0

0

0

11:55

06/12/2021

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective

Zhengzhuo Xu, Zenghao Chai, Chun Yuan

Keywords Paper

theory, machine learning

0

0

0

0

4:23

06/12/2020

Multi-label Contrastive Predictive Coding

Jiaming Song, Stefano Ermon

Keywords Paper

0

0

0

0

3:10

26/04/2020

Adversarially Robust Representations with Smooth Encoders

Taylan Cemgil, Sumedh Ghaisas, Krishnamurthy (Dj) Dvijotham, Pushmeet Kohli

Keywords Paper

Adversarial Learning, Robust Representations, Variational AutoEncoder, Wasserstein Distance, Variational Inference

0

0

0

0

5:16

13/04/2021

CONTRA: Contrarian statistics for controlled variable selection

Mukund Sudarshan, Aahlad Puli, Lakshmi Subramanian and
Sriram Sankararaman, Rajesh Ranganath

Keywords Paper

0

0

0

0

3:33

03/05/2021

Contrastive Learning with Adversarial Perturbations for Conditional Text Generation

Seanie Lee, Dong Bok Lee, Sung Ju Hwang

Keywords Paper

contrastive learning, conditional text generation

0

0

0

0

4:51

13/04/2021

On the high accuracy limitation of adaptive property estimation

Yanjun Han

Keywords Paper

0

0

0

0

3:06

08/12/2020

Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity

Anne Lauscher, Ivan Vulić, Edoardo Maria Ponti and
Anna Korhonen, Goran Glavaš

Keywords Paper

0

0

0

0

13:01

26/04/2020

Residual Energy-Based Models for Text Generation

Yuntian Deng, Anton Bakhtin, Myle Ott and
Arthur Szlam, Marc'Aurelio Ranzato

Keywords Paper

energy-based models, text generation

0

0

0

0

4:59

02/02/2021

Multi-Objective Submodular Maximization by Regret Ratio Minimization with Theoretical Guarantee

Chao Feng, Chao Qian

Keywords Paper

0

0

0

0

15:19

04/07/2020

Pointwise Paraphrase Appraisal is Potentially Problematic

Hannah Chen, Yangfeng Ji, David Evans

Keywords Paper

Pointwise Appraisal, binary problem, paraphrase identification, paraphrase models

0

0

0

0

9:54

16/11/2020

DAGA: Data Augmentation with a Generation Approach forLow-resource Tagging Tasks

Bosheng Ding, Linlin Liu, Lidong Bing and
Canasai Kruengkrai, Thien Hai Nguyen, Shafiq Joty, Luo Si, Chunyan Miao

Keywords Paper

machine learning, generalization, low-resource tasks, named recognition

0

0

0

0

11:09

06/12/2021

OpenMatch: Open-Set Semi-supervised Learning with Open-set Consistency Regularization

Kuniaki Saito, Donghyun Kim, Kate Saenko

Keywords Paper

semi-supervised learning

0

0

0

0

11:12

26/10/2020

Stochastic Fairness and Language-Theoretic Fairness in Planning in Nondeterministic Domains

Benjamin Aminof, Giuseppe De Giacomo, Sasha Rubin

Keywords Paper

Reasoning about Actions, Planning in AI, Nondeterminism, Fairness, FOND

0

0

0

0

9:07

19/04/2021

Framing word sense disambiguation as a multi-label problem for model-agnostic knowledge integration

Simone Conia, Roberto Navigli

Keywords Paper

0

0

0

0

6:38

26/04/2020

The Curious Case of Neural Text Degeneration

Ari Holtzman, Jan Buys, Li Du and
Maxwell Forbes, Yejin Choi

Keywords Paper

generation, text, NLG, NLP, natural language, natural language generation, language model, neural, neural language model

0

0

0

0

4:57

12/07/2020

On the consistency of top-k surrogate losses

Forest Yang, Sanmi Koyejo

Keywords Paper

Learning Theory

0

0

0

0

15:54

06/12/2020

Recursive Inference for Variational Autoencoders

Minyoung Kim, Vladimir Pavlovic

Keywords Paper

0

0

0

0

3:24

19/04/2021

Elastic weight consolidation for better bias inoculation

James Thorne, Andreas Vlachos

Keywords Paper

0

0

0

0

6:17

06/12/2021

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

Paria Rashidinejad, Banghua Zhu, Cong Ma and
Jiantao Jiao, Stuart Russell

Keywords Paper

theory, reinforcement learning and planning, bandits

0

0

0

0

12:21

18/07/2021

Order-Agnostic Cross Entropy for Non-Autoregressive Machine Translation

Cunxiao Du, Zhaopeng Tu, Jing Jiang

Keywords Paper

Applications, Natural Language Processing

0

0

0

0

17:21

03/05/2021

On Learning Universal Representations Across Languages

Xiangpeng Wei, Rongxiang Weng, Yue Hu and
Luxi Xing, Heng Yu, Weihua Luo

Keywords Paper

hierarchical contrastive learning, cross-lingual pretraining, universal representation learning

0

0

0

0

3:51

06/12/2020

Extrapolation Towards Imaginary 0-Nearest Neighbour and Its Improved Convergence Rate

Akifumi Okuno, Hidetoshi Shimodaira

Keywords Paper

0

0

0

0

3:14

16/11/2020

BERT-ATTACK: Adversarial Attack Against BERT Using BERT

Linyang Li, Ruotian Ma, Qipeng Guo and
Xiangyang Xue, Xipeng Qiu

Keywords Paper

adversarial attacks, downstream tasks, calculation, gradient-based methods

0

0

0

0

11:36

18/07/2021

On Lower Bounds for Standard and Robust Gaussian Process Bandit Optimization

Xu Cai, Jonathan Scarlett

Keywords Paper

Applications, Natural Language Processing, Applications, Network Analysis, Reinforcement Learning and Planning, Bandits

0

0

0

0

4:19

26/04/2020

Language GANs Falling Short

Massimo Caccia, Lucas Caccia, William Fedus and
Hugo Larochelle, Joelle Pineau, Laurent Charlin

Keywords Paper

NLP, GAN, MLE, adversarial, text generation, temperature

0

0

0

0

4:29

19/10/2020

Robust normalized squares maximization for unsupervised domain adaptation

Wenju Zhang, Xiang Zhang, Qing Liao and
Wenjing Yang, Long Lan, Zhigang Luo

Keywords Paper

transfer learning, image classification, domain adaptation

0

0

0

0

6:23

26/04/2020

Is a Good Representation Sufficient for Sample Efficient Reinforcement Learning?

Simon S. Du, Sham M. Kakade, Ruosong Wang, Lin F. Yang

Keywords Paper

reinforcement learning, function approximation, lower bound, representation

0

0

0

0

4:55

14/06/2020

Seeing without Looking: Contextual Rescoring of Object Detections for AP Maximization

Lourenço V. Pato, Renato Negrinho, Pedro M. Q. Aguiar

Keywords Paper

object detection, context, rescoring, average precision, non-maximum suppression

0

0

0

0

1:00

06/12/2021

Bayesian decision-making under misspecified priors with applications to meta-learning

Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy and
Daniel Hsu, Thodoris Lykouris, Miro Dudik, Robert E Schapire

Keywords Paper

meta learning, bandits

0

0

0

0

14:58

06/12/2020

A Unified View of Label Shift Estimation

Saurabh Garg, Yifan Wu, Sivaraman Balakrishnan, Zachary Lipton

Keywords Paper

0

0

0

0

3:18

04/07/2020

Slot-consistent NLG for Task-oriented Dialogue Systems with Iterative Rectification Network

Yangming Li, Kaisheng Yao, Libo Qin and
Wanxiang Che, Xiaolong Li, Ting Liu

Keywords Paper

Task-oriented Systems, natural generation, natural NLG, NLG

0

0

0

0

10:53

08/12/2020

SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILP

Katsuki Chousa, Masaaki Nagata, Masaaki Nishino

Keywords Paper

0

0

0

0

14:39

03/05/2021

Selective Classification Can Magnify Disparities Across Groups

Erik Jones, Shiori Sagawa, Pang Wei Koh and
Ananya Kumar, Percy Liang

Keywords Paper

log-concavity, group disparities, selective classification, robustness

0

0

0

0

5:24