06/12/2020

ColdGANs: Taming Language GANs with Cautious Sampling Strategies

Thomas Scialom, Paul-Alexis Dray, Sylvain Lamprier, Benjamin Piwowarski, Jacopo Staiano

Keywords:

Abstract: Training regimes based on Maximum Likelihood Estimation (MLE) suffer from known limitations, often leading to poorly generated text sequences that lack of coherence, factualness, and are prone to repetitions. At the root of these limitations is the mismatch between training and inference, i.e. the so-called exposure bias. Another problem lies in considering only the reference text as correct, while in practice several alternative formulations could be as good. Generative Adversarial Networks (GANs) could mitigate those limitations. Nonetheless, the discrete nature of text has hindered their application to language generation: the approaches proposed so far, based on Reinforcement Learning, have been shown to under-perform MLE. In this context, the exploration is known to be critical, while surprisingly being under-studied. In this work, we show how the most popular sampling method results in unstable training for language GANs. We propose alternative exploration strategies that we named Cold-GANs. By forcing the sampling to be close to the distribution mode, the learning dynamic becomes smoother. We report experimental results obtained on three tasks: unconditional text generation, question generation, and abstractive summarization. For the first time, to the best of our knowledge, the proposed language GANs compare favorably to MLE, and obtain improvements over the state-of-the-art on the considered tasks.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers