01/07/2020

Meeting the 2020 Duolingo Challenge on a Shoestring

Tadashi Nomoto

Keywords:

Abstract: What is given below is a brief description of the two systems, called gFCONV and c-VAE, which we built in a response to the 2020 Duolingo Challenge. Both are neural models that aim at disrupting a sentence representation the encoder generates with an eye on increasing the diversity of sentences that emerge out of the process. Importantly, we decided not to turn to external sources for extra ammunition, curious to know how far we can go while confining ourselves to the data released by Duolingo. gFCONV works by taking over a pre-trained sequence model, and intercepting the output its encoder produces on its way to the decoder. c-VAE is a conditional variational auto-encoder, seeking the diversity by blurring the representation that the encoder derives. Experiments on a corpus constructed out of the public dataset from Duolingo, containing some 4 million pairs of sentences, found that gFCONV is a consistent winner over c-VAE though both suffered heavily from a low recall.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ACL Workshops virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd

Similar Papers