Non-autoregressive Translation with Disentangled Context Transformer

12/07/2020

Non-autoregressive Translation with Disentangled Context Transformer

Jungo Kasai, James Cross, Marjan Ghazvininejad, Jiatao Gu

Keywords: Applications - Language, Speech and Dialog

Abstract Paper Similar Papers

Abstract: State-of-the-art neural machine translation models generate a translation from left to right and every step is conditioned on the previously generated tokens. The sequential nature of this generation process causes fundamental latency in inference since we cannot generate multiple tokens in each sentence in parallel. We propose an attention-masking based model, called Disentangled Context (DisCo) transformer, that simultaneously generates all tokens given different contexts. The DisCo transformer is trained to predict every output token given an arbitrary subset of the other reference tokens. We also develop the parallel easy-first inference algorithm, which iteratively refines every token in parallel and reduces the number of required iterations. Our extensive experiments on 7 translation directions with varying data sizes demonstrate that our model achieves competitive, if not better, performance compared to the state of the art in non-autoregressive machine translation while significantly reducing decoding time on average.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

04/07/2020

Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation

Qiu Ran, Yankai Lin, Peng Li, Jie Zhou

Keywords Paper

Non-Autoregressive Translation, Non-Autoregressive , inference process, multi-modality problem

0

0

0

0

8:34

03/05/2021

Discovering Non-monotonic Autoregressive Orderings with Variational Inference

Xuanlin Li, Brandon Trabucco, Dong Huk Park and
Michael Luo, Sheng Shen, trevor darrell, Yang Gao

Keywords Paper

reinforcement learning, computer vision, natural language processing, optimization, variational inference, unsupervised learning

0

0

0

0

4:56

16/11/2020

SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup

Rongzhi Zhang, Yue Yu, Chao Zhang

Keywords Paper

low-resource tasks, active labeling, mixup, sequence mixup

0

0

0

0

11:16

12/07/2020

An EM Approach to Non-autoregressive Conditional Sequence Generation

Zhiqing Sun, Yiming Yang

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

12:13

18/07/2021

You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

Zhanpeng Zeng, Yunyang Xiong, Sathya Ravi and
Shailesh Acharya, Glenn Fung, Vikas Singh

Keywords Paper

Applications, Natural Language Processing

0

0

0

0

5:16

03/05/2021

Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation

Jungo Kasai, Nikolaos Pappas, Hao Peng and
James Cross, Noah Smith

Keywords Paper

Machine Translation, Sequence Modeling, Natural Language Processing

0

0

0

0

5:04

12/07/2020

Aligned Cross Entropy for Non-Autoregressive Machine Translation

Marjan Ghazvininejad, Vladimir Karpukhin, Luke Zettlemoyer, Omer Levy

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

14:43

06/12/2021

FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition

Yichong Leng, Xu Tan, Linchen Zhu and
Jin Xu, Renqian Luo, Linquan Liu, Tao Qin, Xiangyang Li, Edward Lin, Tie-Yan Liu

Keywords Paper

0

0

0

0

13:44

26/04/2020

CLN2INV: Learning Loop Invariants with Continuous Logic Networks

Gabriel Ryan, Justin Wong, Jianan Yao and
Ronghui Gu, Suman Jana

Keywords Paper

loop invariants, deep learning, logic learning

0

0

0

0

5:12

18/07/2021

Attention is not all you need: pure attention loses rank doubly exponentially with depth

Yihe Dong, Jean-Baptiste Cordonnier, Andreas Loukas

Keywords Paper

Deep Learning, Architectures

0

0

0

0

12:36

30/11/2020

Fast and Differentiable Message Passing on Pairwise Markov Random Fields

Zhiwei Xu, Thalaiyasingam Ajanthan, Richard Hartley

Keywords Paper

0

0

0

0

9:41

19/08/2021

Improving Text Generation with Dynamic Masking and Recovering

Zhidong Liu, Junhui Li, Muhua Zhu

Keywords Paper

Natural Language Processing, Machine Translation, Natural Language Generation

0

0

0

0

13:44

08/12/2020

Infusing Sequential Information into Conditional Masked Translation Model with Self-Review Mechanism

Pan Xie, Zhi Cui, Xiuying Chen and
XiaoHui Hu, Jianwei Cui, Bin Wang

Keywords Paper

0

0

0

0

6:43

26/04/2020

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

Kevin Clark, Minh-Thang Luong, Quoc V. Le, Christopher D. Manning

Keywords Paper

Natural Language Processing, Representation Learning

0

0

0

0

5:12

16/11/2020

Iterative Refinement in the Continuous Space for Non-Autoregressive Neural Machine Translation

Jason Lee, Raphael Shu, Kyunghyun Cho

Keywords Paper

non-autoregressive translation, translation, machine translation, inference procedure

0

0

0

0

11:44

06/12/2021

Drop-DTW: Aligning Common Signal Between Sequences While Dropping Outliers

Mikita Dvornik, Isma Hadji, Konstantinos Derpanis and
Animesh Garg, Allan Jepson

Keywords Paper

representation learning

0

0

0

0

13:34

16/11/2020

Language Model Prior for Low-Resource Neural Machine Translation

Christos Baziotis, Barry Haddow, Alexandra Birch

Keywords Paper

neural translation, neural tm, knowledge distillation, training time

0

0

0

0

11:16

16/11/2020

Accurate Word Alignment Induction from Neural Machine Translation

Yun Chen, Yang Liu, Guanhua Chen and
Xin Jiang, Qun Liu

Keywords Paper

transformer, attention mechanism, word methods, shift-att

0

0

0

0

11:47

18/07/2021

Massively Parallel and Asynchronous Tsetlin Machine Architecture Supporting Almost Constant-Time Scaling

Kuruge Darshana Abeyrathna, Bimal Bhattarai, Morten Goodwin and
Saeed Rahimi Gorji, Ole-Christoffer Granmo, Lei Jiao, Rupsa Saha, Rohan Kumar Yadav

Keywords Paper

Reinforcement Learning and Planning, Bandits

0

0

0

0

5:14

01/07/2020

Improving Autoregressive NMT with Non-Autoregressive Model

Long Zhou, Jiajun Zhang, Chengqing Zong

Keywords Paper

0

0

0

0

8:21

15/11/2020

Testing Consensus Implementations using Communication Closure

Cezara Drăgoi, Constantin Enea, Burcu Kulahcioglu Ozkan and
Rupak Majumdar, Filip Niksic

Keywords Paper

Distributed consensus, Communication closure, Randomized testing

0

0

0

0

15:19

12/07/2020

Imputer: Sequence Modelling via Imputation and Dynamic Programming

William Chan, Chitwan Saharia, Geoffrey Hinton and
Mohammad Norouzi, Navdeep Jaitly

Keywords Paper

Sequential, Network, and Time-Series Modeling

0

0

0

0

10:50

06/12/2021

A Minimalist Approach to Offline Reinforcement Learning

Scott Fujimoto, Shixiang (Shane) Gu

Keywords Paper

reinforcement learning and planning, generative model

1

0

0

0

8:31

12/07/2020

Learning to Encode Position for Transformer with Continuous Dynamical Model

Xuanqing Liu, Hsiang-Fu Yu, Inderjit Dhillon, Cho-Jui Hsieh

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

12:05

18/07/2021

Dataset Condensation with Differentiable Siamese Augmentation

Bo Zhao, Hakan Bilen

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

5:02

07/09/2020

From Quantized DNNs to Quantizable DNNs

Kunyuan Du, Ya Zhang, Haibing Guan

Keywords Paper

Quantized DNNs, Dynamic Bit-width

0

0

0

0

4:05

04/07/2020

Cross-Lingual Unsupervised Sentiment Classification with Multi-View Transfer Learning

Hongliang Fei, Ping Li

Keywords Paper

Cross-Lingual Classification, sentiment classification, unsupervised system, classification

0

0

0

0

12:23

26/04/2020

Understanding Knowledge Distillation in Non-autoregressive Machine Translation

Chunting Zhou, Jiatao Gu, Graham Neubig

Keywords Paper

knowledge distillation, non-autoregressive neural machine translation

0

0

0

0

4:55

12/07/2020

Variable-Bitrate Neural Compression via Bayesian Arithmetic Coding

Yibo Yang, Robert Bamler, Stephan Mandt

Keywords Paper

Deep Learning - General

0

0

0

0

15:08

06/12/2021

Align before Fuse: Vision and Language Representation Learning with Momentum Distillation

Junnan Li, Ramprasaath Selvaraju, Akhilesh Gotmare and
Shafiq Joty, Caiming Xiong, Steven Chu Hong Hoi

Keywords Paper

transformers, vision, representation learning

0

0

0

0

9:40

04/07/2020

IMoJIE: Iterative Memory-Based Joint Open Information Extraction

Keshav Kolluru, Samarth Aggarwal, Vipul Rathore and
Mausam -, Soumen Chakrabarti

Keywords Paper

Iterative Extraction, Open Extraction, IMoJIE, Iterative

0

0

0

0

9:31

06/12/2021

Deep Reinforcement Learning at the Edge of the Statistical Precipice

Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro and
Aaron Courville, Marc Bellemare

Keywords Paper

reinforcement learning and planning

0

0

0

0

19:36

06/12/2021

Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning

Jongjin Park, Younggyo Seo, Chang Liu and
Li Zhao, Tao Qin, Jinwoo Shin, Tie-Yan Liu

Keywords Paper

reinforcement learning and planning, causality

0

0

0

0

12:12

06/12/2021

Self-Supervised Learning of Event-Based Optical Flow with Spiking Neural Networks

Jesse Hagenaars, Federico Paredes-Valles, Guido de Croon

Keywords Paper

deep learning, optimization, self-supervised learning

0

0

0

0

13:28

14/06/2020

OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page Text Recognition by learning to unfold

Mohamed Yousef, Tom E. Bishop

Keywords Paper

text recognition, weakly supervised, handwriting recognition, convolutional neural network fully convolutional, ctc

0

0

0

0

1:00

06/12/2020

Make One-Shot Video Object Segmentation Efficient Again

Tim Meinhardt, Laura Leal-Taixé

Keywords Paper

0

0

0

0

3:17

26/04/2020

Adversarially Robust Representations with Smooth Encoders

Taylan Cemgil, Sumedh Ghaisas, Krishnamurthy (Dj) Dvijotham, Pushmeet Kohli

Keywords Paper

Adversarial Learning, Robust Representations, Variational AutoEncoder, Wasserstein Distance, Variational Inference

0

0

0

0

5:16

02/02/2021

Adversarial Turing Patterns from Cellular Automata

Nurislam Tursynbek, Ilya Vilkoviskiy, Maria Sindeeva, Ivan Oseledets

Keywords Paper

0

0

0

0

14:50

06/12/2021

Not All Images are Worth 16x16 Words: Dynamic Transformers for Efficient Image Recognition

Yulin Wang, Rui Huang, Shiji Song and
Zeyi Huang, Gao Huang

Keywords Paper

transformers

0

0

0

0

7:20

06/12/2021

Self-Supervised Multi-Object Tracking with Cross-input Consistency

Favyen Bastani, Songtao He, Samuel Madden

Keywords Paper

self-supervised learning

0

0

0

0

14:59