Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation

16/11/2020

Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation

Pei Zhang, Boxing Chen, Niyu Ge, Kai Fan

Keywords: document-level translation, document-level systems, context-aware architecture, transformer

Abstract Paper Similar Papers

Abstract: Many document-level neural machine translation (NMT) systems have explored the utility of context-aware architecture, usually requiring an increasing number of parameters and computational complexity. However, few attention is paid to the baseline model. In this paper, we research extensively the pros and cons of the standard transformer in document-level translation, and find that the auto-regressive property can simultaneously bring both the advantage of the consistency and the disadvantage of error accumulation. Therefore, we propose a surprisingly simple long-short term masking self-attention on top of the standard transformer to both effectively capture the long-range dependence and reduce the propagation of errors. We examine our approach on the two publicly available document-level datasets. We can achieve a strong result in BLEU and capture discourse phenomena.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

04/07/2020

Evaluating Robustness to Input Perturbations for Neural Machine Translation

Xing Niu, Prashant Mathur, Georgiana Dinu, Yaser Al-Onaizan

Keywords Paper

Neural Translation, Neural models, subword methods, relative degradation

0

0

0

0

6:55

03/05/2021

Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation

Biao Zhang, Ankur Bapna, Rico Sennrich, Orhan Firat

Keywords Paper

multilingual transformer, multilingual translation, language-specific modeling, conditional computation

0

0

0

0

15:04

04/07/2020

A Transformer-based Approach for Source Code Summarization

Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

Keywords Paper

Source Summarization, summarization, ablation studies, Transformer-based Approach

0

0

0

0

6:14

08/12/2020

Best Practices for Data-Efficient Modeling in NLG:How to Train Production-Ready Neural Models with Less Data

Ankit Arun, Soumya Batra, Vikas Bhardwaj and
Ashwini Challa, Pinar Donmez, Peyman Heidari, Hakan Inan, Shashank Jain, Anuj Kumar, Shawn Mei, Karthik Mohan, Michael White

Keywords Paper

0

0

0

0

15:01

22/11/2021

Mode-Guided Feature Augmentation for Domain Generalization

Muhammad Haris Khan, Syed Muhammad talha Zaidi, Salman Khan, Fahad Shahbaz Khan

Keywords Paper

out-of-domain robustness, domain generalization, domain adaptation, convolutional neural networks, data augmentation, feature augmentation, subspace similarity, covariate shift, in-domain generalization, robust objective function

0

0

0

0

2:56

05/01/2021

Meta Module Network for Compositional Visual Reasoning

Wenhu Chen, Zhe Gan, Linjie Li and
Yu Cheng, William Wang, Jingjing Liu

Keywords Paper

0

0

0

0

5:13

26/04/2020

Multiplicative Interactions and Where to Find Them

Siddhant M. Jayakumar, Wojciech M. Czarnecki, Jacob Menick and
Jonathan Schwarz, Jack Rae, Simon Osindero, Yee Whye Teh, Tim Harley, Razvan Pascanu

Keywords Paper

multiplicative interactions, hypernetworks, attention

0

0

0

0

5:34

26/04/2020

Reducing Transformer Depth on Demand with Structured Dropout

Angela Fan, Edouard Grave, Armand Joulin

Keywords Paper

reduction, regularization, pruning, dropout, transformer

0

0

0

0

5:01

04/07/2020

Obtaining Faithful Interpretations from Compositional Neural Networks

Sanjay Subramanian, Ben Bogin, Nitish Gupta and
Tomer Wolfson, Sameer Singh, Jonathan Berant, Matt Gardner

Keywords Paper

vision, abstract process, Compositional Networks, Neural networks

0

0

0

1

11:21

02/02/2021

i-Algebra: Towards Interactive Interpretability of Deep Neural Networks

Xinyang Zhang, Ren Pang, Shouling Ji and
Fenglong Ma, Ting Wang

Keywords Paper

0

0

0

0

18:38

02/02/2021

Lexically Constrained Neural Machine Translation with Explicit Alignment Guidance

Guanhua Chen, Yun Chen, Victor O.K. Li

Keywords Paper

0

0

0

0

15:33

26/04/2020

Variational Template Machine for Data-to-Text Generation

Rong Ye, Wenxian Shi, Hao Zhou and
Zhongyu Wei, Lei Li

Keywords Paper

0

0

0

0

4:55

02/02/2021

Adversarial Turing Patterns from Cellular Automata

Nurislam Tursynbek, Ilya Vilkoviskiy, Maria Sindeeva, Ivan Oseledets

Keywords Paper

0

0

0

0

14:50

04/07/2020

It's Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information

Emanuele Bugliarello, Sabrina J. Mielke, Antonios Anastasopoulos and
Ryan Cotterell, Naoaki Okazaki

Keywords Paper

Measuring Difficulty, generation, asymmetric difficulty, machine difficulty

0

0

0

0

6:52

26/04/2020

A Probabilistic Formulation of Unsupervised Text Style Transfer

Junxian He, Xinyi Wang, Graham Neubig, Taylor Berg-Kirkpatrick

Keywords Paper

unsupervised text style transfer, deep latent sequence model

0

0

0

0

5:02

08/12/2020

Meet Changes with Constancy: Learning Invariance in Multi-Source Translation

Jianfeng Liu, Ling Luo, Xiang Ao and
Yan Song, Haoran Xu, Jian Ye

Keywords Paper

0

0

0

0

13:35

14/09/2020

Squeezing Correlated Neurons for Resource-Efficient Deep Neural Networks

Elbruz Ozen, Alex Orailoglu

Keywords Paper

deep learning, information redundancy, pruning

0

0

0

0

14:48

04/07/2020

Location Attention for Extrapolation to Longer Sequences

Yann Dubois, Gautier Dagan, Dieuwke Hupkes, Elia Bruni

Keywords Paper

Extrapolation, natural processing, generalization, Lookup task

0

0

0

0

11:02

06/12/2020

Incorporating BERT into Parallel Sequence Decoding with Adapters

Junliang Guo, Zhirui Zhang, Linli Xu and
Hao-Ran Wei, Boxing Chen, Enhong Chen

Keywords Paper

0

0

0

0

3:17

06/12/2021

A Framework to Learn with Interpretation

Jayneel Parekh, Pavlo Mozharovskyi, Florence d'Alché-Buc

Keywords Paper

deep learning, interpretability

0

0

0

0

14:05

04/07/2020

Improving Multimodal Named Entity Recognition via Entity Span Detection with Unified Multimodal Transformer

Jianfei Yu, Jing Jiang, Li Yang, Rui Xia

Keywords Paper

Multimodal Recognition, Multimodal MNER, Multimodal, MNER

1

0

0

0

12:52

07/09/2020

Advancing weakly supervised cross-domain alignment with optimal transport

Siyang Yuan, Ke Bai, Liqun Chen and
Yizhe Zhang, Chenyang Tao, Chunyuan Li, Guoyin Wang, Ricardo Henao, Lawrence Carin Duke

Keywords Paper

Optimal Transport, Cross Domain Alignment

0

0

0

0

10:04

06/12/2021

NxMTransformer: Semi-Structured Sparsification for Natural Language Understanding via ADMM

Connor Holmes, Minjia Zhang, Yuxiong He, Bo Wu

Keywords Paper

optimization, transformers, language

0

0

0

0

10:53

19/08/2021

Improving Stylized Neural Machine Translation with Iterative Dual Knowledge Transfer

Xuanxuan Wu, Jian Liu, Xinjie Li and
Jinan Xu, Yufeng Chen, Yujie Zhang, Hui Huang

Keywords Paper

Natural Language Processing, Machine Translation, Natural Language Generation

0

0

0

0

12:35

05/12/2020

A cascade approach to neural abstractive summarization with content selection and fusion

Logan Lebanoff, Franck Dernoncourt, Doo Soon Kim and
Walter Chang, Fei Liu

Keywords Paper

0

0

0

0

9:58

16/11/2020

Uncertainty-Aware Semantic Augmentation for Neural Machine Translation

Xiangpeng Wei, Heng Yu, Yue Hu and
Rongxiang Weng, Luxi Xing, Weihua Luo

Keywords Paper

sequence-to-sequence task, nmt, inference, translation tasks

0

0

0

0

11:11

14/06/2020

Resolution Adaptive Networks for Efficient Inference

Le Yang, Yizeng Han, Xi Chen and
Shiji Song, Jifeng Dai, Gao Huang

Keywords Paper

adaptive inference, efficient deep learning, multi-scale feature learning, budgeted batch classification

0

0

0

0

0:59

04/07/2020

Generalizing Natural Language Analysis through Span-relation Representations

Zhengbao Jiang, Wei Xu, Jun Araki, Graham Neubig

Keywords Paper

Natural Analysis, Natural processing, dependency parsing, semantic labeling

0

0

0

0

8:30

02/02/2021

DenserNet: Weakly Supervised Visual Localization Using Multi-Scale Feature Aggregation

Dongfang Liu, Yiming Cui, Liqi Yan and
Christos Mousas, Baijian Yang, Yingjie Chen

Keywords Paper

0

0

0

0

16:15

06/12/2020

Evolving Normalization-Activation Layers

Hanxiao Liu, Andy Brock, Karen Simonyan, Quoc V Le

Keywords Paper

0

0

0

0

2:32

02/02/2021

IsoBN: Fine-Tuning BERT with Isotropic Batch Normalization

Wenxuan Zhou, Bill Yuchen Lin, Xiang Ren

Keywords Paper

0

0

0

0

16:25

14/06/2020

Cascaded Human-Object Interaction Recognition

Tianfei Zhou, Wenguan Wang, Siyuan Qi and
Haibin Ling, Jianbing Shen

Keywords Paper

human-object interaction recognition, cascade reasoning, fine-grained relation segmentation

0

0

0

0

1:01

19/10/2020

Flexible IR pipelines with capreolus

Andrew Yates, Kevin Martin Jose, Xinyu Zhang, Jimmy Lin

Keywords Paper

neural information retrieval, retrieval pipeline, ad hoc ranking

0

0

0

0

10:00

14/06/2020

DUNIT: Detection-Based Unsupervised Image-to-Image Translation

Deblina Bhattacharjee, Seungryong Kim, Guillaume Vizier, Mathieu Salzmann

Keywords Paper

style transfer, object detection, unsupervised image to image translation, domain adaptation

0

0

0

0

1:01

22/11/2021

Logsig-RNN: a novel network for robust and efficient skeleton-based action recognition

Hao Ni, Shujian Liao, Weixin Yang and
Kevin Schlegel, Terry J Lyons

Keywords Paper

skeleton-based action recognition, recurrent neural network, log-signature

0

0

0

0

2:58

16/11/2020

Distilling Multiple Domains for Neural Machine Translation

Anna Currey, Prashant Mathur, Georgiana Dinu

Keywords Paper

translation, neural translation, multi-domain model, high-resource conditions

0

0

0

0

12:15

02/02/2021

Meta-Transfer Learning for Low-Resource Abstractive Summarization

Yi-Syuan Chen, Hong-Han Shuai

Keywords Paper

0

0

0

0

19:10

16/11/2020

Tractable Lexical-Functional Grammar

Jürgen Wedekind, Ronald M. Kaplan

Keywords Paper

recognition, emptiness, tractable computation, constraint-based formalisms

0

0

0

0

11:27

14/06/2020

Explorable Super Resolution

Yuval Bahat, Tomer Michaeli

Keywords Paper

super-resolution, gan, editing

0

0

0

0

4:56

03/05/2021

The Unreasonable Effectiveness of Patches in Deep Convolutional Kernels Methods

Louis THIRY, Michael Arbel, Eugene Belilovsky, Edouard Oyallon

Keywords Paper

image classification, convolutional kernel methods

0

0

0

0

6:45