Future-Guided Incremental Transformer for Simultaneous Translation

02/02/2021

Future-Guided Incremental Transformer for Simultaneous Translation

Shaolei Zhang, Yang Feng, Liangyou Li

Keywords:

Abstract Paper Similar Papers

Abstract: Simultaneous translation (ST) starts translations synchronously while reading source sentences, and is used in many online scenarios. The previous wait-k policy is concise and achieved good results in ST. However, wait-k policy faces two weaknesses: low training speed caused by the recalculation of hidden states and lack of future source information to guide training. For the low training speed, we propose an incremental Transformer with an average embedding layer (AEL) to accelerate the speed of calculation of the hidden states during training. For future-guided training, we propose a conventional Transformer as the teacher of the incremental Transformer, and try to invisibly embed some future information in the model through knowledge distillation. We conducted experiments on Chinese-English and German-English simultaneous translation tasks and compared with the wait-k policy to evaluate the proposed method. Our method can effectively increase the training speed by about 28 times on average at different k and implicitly embed some predictive abilities in the model, achieving better translation quality than wait-k baseline.

The video of this talk cannot be embedded. You can watch it here:

https://slideslive.com/38948467

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AAAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

Adaptive Teaching of Temporal Logic Formulas to Preference-based Learners

Zhe Xu, Yuxin Chen, Ufuk Topcu

Keywords Paper

0

0

0

0

19:42

03/05/2021

Contrastive Learning with Adversarial Perturbations for Conditional Text Generation

Seanie Lee, Dong Bok Lee, Sung Ju Hwang

Keywords Paper

contrastive learning, conditional text generation

0

0

0

0

4:51

12/07/2020

Teaching with Limited Information on the Learner's Behaviour

Ferdinando Cicalese, Francisco Sergio de Freitas Filho, Eduardo Laber, Marco Molinaro

Keywords Paper

Learning Theory

0

0

0

0

15:07

16/11/2020

Plug and Play Autoencoders for Conditional Text Generation

Florian Mai, Nikolaos Pappas, Ivan Montero and
Noah A. Smith, James Henderson

Keywords Paper

conditional tasks, style transfer, style tasks, text autoencoders

0

0

0

0

9:23

18/07/2021

Self-supervised and Supervised Joint Training for Resource-rich Machine Translation

Yong Cheng, Wei Wang, Lu Jiang, Wolfgang Macherey

Keywords Paper

Applications, Natural Language Processing

0

0

0

0

5:21

04/07/2020

Distilling Knowledge Learned in BERT for Text Generation

Yen-Chun Chen, Zhe Gan, Yu Cheng and
Jingzhou Liu, Jingjing Liu

Keywords Paper

Text Generation, language tasks, language generation, generation tasks

0

0

0

0

10:41

18/07/2021

CARTL: Cooperative Adversarially-Robust Transfer Learning

Dian Chen, Hongxin Hu, Qian Wang and
Li Yinli, Cong Wang, Chao Shen, Qi Li

Keywords Paper

Algorithms, Adversarial Examples

0

0

0

0

19:32

16/11/2020

Adversarial Self-Supervised Data-Free Distillation for Text Classification

Xinyin Ma, Yongliang Shen, Gongfan Fang and
Chen Chen, Chenghao Jia, Weiming Lu

Keywords Paper

nlp tasks, nlp, compressing models, text generation

0

0

0

0

9:36

02/02/2021

Learning to Reweight with Deep Interactions

Yang Fan, Yingce Xia, Lijun Wu and
Shufang Xie, Weiqing Liu, Jiang Bian, Tao Qin, Xiang-Yang Li

Keywords Paper

0

0

0

0

14:06

12/07/2020

Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training

Xuxi Chen, Wuyang Chen, Tianlong Chen and
Ye Yuan, Chen Gong, Kewei Chen, Zhangyang Wang

Keywords Paper

Supervised Learning

0

0

0

0

7:05

01/07/2020

Distill, Adapt, Distill: Training Small, In-Domain Models for Neural Machine Translation

Mitchell Gordon, Kevin Duh

Keywords Paper

0

0

0

0

8:59

19/08/2021

MRD-Net: Multi-Modal Residual Knowledge Distillation for Spoken Question Answering

Chenyu You, Nuo Chen, Yuexian Zou

Keywords Paper

Natural Language Processing, Question Answering, Sentiment Analysis and Text Mining, Speech

0

0

0

0

12:23

06/12/2021

Iterative Teacher-Aware Learning

Luyao Yuan, Dongruo Zhou, Junhong Shen and
Jingdong Gao, Jeffrey L Chen, Quanquan Gu, Ying Nian Wu, Song-Chun Zhu

Keywords Paper

theory, optimization, reinforcement learning and planning, machine learning

0

0

0

0

6:40

16/11/2020

Self-Paced Learning for Neural Machine Translation

Yu Wan, Baosong Yang, Derek F. Wong and
Yikai Zhou, Lidia S. Chao, Haibo Zhang, Boxing Chen

Keywords Paper

neural, curriculum learning, translation tasks, nmt

0

0

0

0

6:03

01/07/2020

Training and Inference Methods for High-Coverage Neural Machine Translation

Michael Yang, Yixin Liu, Rahul Mayuranath

Keywords Paper

0

0

0

0

7:17

02/02/2021

Improving Tree-Structured Decoder Training for Code Generation via Mutual Learning

Binbin Xie, Jinsong Su, Yubin Ge and
Xiang Li, Jianwei Cui, Junfeng Yao, Bin Wang

Keywords Paper

0

0

0

0

15:57

18/07/2021

Training Data Subset Selection for Regression with Controlled Generalization Error

Durga S, Rishabh Iyer, Ganesh Ramakrishnan, Abir De

Keywords Paper

, Algorithms, Online Learning, Algorithms, Supervised Learning

0

0

0

0

4:15

03/05/2021

MixKD: Towards Efficient Distillation of Large-scale Language Models

Kevin Liang, Weituo Hao, Dinghan Shen and
Yufan Zhou, Weizhu Chen, Changyou Chen, Lawrence Carin

Keywords Paper

Representation Learning, Natural Language Processing

0

0

0

0

3:52

06/12/2020

Training Stronger Baselines for Learning to Optimize

Tianlong Chen, Weiyi Zhang, Zhou Jingyang and
Shiyu Chang, Sijia Liu, Lisa Amini, Zhangyang Wang

Keywords Paper

0

0

0

0

3:18

16/11/2020

Q-learning with Language Model for Edit-based Unsupervised Summarization

Ryosuke Kohita, Akifumi Wachi, Yang Zhao, Ryuki Tachibana

Keywords Paper

abstractive textsummarization, unsupervised summarization, unsupervised summarizers, unsupervised methods

0

0

0

0

12:32

12/07/2020

Improving Transformer Optimization Through Better Initialization

Xiao Shi Huang, Felipe Perez, Jimmy Ba, Maksims Volkovs

Keywords Paper

Sequential, Network, and Time-Series Modeling

0

0

0

0

14:52

03/05/2021

Few-Shot Bayesian Optimization with Deep Kernel Surrogates

Martin Wistuba, Josif Grabocka

Keywords Paper

automl, bayesian optimization, metalearning, few-shot learning

0

0

0

0

5:18

16/11/2020

TeaForN: Teacher-Forcing with N-grams

Sebastian Goodman, Nan Ding, Radu Soricut

Keywords Paper

machine benchmark, news benchmarks, sequence models, teacher-forcing

0

0

0

0

12:02

02/02/2021

Automatic Curriculum Learning With Over-repetition Penalty for Dialogue Policy Learning

Yangyang Zhao, Zhenyu Wang, Zhenhua Huang

Keywords Paper

0

0

0

0

15:41

14/09/2020

Companion Guided Soft Margin for Face Recognition

Yingcheng Su, Yichao Wu, Zhenmao Li and
Ken Chen, Ding Liang, Xiaolin Hu, Junjie Yan

Keywords Paper

face recognition, companion guided soft margin, sample-wise adaptive margin

0

0

0

0

15:44

06/12/2021

Stylized Dialogue Generation with Multi-Pass Dual Learning

Jinpeng Li, Yingce Xia, Rui Yan and
Hongda Sun, Dongyan Zhao, Tie-Yan Liu

Keywords Paper

0

0

0

0

3:16

04/07/2020

Uncertainty-Aware Curriculum Learning for Neural Machine Translation

Yikai Zhou, Baosong Yang, Derek F. Wong and
Yu Wan, Lidia S. Chao

Keywords Paper

Neural Translation, assessment difficulty, translation tasks, Uncertainty-Aware Learning

0

0

0

0

8:20

19/10/2020

Ensembled CTR prediction via knowledge distillation

Jieming Zhu, Jinyang Liu, Weiqi Li and
Jincai Lai, Xiuqiang He, Liang Chen, Zibin Zheng

Keywords Paper

model ensemble, knowledge distillation, ctr prediction, online advertising, recommender systems

0

0

0

0

9:26

04/07/2020

Unsupervised Word Translation with Adversarial Autoencoder

Tasnim Mohiuddin, Shafiq Joty

Keywords Paper

Unsupervised Translation, machine translation, transfer learning, word task

0

0

0

0

14:56

16/11/2020

Lifelong Language Knowledge Distillation

Yung-Sung Chuang, Shang-Yu Su, Yun-Nung Chen

Keywords Paper

lll tasks, sequence generation, text tasks, lifelong

0

0

0

0

11:46

25/07/2020

Balancing reinforcement learning training experiences in interactive information retrieval

Limin Chen, Zhiwen Tang, Grace Hui Yang

Keywords Paper

deep reinforcement learning, interactive IR, dynamic search

0

0

0

0

10:22

19/04/2021

Active learning for sequence tagging with deep pre-trained models and Bayesian uncertainty estimates

Artem Shelmanov, Dmitri Puzyrev, Lyubov Kupriyanova and
Denis Belyakov, Daniil Larionov, Nikita Khromov, Olga Kozlova, Ekaterina Artemova, Dmitry V. Dylov, Alexander Panchenko

Keywords Paper

0

0

0

0

11:47

03/05/2021

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

Yi Ren, Chenxu Hu, Xu Tan and
Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu

Keywords Paper

end-to-end, non-autoregressive generation, speech synthesis, one-to-many mapping, text to speech

0

0

0

0

7:01

06/12/2020

Information-theoretic Task Selection for Meta-Reinforcement Learning

Ricardo Luna Gutierrez, Matteo Leonetti

Keywords Paper

0

0

0

0

2:57

04/07/2020

Empowering Active Learning to Jointly Optimize System and User Demands

Ji-Ung Lee, Christian M. Meyer, Iryna Gurevych

Keywords Paper

educational application, Active Learning, end-user application, active approach

0

0

0

0

12:00

12/07/2020

Countering Language Drift with Seeded Iterated Learning

Yuchen Lu, Soumye Singhal, Florian Strub and
Aaron Courville, Olivier Pietquin

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

14:25

04/07/2020

Jointly Masked Sequence-to-Sequence Model for Non-Autoregressive Neural Machine Translation

Junliang Guo, Linli Xu, Enhong Chen

Keywords Paper

Non-Autoregressive Translation, natural tasks, non-autoregressive translation~(NAT, non-autoregressive

0

0

0

0

10:47

19/08/2021

Solving Math Word Problems with Teacher Supervision

Zhenwen Liang, Xiangliang Zhang

Keywords Paper

Machine Learning Applications, Applications of Supervised Learning, Question Answering

0

0

0

0

14:41

16/11/2020

Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection

Shaolei Wang, Zhongyuan Wang, Wanxiang Che, Ting Liu

Keywords Paper

disfluency detection, self-supervised techniques, unsupervised paradigm, noisy training

0

0

0

0

9:57

03/05/2021

Meta-learning with negative learning rates

Alberto Bernacchia

Keywords Paper

Meta-learning

0

0

0

0

5:19