Distilling Multiple Domains for Neural Machine Translation

16/11/2020

Distilling Multiple Domains for Neural Machine Translation

Anna Currey, Prashant Mathur, Georgiana Dinu

Keywords: translation, neural translation, multi-domain model, high-resource conditions

Abstract Paper Similar Papers

Abstract: Neural machine translation achieves impressive results in high-resource conditions, but performance often suffers when the input domain is low-resource. The standard practice of adapting a separate model for each domain of interest does not scale well in practice from both a quality perspective (brittleness under domain shift) as well as a cost perspective (added maintenance and inference complexity). In this paper, we propose a framework for training a single multi-domain neural machine translation model that is able to translate several domains without increasing inference time or memory usage. We show that this model can improve translation on both high- and low-resource domains over strong multi-domain baselines. In addition, our proposed model is effective when domain labels are unknown during training, as well as robust under noisy data conditions.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

MetaPerturb: Transferable Regularizer for Heterogeneous Tasks and Architectures

Jeong Un Ryu, JWoong Shin, Hae Beom Lee, Sung Ju Hwang

Keywords Paper

0

0

0

0

3:32

14/06/2020

On the Acceleration of Deep Learning Model Parallelism With Staleness

An Xu, Zhouyuan Huo, Heng Huang

Keywords Paper

layer-wise staleness, asynchronous model parallelism, convolutional neural networks.

0

0

0

0

1:01

06/12/2021

Task-Agnostic Undesirable Feature Deactivation Using Out-of-Distribution Data

Dongmin Park, Hwanjun Song, Minseok Kim, Jae-Gil Lee

Keywords Paper

deep learning, machine learning

0

0

0

0

14:30

02/02/2021

Step-Ahead Error Feedback for Distributed Training with Compressed Gradient

An Xu, Zhouyuan Huo, Heng Huang

Keywords Paper

0

0

0

0

18:26

03/05/2021

Go with the flow: Adaptive control for Neural ODEs

Mathieu Chalvidal, Matthew Ricci, Rufin VanRullen, Thomas Serre

Keywords Paper

Neural ODEs, Normalizing flows, Hypernetworks, Optimal Control Theory

0

0

0

0

5:03

12/07/2020

Confidence-Aware Learning for Deep Neural Networks

Sangheum Hwang, Jooyoung Moon, Jihyo Kim, Younghak Shin

Keywords Paper

Deep Learning - Algorithms

0

0

0

1

14:05

26/04/2020

Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee

Wei Hu, Zhiyuan Li, Dingli Yu

Keywords Paper

deep learning theory, regularization, noisy labels

0

0

0

0

5:13

26/04/2020

Effect of Activation Functions on the Training of Overparametrized Neural Nets

Abhishek Panigrahi, Abhishek Shetty, Navin Goyal

Keywords Paper

activation functions, deep learning theory, neural networks

0

0

0

0

5:13

02/02/2021

Amata: An Annealing Mechanism for Adversarial Training Acceleration

Nanyang Ye, Qianxiao Li, Xiao-Yun Zhou, Zhanxing Zhu

Keywords Paper

0

0

0

0

14:30

26/04/2020

AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty

Dan Hendrycks, Norman Mu, Ekin Dogus Cubuk and
Barret Zoph, Justin Gilmer, Balaji Lakshminarayanan

Keywords Paper

robustness, uncertainty

0

0

0

0

4:38

03/05/2021

Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch

Aojun Zhou, Yukun Ma, Junnan Zhu and
Jianbo Liu, Zhijie Zhang, Kun Yuan, Wenxiu Sun, Hongsheng Li

Keywords Paper

sparsity, efficient training and inference.

0

0

0

0

5:09

06/12/2021

Intrinsic Dimension, Persistent Homology and Generalization in Neural Networks

Tolga Birdal, Aaron Lou, Leonidas Guibas, Umut Simsekli

Keywords Paper

theory, deep learning, optimization

0

0

0

0

14:38

06/12/2020

Top-KAST: Top-K Always Sparse Training

Sid Jayakumar, Razvan Pascanu, Jack Rae and
Simon Osindero, Erich Elsen

Keywords Paper

0

0

0

0

3:18

14/09/2020

Squeezing Correlated Neurons for Resource-Efficient Deep Neural Networks

Elbruz Ozen, Alex Orailoglu

Keywords Paper

deep learning, information redundancy, pruning

0

0

0

0

14:48

16/11/2020

Transformer Based Multi-Source Domain Adaptation

Dustin Wright, Isabelle Augenstein

Keywords Paper

unsupervised adaptation, cnns, rnns, domain classifiers

0

0

0

0

11:30

18/07/2021

Training Adversarially Robust Sparse Networks via Bayesian Connectivity Sampling

Ozan Özdenizci, Robert Legenstein

Keywords Paper

Algorithms, Adversarial Examples

0

0

0

1

6:27

14/06/2020

RL-CycleGAN: Reinforcement Learning Aware Simulation-to-Real

Kanishka Rao, Chris Harris, Alex Irpan and
Sergey Levine, Julian Ibarz, Mohi Khansari

Keywords Paper

robotics, sim2real, cyclegan, reinforcement learning, grasping, q-learning

0

0

0

0

4:55

18/07/2021

Nondeterminism and Instability in Neural Network Optimization

Cecilia Summers, Michael J Dinneen

Keywords Paper

Deep Learning, Optimization for Deep Networks

0

0

0

0

5:12

06/12/2021

CAPE: Encoding Relative Positions with Continuous Augmented Positional Embeddings

Tatiana Likhomanenko, Qiantong Xu, Gabriel Synnaeve and
Ronan Collobert, Alex Rogozhnikov

Keywords Paper

deep learning, transformers

0

0

0

0

13:30

04/07/2020

Improved Natural Language Generation via Loss Truncation

Daniel Kang, Tatsunori Hashimoto

Keywords Paper

Natural Generation, optimization, estimation, distinguishability

0

0

0

0

10:35

08/12/2020

Optimizing Transformer for Low-Resource Neural Machine Translation

Ali Araabi, Christof Monz

Keywords Paper

0

0

0

0

10:02

06/12/2021

Scalable Neural Data Server: A Data Recommender for Transfer Learning

Tianshi Cao, Sasha (Alexandre) Doubov, David Acuna, Sanja Fidler

Keywords Paper

machine learning, vision, transfer learning

0

0

0

0

12:54

14/06/2020

Closed-Loop Matters: Dual Regression Networks for Single Image Super-Resolution

Yong Guo, Jian Chen, Jingdong Wang and
Qi Chen, Jiezhang Cao, Zeshuai Deng, Yanwu Xu, Mingkui Tan

Keywords Paper

computer vision, image super-resolution, dual regression scheme, closed-loop

0

0

0

0

1:01

04/08/2021

Adversarially Robust Low Dimensional Representations

Pranjal Awasthi, Vaggos Chatziafratis, Xue Chen, Aravindan Vijayaraghavan

Keywords Paper

0

0

0

0

20:19

06/12/2020

Semi-Supervised Neural Architecture Search

Renqian Luo, Xu Tan, Rui Wang and
Tao Qin, Enhong Chen, Tie-Yan Liu

Keywords Paper

0

0

0

0

3:20

19/08/2021

A Survey on Low-Resource Neural Machine Translation

Rui Wang, Xu Tan, Renqian Luo and
Tao Qin, Tie-Yan Liu

Keywords Paper

Natural language processing, General, General

0

0

0

0

13:42

12/07/2020

Automated Synthetic-to-Real Generalization

Wuyang Chen, Zhiding Yu, Zhangyang Wang, Anima Anandkumar

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

9:24

04/07/2020

Variational Neural Machine Translation with Normalizing Flows

Hendra Setiawan, Matthias Sperber, Udhyakumar Nallasamy, Matthias Paulik

Keywords Paper

Variational Translation, Variational VNMT, Variational, generation translations

0

0

0

0

7:09

06/12/2021

Adversarial Robustness with Semi-Infinite Constrained Learning

Alexander Robey, Luiz Chamon, George J. Pappas and
Hamed Hassani, Alejandro Ribeiro

Keywords Paper

theory, deep learning, optimization, robustness, adversarial robustness and security

0

0

0

0

14:48

04/07/2020

Location Attention for Extrapolation to Longer Sequences

Yann Dubois, Gautier Dagan, Dieuwke Hupkes, Elia Bruni

Keywords Paper

Extrapolation, natural processing, generalization, Lookup task

0

0

0

0

11:02

14/06/2020

Gradually Vanishing Bridge for Adversarial Domain Adaptation

Shuhao Cui, Shuhui Wang, Junbao Zhuo and
Chi Su, Qingming Huang, Qi Tian

Keywords Paper

bridge, domain adaptation, adversarial learning

0

0

0

0

1:01

14/06/2020

DeepDeform: Learning Non-Rigid RGB-D Reconstruction With Semi-Supervised Data

Aljaž Božič, Michael Zollhöfer, Christian Theobalt, Matthias Nießner

Keywords Paper

non-rigid reconstruction, non-rigid tracking, dataset, benchmark, correspondence prediction, heatmap network, rgb-d, single camera, least squares optimization

0

0

0

0

1:00

08/12/2020

Meet Changes with Constancy: Learning Invariance in Multi-Source Translation

Jianfeng Liu, Ling Luo, Xiang Ao and
Yan Song, Haoran Xu, Jian Ye

Keywords Paper

0

0

0

0

13:35

03/05/2021

Share or Not? Learning to Schedule Language-Specific Capacity for Multilingual Translation

Biao Zhang, Ankur Bapna, Rico Sennrich, Orhan Firat

Keywords Paper

multilingual transformer, multilingual translation, language-specific modeling, conditional computation

0

0

0

0

15:04

06/12/2021

Attention over Learned Object Embeddings Enables Complex Visual Reasoning

David Ding, Felix Hill, Adam Santoro and
Malcolm Reynolds, Matt Botvinick

Keywords Paper

deep learning, transformers, vision

0

0

0

0

18:51

03/05/2021

Revisiting Locally Supervised Learning: an Alternative to End-to-end Training

Yulin Wang, Zanlin Ni, Shiji Song and
Le Yang, Gao Huang

Keywords Paper

Deep learning, Locally supervised training

1

0

0

1

5:03

19/04/2021

Zero-shot neural passage retrieval via domain-targeted synthetic question generation

Ji Ma, Ivan Korotkov, Yinfei Yang and
Keith Hall, Ryan McDonald

Keywords Paper

0

0

0

0

12:47

26/04/2020

MACER: Attack-free and Scalable Robust Training via Maximizing Certified Radius

Runtian Zhai, Chen Dan, Di He and
Huan Zhang, Boqing Gong, Pradeep Ravikumar, Cho-Jui Hsieh, Liwei Wang

Keywords Paper

Adversarial Robustness, Provable Adversarial Defense, Randomized Smoothing, Robustness Certification

0

0

0

0

5:10

06/12/2020

Learning Invariances in Neural Networks from Training Data

Greg Benton, Marc Finzi, Pavel Izmailov, Andrew Wilson

Keywords Paper

0

0

0

0

3:03

06/12/2021

Network-to-Network Regularization: Enforcing Occam's Razor to Improve Generalization

Rohan Ghosh, Mehul Motani

Keywords Paper

theory, deep learning, machine learning

0

0

0

0

14:07