Finding Sparse Structures for Domain Specific Neural Machine Translation

02/02/2021

Finding Sparse Structures for Domain Specific Neural Machine Translation

Jianze Liang, Chengqi Zhao, Mingxuan Wang, Xipeng Qiu, Lei Li

Keywords:

Abstract Paper Similar Papers

Abstract: Neural machine translation often adopts the fine-tuning approach to adapt to specific domains. However, nonrestricted fine-tuning can easily degrade on the general domain and over-fit to the target domain. To mitigate the issue, we propose Prune-Tune, a novel domain adaptation method via gradual pruning. It learns tiny domain-specific sub-networks during fine-tuning on new domains. Prune-Tune alleviates the over-fitting and the degradation problem without model modification. Furthermore, Prune-Tune is able to sequentially learn a single network with multiple disjoint domain-specific sub-networks for multiple domains. Empirical experiment results show that Prune-Tune outperforms several strong competitors in the target domain test set without sacrificing the quality on the general domain in both single and multi-domain settings. The source code and data are available at https://github.com/ohlionel/Prune-Tune.

The video of this talk cannot be embedded. You can watch it here:

https://slideslive.com/38948417

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AAAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training

Chen Zhu, Renkun Ni, Zheng Xu and
Kezhi Kong, W. Ronny Huang, Tom Goldstein

Keywords Paper

deep learning, transformers, vision

0

0

0

0

13:17

08/12/2020

Meet Changes with Constancy: Learning Invariance in Multi-Source Translation

Jianfeng Liu, Ling Luo, Xiang Ao and
Yan Song, Haoran Xu, Jian Ye

Keywords Paper

0

0

0

0

13:35

05/04/2021

Pipelined Backpropagation at Scale: Training Large Models without Batches

Atli Kosson, Vitaliy Chiley, Abhi Venigalla and
Joel Hestness, Urs Koster

Keywords Paper

0

0

0

0

4:14

05/04/2021

Pipelined Backpropagation at Scale: Training Large Models without Batches

Atli Kosson, Vitaliy Chiley, Abhi Venigalla and
Joel Hestness, Urs Koster

Keywords Paper

0

0

0

0

18:00

04/07/2020

Deep Contextualized Self-training for Low Resource Dependency Parsing

Guy Rotman, Roi Reichart

Keywords Paper

Low Parsing, sequence tasks, Deep Self-training, Neural parsing

0

0

0

0

11:41

06/12/2020

Model Fusion via Optimal Transport

Sidak Pal Singh, Martin Jaggi

Keywords Paper

1

0

0

1

3:10

16/11/2020

Transformer Based Multi-Source Domain Adaptation

Dustin Wright, Isabelle Augenstein

Keywords Paper

unsupervised adaptation, cnns, rnns, domain classifiers

0

0

0

0

11:30

06/12/2021

Improving Compositionality of Neural Networks by Decoding Representations to Inputs

Mike Wu, Noah Goodman, Stefano Ermon

Keywords Paper

deep learning, machine learning, adversarial robustness and security, generative model

0

0

0

0

12:36

03/05/2021

Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation

Jungo Kasai, Nikolaos Pappas, Hao Peng and
James Cross, Noah Smith

Keywords Paper

Machine Translation, Sequence Modeling, Natural Language Processing

0

0

0

0

5:04

03/05/2021

Conditional Generative Modeling via Learning the Latent Space

Sameera Ramasinghe, Kanchana Ranasinghe, Salman Khan and
Nick Barnes, Stephen Gould

Keywords Paper

Generative Modeling, Conditional Generation, Multimodal Spaces

0

0

0

0

4:57

04/07/2020

Variational Neural Machine Translation with Normalizing Flows

Hendra Setiawan, Matthias Sperber, Udhyakumar Nallasamy, Matthias Paulik

Keywords Paper

Variational Translation, Variational VNMT, Variational, generation translations

0

0

0

0

7:09

26/04/2020

Reducing Transformer Depth on Demand with Structured Dropout

Angela Fan, Edouard Grave, Armand Joulin

Keywords Paper

reduction, regularization, pruning, dropout, transformer

0

0

0

0

5:01

06/12/2021

Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems

Subhabrata Dutta, Tanya Gautam, Soumen Chakrabarti, Tanmoy Chakraborty

Keywords Paper

deep learning, transformers

0

0

0

0

11:54

02/02/2021

Any-Precision Deep Neural Networks

Haichao Yu, Haoxiang Li, Humphrey Shi and
Thomas S. Huang, Gang Hua

Keywords Paper

0

0

0

0

14:26

14/06/2020

Computing the Testing Error Without a Testing Set

Ciprian A. Corneanu, Sergio Escalera, Aleix M. Martinez

Keywords Paper

deep learning, algebraic topology, generalization, object recognition, facial analysis, semantic segmentation

0

0

0

0

4:43

26/04/2020

Domain Adaptive Multibranch Networks

Róger Bermúdez-Chacón, Mathieu Salzmann, Pascal Fua

Keywords Paper

Domain Adaptation, Computer Vision

0

0

0

0

5:26

18/07/2021

Sparsifying Networks via Subdifferential Inclusion

Sagar Verma, Jean-Christophe Pesquet

Keywords Paper

Optimization, Convex Optimization

0

0

0

0

5:10

26/04/2020

Generalization through Memorization: Nearest Neighbor Language Models

Urvashi Khandelwal, Omer Levy, Dan Jurafsky and
Luke Zettlemoyer, Mike Lewis

Keywords Paper

language models, k-nearest neighbors

0

0

0

0

4:56

16/11/2020

Distilling Multiple Domains for Neural Machine Translation

Anna Currey, Prashant Mathur, Georgiana Dinu

Keywords Paper

translation, neural translation, multi-domain model, high-resource conditions

0

0

0

0

12:15

14/09/2020

Squeezing Correlated Neurons for Resource-Efficient Deep Neural Networks

Elbruz Ozen, Alex Orailoglu

Keywords Paper

deep learning, information redundancy, pruning

0

0

0

0

14:48

18/07/2021

f-Domain Adversarial Learning: Theory and Algorithms

David Acuna, Guojun Zhang, Marc Law, Sanja Fidler

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

5:17

03/05/2021

Auxiliary Task Update Decomposition: The Good, the Bad and the Neutral

Lucio Dery, Yann Dauphin, David Grangier

Keywords Paper

multitask learning, deeplearning, pre-training, gradient decomposition

0

0

0

0

5:22

19/10/2020

Flexible IR pipelines with capreolus

Andrew Yates, Kevin Martin Jose, Xinyu Zhang, Jimmy Lin

Keywords Paper

neural information retrieval, retrieval pipeline, ad hoc ranking

0

0

0

0

10:00

12/07/2020

Operation-Aware Soft Channel Pruning using Differentiable Masks

Minsoo Kang, Bohyung Han

Keywords Paper

Applications - Computer Vision

0

0

0

0

14:56

03/05/2021

Meta-Learning with Neural Tangent Kernels

Yufan Zhou, Zhenyi Wang, Jiayi Xian and
Changyou Chen, Jinhui Xu

Keywords Paper

neural tangent kernel, meta-learning

0

0

0

0

3:54

16/11/2020

Reproducible and Efficient Benchmarks for Hyperparameter Optimization of Neural Machine Translation Systems

Xuan Zhang, Kevin Duh

Keywords Paper

hyperparameter selection, neural systems, automatic optimization, nmt

0

0

0

0

11:38

03/05/2021

Dataset Meta-Learning from Kernel Ridge-Regression

Timothy Nguyen, Zhourong Chen, Jaehoon Lee

Keywords Paper

dataset corruption, infinite-width networks, neural kernels, kernel-ridge regression, dataset compression, dataset distillation, meta-learning

0

0

0

0

4:59

03/05/2021

Generalized Multimodal ELBO

Thomas Sutter, Imant Daunhawer, Julia E Vogt

Keywords Paper

self-supervised, generative learning, ELBO, VAE, Multimodal

0

0

0

0

5:15

04/07/2020

Harvesting and Refining Question-Answer Pairs for Unsupervised QA

Zhongli Li, Wenhui Wang, Li Dong and
Furu Wei, Ke Xu

Keywords Paper

Unsupervised QA, Question Answering, Question QA, QA

0

0

0

0

10:28

18/07/2021

Learning Neural Network Subspaces

Mitchell Wortsman, Maxwell Horton, Carlos Guestrin and
Ali Farhadi, Mohammad Rastegari

Keywords Paper

Deep Learning, Applications, Dialog- or Communication-Based Learning, Algorithms, Representation Learning

0

0

0

0

5:07

06/12/2021

Learning Transferable Adversarial Perturbations

Krishna kanth Nakka, Mathieu Salzmann

Keywords Paper

deep learning, optimization, adversarial robustness and security

0

0

0

0

12:00

12/07/2020

Incremental Sampling Without Replacement for Sequence Models

Kensen Shi, David Bieber, Charles Sutton

Keywords Paper

Sequential, Network, and Time-Series Modeling

0

0

0

0

14:46

12/07/2020

Evolving Machine Learning Algorithms From Scratch

Esteban Real, Chen Liang, David So, Quoc Le

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

15:01

05/04/2021

CODE: Compiler-based Neuron-aware Ensemble training

Ettore M. G. Trainiti, Thanapon Noraset, David Demeter and
Doug Downey, Simone Campanoni Campanoni

Keywords Paper

Neuroscience and Cognitive Science -> Memory, Neuroscience and Cognitive Science -> Plasticity and Adaptation

0

0

0

0

18:48

05/01/2021

G2D: Generate to Detect Anomaly

Masoud Pourreza, Bahram Mohammadi, Mostafa Khaki and
Samir Bouindour, Hichem Snoussi, Mohammad Sabokrou

Keywords Paper

0

0

0

0

5:12

03/05/2021

In Search of Lost Domain Generalization

Ishaan Gulrajani, David Lopez-Paz

Keywords Paper

reproducible research, domain generalization

0

0

0

0

5:38

03/05/2021

Meta Back-Translation

Hieu Pham, Xinyi Wang, Yiming Yang, Graham Neubig

Keywords Paper

back translation, machine translation, meta learning

0

0

0

0

5:07

03/05/2021

Neural Pruning via Growing Regularization

Huan Wang, Can Qin, Yulun Zhang, Yun Fu

Keywords Paper

deep neural network pruning, regularization, Hessian matrix, model compression

0

0

0

0

6:15

06/12/2021

On Calibration and Out-of-Domain Generalization

Yoav Wald, Amir Feder, Daniel Greenfeld, Uri Shalit

Keywords Paper

machine learning, domain adaptation, causality

0

0

0

0

11:00

22/11/2021

FFNB: Forgetting-Free Neural Blocks for Deep Continual Learning

Hichem Sahbi, Haoming Zhan

Keywords Paper

Continual and incremental learning, lifelong learning, catastrophic interference, catastrophic forgetting, dynamic neural networks, visual recognition

0

0

0

0

3:05