Reproducible and Efficient Benchmarks for Hyperparameter Optimization of Neural Machine Translation Systems

16/11/2020

Reproducible and Efficient Benchmarks for Hyperparameter Optimization of Neural Machine Translation Systems

Xuan Zhang, Kevin Duh

Keywords: hyperparameter selection, neural systems, automatic optimization, nmt

Abstract Paper Similar Papers

Abstract: Hyperparameter selection is a crucial part of building neural machine translation (NMT) systems across both academia and industry. Fine-grained adjustments to a model′s architecture or training recipe can mean the difference between a positive and negative research result or between a state-of-the-art and underperforming system. While recent literature has proposed methods for automatic hyperparameter optimization (HPO), there has been limited work on applying these methods to neural machine translation (NMT), due in part to the high costs associated with experiments that train large numbers of model variants. To facilitate research in this space, we introduce a lookup-based approach that uses a library of pre-trained models for fast, low cost HPO experimentation. Our contributions include (1) the release of a large collection of trained NMT models covering a wide range of hyperparameters, (2) the proposal of targeted metrics for evaluating HPO methods on NMT, and (3) a reproducible benchmark of several HPO methods against our model library, including novel graph-based and multiobjective methods.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at EMNLP 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training

Chen Zhu, Renkun Ni, Zheng Xu and
Kezhi Kong, W. Ronny Huang, Tom Goldstein

Keywords Paper

deep learning, transformers, vision

0

0

0

0

13:17

05/04/2021

Pipelined Backpropagation at Scale: Training Large Models without Batches

Atli Kosson, Vitaliy Chiley, Abhi Venigalla and
Joel Hestness, Urs Koster

Keywords Paper

0

0

0

0

4:14

05/04/2021

Pipelined Backpropagation at Scale: Training Large Models without Batches

Atli Kosson, Vitaliy Chiley, Abhi Venigalla and
Joel Hestness, Urs Koster

Keywords Paper

0

0

0

0

18:00

06/12/2021

NAS-Bench-x11 and the Power of Learning Curves

Shen Yan, Colin White, Yash Savani, Frank Hutter

Keywords Paper

deep learning

0

0

0

0

14:03

18/07/2021

Meta-learning Hyperparameter Performance Prediction with Neural Processes

Ying WEI, Peilin Zhao, Junzhou Huang

Keywords Paper

Algorithms, Supervised Learning

0

0

0

0

5:07

12/07/2020

Small Data, Big Decisions: Model Selection in the Small-Data Regime

Jorg Bornschein, Francesco Visin, Simon Osindero

Keywords Paper

Deep Learning - General

0

0

0

0

11:47

14/06/2020

Computing the Testing Error Without a Testing Set

Ciprian A. Corneanu, Sergio Escalera, Aleix M. Martinez

Keywords Paper

deep learning, algebraic topology, generalization, object recognition, facial analysis, semantic segmentation

0

0

0

0

4:43

19/08/2021

A Survey on Low-Resource Neural Machine Translation

Rui Wang, Xu Tan, Renqian Luo and
Tao Qin, Tie-Yan Liu

Keywords Paper

Natural language processing, General, General

0

0

0

0

13:42

13/04/2021

Faster & more reliable tuning of neural networks: Bayesian optimization with importance sampling

Setareh Ariafar, Zelda Mariet, Dana Brooks and
Jennifer Dy, Jasper Snoek

Keywords Paper

0

0

0

0

3:01

12/07/2020

Evolving Machine Learning Algorithms From Scratch

Esteban Real, Chen Liang, David So, Quoc Le

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

15:01

03/05/2021

Dataset Condensation with Gradient Matching

Bo ZHAO, Konda Reddy Mopuri, Hakan Bilen

Keywords Paper

dataset condensation, image generation, data-efficient learning

0

0

0

0

15:09

12/07/2020

Learning to Rank Learning Curves

Martin Wistuba, Tejaswini Pedapati

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

15:34

08/12/2020

Dynamic Curriculum Learning for Low-Resource Neural Machine Translation

Chen Xu, Bojie Hu, Yufan Jiang and
Kai Feng, Zeyang Wang, Shen Huang, Qi Ju, Tong Xiao, Jingbo Zhu

Keywords Paper

0

0

0

0

13:28

03/05/2021

Dataset Meta-Learning from Kernel Ridge-Regression

Timothy Nguyen, Zhourong Chen, Jaehoon Lee

Keywords Paper

dataset corruption, infinite-width networks, neural kernels, kernel-ridge regression, dataset compression, dataset distillation, meta-learning

0

0

0

0

4:59

26/04/2020

Once for All: Train One Network and Specialize it for Efficient Deployment

Han Cai, Chuang Gan, Tianzhe Wang and
Zhekai Zhang, Song Han

Keywords Paper

Efficient Deep Learning, Specialized Neural Network Architecture, AutoML

0

0

0

0

4:53

02/02/2021

Any-Precision Deep Neural Networks

Haichao Yu, Haoxiang Li, Humphrey Shi and
Thomas S. Huang, Gang Hua

Keywords Paper

0

0

0

0

14:26

06/12/2020

Efficient Algorithms for Device Placement of DNN Graph Operators

Jakub Tarnawski, Amar Phanishayee, Nikhil Devanur and
Divya Mahajan, Fanny Nina Paravecino

Keywords Paper

0

0

1

0

3:20

18/07/2021

Opening the Blackbox: Accelerating Neural Differential Equations by Regularizing Internal Solver Heuristics

Avik Pal, Yingbo Ma, Viral Shah, Christopher Rackauckas

Keywords Paper

Deep Learning

0

0

0

0

5:11

02/02/2021

Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks

Xiangyu Chang, Yingcong Li, Samet Oymak, Christos Thrampoulidis

Keywords Paper

0

0

0

0

18:14

14/09/2020

Active Learning for Hierarchical Multi-Label Classification

Felipe Kenji Nakano, Ricardo Cerri, Vens Celin

Keywords Paper

0

0

0

0

15:42

19/08/2021

Hardware-Aware Neural Architecture Search: Survey and Taxonomy

Hadjer Benmeziane, Kaoutar El Maghraoui, Hamza Ouarnoughi and
Smail Niar, Martin Wistuba, Naigang Wang

Keywords Paper

Machine learning, General, General, General

0

0

0

0

14:12

06/12/2021

MEST: Accurate and Fast Memory-Economic Sparse Training Framework on the Edge

Geng Yuan, Xiaolong Ma, Wei Niu and
Zhengang Li, Zhenglun Kong, Ning Liu, Yifan Gong, Zheng Zhan, Chaoyang He, Qing Jin, Siyue Wang, Minghai Qin, Bin Ren, Yanzhi Wang, Sijia Liu, Xue Lin

Keywords Paper

deep learning, reinforcement learning and planning

0

0

0

0

15:00

06/12/2021

BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer

Haoping Bai, Meng Cao, Ping Huang, Jiulong Shan

Keywords Paper

deep learning, optimization

0

0

0

0

4:12

04/07/2020

Deep Contextualized Self-training for Low Resource Dependency Parsing

Guy Rotman, Roi Reichart

Keywords Paper

Low Parsing, sequence tasks, Deep Self-training, Neural parsing

0

0

0

0

11:41

06/12/2021

Learning Transferable Adversarial Perturbations

Krishna kanth Nakka, Mathieu Salzmann

Keywords Paper

deep learning, optimization, adversarial robustness and security

0

0

0

0

12:00

16/11/2020

On the Sparsity of Neural Machine Translation Models

Yong Wang, Longyue Wang, Victor Li, Zhaopeng Tu

Keywords Paper

nmt architectures, over-parameterization, underutilization resources, redundant parameters

0

0

0

0

6:18

03/05/2021

Complex Query Answering with Neural Link Predictors

Erik Arakelyan, Daniel Daza, Pasquale Minervini, Michael Cochez

Keywords Paper

neural link prediction, complex query answering

0

0

0

0

15:28

06/12/2021

Differentiable Optimization of Generalized Nondecomposable Functions using Linear Programs

Zihang Meng, Lopamudra Mukherjee, Yichao Wu and
Vikas Singh, Sathya Narayanan Ravi

Keywords Paper

deep learning, optimization

0

0

0

0

13:21

14/06/2020

F-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation

Konstantin Sofiiuk, Ilia Petrov, Olga Barinova, Anton Konushin

Keywords Paper

interactive segmentation, interactive, instance segmentation, segmentation, backpropagating refinement, refinement

0

0

0

0

4:56

03/05/2021

Theoretical bounds on estimation error for meta-learning

James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI and
Toniann Pitassi, Richard Zemel

Keywords Paper

meta learning, minimax risk, few-shot, lower bounds, learning theory

0

0

0

0

4:46

18/07/2021

Sparsifying Networks via Subdifferential Inclusion

Sagar Verma, Jean-Christophe Pesquet

Keywords Paper

Optimization, Convex Optimization

0

0

0

0

5:10

18/07/2021

Training Data Subset Selection for Regression with Controlled Generalization Error

Durga S, Rishabh Iyer, Ganesh Ramakrishnan, Abir De

Keywords Paper

, Algorithms, Online Learning, Algorithms, Supervised Learning

0

0

0

0

4:15

06/12/2021

Near-Optimal Multi-Perturbation Experimental Design for Causal Structure Learning

Scott Sussex, Caroline Uhler, Andreas Krause

Keywords Paper

causality

0

0

0

0

14:14

03/05/2021

Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch

Aojun Zhou, Yukun Ma, Junnan Zhu and
Jianbo Liu, Zhijie Zhang, Kun Yuan, Wenxiu Sun, Hongsheng Li

Keywords Paper

sparsity, efficient training and inference.

0

0

0

0

5:09

03/05/2021

Meta-Learning with Neural Tangent Kernels

Yufan Zhou, Zhenyi Wang, Jiayi Xian and
Changyou Chen, Jinhui Xu

Keywords Paper

neural tangent kernel, meta-learning

0

0

0

0

3:54

02/02/2021

Improving Model Robustness by Adaptively Correcting Perturbation Levels with Active Queries

Kun-Peng Ning, Lue Tao, Songcan Chen, Sheng-Jun Huang

Keywords Paper

0

1

0

0

16:10

06/12/2021

Meta-learning to Improve Pre-training

Aniruddh Raghu, Jonathan Lorraine, Simon Kornblith and
Matthew McDermott, David Duvenaud

Keywords Paper

deep learning, optimization, graph learning, meta learning

0

0

0

0

12:57

02/02/2021

Meta-Transfer Learning for Low-Resource Abstractive Summarization

Yi-Syuan Chen, Hong-Han Shuai

Keywords Paper

0

0

0

0

19:10

06/12/2021

Practical, Provably-Correct Interactive Learning in the Realizable Setting: The Power of True Believers

Julian Katz-Samuels, Blake Mason, Kevin Jamieson, Rob Nowak

Keywords Paper

theory, machine learning, bandits, kernel methods, active learning

0

0

0

0

7:41

02/02/2021

Finding Sparse Structures for Domain Specific Neural Machine Translation

Jianze Liang, Chengqi Zhao, Mingxuan Wang and
Xipeng Qiu, Lei Li

Keywords Paper

0

0

0

0

14:45