Meta-learning to Improve Pre-training

06/12/2021

Meta-learning to Improve Pre-training

Aniruddh Raghu, Jonathan Lorraine, Simon Kornblith, Matthew McDermott, David Duvenaud

Keywords: deep learning, optimization, graph learning, meta learning

Abstract Paper Similar Papers

Abstract: Pre-training (PT) followed by fine-tuning (FT) is an effective method for training neural networks, and has led to significant performance improvements in many domains. PT can incorporate various design choices such as task and data reweighting strategies, augmentation policies, and noise models, all of which can significantly impact the quality of representations learned. The hyperparameters introduced by these strategies therefore must be tuned appropriately. However, setting the values of these hyperparameters is challenging. Most existing methods either struggle to scale to high dimensions, are too slow and memory-intensive, or cannot be directly applied to the two-stage PT and FT learning process. In this work, we propose an efficient, gradient-based algorithm to meta-learn PT hyperparameters. We formalize the PT hyperparameter optimization problem and propose a novel method to obtain PT hyperparameter gradients by combining implicit differentiation and backpropagation through unrolled optimization. We demonstrate that our method improves predictive performance on two real-world domains. First, we optimize high-dimensional task weighting hyperparameters for multitask pre-training on protein-protein interaction graphs and improve AUROC by up to 3.9%. Second, we optimize a data augmentation neural network for self-supervised PT with SimCLR on electrocardiography data and improve AUROC by up to 1.9%.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Adversarial Reweighting for Partial Domain Adaptation

Xiang Gu, Xi Yu, yan yang and
Jian Sun, Zongben Xu

Keywords Paper

domain adaptation

0

0

0

1

9:03

13/04/2021

On the importance of hyperparameter optimization for model-based reinforcement learning

Baohe Zhang, Raghu Rajan, Luis Pineda and
Nathan Lambert, André Biedenkapp, Kurtland Chua, Frank Hutter, Roberto Calandra

Keywords Paper

0

0

0

0

2:59

12/07/2020

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

13:21

18/07/2021

Improving Generalization in Meta-learning via Task Augmentation

Huaxiu Yao, Long-Kai Huang, Linjun Zhang and
Ying WEI, Li Tian, James Zou, Junzhou Huang, Zhenhui (Jessie) Li

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

8:27

12/07/2020

Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data

Felipe Petroski Such, Aditya Rawal, Joel Lehman and
Kenneth Stanley, Jeffrey Clune

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

7:25

13/04/2021

When MAML can adapt fast and how to assist when it cannot

Sébastien M. R. Arnold, Shariq Iqbal, Fei Sha

Keywords Paper

0

0

0

0

3:00

06/12/2020

Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians

Juhan Bae, Roger Grosse

Keywords Paper

0

0

0

0

3:20

06/12/2021

Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems

Subhabrata Dutta, Tanya Gautam, Soumen Chakrabarti, Tanmoy Chakraborty

Keywords Paper

deep learning, transformers

0

0

0

0

11:54

18/07/2021

XOR-CD: Linearly Convergent Constrained Structure Generation

Fan Ding, Jianzhu Ma, Jinbo Xu, Yexiang Xue

Keywords Paper

Probabilistic Methods

0

0

0

0

5:14

02/02/2021

GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning

Krishnateja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, Rishabh Iyer

Keywords Paper

0

0

0

0

19:14

03/05/2021

Meta-Learning with Neural Tangent Kernels

Yufan Zhou, Zhenyi Wang, Jiayi Xian and
Changyou Chen, Jinhui Xu

Keywords Paper

neural tangent kernel, meta-learning

0

0

0

0

3:54

03/05/2021

DDPNOpt: Differential Dynamic Programming Neural Optimizer

Guan-Horng Liu, Tianrong Chen, Evangelos Theodorou

Keywords Paper

differential dynamica programming, trajectory optimization, deep learning training, optimal control

0

0

0

0

10:02

06/12/2020

Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search

Houwen Peng, Hao Du, Hongyuan Yu and
QI LI, Jing Liao, Jianlong Fu

Keywords Paper

0

0

0

0

3:12

02/02/2021

Improving Model Robustness by Adaptively Correcting Perturbation Levels with Active Queries

Kun-Peng Ning, Lue Tao, Songcan Chen, Sheng-Jun Huang

Keywords Paper

0

1

0

0

16:10

18/07/2021

Model Performance Scaling with Multiple Data Sources

Tatsunori Hashimoto

Keywords Paper

Algorithms, Supervised Learning

0

0

0

1

4:50

02/02/2021

High Dimensional Level Set Estimation with Bayesian Neural Network

Huong Ha, Sunil Gupta, Santu Rana, Svetha Venkatesh

Keywords Paper

0

0

0

0

19:14

13/04/2021

Faster & more reliable tuning of neural networks: Bayesian optimization with importance sampling

Setareh Ariafar, Zelda Mariet, Dana Brooks and
Jennifer Dy, Jasper Snoek

Keywords Paper

0

0

0

0

3:01

06/12/2021

Scalable Rule-Based Representation Learning for Interpretable Classification

Zhuo Wang, Wei Zhang, Ning Liu, Jianyong Wang

Keywords Paper

optimization, machine learning, representation learning, interpretability

0

0

0

0

14:52

02/02/2021

Distribution Adaptive INT8 Quantization for Training CNNs

Kang Zhao, Sida Huang, Pan Pan and
Yinghan Li, Yingya Zhang, Zhenyu Gu, Yinghui Xu

Keywords Paper

0

0

0

0

16:42

06/12/2020

Training Stronger Baselines for Learning to Optimize

Tianlong Chen, Weiyi Zhang, Zhou Jingyang and
Shiyu Chang, Sijia Liu, Lisa Amini, Zhangyang Wang

Keywords Paper

0

0

0

0

3:18

06/12/2020

Learning to Execute Programs with Instruction Pointer Attention Graph Neural Networks

David Bieber, Charles Sutton, Hugo Larochelle, Daniel Tarlow

Keywords Paper

0

0

0

0

3:20

03/05/2021

Understanding the effects of data parallelism and sparsity on neural network training

Namhoon Lee, Thalaiyasingam Ajanthan, Philip Torr, Martin Jaggi

Keywords Paper

sparsity, neural network training, data parallelism

0

0

0

0

4:52

06/12/2020

Memory-Efficient Learning of Stable Linear Dynamical Systems for Prediction and Control

Giorgos Mamakoukas, Orest Xherija, Todd Murphey

Keywords Paper

Optimization -> Non-Convex Optimization, Optimization -> Stochastic Optimization

0

0

0

0

3:13

03/05/2021

Neurally Augmented ALISTA

Freya Behrens, Jonathan Sauder, Peter Jung

Keywords Paper

learned ISTA, unrolled algorithms, compressed sensing, sparse reconstruction

0

0

0

0

5:18

02/02/2021

Evolutionary Approach for AutoAugment Using the Thermodynamical Genetic Algorithm

Akira Terauchi, Naoki Mori

Keywords Paper

0

0

0

0

17:42

18/07/2021

Training Data Subset Selection for Regression with Controlled Generalization Error

Durga S, Rishabh Iyer, Ganesh Ramakrishnan, Abir De

Keywords Paper

, Algorithms, Online Learning, Algorithms, Supervised Learning

0

0

0

0

4:15

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

13/04/2021

Learning to defend by learning to attack

Haoming Jiang, Zhehui Chen, Yuyang Shi and
Bo Dai, Tuo Zhao

Keywords Paper

0

0

0

0

2:58

06/12/2021

$(\textrm{Implicit})^2$: Implicit Layers for Implicit Representations

Zhichun Huang, Shaojie Bai, J. Zico Kolter

Keywords Paper

deep learning, representation learning

1

0

0

1

12:23

26/04/2020

Model-based reinforcement learning for biological sequence design

Christof Angermueller, David Dohan, David Belanger and
Ramya Deshpande, Kevin Murphy, Lucy Colwell

Keywords Paper

reinforcement learning, blackbox optimization, molecule design

0

0

0

0

6:10

19/08/2021

DACBench: A Benchmark Library for Dynamic Algorithm Configuration

Theresa Eimer, André Biedenkapp, Maximilian Reimer and
Steven Adriansen, Frank Hutter, Marius Lindauer

Keywords Paper

Heuristic Search and Game Playing, Evaluation and Analysis, Heuristic Search and Machine Learning, Meta-Reasoning and Meta-Heuristics

0

0

0

0

13:51

18/07/2021

Sparsifying Networks via Subdifferential Inclusion

Sagar Verma, Jean-Christophe Pesquet

Keywords Paper

Optimization, Convex Optimization

0

0

0

0

5:10

06/12/2020

Bayesian Optimization for Iterative Learning

Vu Nguyen, Sebastian Schulze, Michael A Osborne

Keywords Paper

0

0

0

0

3:19

26/04/2020

Meta-Learning Acquisition Functions for Transfer Learning in Bayesian Optimization

Michael Volpp, Lukas P. Fröhlich, Kirsten Fischer and
Andreas Doerr, Stefan Falkner, Frank Hutter, Christian Daniel

Keywords Paper

Transfer Learning, Meta Learning, Bayesian Optimization, Reinforcement Learning

0

0

0

0

5:05

16/11/2020

A Diagnostic Study of Explainability Techniques for Text Classification

Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein

Keywords Paper

downstream tasks, machine learning, explainability techniques, diverse techniques

0

0

0

0

11:24

30/11/2020

Large-Scale Cross-Domain Few-Shot Learning

Jiechao Guan, Manli Zhang, Zhiwu Lu

Keywords Paper

0

0

0

0

7:26

26/04/2020

Strategies for Pre-training Graph Neural Networks

Weihua Hu, Bowen Liu, Joseph Gomes and
Marinka Zitnik, Percy Liang, Vijay Pande, Jure Leskovec

Keywords Paper

Pre-training, Transfer learning, Graph Neural Networks

0

0

0

0

4:56

13/04/2021

Stochastic polyak step-size for SGD: An adaptive learning rate for fast convergence

Nicolas Loizou, Sharan Vaswani, Issam Hadj Laradji, Simon Lacoste-Julien

Keywords Paper

0

0

0

0

3:30

18/07/2021

Optimization Planning for 3D ConvNets

Zhaofan Qiu, Ting Yao, Chong-Wah Ngo, Tao Mei

Keywords Paper

Applications, Activity and Event Recognition

0

0

0

0

5:13

18/07/2021

ActNN: Reducing Training Memory Footprint via 2-Bit Activation Compressed Training

Jianfei Chen, Lianmin Zheng, Zhewei Yao and
Dequan Wang, Ion Stoica, Michael Mahoney, Joseph E Gonzalez

Keywords Paper

Algorithms, Large Scale Learning

0

0

0

0

18:54