Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML

26/04/2020

Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML

Aniruddh Raghu, Maithra Raghu, Samy Bengio, Oriol Vinyals

Keywords: deep learning analysis, representation learning, meta-learning, few-shot learning

Abstract Paper Similar Papers

Abstract: An important research direction in machine learning has centered around developing meta-learning algorithms to tackle few-shot learning. An especially successful algorithm has been Model Agnostic Meta-Learning (MAML), a method that consists of two optimization loops, with the outer loop finding a meta-initialization, from which the inner loop can efficiently learn new tasks. Despite MAML's popularity, a fundamental open question remains -- is the effectiveness of MAML due to the meta-initialization being primed for rapid learning (large, efficient changes in the representations) or due to feature reuse, with the meta initialization already containing high quality features? We investigate this question, via ablation studies and analysis of the latent representations, finding that feature reuse is the dominant factor. This leads to the ANIL (Almost No Inner Loop) algorithm, a simplification of MAML where we remove the inner loop for all but the (task-specific) head of the underlying neural network. ANIL matches MAML's performance on benchmark few-shot image classification and RL and offers computational improvements over MAML. We further study the precise contributions of the head and body of the network, showing that performance on the test tasks is entirely determined by the quality of the learned features, and we can remove even the head of the network (the NIL algorithm). We conclude with a discussion of the rapid learning vs feature reuse question for meta-learning algorithms more broadly.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICLR 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

03/05/2021

Meta-Learning with Neural Tangent Kernels

Yufan Zhou, Zhenyi Wang, Jiayi Xian and
Changyou Chen, Jinhui Xu

Keywords Paper

neural tangent kernel, meta-learning

0

0

0

0

3:54

18/07/2021

Training Data Subset Selection for Regression with Controlled Generalization Error

Durga S, Rishabh Iyer, Ganesh Ramakrishnan, Abir De

Keywords Paper

, Algorithms, Online Learning, Algorithms, Supervised Learning

0

0

0

0

4:15

03/05/2021

On Fast Adversarial Robustness Adaptation in Model-Agnostic Meta-Learning

Ren Wang, Kaidi Xu, Sijia Liu and
Pin-Yu Chen, Lily Weng, Chuang Gan, Meng Wang

Keywords Paper

0

0

0

0

5:12

03/05/2021

Few-Shot Bayesian Optimization with Deep Kernel Surrogates

Martin Wistuba, Josif Grabocka

Keywords Paper

automl, bayesian optimization, metalearning, few-shot learning

0

0

0

0

5:18

06/12/2021

Towards Sample-efficient Overparameterized Meta-learning

Yue Sun, Adhyyan Narang, Ibrahim Gulluk and
Samet Oymak, Maryam Fazel

Keywords Paper

theory, machine learning, meta learning, representation learning, few shot learning

0

0

0

0

13:54

06/12/2020

Modeling and Optimization Trade-off in Meta-learning

Katelyn Gao, Ozan Sener

Keywords Paper

0

0

0

0

3:21

30/11/2020

Regularizing Meta-Learning via Gradient Dropout

Hung-Yu Tseng, Yi-Wen Chen, Yi-Hsuan Tsai and
Sifei Liu, Yen-Yu Lin, Ming-Hsuan Yang

Keywords Paper

0

0

0

0

3:21

18/07/2021

Provable Meta-Learning of Linear Representations

Nilesh Tripuraneni, Chi Jin, Michael Jordan

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:09

04/08/2021

Adversarially Robust Low Dimensional Representations

Pranjal Awasthi, Vaggos Chatziafratis, Xue Chen, Aravindan Vijayaraghavan

Keywords Paper

0

0

0

0

20:19

06/12/2020

A Theoretical Framework for Target Propagation

Alexander Meulemans, Francesco Carzaniga, Johan Suykens and
João Sacramento, Benjamin F. Grewe

Keywords Paper

0

0

0

0

3:20

06/12/2021

Adaptive Proximal Gradient Methods for Structured Neural Networks

Jihun Yun, Aurelie Lozano, Eunho Yang

Keywords Paper

deep learning, optimization, machine learning

0

0

0

0

10:46

02/02/2021

Fast and Scalable Adversarial Training of Kernel SVM via Doubly Stochastic Gradients

Huimin Wu, Zhengmian Hu, Bin Gu

Keywords Paper

0

0

0

0

14:04

02/02/2021

Physarum Powered Differentiable Linear Programming Layers and Applications

Zihang Meng, Sathya N. Ravi, Vikas Singh

Keywords Paper

0

0

0

0

16:57

18/07/2021

Sparsifying Networks via Subdifferential Inclusion

Sagar Verma, Jean-Christophe Pesquet

Keywords Paper

Optimization, Convex Optimization

0

0

0

0

5:10

26/04/2020

Learning to Learn by Zeroth-Order Oracle

Yangjun Ruan, Yuanhao Xiong, Sashank Reddi and
Sanjiv Kumar, Cho-Jui Hsieh

Keywords Paper

learning to learn, zeroth-order optimization, black-box adversarial attack

0

0

0

0

4:48

18/07/2021

Efficient Iterative Amortized Inference for Learning Symmetric and Disentangled Multi-Object Representations

Patrick Emami, Pan He, Sanjay Ranka, Anand Rangarajan

Keywords Paper

Deep Learning, Embedding and Representation learning

0

0

0

0

5:10

02/02/2021

Deterministic Mini-batch Sequencing for Training Deep Neural Networks

Subhankar Banerjee, Shayok Chakraborty

Keywords Paper

0

0

0

0

16:00

03/05/2021

Auxiliary Task Update Decomposition: The Good, the Bad and the Neutral

Lucio Dery, Yann Dauphin, David Grangier

Keywords Paper

multitask learning, deeplearning, pre-training, gradient decomposition

0

0

0

0

5:22

06/12/2021

Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems

Subhabrata Dutta, Tanya Gautam, Soumen Chakrabarti, Tanmoy Chakraborty

Keywords Paper

deep learning, transformers

0

0

0

0

11:54

14/06/2020

Conditional Channel Gated Networks for Task-Aware Continual Learning

Davide Abati, Jakub Tomczak, Tijmen Blankevoort and
Simone Calderara, Rita Cucchiara, Babak Ehteshami Bejnordi

Keywords Paper

continual learning, channel gating, conditional computation, incremental learning, lifelong learning, hard attention

0

0

0

0

5:01

14/06/2020

Recursive Least-Squares Estimator-Aided Online Learning for Visual Tracking

Jin Gao, Weiming Hu, Yan Lu

Keywords Paper

online learning, visual tracking, continual learning, recursive least-squares estimation, deep learning, memory retention, recursive learning, mini-batch sgd, normal equation, mlp layer

0

0

0

0

5:01

12/07/2020

Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources

Yun Yun Tsai, Pin-Yu Chen, Tsung-Yi Ho

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

1

14:20

06/12/2021

Self-Supervised Learning of Event-Based Optical Flow with Spiking Neural Networks

Jesse Hagenaars, Federico Paredes-Valles, Guido de Croon

Keywords Paper

deep learning, optimization, self-supervised learning

0

0

0

0

13:28

12/07/2020

Online Multi-Kernel Learning with Graph-Structured Feedback

Pouya M Ghari, Yanning Shen

Keywords Paper

General Machine Learning Techniques

0

0

0

0

12:57

18/07/2021

Offline Meta-Reinforcement Learning with Advantage Weighting

Eric Mitchell, Rafael Rafailov, Xue Bin Peng and
Sergey Levine, Chelsea Finn

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

1

0

0

0

5:08

12/07/2020

Learning What to Defer for Maximum Independent Sets

Sungsoo Ahn, Younggyo Seo, Jinwoo Shin

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

14:47

06/12/2021

Fast Axiomatic Attribution for Neural Networks

Robin Hesse, Simone Schaub-Meyer, Stefan Roth

Keywords Paper

deep learning, interpretability

0

0

0

0

14:49

30/11/2020

Large-Scale Cross-Domain Few-Shot Learning

Jiechao Guan, Manli Zhang, Zhiwu Lu

Keywords Paper

0

0

0

0

7:26

12/07/2020

Learning To Stop While Learning To Predict

Xinshi Chen, Hanjun Dai, Yu Li and
Xin Gao, Le Song

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

14:33

14/06/2020

Improved Few-Shot Visual Classification

Peyman Bateni, Raghav Goyal, Vaden Masrani and
Frank Wood, Leonid Sigal

Keywords Paper

meta-learning, few-shot classification, transfer learning, mahalanobis metric, bergman divergences

0

0

0

0

1:01

06/12/2021

Algorithmic stability and generalization of an unsupervised feature selection algorithm

xinxing wu, Qiang Cheng

Keywords Paper

deep learning

0

0

0

0

12:41

06/12/2020

Functional Regularization for Representation Learning: A Unified Theoretical Perspective

Siddhant Garg, Yingyu Liang

Keywords Paper

0

0

0

0

3:19

19/08/2021

Cross-Domain Few-Shot Classification via Adversarial Task Augmentation

Haoqing Wang, Zhi-Hong Deng

Keywords Paper

Computer Vision, Recognition, Adversarial Machine Learning, Deep Learning

0

0

0

0

10:39

05/01/2021

Representation Learning With Statistical Independence to Mitigate Bias

Ehsan Adeli, Qingyu Zhao, Adolf Pfefferbaum and
Edith V. Sullivan, Li Fei-Fei, Juan Carlos Niebles, Kilian M. Pohl

Keywords Paper

0

0

0

0

4:33

13/04/2021

Neural function modules with sparse arguments: A dynamic approach to integrating information across layers

Alex Lamb, Anirudh Goyal, Agnieszka Słowik and
Michael Mozer, Philippe Beaudoin, Yoshua Bengio

Keywords Paper

0

0

0

0

3:01

13/04/2021

A theoretical characterization of semi-supervised learning with self-training for gaussian mixture models

Samet Oymak, Talha Cihad Gulcu

Keywords Paper

1

1

0

0

2:59

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

13/04/2021

Communication efficient primal-dual algorithm for nonconvex nonsmooth distributed optimization

Congliang Chen, Jiawei Zhang, Li Shen and
Peilin Zhao, Zhiquan Luo

Keywords Paper

0

0

0

0

3:01

03/05/2021

Neural Pruning via Growing Regularization

Huan Wang, Can Qin, Yulun Zhang, Yun Fu

Keywords Paper

deep neural network pruning, regularization, Hessian matrix, model compression

0

0

0

0

6:15

13/04/2021

On the generalization properties of adversarial training

Yue Xing, Qifan Song, Guang Cheng

Keywords Paper

0

0

0

0

3:05