12/07/2020

Learning Attentive Meta-Transfer

Jaesik Yoon, Gautam Singh, Sungjin Ahn

Keywords: Sequential, Network, and Time-Series Modeling

Abstract: Meta-transfer learning seeks to improve the efficiency of learning a new task via both meta-learning and transfer-learning in a setting with a stream of evolving tasks. While standard attention has been effective in a variety of settings, we question its effectiveness in improving meta-transfer learning since the tasks being learned are dynamic, and the amount of context information can be substantially small. In this paper, using a recently proposed meta-transfer learning model, Sequential Neural Processes (SNP), we first empirically show that it suffers a similar underfitting problem observed in the functions inferred by Neural Processes. However, we further demonstrate that unlike the meta-learning setting, standard attention mechanisms are ineffective in meta-transfer learning.~To resolve, we propose a new attention mechanism, Recurrent Memory Reconstruction (RMR), and demonstrate that providing an imaginary context that is recurrently updated and reconstructed with interaction is crucial in achieving effective attention for meta-transfer learning. Furthermore, incorporating RMR into SNP, we propose Attentive Sequential Neural Processes (ASNP) and demonstrate in various tasks that ASNP significantly outperforms the baselines.

 1
 1
 1
 0
This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd

Similar Papers