Multimodal Few-Shot Learning with Frozen Language Models

06/12/2021

Multimodal Few-Shot Learning with Frozen Language Models

Maria Tsimpoukelli, Jacob L Menick, Serkan Cabi, S. M. Ali Eslami, Oriol Vinyals, Felix Hill

Keywords: vision, few shot learning

Abstract Paper Similar Papers

Abstract: When trained at sufficient scale, auto-regressive language models exhibit the notable ability to learn a new language task after being prompted with just a few examples. Here, we present a simple, yet effective, approach for transferring this few-shot learning ability to a multimodal setting (vision and language). Using aligned image and caption data, we train a vision encoder to represent each image as a sequence of continuous embeddings, such that a pre-trained, frozen language model presented with this prefix generates the appropriate caption. The resulting system is a multimodal few-shot learner, with the surprising ability to learn a variety of new tasks when conditioned on examples, represented as a sequence of any number of interleaved image and text embeddings. We demonstrate that it can rapidly learn words for new objects and novel visual categories, do visual question-answering with only a handful of examples, and make use of outside knowledge, by measuring a single model on a variety of established and new benchmarks.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

16/11/2020

Learning to Represent Image and Text with Denotation Graph

Bowen Zhang, Hexiang Hu, Vihan Jain and
Eugene Ie, Fei Sha

Keywords Paper

cross-modal retrieval, referring expression, compositional recognition, pre-training

0

0

0

0

10:59

06/12/2021

Referring Transformer: A One-step Approach to Multi-task Visual Grounding

Muchen Li, Leonid Sigal

Keywords Paper

transformers, vision

0

0

0

0

7:54

03/05/2021

$i$-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning

Kibok Lee, Yian Zhu, Kihyuk Sohn and
Chun-Liang Li, Jinwoo Shin, Honglak Lee

Keywords Paper

self-supervised learning, unsupervised representation learning, data augmentation, MixUp, contrastive representation learning

0

0

0

0

5:04

06/12/2020

Uncertainty-aware Self-training for Few-shot Text Classification

Subhabrata Mukherjee, Ahmed Awadallah

Keywords Paper

0

0

0

0

3:16

25/07/2020

Leveraging adversarial training in self-learning for cross-lingual text classification

Xin Dong, Yaxin Zhu, Yupeng Zhang and
Zuohui Fu, Dongkuan Xu, Sen Yang, Gerard Melo

Keywords Paper

multilingual, semantics, text classification, cross-lingual

0

0

0

0

9:19

06/12/2021

Think Big, Teach Small: Do Language Models Distil Occam’s Razor?

Gonzalo Jaimovitch-Lopez, David Castellano Falcón, Cesar Ferri, José Hernández-Orallo

Keywords Paper

machine learning, interpretability, few shot learning

0

0

0

0

12:12

14/06/2020

TransMatch: A Transfer-Learning Scheme for Semi-Supervised Few-Shot Learning

Zhongjie Yu, Lin Chen, Zhongwei Cheng, Jiebo Luo

Keywords Paper

few-shot learning, semi-supervised learning, meta-learning

0

0

0

0

1:01

16/11/2020

Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision

Hao Tan, Mohit Bansal

Keywords Paper

speaking, writing, text-only self-supervision, pure-language tasks

0

0

0

0

11:59

04/07/2020

Shaping Visual Representations with Language for Few-Shot Classification

Jesse Mu, Percy Liang, Noah Goodman

Keywords Paper

Few-Shot Classification, human learning, supervision, machine models

0

0

0

0

6:59

05/12/2020

Systematic generalization on gSCAN with language conditioned embedding

Tong Gao, Qi Huang, Raymond Mooney

Keywords Paper

0

0

0

0

14:19

05/01/2021

Towards Contextual Learning in Few-Shot Object Classification

Mathieu Page Fortin, Brahim Chaib-draa

Keywords Paper

0

0

0

0

4:57

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

16/11/2020

Meta Fine-Tuning Neural Language Models for Multi-Domain Text Mining

Chengyu Wang, Minghui Qiu, Jun Huang, Xiaofeng He

Keywords Paper

nlp tasks, fine-tuning, learning process, multi-domain tasks

0

0

0

0

9:58

16/11/2020

Learning from Task Descriptions

Orion Weller, Nicholas Lourie, Matt Gardner, Matthew Peters

Keywords Paper

task-oriented evaluation, systematic generalization, machine systems, nlp systems

0

0

0

0

11:48

04/07/2020

Curriculum Learning for Natural Language Understanding

Benfeng Xu, Licheng Zhang, Zhendong Mao and
Quan Wang, Hongtao Xie, Yongdong Zhang

Keywords Paper

Curriculum Learning, Natural Understanding, natural tasks, NLU tasks

0

0

0

0

9:41

12/07/2020

Retrieval Augmented Language Model Pre-Training

Kelvin Guu, Kenton Lee, Zora Tung and
Panupong Pasupat, Mingwei Chang

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

14:44

04/07/2020

Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning

Angeliki Lazaridou, Anna Potapenko, Olivier Tieleman

Keywords Paper

Multi-agent Communication, natural learning, visual task, Functional Learning

0

0

0

0

11:44

06/12/2020

OOD-MAML: Meta-Learning for Few-Shot Out-of-Distribution Detection and Classification

Taewon Jeong, Heeyoung Kim

Keywords Paper

0

0

0

0

3:16

06/12/2020

Learning Sparse Prototypes for Text Generation

Junxian He, Taylor Berg-Kirkpatrick, Graham Neubig

Keywords Paper

0

0

0

0

3:22

16/11/2020

Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks

Trapit Bansal, Rishikesh Jha, Tsendsuren Munkhdalai, Andrew McCallum

Keywords Paper

nlp applications, fine-tuning, meta-learning problem, supervised tasks

0

0

0

0

11:49

03/05/2021

Learning Associative Inference Using Fast Weight Memory

Imanol Schlag, Tsendsuren Munkhdalai, Jürgen Schmidhuber

Keywords Paper

fast weights, memory-augmented neural networks, tensor product

0

0

0

0

4:29

02/02/2021

Improving Tree-Structured Decoder Training for Code Generation via Mutual Learning

Binbin Xie, Jinsong Su, Yubin Ge and
Xiang Li, Jianwei Cui, Junfeng Yao, Bin Wang

Keywords Paper

0

0

0

0

15:57

16/11/2020

Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models

Ethan Wilcox, Peng Qian, Richard Futrell and
Ryosuke Kohita, Roger Levy, Miguel Ballesteros

Keywords Paper

learning outcomes, syntactic representations, neural models, n-gram baseline

0

0

0

0

11:29

14/06/2020

Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-Training

Weituo Hao, Chunyuan Li, Xiujun Li and
Lawrence Carin, Jianfeng Gao

Keywords Paper

vision-and-language navigation, large-scale pretraining, cross modality understanding, self-supervised learning

0

0

0

0

0:59

06/12/2020

Learning to Learn Variational Semantic Memory

Xiantong Zhen, Yingjun Du, Huan Xiong and
Qiang Qiu, Cees Snoek, Ling Shao

Keywords Paper

0

1

1

1

3:24

03/05/2021

Parrot: Data-Driven Behavioral Priors for Reinforcement Learning

Avi Singh, Huihan Liu, Gaoyue Zhou and
Albert Yu, Nicholas Rhinehart, Sergey Levine

Keywords Paper

reinforcement learning, imitation learning

0

0

0

0

14:21

02/02/2021

Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

Peng Shi, Patrick Ng, Zhiguo Wang and
Henghui Zhu, Alexander Hanbo Li, Jun Wang, Cicero Nogueira dos Santos, Bing Xiang

Keywords Paper

0

0

0

0

15:15

14/06/2020

Learning Representations by Predicting Bags of Visual Words

Spyros Gidaris, Andrei Bursuc, Nikos Komodakis and
Patrick Pérez, Matthieu Cord

Keywords Paper

representation learning, self-supervised learning, unsupervised learning, discrete representations, bag of visual words, image understanding, deep learning, convolutional neural networks

0

0

0

0

1:01

19/04/2021

Active learning for sequence tagging with deep pre-trained models and Bayesian uncertainty estimates

Artem Shelmanov, Dmitri Puzyrev, Lyubov Kupriyanova and
Denis Belyakov, Daniil Larionov, Nikita Khromov, Olga Kozlova, Ekaterina Artemova, Dmitry V. Dylov, Alexander Panchenko

Keywords Paper

0

0

0

0

11:47

02/02/2021

SALNet: Semi-supervised Few-Shot Text Classification with Attention-based Lexicon Construction

Ju-Hyoung Lee, Sang-Ki Ko, Yo-Sub Han

Keywords Paper

0

0

0

0

15:28

22/11/2021

Class-Balanced Distillation for Long-Tailed Visual Recognition

Ahmet Iscen, Andre Araujo, Boqing Gong, Cordelia Schmid

Keywords Paper

Long tailed recognition, dataset imbalance

0

0

0

0

3:02

02/02/2021

Generalising without Forgetting for Lifelong Person Re-Identification

Guile Wu, Shaogang Gong

Keywords Paper

0

0

0

0

17:10

14/09/2020

Partial Label Learning via Self-Paced Curriculum Strategy

Gengyu Lyu, Songhe Feng, Yi Jin, Yidong Li

Keywords Paper

partial-label learning, self-paced learning strategy, curriculum learning strategy, instructor-student-collaborative

0

0

0

0

6:46

07/09/2020

RODEO: Replay for Online Object Detection

Manoj Acharya, Tyler Hayes, Christopher Kanan

Keywords Paper

streaming learning, continual learning, object detection, lifelong learning, catastrophic forgetting, product quantization

0

0

0

0

8:16

02/02/2021

Towards Semantics-Enhanced Pre-Training: Can Lexicon Definitions Help Learning Sentence Meanings?

Xuancheng Ren, Xu Sun, Houfeng Wang, Qun Liu

Keywords Paper

0

0

0

0

16:04

06/12/2020

Leap-Of-Thought: Teaching Pre-Trained Models to Systematically Reason Over Implicit Knowledge

Alon Talmor, Oyvind Tafjord, Peter Clark and
Yoav Goldberg, Jonathan Berant

Keywords Paper

0

0

0

0

3:28

19/04/2021

Cross-lingual visual pre-training for multimodal machine translation

Ozan Caglayan, Menekse Kuyu, Mustafa Sercan Amac and
Pranava Madhyastha, Erkut Erdem, Aykut Erdem, Lucia Specia

Keywords Paper

0

0

0

0

6:16

16/11/2020

Visually Grounded Continual Learning of Compositional Phrases

Xisen Jin, Junyi Du, Arka Sadhu and
Ram Nevatia, Xiang Ren

Keywords Paper

visually task, continual phrases, visually-grounded task, compositional generalization

0

0

0

0

10:50

04/07/2020

Language to Network: Conditional Parameter Adaptation with Natural Language Descriptions

Tian Jin, Zhun Liu, Shengjia Yan and
Alexandre Eichenberger, Louis-Philippe Morency

Keywords Paper

Transfer learning, computer tasks, fine-tuning, Conditional Adaptation

0

0

0

0

5:42

12/07/2020

Word-Level Speech Recognition With a Letter to Word Encoder

Ronan Collobert, Awni Hannun, Gabriel Synnaeve

Keywords Paper

Applications - Language, Speech and Dialog

0

0

0

0

14:53