Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

06/12/2021

Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

Jannik Kossen, Neil Band, Clare Lyle, Aidan Gomez, Thomas Rainforth, Yarin Gal

Keywords: deep learning, transformers

Abstract Paper Similar Papers

Abstract: We challenge a common assumption underlying most supervised deep learning: that a model makes a prediction depending only on its parameters and the features of a single input. To this end, we introduce a general-purpose deep learning architecture that takes as input the entire dataset instead of processing one datapoint at a time. Our approach uses self-attention to reason about relationships between datapoints explicitly, which can be seen as realizing non-parametric models using parametric attention mechanisms. However, unlike conventional non-parametric models, we let the model learn end-to-end from the data how to make use of other datapoints for prediction. Empirically, our models solve cross-datapoint lookup and complex reasoning tasks unsolvable by traditional deep learning models. We show highly competitive results on tabular data, early results on CIFAR-10, and give insight into how the model makes use of the interactions between points.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Deep Learning Through the Lens of Example Difficulty

Robert Baldock, Hartmut Maennel, Behnam Neyshabur

Keywords Paper

deep learning

0

0

0

0

12:54

19/08/2021

Abductive Learning with Ground Knowledge Base

Le-Wen Cai, Wang-Zhou Dai, Yu-Xuan Huang and
Yu-Feng Li, Stephen Muggleton, Yuan Jiang

Keywords Paper

Knowledge Representation and Reasoning, Diagnosis and Abductive Reasoning, Knowledge Aided Learning, Weakly Supervised Learning

0

0

0

0

12:59

13/04/2021

Contrastive learning of strong-mixing continuous-time stochastic processes

Bingbin Liu, Pradeep Ravikumar, Andrej Risteski

Keywords Paper

0

0

0

0

2:57

12/07/2020

A Sample Complexity Separation between Non-Convex and Convex Meta-Learning

Nikunj Umesh Saunshi, Yi Zhang, Mikhail Khodak, Sanjeev Arora

Keywords Paper

Deep Learning - Theory

0

0

0

0

15:03

03/05/2021

Learning explanations that are hard to vary

Giambattista Parascandolo, Alexander Neitz, Antonio Orvieto and
Luigi Gresele, Bernhard Schoelkopf

Keywords Paper

invariances, gradient alignment, consistency

0

0

0

0

5:16

06/12/2021

A Mathematical Framework for Quantifying Transferability in Multi-source Transfer Learning

Xinyi Tong, Xiangxiang Xu, Shao-Lun Huang, Lizhong Zheng

Keywords Paper

theory, deep learning, machine learning, vision, transfer learning

2

1

0

0

13:27

13/04/2021

Nonparametric estimation of heterogeneous treatment effects: From theory to learning algorithms

Alicia Curth, Mihaela Schaar

Keywords Paper

0

0

0

0

3:01

06/12/2020

Functional Regularization for Representation Learning: A Unified Theoretical Perspective

Siddhant Garg, Yingyu Liang

Keywords Paper

0

0

0

0

3:19

06/12/2020

Continuous Meta-Learning without Tasks

James Harrison, Apoorva Sharma, Chelsea Finn, Marco Pavone

Keywords Paper

0

0

0

0

3:09

22/11/2021

One-Shot Deep Model for End-to-End Multi-Person Activity Recognition

Shuhei Tarashima

Keywords Paper

Group Activity Recognition, Action Recognition, Multi-Object Tracking, Multi-task Learning

0

0

0

0

2:50

26/04/2020

Weakly Supervised Disentanglement with Guarantees

Rui Shu, Yining Chen, Abhishek Kumar and
Stefano Ermon, Ben Poole

Keywords Paper

disentanglement, theory of disentanglement, representation learning, generative models

0

0

0

0

4:42

18/07/2021

On Recovering from Modeling Errors Using Testing Bayesian Networks

Haiying Huang, Adnan Darwiche

Keywords Paper

Probabilistic Methods, Graphical Models

0

0

0

0

5:09

03/05/2021

Evaluating the Disentanglement of Deep Generative Models through Manifold Topology

Sharon Zhou, Eric Zelikman, Fred Lu and
Andrew Ng, Gunnar E Carlsson, Stefano Ermon

Keywords Paper

generative models, disentanglement, evaluation

0

0

0

0

5:06

26/04/2020

Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distribution Tasks

Hae Beom Lee, Hayeon Lee, Donghyun Na and
Saehoon Kim, Minseop Park, Eunho Yang, Sung Ju Hwang

Keywords Paper

meta-learning, few-shot learning, Bayesian neural network, variational inference, learning to learn, imbalanced and out-of-distribution tasks for few-shot learning

0

0

0

1

13:46

18/07/2021

The Impact of Record Linkage on Learning from Feature Partitioned Data

Richard Nock, Stephen J Hardy, Wilko Henecka and
Hamish Ivey-Law, Jakub Nabaglo, Giorgio Patrini, Guillaume Smith, Brian Thorne

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

6:02

19/08/2021

Deep Descriptive Clustering

Hongjing Zhang, Ian Davidson

Keywords Paper

Machine Learning, Clustering, Explainable/Interpretable Machine Learning, Constraints and Data Mining; Constraints and Machine Learning

0

0

0

0

15:12

06/12/2021

An online passive-aggressive algorithm for difference-of-squares classification

Lawrence Saul

Keywords Paper

machine learning, online learning

0

0

0

0

14:00

03/05/2021

Theoretical bounds on estimation error for meta-learning

James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI and
Toniann Pitassi, Richard Zemel

Keywords Paper

meta learning, minimax risk, few-shot, lower bounds, learning theory

0

0

0

0

4:46

04/07/2020

Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks

Yiping Song, Zequn Liu, Wei Bi and
Rui Yan, Ming Zhang

Keywords Paper

Few-shot Tasks, open-domain systems, generative models, meta-learning framework

0

0

0

0

11:43

13/04/2021

On data efficiency of meta-learning

Maruan Al-Shedivat, Liam Li, Eric Xing, Ameet Talwalkar

Keywords Paper

0

0

0

0

3:24

19/08/2021

Learning CNF Theories Using MDL and Predicate Invention

Arcchit Jain, Clément Gautrais, Angelika Kimmig, Luc De Raedt

Keywords Paper

Machine Learning, Relational Learning, Constraints and Data Mining; Constraints and Machine Learning

0

0

0

0

15:00

30/11/2020

Large-Scale Cross-Domain Few-Shot Learning

Jiechao Guan, Manli Zhang, Zhiwu Lu

Keywords Paper

0

0

0

0

7:26

06/12/2021

Leveraging Recursive Gumbel-Max Trick for Approximate Inference in Combinatorial Spaces

Kirill Struminsky, Artyom Gadetsky, Denis Rakitin and
Danil Karpushkin, Dmitry Vetrov

Keywords Paper

deep learning, optimization

0

0

0

0

14:26

12/07/2020

Provable Representation Learning for Imitation Learning via Bi-level Optimization

Sanjeev Arora, Simon Du, Sham Kakade and
Yuping Luo, Nikunj Umesh Saunshi

Keywords Paper

Reinforcement Learning - Theory

0

0

0

0

15:04

02/02/2021

Fine-grained Generalization Analysis of Vector-Valued Learning

Liang Wu, Antoine Ledent, Yunwen Lei, Marius Kloft

Keywords Paper

0

0

0

0

13:54

02/02/2021

LRSC: Learning Representations for Subspace Clustering

Changsheng Li, Chen Yang, Bo Liu and
Ye Yuan, Guoren Wang

Keywords Paper

0

0

0

0

15:09

14/09/2020

Partial Label Learning via Self-Paced Curriculum Strategy

Gengyu Lyu, Songhe Feng, Yi Jin, Yidong Li

Keywords Paper

partial-label learning, self-paced learning strategy, curriculum learning strategy, instructor-student-collaborative

0

0

0

0

6:46

02/02/2021

Task Cooperation for Semi-Supervised Few-Shot Learning

Han-Jia Ye, Xin-Chun Li, De-Chuan Zhan

Keywords Paper

0

0

0

0

16:06

03/05/2021

Meta-GMVAE: Mixture of Gaussian VAE for Unsupervised Meta-Learning

Dong Bok Lee, Dongchan Min, Seanie Lee, Sung Ju Hwang

Keywords Paper

Unsupervised Learning, Variational Autoencoders, Unsupervised Meta-learning, Meta-Learning

0

0

0

0

13:31

19/08/2021

Abductive Knowledge Induction from Raw Data

Wang-Zhou Dai, Stephen Muggleton

Keywords Paper

Knowledge Representation and Reasoning, Diagnosis and Abductive Reasoning, Leveraging Knowledge and Learning, Knowledge Aided Learning, Neuro-Symbolic Methods

0

0

0

0

15:07

18/07/2021

Dash: Semi-Supervised Learning with Dynamic Thresholding

Yi Xu, Lei Shang, Jinxing Ye and
Qi Qian, Yufeng Li, Baigui Sun, Hao Li, rong jin

Keywords Paper

Algorithms, Semi-Supervised Learning

0

0

0

1

15:24

12/07/2020

Task Understanding from Confusing Multi-task Data

Xin Su, Yizhou Jiang, Shangqi Guo, Feng Chen

Keywords Paper

General Machine Learning Techniques

0

0

0

0

15:29

16/11/2020

Deep Weighted MaxSAT for Aspect-based Opinion Extraction

Meixi Wu, Wenya Wang, Sinno Jialin Pan

Keywords Paper

nlp tasks, training process, logic programs, satisfiability problem

0

0

0

0

11:36

06/12/2020

Meta-learning from Tasks with Heterogeneous Attribute Spaces

Tomoharu Iwata, Atsutoshi Kumagai

Keywords Paper

Algorithms -> Unsupervised Learning, Applications -> Robotics

0

0

0

0

3:19

06/12/2020

Self-Supervised Relational Reasoning for Representation Learning

Massimiliano Patacchiola, Amos Storkey

Keywords Paper

0

0

0

0

2:55

03/05/2021

One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks

Atish Agarwala, Abhimanyu Das, Brendan Juba and
Rina Panigrahy, Vatsal Sharan, Xin Wang, Qiuyi Zhang

Keywords Paper

deep learning theory, multi-task learning

0

0

0

0

5:18

06/12/2021

USCO-Solver: Solving Undetermined Stochastic Combinatorial Optimization Problems

Guangmo Tong

Keywords Paper

optimization

0

0

0

0

15:00

03/05/2021

On the Dynamics of Training Attention Models

Haoye Lu, Yongyi Mao, Amiya Nayak

Keywords Paper

0

0

0

0

5:09

06/12/2021

Towards Sample-efficient Overparameterized Meta-learning

Yue Sun, Adhyyan Narang, Ibrahim Gulluk and
Samet Oymak, Maryam Fazel

Keywords Paper

theory, machine learning, meta learning, representation learning, few shot learning

0

0

0

0

13:54

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27