For self-supervised learning, Rationality implies generalization, provably

Abstract: We prove a new upper bound on the generalization gap of classifiers that are obtained by first using self-supervision to learn a representation $r$ of the training~data, and then fitting a simple (e.g., linear) classifier $g$ to the labels. Specifically, we show that (under the assumptions described below) the generalization gap of such classifiers tends to zero if $\mathsf{C}(g) \ll n$, where $\mathsf{C}(g)$ is an appropriately-defined measure of the simple classifier $g$'s complexity, and $n$ is the number of training samples. We stress that our bound is independent of the complexity of the representation $r$. We do not make any structural or conditional-independence assumptions on the representation-learning task, which can use the same training dataset that is later used for classification. Rather, we assume that the training procedure satisfies certain natural noise-robustness (adding small amount of label noise causes small degradation in performance) and rationality (getting the wrong label is not better than getting no label at all) conditions that widely hold across many standard architectures. We also conduct an extensive empirical study of the generalization gap and the quantities used in our assumptions for a variety of self-supervision based algorithms, including SimCLR, AMDIM and BigBiGAN, on the CIFAR-10 and ImageNet datasets. We show that, unlike standard supervised classifiers, these algorithms display small generalization gap, and the bounds we prove on this gap are often non vacuous.

06/12/2020

For self-supervised learning, Rationality implies generalization, provably

Yamini Bansal, Gal Kaplun, Boaz Barak

Comments

Similar Papers

Minimax Classification with 0-1 Loss and Performance Guarantees

Santiago Mazuelas, Andrea Zanoni, Aritz Pérez

Keywords Abstract Paper

SuperLoss: A Generic Loss for Robust Curriculum Learning

Thibault Castells, Philippe Weinzaepfel, Jerome Revaud

Keywords Abstract Paper

, Probabilistic Methods -> MCMC

RATT: Leveraging Unlabeled Data to Guarantee Generalization

Saurabh Garg, Sivaraman Balakrishnan, Zico Kolter, Zachary Lipton

Keywords Abstract Paper

Probabilistic Methods, Graphical Models, Theory, Computational Complexity, Theory, Models of Learning and Generalization

Auxiliary Task Reweighting for Minimum-data Learning

Baifeng Shi, Judy Hoffman, Kate Saenko and Trevor Darrell, Huijuan Xu

Keywords Abstract Paper

Single Layer Predictive Normalized Maximum Likelihood for Out-of-Distribution Detection

Koby Bibas, Meir Feder, Tal Hassner

Keywords Abstract Paper

theory, deep learning, machine learning

A theoretical characterization of semi-supervised learning with self-training for gaussian mixture models

Samet Oymak, Talha Cihad Gulcu

Keywords Abstract Paper

Subspace Fitting Meets Regression: The Effects of Supervision and Orthonormality Constraints on Double Descent of Generalization Errors

Yehuda Dar, Paul Mayer, Lorenzo Luzi, Richard Baraniuk

Keywords Abstract Paper

Unsupervised and Semi-Supervised Learning

On the generalization properties of adversarial training

Yue Xing, Qifan Song, Guang Cheng

Keywords Abstract Paper

Training Data Subset Selection for Regression with Controlled Generalization Error

Durga S, Rishabh Iyer, Ganesh Ramakrishnan, Abir De

Keywords Abstract Paper

, Algorithms, Online Learning, Algorithms, Supervised Learning

Dash: Semi-Supervised Learning with Dynamic Thresholding

Yi Xu, Lei Shang, Jinxing Ye and Qi Qian, Yufeng Li, Baigui Sun, Hao Li, rong jin

Keywords Abstract Paper

Algorithms, Semi-Supervised Learning

Training Neural Networks for and by Interpolation

Leonard Berrada, M. Pawan Kumar, Andrew Zisserman

Keywords Abstract Paper

Deep Learning - General

Shape your Space: A Gaussian Mixture Regularization Approach to Deterministic Autoencoders

Amrutha Saseendran, Kathrin Skubch, Stefan Falkner, Margret Keuper

Keywords Abstract Paper

generative model

Efficient Iterative Amortized Inference for Learning Symmetric and Disentangled Multi-Object Representations

Patrick Emami, Pan He, Sanjay Ranka, Anand Rangarajan

Keywords Abstract Paper

Deep Learning, Embedding and Representation learning

Regularization via Structural Label Smoothing

Weizhi Li, Gautam Dasarathy, Visar Berisha

Keywords Abstract Paper

Self-supervised Adversarial Robustness for the Low-label, High-data Regime

Sven Gowal, Po-Sen Huang, Aaron v den and Timothy A Mann, Pushmeet Kohli

Keywords Abstract Paper

self-supervised, adversarial training, robustness

Examining and Combating Spurious Features under Distribution Shift

Chunting Zhou, Xuezhe Ma, Paul Michel, Graham Neubig

Keywords Abstract Paper

Deep Learning, Embedding and Representation learning

Robust Disentanglement of a Few Factors at a Time

Benjamin Estermann, Markus Marks, Mehmet Fatih Yanik

Keywords Abstract Paper

Stopping criterion for active learning based on deterministic generalization bounds

Hideaki Ishibashi, Hideitsu Hino

Keywords Abstract Paper

EvidentialMix: Learning With Combined Open-Set and Closed-Set Noisy Labels

Ragav Sachdeva, Filipe R. Cordeiro, Vasileios Belagiannis and Ian Reid, Gustavo Carneiro

Keywords Abstract Paper

Meta-learning with negative learning rates

Alberto Bernacchia

Keywords Abstract Paper

Meta-learning

Continual Learning with Node-Importance based Adaptive Group Sparse Regularization

Sangwon Jung, Hongjoon Ahn, Sungmin Cha, Taesup Moon

Keywords Abstract Paper

Fast Axiomatic Attribution for Neural Networks

Keywords Paper

Keywords Paper

Keywords Paper

Baifeng Shi, Judy Hoffman, Kate Saenko and
Trevor Darrell, Huijuan Xu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yi Xu, Lei Shang, Jinxing Ye and
Qi Qian, Yufeng Li, Baigui Sun, Hao Li, rong jin

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Sven Gowal, Po-Sen Huang, Aaron v den and
Timothy A Mann, Pushmeet Kohli

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Ragav Sachdeva, Filipe R. Cordeiro, Vasileios Belagiannis and
Ian Reid, Gustavo Carneiro

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Eric Mitchell, Rafael Rafailov, Xue Bin Peng and
Sergey Levine, Chelsea Finn

Keywords Paper

Atal Sahu, Aritra Dutta, Ahmed M. Abdelmoniem and
Trambak Banerjee, Marco Canini, Panos Kalnis

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Qizhe Xie, Zihang Dai, Eduard Hovy and
Thang Luong, Quoc V Le

Keywords Paper

Neha Wadia, Daniel Duckworth, Samuel Schoenholz and
Ethan Dyer, Jascha Sohl-Dickstein

Keywords Paper

Tianlong Chen, Weiyi Zhang, Zhou Jingyang and
Shiyu Chang, Sijia Liu, Lisa Amini, Zhangyang Wang

Keywords Paper

Keywords Paper

Keywords Paper

Hao Zhu, Yang Yuan, Guosheng Hu and
Xiang Wu, Neil Robertson

Keywords Paper