Deep Learning on a Data Diet: Finding Important Examples Early in Training

Abstract: Recent success in deep learning has partially been driven by training increasingly overparametrized networks on ever larger datasets. It is therefore natural to ask: how much of the data is superfluous, which examples are important for generalization, and how do we find them? In this work, we make the striking observation that, in standard vision datasets, simple scores averaged over several weight initializations can be used to identify important examples very early in training. We propose two such scores—the Gradient Normed (GraNd) and the Error L2-Norm (EL2N) scores—and demonstrate their efficacy on a range of architectures and datasets by pruning significant fractions of training data without sacrificing test accuracy. In fact, using EL2N scores calculated a few epochs into training, we can prune half of the CIFAR10 training set while slightly improving test accuracy. Furthermore, for a given dataset, EL2N scores from one architecture or hyperparameter configuration generalize to other configurations. Compared to recent work that prunes data by discarding examples that are rarely forgotten over the course of training, our scores use only local information early in training. We also use our scores to detect noisy examples and study training dynamics through the lens of important examples—we investigate how the data distribution shapes the loss surface and identify subspaces of the model’s data representation that are relatively stable over training.

03/05/2021

dataset corruption, infinite-width networks, neural kernels, kernel-ridge regression, dataset compression, dataset distillation, meta-learning

4:59

06/12/2021

Deep Learning on a Data Diet: Finding Important Examples Early in Training

Mansheej Paul, Surya Ganguli, Gintare Karolina Dziugaite

Comments

Similar Papers

When Do Curricula Work?

Xiaoxia (Shirley) Wu, Ethan Dyer, Behnam Neyshabur

Keywords Abstract Paper

Empirical Investigation, Understanding Deep Learning, Curriculum Learning

Revisiting Training Strategies and Generalization Performance in Deep Metric Learning

Karsten Roth, Timo Milbich, Samrath Sinha and Prateek Gupta, Bjorn Ommer, Joseph Paul Cohen

Keywords Abstract Paper

Applications - Computer Vision

EvidentialMix: Learning With Combined Open-Set and Closed-Set Noisy Labels

Ragav Sachdeva, Filipe R. Cordeiro, Vasileios Belagiannis and Ian Reid, Gustavo Carneiro

Keywords Abstract Paper

Once-for-All Adversarial Training: In-Situ Tradeoff between Robustness and Accuracy for Free

Haotao Wang, Tianlong Chen, Shupeng Gui and TingKuei Hu, Ji Liu, Zhangyang Wang

Keywords Abstract Paper

Dataset Meta-Learning from Kernel Ridge-Regression

Timothy Nguyen, Zhourong Chen, Jaehoon Lee

Keywords Abstract Paper

dataset corruption, infinite-width networks, neural kernels, kernel-ridge regression, dataset compression, dataset distillation, meta-learning

Fast Axiomatic Attribution for Neural Networks

Robin Hesse, Simone Schaub-Meyer, Stefan Roth

Keywords Abstract Paper

deep learning, interpretability

GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning

Krishnateja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, Rishabh Iyer

Keywords Abstract Paper

Dynamic Curriculum Learning for Low-Resource Neural Machine Translation

Chen Xu, Bojie Hu, Yufan Jiang and Kai Feng, Zeyang Wang, Shen Huang, Qi Ju, Tong Xiao, Jingbo Zhu

Keywords Abstract Paper

Training Stronger Baselines for Learning to Optimize

Tianlong Chen, Weiyi Zhang, Zhou Jingyang and Shiyu Chang, Sijia Liu, Lisa Amini, Zhangyang Wang

Keywords Abstract Paper

Dataset Condensation with Gradient Matching

Bo ZHAO, Konda Reddy Mopuri, Hakan Bilen

Keywords Abstract Paper

dataset condensation, image generation, data-efficient learning

Curriculum learning by optimizing learning dynamics

Tianyi Zhou, Shengjie Wang, Jeff Bilmes

Keywords Abstract Paper

Rethinking Curriculum Learning with Incremental Labels and Adaptive Compensation

Madan Ravi Ganesh, Jason Corso

Keywords Abstract Paper

label smoothing, curriculum learning, incremental labels, adaptive compensation, negative mining

Learning perturbation sets for robust machine learning

Eric Wong, Zico Kolter

Keywords Abstract Paper

conditional variational autoencoder, adversarial examples, perturbation sets, robust machine learning

Self-Damaging Contrastive Learning

Ziyu Jiang, Tianlong Chen, Bobak Mortazavi, Zhangyang Wang

Keywords Abstract Paper

Algorithms, Unsupervised Learning

Memory Efficient Meta-Learning with Large Images

John Bronskill, Daniela Massiceti, Massimiliano Patacchiola and Katja Hofmann, Sebastian Nowozin, Richard Turner

Keywords Abstract Paper

optimization, machine learning, vision, meta learning, transfer learning, few shot learning

Small-GAN: Speeding up GAN Training using Core-Sets

Samrath Sinha, Han Zhang, Anirudh Goyal and Yoshua Bengio, Hugo Larochelle, Augustus Odena

Keywords Abstract Paper

Deep Learning - General

Uncertainty-Aware Curriculum Learning for Neural Machine Translation

Yikai Zhou, Baosong Yang, Derek F. Wong and Yu Wan, Lidia S. Chao

Keywords Abstract Paper

Neural Translation, assessment difficulty, translation tasks, Uncertainty-Aware Learning

FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training

Yonggan Fu, Haoran You, Yang Zhao and Yue Wang, Chaojian Li, Kailash Gopalakrishnan, Zhangyang Wang, Yingyan Lin

Keywords Abstract Paper

Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel

Stanislav Fort, Gintare Karolina Dziugaite, Mansheej Paul and Sepideh Kharaghani, Dan Roy, Surya Ganguli

Keywords Abstract Paper

Model Performance Scaling with Multiple Data Sources

Tatsunori Hashimoto

Keywords Abstract Paper

Algorithms, Supervised Learning

Rethinking Softmax Cross-Entropy Loss for Adversarial Robustness

Tianyu Pang, Kun Xu, Yinpeng Dong and Chao Du, Ning Chen, Jun Zhu

Keywords Abstract Paper

Trustworthy Machine Learning, Adversarial Robustness, Training Objective, Sample Density

Keywords Paper

Karsten Roth, Timo Milbich, Samrath Sinha and
Prateek Gupta, Bjorn Ommer, Joseph Paul Cohen

Keywords Paper

Ragav Sachdeva, Filipe R. Cordeiro, Vasileios Belagiannis and
Ian Reid, Gustavo Carneiro

Keywords Paper

Haotao Wang, Tianlong Chen, Shupeng Gui and
TingKuei Hu, Ji Liu, Zhangyang Wang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Chen Xu, Bojie Hu, Yufan Jiang and
Kai Feng, Zeyang Wang, Shen Huang, Qi Ju, Tong Xiao, Jingbo Zhu

Keywords Paper

Tianlong Chen, Weiyi Zhang, Zhou Jingyang and
Shiyu Chang, Sijia Liu, Lisa Amini, Zhangyang Wang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

John Bronskill, Daniela Massiceti, Massimiliano Patacchiola and
Katja Hofmann, Sebastian Nowozin, Richard Turner

Keywords Paper

Samrath Sinha, Han Zhang, Anirudh Goyal and
Yoshua Bengio, Hugo Larochelle, Augustus Odena

Keywords Paper

Yikai Zhou, Baosong Yang, Derek F. Wong and
Yu Wan, Lidia S. Chao

Keywords Paper

Yonggan Fu, Haoran You, Yang Zhao and
Yue Wang, Chaojian Li, Kailash Gopalakrishnan, Zhangyang Wang, Yingyan Lin

Keywords Paper

Stanislav Fort, Gintare Karolina Dziugaite, Mansheej Paul and
Sepideh Kharaghani, Dan Roy, Surya Ganguli

Keywords Paper

Keywords Paper

Tianyu Pang, Kun Xu, Yinpeng Dong and
Chao Du, Ning Chen, Jun Zhu

Keywords Paper

Keywords Paper

Mi Luo, Fei Chen, Dapeng Hu and
Yifan Zhang, Jian Liang, Jiashi Feng

Keywords Paper

Keywords Paper

James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI and
Toniann Pitassi, Richard Zemel

Keywords Paper

Keywords Paper

Keywords Paper

Tianzhe Wang, Kuan Wang, Han Cai and
Ji Lin, Zhijian Liu, Hanrui Wang, Yujun Lin, Song Han

Keywords Paper

Weizhe Hua, Yichi Zhang, Chuan Guo and
Zhiru Zhang, G. Edward Suh

Keywords Paper

Keywords Paper

Kaidi Cao, Jingwei Ji, Zhangjie Cao and
Chien-Yi Chang, Juan Carlos Niebles

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper