Model Performance Scaling with Multiple Data Sources

Abstract: Real-world machine learning systems are often trained using a mix of data sources with varying cost and quality. Understanding how the size and composition of a training dataset affect model performance is critical for advancing our understanding of generalization, as well as designing more effective data collection policies. We show that there is a simple scaling law that predicts the loss incurred by a model even under varying dataset composition. Our work expands recent observations of scaling laws for log-linear generalization error in the i.i.d setting and uses this to cast model performance prediction as a learning problem. Using the theory of optimal experimental design, we derive a simple rational function approximation to generalization error that can be fitted using a few model training runs. Our approach can achieve highly accurate ($r^2\approx .9$) predictions of model performance under substantial extrapolation in two different standard supervised learning tasks and is accurate ($r^2 \approx .83$) on more challenging machine translation and question answering tasks where many baselines achieve worse-than-random performance.

18/07/2021

Model Performance Scaling with Multiple Data Sources

Tatsunori Hashimoto

Comments

Similar Papers

Training Data Subset Selection for Regression with Controlled Generalization Error

Durga S, Rishabh Iyer, Ganesh Ramakrishnan, Abir De

Keywords Abstract Paper

, Algorithms, Online Learning, Algorithms, Supervised Learning

Training Over-parameterized Models with Non-decomposable Objectives

Harikrishna Narasimhan, Aditya Menon

Keywords Abstract Paper

optimization, machine learning, fairness

Adversarial Regression with Doubly Non-negative Weighting Matrices

Tam Le, Truyen Nguyen, Makoto Yamada and Jose Blanchet, Viet Anh Nguyen

Keywords Abstract Paper

machine learning

Extrapolation for Large-batch Training in Deep Learning

Tao LIN, Lingjing Kong, Sebastian Stich, Martin Jaggi

Keywords Abstract Paper

Deep Learning - Algorithms

Learn to expect the unexpected: Probably approximately correct domain generalization

Vikas Garg, Adam Tauman Kalai, Katrina Ligett, Steven Wu

Keywords Abstract Paper

GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning

Krishnateja Killamsetty, Durga Sivasubramanian, Ganesh Ramakrishnan, Rishabh Iyer

Keywords Abstract Paper

Robust Generalization despite Distribution Shift via Minimum Discriminating Information

Tobias Sutter, Andreas Krause, Daniel Kuhn

Keywords Abstract Paper

optimization, machine learning

Critical parameters for scalable distributed learning with large batches and asynchronous updates

Sebastian Stich, Amirkeivan Mohtashami, Martin Jaggi

Keywords Abstract Paper

Good classifiers are abundant in the interpolating regime

Ryan Theisen, Jason Klusowski, Michael Mahoney

Keywords Abstract Paper

When Do Curricula Work?

Xiaoxia (Shirley) Wu, Ethan Dyer, Behnam Neyshabur

Keywords Abstract Paper

Empirical Investigation, Understanding Deep Learning, Curriculum Learning

A Distribution-dependent Analysis of Meta Learning

Mikhail Konobeev, Ilja Kuzborskij, Csaba Szepesvari

Keywords Abstract Paper

Theory, Statistical Learning Theory

SuperLoss: A Generic Loss for Robust Curriculum Learning

Thibault Castells, Philippe Weinzaepfel, Jerome Revaud

Keywords Abstract Paper

, Probabilistic Methods -> MCMC

Field-wise Learning for Multi-field Categorical Data

Zhibin Li, Jian Zhang, Yongshun Gong and Yazhou Yao, Qiang Wu

Keywords Abstract Paper

RATT: Leveraging Unlabeled Data to Guarantee Generalization

Saurabh Garg, Sivaraman Balakrishnan, Zico Kolter, Zachary Lipton

Keywords Abstract Paper

Probabilistic Methods, Graphical Models, Theory, Computational Complexity, Theory, Models of Learning and Generalization

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Abstract Paper

Adaptive Sampling for Minimax Fair Classification

Shubhanshu Shekhar, Greg Fields, Mohammad Ghavamzadeh, Tara Javidi

Keywords Abstract Paper

deep learning, machine learning, fairness

Model-based Adversarial Meta-Reinforcement Learning

Zichuan Lin, Garrett Thomas, Guangwen Yang, Tengyu Ma

Keywords Abstract Paper

DORO: Distributional and Outlier Robust Optimization

Runtian Zhai, Chen Dan, Zico Kolter, Pradeep Ravikumar

Keywords Abstract Paper

Probabilistic Methods, Robust statistics

Precise Tradeoffs in Adversarial Training for Linear Regression

Adel Javanmard, Mahdi Soltanolkotabi, Hamed Hassani

Keywords Abstract Paper

Adversarial learning and robustness, High-dimensional statistics, Regression

Shape your Space: A Gaussian Mixture Regularization Approach to Deterministic Autoencoders

Amrutha Saseendran, Kathrin Skubch, Stefan Falkner, Margret Keuper

Keywords Abstract Paper

generative model

Theoretical bounds on estimation error for meta-learning

James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI and Toniann Pitassi, Richard Zemel

Keywords Abstract Paper

Keywords Paper

Keywords Paper

Tam Le, Truyen Nguyen, Makoto Yamada and
Jose Blanchet, Viet Anh Nguyen

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Zhibin Li, Jian Zhang, Yongshun Gong and
Yazhou Yao, Qiang Wu

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI and
Toniann Pitassi, Richard Zemel

Keywords Paper

Mingchen Li, Xuechen Zhang, Christos Thrampoulidis and
Jiasi Chen, Samet Oymak

Keywords Paper

Bingyi Kang, Yu Li, Sain Xie and
Zehuan Yuan, Jiashi Feng

Keywords Paper

Keywords Paper

Richard Nock, Stephen J Hardy, Wilko Henecka and
Hamish Ivey-Law, Jakub Nabaglo, Giorgio Patrini, Guillaume Smith, Brian Thorne

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Mohammad Pezeshki, Oumar Kaba, Yoshua Bengio and
Aaron Courville, Doina Precup, Guillaume Lajoie

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Baifeng Shi, Judy Hoffman, Kate Saenko and
Trevor Darrell, Huijuan Xu

Keywords Paper

Keywords Paper

Keywords Paper