Approximate Cross-Validation in High Dimensions with Guarantees

Abstract: Leave-one-out cross-validation (LOOCV) can be particularly accurate among cross-validation (CV) variants for machine learning assessment tasks -- e.g., assessing methods' error or variability. But it is expensive to re-fit a model $N$ times for a dataset of size $N$. Previous work has shown that approximations to LOOCV can be both fast and accurate -- when the unknown parameter is of small, fixed dimension. But these approximations incur a running time roughly cubic in dimension -- and we show that, besides computational issues, their accuracy dramatically deteriorates in high dimensions. Authors have suggested many potential and seemingly intuitive solutions, but these methods have not yet been systematically evaluated or compared. We find that all but one perform so poorly as to be unusable for approximating LOOCV. Crucially, though, we are able to show, both empirically and theoretically, that one approximation can perform well in high dimensions -- in cases where the high-dimensional parameter exhibits sparsity. Under interpretable assumptions, our theory demonstrates that the problem can be reduced to working within an empirically recovered (small) support. This procedure is straightforward to implement, and we prove that its running time and error depend on the (small) support size even when the full parameter dimension is large.

12/07/2020

Approximate Cross-Validation in High Dimensions with Guarantees

William Stephenson, Tamara Broderick

Comments

Similar Papers

Estimating the Error of Randomized Newton Methods: A Bootstrap Approach

Miles Lopes, Jessie X.T. Chen

Keywords Abstract Paper

Probabilistic Inference - Approximate, Monte Carlo, and Spectral Methods

Robust Meta-learning for Mixed Linear Regression with Small Batches

Weihao Kong, Raghav Somani, Sham Kakade, Sewoong Oh

Keywords Abstract Paper

Approximate Cross-Validation with Low-Rank Data in High Dimensions

Will Stephenson, Madeleine Udell, Tamara Broderick

Keywords Abstract Paper

Random Shuffling Beats SGD Only After Many Epochs on Ill-Conditioned Problems

Itay Safran, Ohad Shamir

Keywords Abstract Paper

optimization, machine learning

Practical and Rigorous Uncertainty Bounds for Gaussian Process Regression

Christian Fiedler, Carsten W. Scherer, Sebastian Trimpe

Keywords Abstract Paper

Exponential convergence rates of classification errors on learning with SGD and random features

Shingo Yashima, Atsushi Nitanda, Taiji Suzuki

Keywords Abstract Paper

Reanalyzing the most probable sentence problem: A case study in explicating the role of entropy in algorithmic complexity

Eric Corlett, Gerald Penn

Keywords Abstract Paper

Quasi-Newton Solver for Robust Non-Rigid Registration

Yuxin Yao, Bailin Deng, Weiwei Xu, Juyong Zhang

Keywords Abstract Paper

non-rigid registration, robust estimator, quasi-newton, welsch's function, mm algorithm, l-bfgs, deformation graph.

Online Convex Optimization in the Random Order Model

Dan Garber, Gal Korcia, Kfir Levy

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Non-Stationary Bandits with Intermediate Observations

Claire Vernade, András György, Timothy Mann

Keywords Abstract Paper

Online Learning, Active Learning, and Bandits

Calibration and Consistency of Adversarial Surrogate Losses

Pranjal Awasthi, Natalie Frank, Anqi Mao and Mehryar Mohri, Yutao Zhong

Keywords Abstract Paper

theory, optimization, machine learning, robustness, adversarial robustness and security

Multi-Step Budgeted Bayesian Optimization with Unknown Evaluation Costs

Raul Astudillo, Daniel Jiang, Maximilian Balandat and Eytan Bakshy, Peter Frazier

Keywords Abstract Paper

optimization, reinforcement learning and planning, machine learning

Calibrated Reliable Regression using Maximum Mean Discrepancy

Peng Cui, Wenbo Hu, Jun Zhu

Keywords Abstract Paper

Perturbation Based Learning for Structured NLP tasks with Application to Dependency Parsing

Amichay Doitch, Ram Yazdi, Tamir Hazan, Roi Reichart

Keywords Abstract Paper

Structured tasks, Dependency Parsing, NLP, sampling

Matrix Completion with Quantified Uncertainty through Low Rank Gaussian Copula

Yuxuan Zhao, Madeleine Udell

Keywords Abstract Paper

Uncertainty Quantification and Deep Ensembles

Rahul Rahaman, alexandre thiery

Keywords Abstract Paper

deep learning, machine learning

Generalization Guarantees for Sparse Kernel Approximation with Entropic Optimal Features

Liang Ding, Rui Tuo, Shahin Shahrampour

Keywords Abstract Paper

General Machine Learning Techniques

Iteratively Reweighted Least Squares for Basis Pursuit with Global Linear Convergence Rate

Christian Kümmerle, Claudio Mayrink Verdun, Dominik Stöger

Keywords Abstract Paper

theory, optimization, machine learning

Bayesian Pseudocoresets

Dionysis Manousakas, Zuheng Xu, Cecilia Mascolo, Trevor Campbell

Keywords Abstract Paper

Robustness and scalability under heavy tails, without strong convexity

Matthew Holland

Keywords Abstract Paper

Knapsack Secretary with Bursty Adversary

Thomas Kesselheim, Marco Molinaro

Keywords Abstract Paper

Beyond worst-case, secretary problem, random order, online algorithms, knapsack

Deciphering and Optimizing Multi-Task Learning: a Random Matrix Approach

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Pranjal Awasthi, Natalie Frank, Anqi Mao and
Mehryar Mohri, Yutao Zhong

Keywords Paper

Raul Astudillo, Daniel Jiang, Maximilian Balandat and
Eytan Bakshy, Peter Frazier

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Jiawei Huang, Ruomin Huang, wenjie liu and
Nikolaos Freris, Hu Ding

Keywords Paper

Andres Potapczynski, Luhuan Wu, Dan Biderman and
Geoff Pleiss, John Cunningham

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper