Coresets for Classification – Simplified and Strengthened

06/12/2021

Coresets for Classification – Simplified and Strengthened

Tung Mai, Cameron Musco, Anup Rao

Keywords: machine learning, active learning

Abstract Paper Similar Papers

Abstract: We give relative error coresets for training linear classifiers with a broad class of loss functions, including the logistic loss and hinge loss. Our construction achieves $(1\pm \epsilon)$ relative error with $\tilde O(d \cdot \mu_y(X)^2/\epsilon^2)$ points, where $\mu_y(X)$ is a natural complexity measure of the data matrix $X \in \mathbb{R}^{n \times d}$ and label vector $y \in \{-1,1\}^n$, introduced by Munteanu et al. 2018. Our result is based on subsampling data points with probabilities proportional to their $\ell_1$ $Lewis$ $weights$. It significantly improves on existing theoretical bounds and performs well in practice, outperforming uniform subsampling along with other importance sampling methods. Our sampling distribution does not depend on the labels, so can be used for active learning. It also does not depend on the specific loss function, so a single coreset can be used in multiple training scenarios.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2020

Exact expressions for double descent and implicit regularization via surrogate random design

Michal Derezinski, Feynman Liang, Michael W Mahoney

Keywords Paper

0

0

0

0

3:24

18/07/2021

Learning from Biased Data: A Semi-Parametric Approach

Patrice Bertail, Stephan Clémençon, Yannick Guyonvarch, Nathan NOIRY

Keywords Paper

Applications, Fairness, Accountability, and Transparency, Theory, Algorithms, Clustering; Applications, Hardware and Systems; Applications, Privacy, Anonymity, and Security

0

0

0

0

5:09

06/12/2021

Learning the optimal Tikhonov regularizer for inverse problems

Giovanni Alberti, Ernesto De Vito, Matti Lassas and
Luca Ratti, Matteo Santacesaria

Keywords Paper

self-supervised learning, graph learning

0

0

0

0

15:03

18/07/2021

Active Covering

Heinrich Jiang, Afshin Rostamizadeh

Keywords Paper

Algorithms, Active Learning

0

0

0

0

4:47

06/12/2020

Task-Robust Model-Agnostic Meta-Learning

Liam Collins, Aryan Mokhtari, Sanjay Shakkottai

Keywords Paper

0

0

0

0

3:17

26/04/2020

Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks

Ziwei Ji, Matus Telgarsky

Keywords Paper

neural tangent kernel, polylogarithmic width, test error, gradient descent, classification

0

0

0

0

5:04

12/07/2020

Subspace Fitting Meets Regression: The Effects of Supervision and Orthonormality Constraints on Double Descent of Generalization Errors

Yehuda Dar, Paul Mayer, Lorenzo Luzi, Richard Baraniuk

Keywords Paper

Unsupervised and Semi-Supervised Learning

0

0

0

0

15:39

18/07/2021

Discriminative Complementary-Label Learning with Weighted Loss

Yi Gao, Min-Ling Zhang

Keywords Paper

Probabilistic Methods, Probabilistic Methods, MCMC, Algorithms, Supervised Learning

0

0

0

0

20:39

03/05/2021

Initialization and Regularization of Factorized Neural Layers

Misha Khodak, Neil Tenenholtz, Lester Mackey, Nicolo Fusi

Keywords Paper

matrix factorization, knowledge distillation, multi-head attention, model compression

0

0

0

0

4:25

18/07/2021

PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization

Zhize Li, Hongyan Bao, Xiangliang Zhang, Peter Richtarik

Keywords Paper

Optimization

0

0

0

0

11:53

06/12/2021

An Improved Analysis of Gradient Tracking for Decentralized Machine Learning

Anastasiia Koloskova, Tao Lin, Sebastian Stich

Keywords Paper

optimization, machine learning

0

0

0

0

7:22

06/12/2020

A random matrix analysis of random Fourier features: beyond the Gaussian kernel, a precise phase transition, and the corresponding double descent

Zhenyu Liao, Romain Couillet, Michael W Mahoney

Keywords Paper

0

0

0

0

3:26

06/12/2021

Fast Axiomatic Attribution for Neural Networks

Robin Hesse, Simone Schaub-Meyer, Stefan Roth

Keywords Paper

deep learning, interpretability

0

0

0

0

14:49

06/12/2020

Domain Adaptation with Conditional Distribution Matching and Generalized Label Shift

Remi Tachet des Combes, Han Zhao, Yu-Xiang Wang, Geoffrey Gordon

Keywords Paper

0

0

0

0

3:19

04/08/2021

Query complexity of least absolute deviation regression via robust uniform convergence

Xue Chen, Michal Derezinski

Keywords Paper

0

0

0

0

19:41

03/05/2021

For self-supervised learning, Rationality implies generalization, provably

Yamini Bansal, Gal Kaplun, Boaz Barak

Keywords Paper

Representation learning, Self-supervised learning, Generalization Bounds, Deep Learning Theory

0

0

0

0

7:23

06/12/2020

Sample Efficient Reinforcement Learning via Low-Rank Matrix Estimation

Devavrat Shah, Dogyoon Song, Zhi Xu, Yuzhe Yang

Keywords Paper

0

0

0

0

3:22

12/07/2020

Double Trouble in Double Descent: Bias and Variance(s) in the Lazy Regime

Stéphane d'Ascoli, Maria Refinetti, Giulio Biroli, Florent Krzakala

Keywords Paper

Deep Learning - Theory

0

0

0

0

15:11

09/07/2020

Bessel Smoothing and Multi-Distribution Property Estimation

Yi Hao, Ping Li

Keywords Paper

Distribution learning/testing, High-dimensional statistics, Information theory

0

0

0

0

14:48

02/02/2021

Infinite Gaussian Mixture Modeling with an Improved Estimation of the Number of Clusters

Avi Matza, Yuval Bistritz

Keywords Paper

0

0

0

0

20:14

06/12/2021

A Minimalist Approach to Offline Reinforcement Learning

Scott Fujimoto, Shixiang (Shane) Gu

Keywords Paper

reinforcement learning and planning, generative model

1

0

0

0

8:31

06/12/2020

Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions

Matthew Faw, Rajat Sen, Karthikeyan Shanmugam and
Constantine Caramanis, Sanjay Shakkottai

Keywords Paper

0

0

0

0

3:24

06/12/2021

Representer Point Selection via Local Jacobian Expansion for Post-hoc Classifier Explanation of Deep Neural Networks and Ensemble Models

Yi Sui, Ga Wu, Scott Sanner

Keywords Paper

deep learning, optimization, machine learning, vision

0

0

0

0

10:29

04/08/2021

Exponential Weights Algorithms for Selective Learning

Mingda Qiao, Gregory Valiant

Keywords Paper

0

0

0

0

12:52

18/07/2021

DORO: Distributional and Outlier Robust Optimization

Runtian Zhai, Chen Dan, Zico Kolter, Pradeep Ravikumar

Keywords Paper

Probabilistic Methods, Robust statistics

0

0

0

1

5:06

19/04/2021

Keep learning: Self-supervised meta-learning for learning from inference

Akhil Kedia, Sai Chetan Chinthakindi

Keywords Paper

0

0

0

0

11:27

02/02/2021

MetaAugment: Sample-Aware Data Augmentation Policy Learning

Fengwei Zhou, Jiawei Li, Chuanlong Xie and
Fei Chen, Lanqing Hong, Rui Sun, Zhenguo Li

Keywords Paper

0

0

0

0

18:19

18/07/2021

Training Data Subset Selection for Regression with Controlled Generalization Error

Durga S, Rishabh Iyer, Ganesh Ramakrishnan, Abir De

Keywords Paper

, Algorithms, Online Learning, Algorithms, Supervised Learning

0

0

0

0

4:15

06/12/2020

A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Wei Deng, Guang Lin, Faming Liang

Keywords Paper

0

0

0

0

3:26

06/12/2020

Sharper Generalization Bounds for Pairwise Learning

Yunwen Lei, Antoine Ledent, Marius Kloft

Keywords Paper

0

0

0

0

3:20

06/12/2020

Fourier Sparse Leverage Scores and Approximate Kernel Learning

Tamas Erdelyi, Cameron Musco, Christopher Musco

Keywords Paper

0

0

0

0

3:25

06/12/2021

Bridging Explicit and Implicit Deep Generative Models via Neural Stein Estimators

Qitian Wu, Rui Gao, Hongyuan Zha

Keywords Paper

generative model

0

0

0

0

12:51

26/08/2020

A Rule for Gradient Estimator Selection, with an Application to Variational Inference

Tomas Geffner, Justin Domke

Keywords Paper

0

0

0

0

8:36

09/07/2020

Privately Learning Thresholds: Closing the Exponential Gap

Haim Kaplan, Katrina Ligett, Yishay Mansour and
Moni Naor, Uri Stemmer

Keywords Paper

Privacy, fairness, PAC learning

0

0

0

0

14:44

26/08/2020

Deep Active Learning: Unified and Principled Method for Query and Training

Changjian Shui, Fan Zhou, Christian Gagné, Boyu Wang

Keywords Paper

0

0

0

0

12:12

03/05/2021

Local Convergence Analysis of Gradient Descent Ascent with Finite Timescale Separation

Tanner Fiez, Lillian J Ratliff

Keywords Paper

equilibrium, gradient descent-ascent, continuous games, game theory, theory, convergence, generative adversarial networks

0

0

0

0

5:10

18/07/2021

Fast margin maximization via dual acceleration

Ziwei Ji, Nati Srebro, Matus Telgarsky

Keywords Paper

Optimization, Convex Optimization

0

0

0

0

4:50

06/12/2021

Determinantal point processes based on orthogonal polynomials for sampling minibatches in SGD

Rémi Bardenet, Subhroshekhar Ghosh, Meixia LIN

Keywords Paper

optimization, machine learning

0

0

0

0

14:51

20/07/2020

A type of generalization error induced by initialization in deep neural networks

Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

Keywords Paper

0

0

0

0

17:33

18/07/2021

Model Performance Scaling with Multiple Data Sources

Tatsunori Hashimoto

Keywords Paper

Algorithms, Supervised Learning

0

0

0

1

4:50