Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time

06/12/2020

Robust Gaussian Covariance Estimation in Nearly-Matrix Multiplication Time

Jerry Li, Guanghao Ye

Keywords:

Abstract Paper Similar Papers

Abstract: Robust covariance estimation is the following, well-studied problem in high dimensional statistics: given $N$ samples from a $d$-dimensional Gaussian $\mathcal{N}(\boldsymbol{0}, \Sigma)$, but where an $\varepsilon$-fraction of the samples have been arbitrarily corrupted, output $\widehat{\Sigma}$ minimizing the total variation distance between $\mathcal{N}(\boldsymbol{0}, \Sigma)$ and $\mathcal{N}(\boldsymbol{0}, \widehat{\Sigma})$. This corresponds to learning $\Sigma$ in a natural affine-invariant variant of the Frobenius norm known as the \emph{Mahalanobis norm}. Previous work of Cheng et al demonstrated an algorithm that, given $N = \widetilde{\Omega}(d^2 / \varepsilon^2)$ samples, achieved a near-optimal error of $O(\varepsilon \log 1 / \varepsilon)$, and moreover, their algorithm ran in time $\widetilde{O}(T(N, d) \log \kappa / \mathrm{poly} (\varepsilon))$, where $T(N, d)$ is the time it takes to multiply a $d \times N$ matrix by its transpose, and $\kappa$ is the condition number of $\Sigma$. When $\varepsilon$ is relatively small, their polynomial dependence on $1/\varepsilon$ in the runtime is prohibitively large. In this paper, we demonstrate a novel algorithm that achieves the same statistical guarantees, but which runs in time $\widetilde{O} (T(N, d) \log \kappa)$. In particular, our runtime has no dependence on $\varepsilon$. When $\Sigma$ is reasonably conditioned, our runtime matches that of the fastest algorithm for covariance estimation without outliers, up to poly-logarithmic factors, showing that we can get robustness essentially ``for free.''

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

List-Decodable Mean Estimation in Nearly-PCA Time

Ilias Diakonikolas, Daniel Kane, Daniel Kongsgaard and
Jerry Li, Kevin Tian

Keywords Paper

theory, clustering

0

0

0

0

14:21

18/07/2021

Consistent regression when oblivious outliers overwhelm

Tommaso d'Orsi, Gleb Novikov, David Steurer

Keywords Paper

Theory, Game Theory and Computational Economics, Theory, Theory, Computational Complexity

0

0

0

0

4:42

06/12/2020

The Flajolet-Martin Sketch Itself Preserves Differential Privacy: Private Counting with Minimal Space

Adam Smith, Shuang Song, Abhradeep Guha Thakurta

Keywords Paper

0

0

0

0

3:17

03/05/2021

Faster Binary Embeddings for Preserving Euclidean Distances

Jinjie Zhang, Rayan Saab

Keywords Paper

Binary Embeddings, Sigma Delta Quantization, Johnson-Lindenstrauss Transforms

0

0

0

0

4:32

06/12/2020

On Adaptive Distance Estimation

Yeshwanth Cherapanamjeri, Jelani Nelson

Keywords Paper

0

0

0

0

3:16

06/12/2021

Consistent Estimation for PCA and Sparse Regression with Oblivious Outliers

Tommaso d'Orsi, Chih-Hung Liu, Rajai Nasser and
Gleb Novikov, David Steurer, Stefan Tiegel

Keywords Paper

optimization

0

0

0

0

10:44

06/12/2020

Federated Principal Component Analysis

Andreas Grammenos, Rodrigo Mendoza Smith, Jon Crowcroft, Cecilia Mascolo

Keywords Paper

0

0

0

0

3:07

18/07/2021

Streaming and Distributed Algorithms for Robust Column Subset Selection

Shuli Jiang, Dongyu Li, Irene Mengze Li and
Arvind Mahankali, David Woodruff

Keywords Paper

Algorithms, Deep Learning, Generative Models, Deep Learning, Predictive Models; Deep Learning, Recurrent Networks

0

0

0

0

7:26

12/07/2020

On Efficient Low Distortion Ultrametric Embedding

Vincent Cohen-Addad, Karthik C. S., Guillaume Lagarde

Keywords Paper

Unsupervised and Semi-Supervised Learning

0

0

0

0

16:37

09/07/2020

A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates

Zhixian Lei, Kyle Luh, Prayaag Venkat, Fred Zhang

Keywords Paper

High-dimensional statistics, Adversarial learning and robustness

0

0

0

0

15:00

26/08/2020

One Sample Stochastic Frank-Wolfe

Mingrui Zhang, Zebang Shen, Aryan Mokhtari and
Hamed Hassani, Amin Karbasi

Keywords Paper

0

0

0

0

6:05

04/08/2021

Near-Optimal Entrywise Sampling of Numerically Sparse Matrices

Vladimir Braverman, Robert Krauthgamer, Aditya R Krishnan, Shay Sapir

Keywords Paper

0

0

0

0

16:59

06/12/2020

Task-Robust Model-Agnostic Meta-Learning

Liam Collins, Aryan Mokhtari, Sanjay Shakkottai

Keywords Paper

0

0

0

0

3:17

18/07/2021

Meta Learning for Support Recovery in High-dimensional Precision Matrix Estimation

Qian Zhang, Yilin Zheng, Jean Honorio

Keywords Paper

Algorithms, Meta-Learning, Algorithms, Few-Shot Learning; Algorithms, Multitask and Transfer Learning, Theory, Statistical Learning Theory

0

0

0

0

5:03

18/07/2021

First-Order Methods for Wasserstein Distributionally Robust MDP

Julien Grand-Clement, Christian Kroer

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:18

06/12/2021

Optimal Sketching for Trace Estimation

Shuli Jiang, Hai Pham, David Woodruff, Richard Zhang

Keywords Paper

machine learning

0

0

0

0

15:14

06/12/2020

Truncated Linear Regression in High Dimensions

Constantinos Daskalakis, Dhruv Rohatgi, Emmanouil Zampetakis

Keywords Paper

0

0

0

0

3:17

06/12/2021

A Comprehensively Tight Analysis of Gradient Descent for PCA

Zhiqiang Xu, Ping Li

Keywords Paper

optimization

0

0

0

0

4:37

12/07/2020

Robust One-Bit Recovery via ReLU Generative Networks: Near-Optimal Statistical Rate and Global Landscape Analysis

Shuang Qiu, Xiaohan Wei, Zhuoran Yang

Keywords Paper

Optimization - Non-convex

0

0

0

0

15:06

09/07/2020

Balancing Gaussian vectors in high dimension

Paxton M Turner, Raghu Meka, Philippe Rigollet

Keywords Paper

Combinatorial optimization, Approximation algorithms, Concentration inequalities, High-dimensional statistics, Stochastic optimization

0

0

0

0

13:39

06/12/2021

Cardinality constrained submodular maximization for random streams

Paul Liu, Aviad Rubinstein, Jan Vondrak, Junyao Zhao

Keywords Paper

optimization

0

0

0

0

14:11

06/12/2021

Better Algorithms for Individually Fair $k$-Clustering

Maryam Negahbani, Deeparnab Chakrabarty

Keywords Paper

theory, self-supervised learning, clustering, fairness

0

0

0

0

14:02

06/12/2021

Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems

Suhas Kowshik, Dheeraj Nagaraj, Prateek Jain, Praneeth Netrapalli

Keywords Paper

theory

0

0

0

0

14:43

09/07/2020

Better Algorithms for Estimating Non-Parametric Models in Crowd-Sourcing and Rank Aggregation

Allen X Liu, Ankur Moitra

Keywords Paper

Matrix/tensor estimation, Learning with algebraic or combinatorial structure, Ranking and preference learning

0

0

0

0

14:09

06/12/2020

Robust Sub-Gaussian Principal Component Analysis and Width-Independent Schatten Packing

Arun Jambulapati, Jerry Li, Kevin Tian

Keywords Paper

0

0

0

0

3:22

12/07/2020

Closing the convergence gap of SGD without replacement

Shashank Rajput, Anant Gupta, Dimitris Papailiopoulos

Keywords Paper

Optimization - Convex

0

0

0

0

12:45

09/07/2020

Efficient and robust algorithms for adversarial linear contextual bandits

Gergely Neu, Julia Olkhovskaya

Keywords Paper

Bandit problems, Online learning

0

0

0

0

9:53

22/06/2020

Algorithms for heavy-tailed statistics: Regression, covariance estimation, and beyond

Yeshwanth Cherapanamjeri, Samuel B. Hopkins, Tarun Kathuria and
Prasad Raghavendra, Nilesh Tripuraneni

Keywords Paper

Sum-of-squares, Algorithms, Heavy-Tailed Estimation

0

0

0

0

20:29

06/12/2021

The Complexity of Sparse Tensor PCA

Davin Choo, Tommaso d'Orsi

Keywords Paper

0

0

0

0

15:10

08/07/2020

Deterministic Sparse Fourier Transform with an 𝓁_{∞} Guarantee

Yi Li, Vasileios Nakos

Keywords Paper

Fourier sparse recovery, derandomization, incoherent matrices

0

0

0

0

19:52

18/07/2021

PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization

Zhize Li, Hongyan Bao, Xiangliang Zhang, Peter Richtarik

Keywords Paper

Optimization

0

0

0

0

11:53

06/12/2020

Efficient active learning of sparse halfspaces with arbitrary bounded noise

Chicheng Zhang, Jie Shen, Pranjal Awasthi

Keywords Paper

0

0

0

0

3:20

06/12/2020

Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses

Raef Bassily, Vitaly Feldman, Cristóbal Guzmán, Kunal Talwar

Keywords Paper

0

0

0

0

3:11

06/12/2020

Online Robust Regression via SGD on the l1 loss

Scott Pesme, Nicolas Flammarion

Keywords Paper

0

0

0

0

3:17

12/07/2020

Optimal Statistical Guaratees for Adversarially Robust Gaussian Classification

Chen Dan, Yuting Wei, Pradeep Ravikumar

Keywords Paper

Learning Theory

0

0

0

0

14:36

06/12/2021

Beyond Smoothness: Incorporating Low-Rank Analysis into Nonparametric Density Estimation

Robert A Vandermeulen, Antoine Ledent

Keywords Paper

theory

0

0

0

0

12:58

18/07/2021

Improving Ultrametrics Embeddings Through Coresets

Vincent Cohen-Addad, Rémi de Joannis de Verclos, Guillaume Lagarde

Keywords Paper

Algorithms, Clustering

0

0

0

0

5:19

06/12/2021

Tight High Probability Bounds for Linear Stochastic Approximation with Fixed Stepsize

Alain Durmus, Eric Moulines, Alexey Naumov and
Sergey Samsonov, Kevin Scaman, Hoi-To Wai

Keywords Paper

machine learning

0

0

0

0

12:53

09/07/2020

Private Mean Estimation of Heavy-Tailed Distributions

Gautam Kamath, Vikrant Singhal, Jonathan Ullman

Keywords Paper

Privacy, fairness, Distribution learning/testing

0

0

0

0

13:24

06/12/2021

Nearly Horizon-Free Offline Reinforcement Learning

Tongzheng Ren, Jialian Li, Bo Dai and
Simon Du, Sujay Sanghavi

Keywords Paper

theory, optimization, reinforcement learning and planning

0

0

0

0

8:44