02/02/2021

Accelerated Combinatorial Search for Outlier Detection with Provable Bound on Sub-Optimality

Guihong Wan, Haim Schweitzer

Keywords:

Abstract: Outliers negatively affect the accuracy of data analysis. In this paper we are concerned with their influence on the accuracy of Principal Component Analysis (PCA). Algorithms that attempt to detect outliers and remove them from the data prior to applying PCA are sometimes called Robust PCA, or Robust Subspace Recovery algorithms. We propose a new algorithm for outlier detection that combines two ideas. The first is "chunk recursive elimination" that was used effectively to accelerate feature selection, and the second is combinatorial search, in a setting similar to A*. Our main result is showing how to combine these two ideas. One variant of our algorithm is guaranteed to compute an optimal solution according to some natural criteria, but its running time makes it impractical for large datasets. Other variants are much faster and come with provable bounds on sub-optimality. Experimental results show the effectiveness of the proposed approach.

The video of this talk cannot be embedded. You can watch it here:
https://slideslive.com/38948160
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at AAAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers