Simple and Scalable Sparse k-means Clustering via Feature Ranking

06/12/2020

Simple and Scalable Sparse k-means Clustering via Feature Ranking

Zhiyue Zhang, Kenneth Lange, Jason Xu

Keywords:

Abstract Paper Similar Papers

Abstract: Clustering, a fundamental activity in unsupervised learning, is notoriously difficult when the feature space is high-dimensional. Fortunately, in many realistic scenarios, only a handful of features are relevant in distinguishing clusters. This has motivated the development of sparse clustering techniques that typically rely on k-means within outer algorithms of high computational complexity. Current techniques also require careful tuning of shrinkage parameters, further limiting their scalability. In this paper, we propose a novel framework for sparse k-means clustering that is intuitive, simple to implement, and competitive with state-of-the-art algorithms. We show that our algorithm enjoys consistency and convergence guarantees. Our core method readily generalizes to several task-specific algorithms such as clustering on subsets of attributes and in partially observed data settings. We showcase these contributions thoroughly via simulated experiments and real data benchmarks, including a case study on protein expression in trisomic mice.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

14/09/2020

An efficient K-means clustering algorithm for tall data

Marco Capó, Aritz Pérez, Jose A. Lozan

Keywords Paper

0

0

0

0

14:46

18/07/2021

Differentially-Private Clustering of Easy Instances

Edith Cohen, Haim Kaplan, Yishay Mansour and
Uri Stemmer, Eliad Tsfadia

Keywords Paper

Social Aspects of Machine Learning, Privacy, Anonymity, and Security

0

0

0

0

4:58

14/09/2020

Online Binary Incomplete Multi-view Clustering

Longqi Yang, Liangliang Zhang, Yuhua Tang

Keywords Paper

0

0

0

0

3:04

06/12/2020

Higher-Order Spectral Clustering of Directed Graphs

Steinar Laenen, He Sun

Keywords Paper

0

0

0

0

3:22

03/05/2021

MetaNorm: Learning to Normalize Few-Shot Batches Across Domains

Yingjun Du, Xiantong Zhen, Ling Shao, Cees G Snoek

Keywords Paper

batch normalization, Meta-learning, few-shot domain generalization

0

0

0

0

5:48

06/12/2020

Provable Overlapping Community Detection in Weighted Graphs

Jimit Majmudar, Stephen Vavasis

Keywords Paper

0

0

0

0

3:09

06/12/2021

Robust and Fully-Dynamic Coreset for Continuous-and-Bounded Learning (With Outliers) Problems

Zixiu Wang, Yiwen Guo, Hu Ding

Keywords Paper

optimization, machine learning, adversarial robustness and security, clustering

0

0

0

0

8:38

19/08/2021

Fast Multi-label Learning

Xiuwen Gong, Dong Yuan, Wei Bao

Keywords Paper

Machine Learning, Multi-instance; Multi-label; Multi-view learning

0

0

0

0

15:18

12/07/2020

On hyperparameter tuning in general clustering problemsm

Xinjie Fan, Yuguang Yue, Purnamrita Sarkar, Y. X. Rachel Wang

Keywords Paper

Unsupervised and Semi-Supervised Learning

0

0

0

0

12:53

19/08/2021

Details (Don't) Matter: Isolating Cluster Information in Deep Embedded Spaces

Lukas Miklautz, Lena G. M. Bauer, Dominik Mautz and
Sebastian Tschiatschek, Christian Böhm, Claudia Plant

Keywords Paper

Machine Learning, Deep Learning, Explainable/Interpretable Machine Learning, Clustering

0

0

0

0

14:37

18/07/2021

Learn2Hop: Learned Optimization on Rough Landscapes

Amil Merchant, Luke Metz, Samuel Schoenholz, Ekin Cubuk

Keywords Paper

Applications, Others

0

0

0

0

5:19

19/04/2021

Exploring the limits of few-shot link prediction in knowledge graphs

Dora Jambor, Komal Teru, Joelle Pineau, William L. Hamilton

Keywords Paper

0

0

0

0

7:05

18/07/2021

A large-scale benchmark for few-shot program induction and synthesis

Ferran Alet, Javier Lopez-Contreras, James Koppel and
Maxwell Nye, Armando Solar-Lezama, Tomas Lozano-Perez, Leslie Kaelbling, Josh Tenenbaum

Keywords Paper

Algorithms, Multitask, Transfer, and Meta Learning

0

0

0

0

5:07

26/08/2020

Entropy Weighted Power k-Means Clustering

Saptarshi Chakraborty, Debolina Paul, Swagatam Das, Jason Xu

Keywords Paper

0

0

0

0

15:20

26/08/2020

Dependent randomized rounding for clustering and partition systems with knapsack constraints

David Harris, Thomas Pensyl, Aravind Srinivasan, Khoa Trinh

Keywords Paper

0

0

0

0

16:49

26/04/2020

What Can Neural Networks Reason About?

Keyulu Xu, Jingling Li, Mozhi Zhang and
Simon S. Du, Ken-ichi Kawarabayashi, Stefanie Jegelka

Keywords Paper

reasoning, deep learning theory, algorithmic alignment, graph neural networks

0

0

0

0

6:00

02/02/2021

Automated Clustering of High-dimensional Data with a Feature Weighted Mean Shift Algorithm

Saptarshi Chakraborty, Debolina Paul, Swagatam Das

Keywords Paper

0

0

0

0

20:09

06/12/2020

Adversarial Learning for Robust Deep Clustering

Xu Yang, Cheng Deng, Kun Wei and
Junchi Yan, Wei Liu

Keywords Paper

0

0

0

0

3:23

04/08/2021

Adversarially Robust Low Dimensional Representations

Pranjal Awasthi, Vaggos Chatziafratis, Xue Chen, Aravindan Vijayaraghavan

Keywords Paper

0

0

0

0

20:19

12/07/2020

Efficient Continuous Pareto Exploration in Multi-Task Learning

Pingchuan Ma, Tao Du, Wojciech Matusik

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

14:56

19/08/2021

Fine-grained Generalization Analysis of Structured Output Prediction

Waleed Mustafa, Yunwen Lei, Antoine Ledent, Marius Kloft

Keywords Paper

Machine Learning, Learning Theory, Structured Prediction

0

0

0

0

15:46

14/09/2020

Simple, Scalable, and Stable Variational Deep Clustering

Lele Cao, Sahar Asadi, Wenfei Zhu and
Christian Schmidli, Michael Sjöberg

Keywords Paper

deep clustering, deep embedding, variational deep clustering, gaussian mixture model, user profiling

0

0

0

0

13:54

06/12/2021

Efficient Bayesian network structure learning via local Markov boundary search

Ming Gao, Bryon Aragam

Keywords Paper

theory, generative model, graph learning

0

1

0

0

3:31

06/12/2020

BanditPAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits

Mo Tiwari, Martin Zhang, James J Mayclin and
Sebastian Thrun, Chris Piech, Ilan Shomorony

Keywords Paper

0

0

0

0

3:16

26/04/2020

Self-labelling via simultaneous clustering and representation learning

Asano YM., Rupprecht C., Vedaldi A.

Keywords Paper

self-supervision, feature representation learning, clustering

0

0

0

0

4:57

06/12/2021

Solving Soft Clustering Ensemble via $k$-Sparse Discrete Wasserstein Barycenter

Ruizhe Qin, Mengying Li, Hu Ding

Keywords Paper

clustering

0

0

0

0

12:11

26/04/2020

Minimizing FLOPs to Learn Efficient Sparse Representations

Biswajit Paria, Chih-Kuan Yeh, Ian E.H. Yen and
Ning Xu, Pradeep Ravikumar, Barnabás Póczos

Keywords Paper

sparse embeddings, deep representations, metric learning, regularization

0

0

0

0

4:41

14/06/2020

SGAS: Sequential Greedy Architecture Search

Guohao Li, Guocheng Qian, Itzel C. Delgadillo and
Matthias Müller, Ali Thabet, Bernard Ghanem

Keywords Paper

neural architecture search, degenerate search-evaluation correlation, cnn, gcn, image classification, point cloud classification, node classification on biological graphs, greedy search

0

0

0

0

1:01

13/04/2021

ATOL: Measure vectorization for automatic topologically-oriented learning

Martin Royer, Frederic Chazal, Clément Levrard and
Yuhei Umeda, Yuichi Ike

Keywords Paper

0

0

0

0

3:05

14/06/2020

SSRNet: Scalable 3D Surface Reconstruction Network

Zhenxing Mi, Yiming Luo, Wenbing Tao

Keywords Paper

deep learning, 3d surface reconstruction, scalable, point cloud, large scale

0

0

0

0

1:01

19/08/2021

Knowledge-based Residual Learning

Guanjie Zheng, Chang Liu, Hua Wei and
Porter Jenkins, Chacha Chen, Tao Wen, Zhenhui Li

Keywords Paper

Data Mining, Classification, Mining Spatial, Temporal Data, Theoretical Foundation of Data Mining

0

0

0

0

12:50

03/05/2021

Contextual Transformation Networks for Online Continual Learning

Quang Pham, Chenghao Liu, Doyen Sahoo, Steven HOI

Keywords Paper

Continual Learning

0

0

0

0

4:48

22/11/2021

EBJR: Energy-Based Joint Reasoning for Adaptive Inference

Mohammad Akbari, Amin Banitalebi-Dehkordi, Yong Zhang

Keywords Paper

joint inference, energy-based models, adaptive inference, classification, regression

0

0

0

0

2:48

06/12/2021

Deep Conditional Gaussian Mixture Model for Constrained Clustering

Laura Manduchi, Kieran Chin-Cheong, Holger Michel and
Sven Wellmann, Julia Vogt

Keywords Paper

machine learning, robustness, generative model, clustering, representation learning

0

0

0

0

14:14

03/05/2021

Degree-Quant: Quantization-Aware Training for Graph Neural Networks

Shyam Tailor, Javier Fernandez-Marques, Nic Lane

Keywords Paper

Graph neural networks, benchmark, quantization

0

0

0

0

5:01

06/12/2020

Kernel Methods Through the Roof: Handling Billions of Points Efficiently

Giacomo Meanti, Luigi Carratino, Lorenzo Rosasco, Alessandro Rudi

Keywords Paper

0

0

0

0

3:28

04/07/2020

SEEK: Segmented Embedding of Knowledge Graphs

Wentao Xu, Shun Zheng, Liang He and
Bin Shao, Jian Yin, Tie-Yan Liu

Keywords Paper

Segmented Graphs, knowledge embedding, artificial intelligence, recommendation

0

0

0

0

12:01

19/08/2021

Graph Filter-based Multi-view Attributed Graph Clustering

Zhiping Lin, Zhao Kang

Keywords Paper

Machine Learning, Clustering, Multi-instance; Multi-label; Multi-view learning, Clustering, Unsupervised Learning

0

0

0

0

13:23

06/12/2020

Top-KAST: Top-K Always Sparse Training

Sid Jayakumar, Razvan Pascanu, Jack Rae and
Simon Osindero, Erich Elsen

Keywords Paper

0

0

0

0

3:18

06/12/2021

Adversarial Attacks on Graph Classifiers via Bayesian Optimisation

Xingchen Wan, Henry Kenlay, Robin Ru and
Arno Blaas, Michael A Osborne, Xiaowen Dong

Keywords Paper

deep learning, machine learning, robustness, adversarial robustness and security, graph learning

0

0

0

0

14:12