Fast and Accurate $k$-means++ via Rejection Sampling

06/12/2020

Fast and Accurate $k$-means++ via Rejection Sampling

Vincent Cohen-Addad, Silvio Lattanzi, Ashkan Norouzi-Fard, Christian Sohler, Ola Svensson

Keywords:

Abstract Paper Similar Papers

Abstract: $k$-means++ \cite{arthur2007k} is a widely used clustering algorithm that is easy to implement, has nice theoretical guarantees and strong empirical performance. Despite its wide adoption, $k$-means++ sometimes suffers from being slow on large data-sets so a natural question has been to obtain more efficient algorithms with similar guarantees. In this paper, we present such a near linear time algorithm for $k$-means++ seeding. Interestingly our algorithm obtains the same theoretical guarantees as $k$-means++ and significantly improves earlier results on fast $k$-means++ seeding. Moreover, we show empirically that our algorithm is significantly faster than $k$-means++ and obtains solutions of equivalent quality.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at NeurIPS 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

12/07/2020

How to Solve Fair k-Center in Massive Data Models

Ashish Chiplunkar, Sagar Kale, Sivaramakrishnan Natarajan Ramamoorthy

Keywords Paper

Fairness, Equity, Justice, and Safety

0

0

0

0

13:45

06/12/2020

Efficient Clustering Based On A Unified View Of $K$-means And Ratio-cut

Shenfei Pei, Feiping Nie, Rong Wang, Xuelong Li

Keywords Paper

0

0

0

0

3:16

19/08/2021

DEHB: Evolutionary Hyberband for Scalable, Robust and Efficient Hyperparameter Optimization

Noor Awad, Neeratyoy Mallik, Frank Hutter

Keywords Paper

Machine Learning, Evolutionary Learning

0

0

0

0

15:09

26/08/2020

Entropy Weighted Power k-Means Clustering

Saptarshi Chakraborty, Debolina Paul, Swagatam Das, Jason Xu

Keywords Paper

0

0

0

0

15:20

06/12/2021

Sparse Spiking Gradient Descent

Nicolas Perez-Nieves, Dan Goodman

Keywords Paper

deep learning, optimization

0

0

0

0

14:54

06/12/2021

Cardinality constrained submodular maximization for random streams

Paul Liu, Aviad Rubinstein, Jan Vondrak, Junyao Zhao

Keywords Paper

optimization

0

0

0

0

14:11

18/07/2021

Multiplying Matrices Without Multiplying

Davis Blalock, John Guttag

Keywords Paper

Optimization, Convex Optimization, Algorithms, Sparsity and Compressed Sensing; Applications, Information Retrieval; Applications, Signal Processing, Algorithms, Others

0

0

0

0

5:27

12/07/2020

Moniqua: Modulo Quantized Communication in Decentralized SGD

Yucheng Lu, Christopher De Sa

Keywords Paper

Optimization - Large Scale, Parallel and Distributed

0

0

0

0

14:57

02/02/2021

Automated Clustering of High-dimensional Data with a Feature Weighted Mean Shift Algorithm

Saptarshi Chakraborty, Debolina Paul, Swagatam Das

Keywords Paper

0

0

0

0

20:09

09/07/2020

A Greedy Anytime Algorithm for Sparse PCA

Dan Vilenchik, Adam Soffer, Guy Holtzman

Keywords Paper

Non-convex optimization, Combinatorial optimization, Computational complexity, High-dimensional statistics, Unsupervised and semi-supervised learning

0

0

0

0

15:31

06/12/2020

Improved Guarantees for k-means++ and k-means++ Parallel

Konstantin Makarychev, Aravind Reddy, Liren Shan

Keywords Paper

0

0

0

0

3:20

06/12/2020

Hypersolvers: Toward Fast Continuous-Depth Models

Michael Poli, Stefano Massaroli, Atsushi Yamashita and
Hajime Asama, Jinkyoo Park

Keywords Paper

0

0

0

0

3:16

06/12/2021

Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

Dylan J Foster, Akshay Krishnamurthy

Keywords Paper

theory, reinforcement learning and planning, bandits, online learning

0

0

0

0

19:34

06/12/2020

Convergence of Meta-Learning with Task-Specific Adaptation over Partial Parameters

Kaiyi Ji, Jason Lee, Yingbin Liang, H. Vincent Poor

Keywords Paper

0

0

0

0

3:11

14/09/2020

An efficient K-means clustering algorithm for tall data

Marco Capó, Aritz Pérez, Jose A. Lozan

Keywords Paper

0

0

0

0

14:46

06/12/2020

Approximate Cross-Validation with Low-Rank Data in High Dimensions

Will Stephenson, Madeleine Udell, Tamara Broderick

Keywords Paper

0

0

0

0

3:02

02/02/2021

A Sharp Leap from Quantified Boolean Formula to Stochastic Boolean Satisfiability Solving

Pei-Wei Chen, Yu-Ching Huang, Jie-Hong R. Jiang

Keywords Paper

0

0

0

0

18:58

12/07/2020

Streaming k-Submodular Maximization under Noise subject to Size Constraint

Lan N. Nguyen, My T. Thai

Keywords Paper

Optimization - General

0

0

1

1

14:52

06/12/2021

Better Algorithms for Individually Fair $k$-Clustering

Maryam Negahbani, Deeparnab Chakrabarty

Keywords Paper

theory, self-supervised learning, clustering, fairness

0

0

0

0

14:02

26/04/2020

Accelerating SGD with momentum for over-parameterized learning

Chaoyue Liu, Mikhail Belkin

Keywords Paper

SGD, acceleration, momentum, stochastic, over-parameterized, Nesterov

0

0

0

0

4:50

13/04/2021

On the faster alternating least-squares for CCA

Zhiqiang Xu, Ping Li

Keywords Paper

0

0

0

0

2:55

06/12/2020

A Catalyst Framework for Minimax Optimization

Junchi Yang, Siqi Zhang, Negar Kiyavash, Niao He

Keywords Paper

0

0

0

0

3:01

06/12/2020

Kernel Methods Through the Roof: Handling Billions of Points Efficiently

Giacomo Meanti, Luigi Carratino, Lorenzo Rosasco, Alessandro Rudi

Keywords Paper

0

0

0

0

3:28

03/05/2021

Practical Massively Parallel Monte-Carlo Tree Search Applied to Molecular Design

Xiufeng Yang, Tanuj Aasawat, Kazuki Yoshizoe

Keywords Paper

molecular design, Upper Confidence bound applied to Trees (UCT), parallel Monte Carlo Tree Search (MCTS)

0

0

0

0

4:59

06/12/2020

BanditPAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits

Mo Tiwari, Martin Zhang, James J Mayclin and
Sebastian Thrun, Chris Piech, Ilan Shomorony

Keywords Paper

0

0

0

0

3:16

06/12/2021

Robust and Fully-Dynamic Coreset for Continuous-and-Bounded Learning (With Outliers) Problems

Zixiu Wang, Yiwen Guo, Hu Ding

Keywords Paper

optimization, machine learning, adversarial robustness and security, clustering

0

0

0

0

8:38

06/12/2020

Sliding Window Algorithms for k-Clustering Problems

Michele Borassi, Alessandro Epasto, Silvio Lattanzi and
Sergei Vassilvitskii, Morteza Zadimoghaddam

Keywords Paper

0

0

0

0

3:16

22/11/2021

EBJR: Energy-Based Joint Reasoning for Adaptive Inference

Mohammad Akbari, Amin Banitalebi-Dehkordi, Yong Zhang

Keywords Paper

joint inference, energy-based models, adaptive inference, classification, regression

0

0

0

0

2:48

12/07/2020

On Efficient Low Distortion Ultrametric Embedding

Vincent Cohen-Addad, Karthik C. S., Guillaume Lagarde

Keywords Paper

Unsupervised and Semi-Supervised Learning

0

0

0

0

16:37

19/08/2021

Fine-grained Generalization Analysis of Structured Output Prediction

Waleed Mustafa, Yunwen Lei, Antoine Ledent, Marius Kloft

Keywords Paper

Machine Learning, Learning Theory, Structured Prediction

0

0

0

0

15:46

06/12/2020

Belief Propagation Neural Networks

Jonathan Kuck, Shuvam Chakraborty, Hao Tang and
Rachel Luo, Jiaming Song, Ashish Sabharwal, Stefano Ermon

Keywords Paper

0

0

0

0

3:22

14/09/2020

Incremental Sensitivity Analysis for Kernelized Models

Hadar Sivan, Moshe Gabel, Assaf Schuster

Keywords Paper

0

0

0

0

14:54

15/06/2020

NVTraverse: In NVRAM data structures, the destination is more important than the journey

Michal Friedman, Naama Ben-David, Yuanhao Wei and
Guy E. Blelloch, Erez Petrank

Keywords Paper

Non-blocking, Lock-free, Concurrent Data Structures, Non-volatile Memory

0

1

0

1

16:56

03/05/2021

RMSprop converges with proper hyper-parameter

Naichen Shi, Dawei Li, Mingyi Hong, Ruoyu Sun

Keywords Paper

convergence, hyperparameter, RMSprop

0

0

0

0

10:12

06/12/2020

Fast Adaptive Non-Monotone Submodular Maximization Subject to a Knapsack Constraint

Georgios Amanatidis, Federico Fusco, Philip Lazos and
Stefano Leonardi, Rebecca Reiffenhäuser

Keywords Paper

0

0

0

0

3:24

14/09/2020

Online Binary Incomplete Multi-view Clustering

Longqi Yang, Liangliang Zhang, Yuhua Tang

Keywords Paper

0

0

0

0

3:04

06/12/2020

Provably Efficient Online Hyperparameter Optimization with Population-Based Bandits

Jack Parker-Holder, Vu Nguyen, Stephen J Roberts

Keywords Paper

0

0

0

0

3:22

06/12/2021

Speedy Performance Estimation for Neural Architecture Search

Robin Ru, Clare Lyle, Lisa Schut and
Miroslav Fil, Mark van der Wilk, Yarin Gal

Keywords Paper

deep learning

0

0

0

0

13:22

03/05/2021

Fast Geometric Projections for Local Robustness Certification

Aymeric Fromherz, Klas Leino, Matt Fredrikson and
Bryan Parno, Corina Pasareanu

Keywords Paper

verification, robustness, safety

0

1

0

0

11:54

12/07/2020

Boosting Frank-Wolfe by Chasing Gradients

Cyrille Combettes, Sebastian Pokutta

Keywords Paper

Optimization - Convex

0

0

0

0

16:15