Near-Optimal Algorithms for Explainable k-Medians and k-Means

18/07/2021

Near-Optimal Algorithms for Explainable k-Medians and k-Means

Kostya Makarychev, Liren Shan

Keywords: Algorithms, Unsupervised Learning

Abstract Paper Similar Papers

Abstract: We consider the problem of explainable $k$-medians and $k$-means introduced by Dasgupta, Frost, Moshkovitz, and Rashtchian~(ICML 2020). In this problem, our goal is to find a \emph{threshold decision tree} that partitions data into $k$ clusters and minimizes the $k$-medians or $k$-means objective. The obtained clustering is easy to interpret because every decision node of a threshold tree splits data based on a single feature into two groups. We propose a new algorithm for this problem which is $\tilde O(\log k)$ competitive with $k$-medians with $\ell_1$ norm and $\tilde O(k)$ competitive with $k$-means. This is an improvement over the previous guarantees of $O(k)$ and $O(k^2)$ by Dasgupta et al (2020). We also provide a new algorithm which is $O(\log^{\nicefrac{3}{2}} k)$ competitive for $k$-medians with $\ell_2$ norm. Our first algorithm is near-optimal: Dasgupta et al (2020) showed a lower bound of $\Omega(\log k)$ for $k$-medians; in this work, we prove a lower bound of $\tilde\Omega(k)$ for $k$-means. We also provide a lower bound of $\Omega(\log k)$ for $k$-medians with $\ell_2$ norm.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

06/12/2021

Nearly-Tight and Oblivious Algorithms for Explainable Clustering

Buddhima Gamlath, Xinrui Jia, Adam Polak, Ola Svensson

Keywords Paper

optimization, clustering, interpretability

0

0

0

0

12:31

04/08/2021

Approximation Algorithms for Socially Fair Clustering

Yury Makarychev, Ali Vakilian

Keywords Paper

0

0

0

0

16:31

18/07/2021

Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample Complexity

Zhang Zihan, Yuan Zhou, Xiangyang Ji

Keywords Paper

Reinforcement Learning and Planning

0

0

0

0

5:03

06/12/2021

Coresets for Decision Trees of Signals

Ibrahim Jubran, Ernesto Evgeniy Sanches Shayda, Ilan I Newman, Dan Feldman

Keywords Paper

machine learning

0

0

0

0

14:50

09/07/2020

How to trap a gradient flow

Dan Mikulincer, Sebastien Bubeck

Keywords Paper

Non-convex optimization,

0

0

0

0

15:01

09/07/2020

An O(m/eps^3.5)-Cost Algorithm for Semidefinite Programs with Diagonal Constraints

Swati Padmanabhan, Yin Tat Lee

Keywords Paper

Convex optimization, Approximation algorithms, Combinatorial optimization

0

0

0

0

12:34

06/12/2021

Better Algorithms for Individually Fair $k$-Clustering

Maryam Negahbani, Deeparnab Chakrabarty

Keywords Paper

theory, self-supervised learning, clustering, fairness

0

0

0

0

14:02

18/07/2021

Meta Learning for Support Recovery in High-dimensional Precision Matrix Estimation

Qian Zhang, Yilin Zheng, Jean Honorio

Keywords Paper

Algorithms, Meta-Learning, Algorithms, Few-Shot Learning; Algorithms, Multitask and Transfer Learning, Theory, Statistical Learning Theory

0

0

0

0

5:03

18/07/2021

Dimensionality Reduction for the Sum-of-Distances Metric

Zhili Feng, Praneeth Kacham, David Woodruff

Keywords Paper

Neuroscience and Cognitive Science, Deep Learning, Biologically Plausible Deep Networks; Neuroscience and Cognitive Science, Connectomics; Neuroscience and Cog, Algorithms, Dimensionality Reduction

0

0

0

0

17:12

18/07/2021

Streaming and Distributed Algorithms for Robust Column Subset Selection

Shuli Jiang, Dongyu Li, Irene Mengze Li and
Arvind Mahankali, David Woodruff

Keywords Paper

Algorithms, Deep Learning, Generative Models, Deep Learning, Predictive Models; Deep Learning, Recurrent Networks

0

0

0

0

7:26

06/12/2020

Universal guarantees for decision tree induction via a higher-order splitting criterion

Guy Blanc, Neha Gupta, Jane Lange, Li-Yang Tan

Keywords Paper

0

0

0

0

2:53

18/07/2021

Randomized Dimensionality Reduction for Facility Location and Single-Linkage Clustering

Shyam Narayanan, Sandeep Silwal, Piotr Indyk, Or Zamir

Keywords Paper

Algorithms, Dimensionality Reduction

0

0

0

0

5:00

26/10/2020

Solving K-MDPs

Jonathan Ferrer-Mestres, Thomas G. Dietterich, Olivier Buffet, Iadine Chadès

Keywords Paper

state abstraction, interpretability, MDP, computational sustainability

0

0

0

0

10:13

06/12/2021

Breaking the Sample Complexity Barrier to Regret-Optimal Model-Free Reinforcement Learning

Gen Li, Laixi Shi, Yuxin Chen and
Yuantao Gu, Yuejie Chi

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

15:32

06/12/2020

Extrapolation Towards Imaginary 0-Nearest Neighbour and Its Improved Convergence Rate

Akifumi Okuno, Hidetoshi Shimodaira

Keywords Paper

0

0

0

0

3:14

08/07/2020

The Online Min-Sum Set Cover Problem

Dimitris Fotakis, Loukas Kavouras, Grigorios Koumoutsos and
Stratis Skoulakis, Manolis Vardas

Keywords Paper

Online Algorithms, Competitive Analysis, Min-Sum Set Cover

0

0

0

0

25:10

09/07/2020

How Good is SGD with Random Shuffling?

Itay M Safran, Ohad Shamir

Keywords Paper

Convex optimization,

0

0

0

0

11:50

06/12/2021

A Faster Maximum Cardinality Matching Algorithm with Applications in Machine Learning

Nathaniel Lahn, Sharath Raghvendra, Jiacheng Ye

Keywords Paper

optimization, machine learning, graph learning

0

0

0

0

14:49

12/07/2020

Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: Joint Gradient Estimation and Tracking

Haoran Sun, Songtao Lu, Mingyi Hong

Keywords Paper

Optimization - Non-convex

0

0

0

0

13:56

03/08/2020

Exponentially faster shortest paths in the congested clique

Michal Dory, Merav Parter

Keywords Paper

congested clique, shortest paths, near-additive emulator

0

0

0

0

23:50

18/07/2021

Towards Tight Bounds on the Sample Complexity of Average-reward MDPs

Yujia Jin, Aaron Sidford

Keywords Paper

Theory, RL, Decisions and Control Theory

0

0

0

0

5:05

18/07/2021

Randomized Algorithms for Submodular Function Maximization with a $k$-System Constraint

Shuang Cui, Kai Han, Tianshuai Zhu and
Jing Tang, Benwei Wu, He Huang

Keywords Paper

Optimization

0

0

0

0

4:48

06/12/2021

Optimal Uniform OPE and Model-based Offline Reinforcement Learning in Time-Homogeneous, Reward-Free and Task-Agnostic Settings

Ming Yin, Yu-Xiang Wang

Keywords Paper

theory, reinforcement learning and planning

0

0

0

0

8:46

18/07/2021

Regularized Submodular Maximization at Scale

Ehsan Kazemi, shervin minaee, Moran Feldman, Amin Karbasi

Keywords Paper

Optimization, Combinatorial Optimization

0

0

0

0

5:17

06/12/2021

Dimensionality Reduction for Wasserstein Barycenter

Zachary Izzo, Sandeep Silwal, Samson Zhou

Keywords Paper

machine learning

0

0

0

0

11:10

09/07/2020

Locally Private Hypothesis Selection

Sivakanth Gopi, Gautam Kamath, Janardhan D Kulkarni and
Aleksandar Nikolov, Steven Wu, Huanyu Zhang

Keywords Paper

Privacy, fairness, Distribution learning/testing

0

0

0

0

14:58

06/12/2021

Improved Coresets and Sublinear Algorithms for Power Means in Euclidean Spaces

Vincent Cohen-Addad, David Saulpic, Chris Schwiegelshohn

Keywords Paper

clustering

0

0

0

0

16:06

06/12/2021

Lower Bounds and Optimal Algorithms for Smooth and Strongly Convex Decentralized Optimization Over Time-Varying Networks

Dmitry Kovalev, Elnur Gasanov, Alexander Gasnikov, Peter Richtarik

Keywords Paper

optimization

0

0

0

0

15:02

13/04/2021

Learning-to-rank with partitioned preference: Fast estimation for the plackett-luce model

Jiaqi Ma, Xinyang Yi, Weijing Tang and
Zhe Zhao, Lichan Hong, Ed Chi, Qiaozhu Mei

Keywords Paper

0

0

0

0

3:03

06/12/2021

An Online Riemannian PCA for Stochastic Canonical Correlation Analysis

Zihang Meng, Rudrasis Chakraborty, Vikas Singh

Keywords Paper

optimization, fairness

0

0

0

0

14:14

03/08/2020

High Dimensional Discrete Integration over the Hypergrid

Raj Kumar Maity, Arya Mazumdar, Soumyabrata Pal

Keywords Paper

0

0

0

0

8:46

06/12/2021

Nearly Horizon-Free Offline Reinforcement Learning

Tongzheng Ren, Jialian Li, Bo Dai and
Simon Du, Sujay Sanghavi

Keywords Paper

theory, optimization, reinforcement learning and planning

0

0

0

0

8:44

04/08/2021

Breaking The Dimension Dependence in Sparse Distribution Estimation under Communication Constraints

Wei-Ning Chen, Peter Kairouz, Ayfer Ozgur

Keywords Paper

0

0

0

0

15:28

06/12/2021

Instance-Dependent Bounds for Zeroth-order Lipschitz Optimization with Error Certificates

Francois Bachoc, Tom Cesari, Sébastien Gerchinovitz

Keywords Paper

theory, optimization

0

0

0

0

14:51

12/07/2020

Near-optimal sample complexity bounds for learning Latent $k-$polytopes and applications to Ad-Mixtures

Chiranjib Bhattacharyya, Ravindran Kannan

Keywords Paper

Learning Theory

0

0

0

0

15:04

06/12/2020

A Novel Approach for Constrained Optimization in Graphical Models

Sara Rouhani, Tahrima Rahman, Vibhav Gogate

Keywords Paper

0

0

0

0

3:21

12/07/2020

Input-Sparsity Low Rank Approximation in Schatten Norm

Yi Li, David Woodruff

Keywords Paper

Unsupervised and Semi-Supervised Learning

0

0

0

0

15:36

22/06/2020

The karger-stein algorithm is optimal for k-cut

Anupam Gupta, Euiwoong Lee, Jason Li

Keywords Paper

Graph Algorithms, Minimum Cut, Randomized Algorithms

0

0

0

0

26:15

22/06/2020

Coresets for clustering in euclidean spaces: Importance sampling is nearly optimal

Lingxiao Huang, Nisheeth K. Vishnoi

Keywords Paper

Coresets, k-means, Importance sampling, Dimension reduction, Clustering, k-median

0

0

0

0

19:23

18/07/2021

PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization

Zhize Li, Hongyan Bao, Xiangliang Zhang, Peter Richtarik

Keywords Paper

Optimization

0

0

0

0

11:53