Efficient distributed algorithms for the k-nearest neighbors problem

Abstract: The K-nearest neighbors is a basic problem in machine learning with numerous applications. In this problem, given a (training) set of n data points with labels and a query point q, we want to assign a label to q based on the labels of the K-nearest points to the query. We study this problem in the k-machine model, a model for distributed large-scale data. In this model, we assume that the n points are distributed (in a balanced fashion) among the k machines and the goal is to compute an answer given a query point to a machine using a small number of communication rounds.Our main result is a randomized algorithm in the k-machine model that runs in O(log K) communication rounds with high success probability (regardless of the number of machines k and the number of points n). The message complexity of the algorithm is small taking only O(k log K) messages. Our bounds are essentially the best possible for comparison-based algorithms. We also implemented our algorithm and show that it performs well in practice.

06/12/2021

Efficient distributed algorithms for the k-nearest neighbors problem

Reza Fathi, Anisur Rahaman Molla, Gopal Pandurangan

Comments

Similar Papers

Active clustering for labeling training data

Quentin Lutz, Elie de Panafieu, Maya Stein, Alex Scott

Keywords Abstract Paper

clustering, active learning

On learning sparse vectors from mixture of responses

Keywords Abstract Paper

Efficiently learning structured distributions from untrusted batches

Sitan Chen, Jerry Li, Ankur Moitra

Keywords Abstract Paper

sum-of-squares, federated learning, VC complexity, Robust statistics

Robust Density Estimation from Batches: The Best Things in Life are (Nearly) Free

Ayush Jain, Alon Orlitsky

Keywords Abstract Paper

Theory, Statistical Learning Theory

Adversarially Robust Low Dimensional Representations

Pranjal Awasthi, Vaggos Chatziafratis, Xue Chen, Aravindan Vijayaraghavan

Keywords Abstract Paper

Large deviations for the perceptron model and consequences for active learning

Hugo Cui, Luca Saglietti, Lenka Zdeborova

Keywords Abstract Paper

Theoretical bounds on estimation error for meta-learning

James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI and Toniann Pitassi, Richard Zemel

Keywords Abstract Paper

meta learning, minimax risk, few-shot, lower bounds, learning theory

Faster Random k-CNF Satisfiability

Andrea Lincoln, Adam Yedidia.

Keywords Abstract Paper

Random k-SAT, Average-Case, Algorithms

Minimax Rate for Learning From Pairwise Comparisons in the BTL Model

Julien Hendrickx, Alex Olshevsky, Venkatesh Saligrama

Keywords Abstract Paper

The Complexity of Dynamic Data Race Prediction

Keywords Abstract Paper

Complexity, Data Race Prediction

Source Identification for Mixtures of Product Distributions

Spencer Gordon, Bijan H Mazaheri, Yuval Rabani, Leonard Schulman

Keywords Abstract Paper

Constructing a provably adversarially-robust classifier from a high accuracy one

Grzegorz Gluch, Rüdiger Urbanke

Keywords Abstract Paper

Online k-means clustering

Vincent Cohen-Addad, Benjamin Guedj, Varun Kanade, Guy Rom

Keywords Abstract Paper

Extrapolation Towards Imaginary 0-Nearest Neighbour and Its Improved Convergence Rate

Akifumi Okuno, Hidetoshi Shimodaira

Keywords Abstract Paper

Learning Online Algorithms with Distributional Advice

Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos and Ali Vakilian, Nikos Zarifis

Keywords Abstract Paper

Few-Shot Learning via Learning the Representation, Provably

Simon Du, Wei Hu, Sham M Kakade and Jason Lee, Qi Lei

Keywords Abstract Paper

statistical learning theory, representation learning

Semi-bandit Optimization in the Dispersed Setting

Travis Dick, Wesley Pegden, Maria-Florina Balcan

Keywords Abstract Paper

Fuzzy Clustering with Similarity Queries

Wasim Huleihel, Arya Mazumdar, Soumyabrata Pal

Keywords Abstract Paper

Neural Active Learning with Performance Guarantees

Zhilei Wang, Pranjal Awasthi, Christoph Dann and Ayush Sekhari, Claudio Gentile

Keywords Abstract Paper

deep learning, active learning

Expert Learning through Generalized Inverse Multiobjective Optimization: Models, Insights and Algorithms

Chaosheng Dong, Bo Zeng

Keywords Abstract Paper

Robust Unsupervised Learning via L-statistic Minimization

Andreas Maurer, Daniela Angela Parletta, Andrea Paudice, Massimiliano Pontil

Keywords Abstract Paper

Theory, Statistical Learning Theory

Adaptive sampling for fast constrained maximization of submodular functions

Francesco Quinzan, Vanja Doskoc, Andreas Göbel, Tobias Friedrich

Keywords Abstract Paper

Learning based distributed tracking

Hao WU, Junhao Gan, Rui Zhang

Keywords Abstract Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

James Lucas, Mengye Ren, Irene Raissa KAMENI KAMENI and
Toniann Pitassi, Richard Zemel

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Ilias Diakonikolas, Vasilis Kontonis, Christos Tzamos and
Ali Vakilian, Nikos Zarifis

Keywords Paper

Simon Du, Wei Hu, Sham M Kakade and
Jason Lee, Qi Lei

Keywords Paper

Keywords Paper

Keywords Paper

Zhilei Wang, Pranjal Awasthi, Christoph Dann and
Ayush Sekhari, Claudio Gentile

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Zhen Fang, Jie Lu, Anjin Liu and
Feng Liu, Guangquan Zhang

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Lan-Zhe Guo, Zhen-Yu Zhang, Yuan Jiang and
Yufeng Li, Zhi-Hua Zhou

Keywords Paper

Keywords Paper

Keywords Paper

Keywords Paper

Yibo Yang, Hongyang Li, Shan You and
Fei Wang, Chen Qian, Zhouchen Lin

Keywords Paper