Composable Sketches for Functions of Frequencies: Beyond the Worst Case

12/07/2020

Composable Sketches for Functions of Frequencies: Beyond the Worst Case

Edith Cohen, Ofir Geri, Rasmus Pagh

Keywords: Optimization - Large Scale, Parallel and Distributed

Abstract Paper Similar Papers

Abstract: Recently there has been increased interest in using machine learning techniques to improve classical algorithms. In this paper we study when it is possible to construct compact, composable sketches for weighted sampling and statistics estimation according to functions of data frequencies. Such structures are now central components of large-scale data analytics and machine learning pipelines. Many common functions, however, such as thresholds and $p$th frequency moments with $p>2$, are known to require polynomial size sketches in the worst case. We explore performance beyond the worst case under two different types of assumptions. The first is having access to noisy \emph{advice} on item frequencies. This continues the line of work of Hsu et al.~(ICLR 2019), who assume predictions are provided by a machine learning model. The second is providing guaranteed performance on a restricted class of input frequency distributions that are better aligned with what is observed in practice. This extends the work on heavy hitters under Zipfian distributions in a seminal paper of Charikar et al.~(ESA 2002). Surprisingly, we show analytically and empirically that ``in practice'' small polylogarithmic-size sketches provide accuracy for ``hard'' functions.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at ICML 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

02/02/2021

Robust Model Compression Using Deep Hypotheses

Omri Armstrong, Ran Gilad-Bachrach

Keywords Paper

0

0

0

0

17:26

03/05/2021

Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search

Peidong Liu, Gengwei Zhang, Bochao Wang and
Hang Xu, Xiaodan Liang, Yong Jiang, Zhenguo Li

Keywords Paper

AutoML, Loss function search, Evolutionary algorithm, Object detection

0

0

0

0

5:15

06/12/2020

Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization

Jonathan Lacotte, Mert Pilanci

Keywords Paper

0

0

0

0

3:17

06/12/2021

Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

Dylan J Foster, Akshay Krishnamurthy

Keywords Paper

theory, reinforcement learning and planning, bandits, online learning

0

0

0

0

19:34

06/12/2020

Fourier Sparse Leverage Scores and Approximate Kernel Learning

Tamas Erdelyi, Cameron Musco, Christopher Musco

Keywords Paper

0

0

0

0

3:25

26/08/2020

Constructing a provably adversarially-robust classifier from a high accuracy one

Grzegorz Gluch, Rüdiger Urbanke

Keywords Paper

0

0

0

0

13:10

26/08/2020

'Bring Your Own Greedy'+Max: Near-Optimal 1/2-Approximations for Submodular Knapsack

Grigory Yaroslavtsev, Samson Zhou, Dmitrii Avdiukhin

Keywords Paper

0

0

0

0

13:14

06/12/2021

Stability and Generalization of Bilevel Programming in Hyperparameter Optimization

Fan Bao, Guoqiang Wu, Chongxuan LI and
Jun Zhu, Bo Zhang

Keywords Paper

optimization

0

0

0

0

8:58

06/12/2020

An Empirical Process Approach to the Union Bound: Practical Algorithms for Combinatorial and Linear Bandits

Julian Katz-Samuels, Lalit Jain, zohar karnin, Kevin Jamieson

Keywords Paper

0

0

0

0

3:20

04/08/2021

Query complexity of least absolute deviation regression via robust uniform convergence

Xue Chen, Michal Derezinski

Keywords Paper

0

0

0

0

19:41

14/09/2020

A General Machine Learning Framework for Survival Analysis

Andreas Bender, David Rügamer, Fabian Scheipl, Bernd Bischl

Keywords Paper

survival analysis, gradient boosting, neural networks, competing risks, multi-state models

0

0

0

0

13:37

26/08/2020

Learning Hierarchical Interactions at Scale: A Convex Optimization Approach

Hussein Hazimeh, Rahul Mazumder

Keywords Paper

0

0

0

0

15:07

12/07/2020

Why Are Learned Indexes So Effective?

Paolo Ferragina, Fabrizio Lillo, Giorgio Vinciguerra

Keywords Paper

Applications - Other

0

0

0

0

13:22

03/05/2021

Entropic gradient descent algorithms and wide flat minima

Fabrizio Pittorino, Carlo Lucibello, Christoph Feinauer and
Gabriele Perugini, Carlo Baldassi, Elizaveta Demyanenko, Riccardo Zecchina

Keywords Paper

flat minima, belief-propagation, statistical physics, entropic algorithms

0

0

0

0

5:38

12/07/2020

A simpler approach to accelerated optimization: iterative averaging meets optimism

Pooria Joulani, Anant Raj, András György, Csaba Szepesvari

Keywords Paper

Online Learning, Active Learning, and Bandits

0

0

1

1

16:17

26/04/2020

Learning-Augmented Data Stream Algorithms

Tanqiu Jiang, Yi Li, Honghao Lin and
Yisong Ruan, David P. Woodruff

Keywords Paper

streaming algorithms, heavy hitters, F_p moment, distinct elements, cascaded norms

0

0

0

0

3:55

06/12/2021

Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, Karthikeyan Shanmugam

Keywords Paper

0

0

0

0

14:56

06/12/2021

Learning Gaussian Mixtures with Generalized Linear Models: Precise Asymptotics in High-dimensions

Bruno Loureiro, Gabriele Sicuro, Cedric Gerbelot and
Alessandro Pacco, Florent Krzakala, Lenka Zdeborová

Keywords Paper

theory, machine learning

0

0

0

0

14:35

06/12/2021

Meta Two-Sample Testing: Learning Kernels for Testing with Limited Data

Feng Liu, Wenkai Xu, Jie Lu, [deadname] J Sutherland

Keywords Paper

meta learning, kernel methods

0

0

0

0

14:31

18/07/2021

Sample-Optimal PAC Learning of Halfspaces with Malicious Noise

Jie Shen

Keywords Paper

Theory, Computational Learning Theory

0

0

0

0

4:37

13/04/2021

Regularized ERM on random subspaces

Andrea Della Vecchia, Jaouad Mourtada, Ernesto De Vito, Lorenzo Rosasco

Keywords Paper

0

0

0

0

2:57

13/04/2021

Robust learning under strong noise via SQs

Ioannis Anagnostides, Themis Gouleakis, Ali Marashian

Keywords Paper

0

0

0

0

3:04

06/12/2021

Closing the Gap: Tighter Analysis of Alternating Stochastic Gradient Methods for Bilevel Problems

Tianyi Chen, Yuejiao Sun, Wotao Yin

Keywords Paper

theory, optimization, reinforcement learning and planning, machine learning

0

0

0

0

7:27

12/07/2020

Efficiently sampling functions from Gaussian process posteriors

James Wilson, Viacheslav Borovitskiy, Alexander Terenin and
Peter Mostowsky, Marc Deisenroth

Keywords Paper

Gaussian Processes

0

0

0

0

14:40

18/07/2021

Bilevel Optimization: Convergence Analysis and Enhanced Design

Kaiyi Ji, Junjie Yang, Yingbin LIANG

Keywords Paper

Optimization, Non-Convex Optimization

0

0

0

0

5:02

06/12/2020

Classification Under Misspecification: Halfspaces, Generalized Linear Models, and Evolvability

Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau

Keywords Paper

0

0

0

0

3:26

03/05/2021

Differentiable Segmentation of Sequences

Erik Scharwächter, Jonathan Lennartz, Emmanuel Müller

Keywords Paper

warping functions, concept drift, change point detection, segmented models, segmentation, gradient descent

0

1

0

0

5:10

12/07/2020

Accelerated Message Passing for Entropy-Regularized MAP Inference

Jonathan Lee, Aldo Pacchiano, Peter Bartlett, Michael Jordan

Keywords Paper

Probabilistic Inference - Models and Probabilistic Programming

0

0

0

0

14:57

06/12/2021

List-Decodable Mean Estimation in Nearly-PCA Time

Ilias Diakonikolas, Daniel Kane, Daniel Kongsgaard and
Jerry Li, Kevin Tian

Keywords Paper

theory, clustering

0

0

0

0

14:21

06/12/2021

Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms

Alexander Camuto, George Deligiannidis, Murat Erdogdu and
Mert Gurbuzbalaban, Umut Simsekli, Lingjiong Zhu

Keywords Paper

theory, deep learning, optimization

0

0

0

0

14:36

12/07/2020

dS^2LBI: Exploring Structural Sparsity on Deep Network via Differential Inclusion Paths

Yanwei Fu, Chen Liu, Donghao Li and
Xinwei Sun, Jinshan ZENG, Yuan Yao

Keywords Paper

Deep Learning - Algorithms

0

0

0

1

12:45

18/07/2021

Provably Correct Optimization and Exploration with Non-linear Policies

Fei Feng, Wotao Yin, Alekh Agarwal, Lin Yang

Keywords Paper

Deep Learning, Adversarial Networks, Applications, Fairness, Accountability, and Transparency, Theory, RL, Decisions and Control Theory

0

0

0

0

5:03

12/07/2020

Optimal approximation for unconstrained non-submodular minimization

Marwa El Halabi, Stefanie Jegelka

Keywords Paper

Optimization - General

0

0

0

0

14:41

09/07/2020

Estimating Principal Components under Adversarial Perturbations

Pranjal Awasthi, Xue Chen, Aravindan Vijayaraghavan

Keywords Paper

Unsupervised and semi-supervised learning, Adversarial learning and robustness

0

0

0

0

15:40

06/12/2020

Learning Feature Sparse Principal Subspace

Lai Tian, Feiping Nie, Rong Wang, Xuelong Li

Keywords Paper

0

0

0

0

3:13

06/12/2020

All your loss are belong to Bayes

Christian Walder, Richard Nock

Keywords Paper

0

0

0

0

3:44

26/04/2020

Short and Sparse Deconvolution --- A Geometric Approach

Yenson Lau, Qing Qu, Han-Wen Kuo and
Pengcheng Zhou, Yuqian Zhang, John Wright

Keywords Paper

0

0

0

0

7:18

06/12/2020

Distributionally Robust Federated Averaging

Yuyang Deng, Mohammad Mahdi Kamani, Mehrdad Mahdavi

Keywords Paper

0

0

0

0

3:11

12/07/2020

State Space Expectation Propagation: Efficient Inference Schemes for Temporal Gaussian Processes

William Wilkinson, Paul Chang, Michael Andersen, Arno Solin

Keywords Paper

Gaussian Processes

0

0

0

0

13:31

06/12/2020

Modeling and Optimization Trade-off in Meta-learning

Katelyn Gao, Ozan Sener

Keywords Paper

0

0

0

0

3:21