Serving DNNs like Clockwork: Performance Predictability from the Bottom Up

04/11/2020

Serving DNNs like Clockwork: Performance Predictability from the Bottom Up

Arpan Gujarati, Reza Karimi, Safya Alzayat, Wei Hao, Antoine Kaufmann, Ymir Vigfusson, Jonathan Mace

Keywords:

Abstract Paper Similar Papers

Abstract: Machine learning inference is becoming a core building block for interactive web applications. As a result, the underlying model serving systems on which these applications depend must consistently meet low latency targets. Existing model serving architectures use well-known reactive techniques to alleviate common-case sources of latency, but cannot effectively curtail tail latency caused by unpredictable execution times. Yet the underlying execution times are not fundamentally unpredictable—on the contrary we observe that inference using Deep Neural Network (DNN) models has deterministic performance. Here, starting with the predictable execution times of individual DNN inferences, we adopt a principled design methodology to successively build a fully distributed model serving system that achieves predictable end-to-end performance. We evaluate our implementation, Clockwork, using production trace workloads, and show that Clockwork can support thousands of models while simultaneously meeting 100 ms latency targets for 99.997% of requests. We further demonstrate that Clockwork exploits predictable execution times to achieve tight request-level service-level objectives (SLOs) as well as a high degree of request-level performance isolation.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at OSDI 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

26/04/2020

GenDICE: Generalized Offline Estimation of Stationary Values

Ruiyi Zhang, Bo Dai, Lihong Li, Dale Schuurmans

Keywords Paper

Off-policy Policy Evaluation, Reinforcement Learning, Stationary Distribution Correction Estimation, Fenchel Dual

0

0

0

0

15:37

03/05/2021

MetaNorm: Learning to Normalize Few-Shot Batches Across Domains

Yingjun Du, Xiantong Zhen, Ling Shao, Cees G Snoek

Keywords Paper

batch normalization, Meta-learning, few-shot domain generalization

0

0

0

0

5:48

06/12/2021

Speedy Performance Estimation for Neural Architecture Search

Robin Ru, Clare Lyle, Lisa Schut and
Miroslav Fil, Mark van der Wilk, Yarin Gal

Keywords Paper

deep learning

0

0

0

0

13:22

18/07/2021

Gaussian Process-Based Real-Time Learning for Safety Critical Applications

Armin Lederer, Alejandro Ordóñez Conejo, Korbinian Maier and
Wenxin Xiao, Jonas Umlauft, Sandra Hirche

Keywords Paper

Probabilistic Methods, Gaussian Processes and Bayesian non-parametrics

0

0

0

0

4:59

15/06/2020

Inductive sequentialization of asynchronous programs

Bernhard Kragl, Constantin Enea, Thomas A. Henzinger and
Suha Orhun Mutluergil, Shaz Qadeer

Keywords Paper

movers, layers, verification, abstraction, invariants, induction, concurrency, refinement, asynchrony, reduction

0

0

0

0

14:40

06/12/2020

Transfer Learning via $\ell_1$ Regularization

Masaaki Takada, Hironori Fujisawa

Keywords Paper

0

0

0

0

3:00

26/08/2020

Adaptive, Distribution-Free Prediction Intervals for Deep Networks

Danijel Kivaranovic, Kory D. Johnson, Hannes Leeb

Keywords Paper

0

0

0

0

16:48

11/08/2020

OmniMon: Re-architecting network telemetry with resource efficiency and full accuracy

Qun Huang, Haifeng Sun, Patrick P. C. Lee and
Wei Bai, Feng Zhu, Yungang Bao

Keywords Paper

Distributed systems, Network measurement

0

0

0

0

20:10

15/11/2020

Perfectly Parallel Fairness Certification of Neural Networks

Caterina Urban, Maria Christakis, Valentin Wüstholz, Fuyuan Zhang

Keywords Paper

Abstract Interpretation, Fairness, Static Analysis, Neural Networks

0

0

0

0

15:18

15/11/2020

DiffStream: Differential Output Testing for Stream Processing Programs

Konstantinos Kallas, Filip Niksic, Caleb Stanford, Rajeev Alur

Keywords Paper

runtime verification, differential testing, stream processing

0

0

0

0

15:50

15/11/2020

Foundations of Empirical Memory Consistency Testing

Jake Kirkham, Tyler Sorensen, Esin Tureci, Margaret Martonosi

Keywords Paper

autotuning, conformance testing, memory consistency, GPUs, OpenCL

0

0

0

0

14:58

09/07/2020

Noise-tolerant, Reliable Active Classification with Comparison Queries

Max Hopkins, Shachar Lovett, Daniel Kane, Gaurav Mahajan

Keywords Paper

Active learning, Classification, Learning with algebraic or combinatorial structure, PAC learning

0

0

0

0

15:23

06/12/2021

BulletTrain: Accelerating Robust Neural Network Training via Boundary Example Mining

Weizhe Hua, Yichi Zhang, Chuan Guo and
Zhiru Zhang, G. Edward Suh

Keywords Paper

deep learning, machine learning, robustness, adversarial robustness and security

0

0

0

0

6:36

12/07/2020

Overfitting in adversarially robust deep learning

Eric Wong, Leslie Rice, Zico Kolter

Keywords Paper

Adversarial Examples

0

0

0

0

14:44

18/07/2021

GNNAutoScale: Scalable and Expressive Graph Neural Networks via Historical Embeddings

Matthias Fey, Jan Lenssen, Frank Weichert, Jure Leskovec

Keywords Paper

Algorithms, Networks and Relational Learning

0

0

0

0

5:18

04/08/2021

Group testing and local search: is there a computational-statistical gap?

Fotis Iliopoulos, Ilias Zadik

Keywords Paper

0

0

0

0

17:50

06/12/2021

Efficient Training of Retrieval Models using Negative Cache

Erik Lindgren, Sashank Reddi, Ruiqi Guo, Sanjiv Kumar

Keywords Paper

deep learning, machine learning

0

0

0

0

10:41

12/09/2020

WOLED: A tool for Online Learning Weighted Answer Set Rules for Temporal Reasoning Under Uncertainty

Nikos Katzouris, Alexander Artikis

Keywords Paper

KR related tools and systems-General, Case studies for KR systems-General, Applications that combine KR with machine learning-General

0

0

0

0

15:59

02/02/2021

Any-Precision Deep Neural Networks

Haichao Yu, Haoxiang Li, Humphrey Shi and
Thomas S. Huang, Gang Hua

Keywords Paper

0

0

0

0

14:26

06/12/2020

Optimal Robustness-Consistency Trade-offs for Learning-Augmented Online Algorithms

Alexander Wei, Fred Zhang

Keywords Paper

0

0

0

0

3:22

13/04/2021

Critical parameters for scalable distributed learning with large batches and asynchronous updates

Sebastian Stich, Amirkeivan Mohtashami, Martin Jaggi

Keywords Paper

0

0

0

0

3:00

06/12/2021

Channel Permutations for N:M Sparsity

Jeff Pool, Chong Yu

Keywords Paper

optimization

0

0

0

0

12:41

26/08/2020

Sequential no-Substitution k-Median-Clustering

Tom Hess, Sivan Sabato

Keywords Paper

0

0

0

0

14:49

06/12/2020

Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of Gaussian Processes

Mengdi Xu, Wenhao Ding, Jiacheng Zhu and
ZUXIN LIU, Baiming Chen, Ding Zhao

Keywords Paper

0

0

0

0

3:21

04/08/2021

Adversarially Robust Low Dimensional Representations

Pranjal Awasthi, Vaggos Chatziafratis, Xue Chen, Aravindan Vijayaraghavan

Keywords Paper

0

0

0

0

20:19

12/07/2020

Optimal Continual Learning has Perfect Memory and is NP-hard

Jeremias Knoblauch, Hisham Husain, Tom Diethe

Keywords Paper

Deep Learning - Theory

0

0

0

0

14:58

18/07/2021

Stochastic Sign Descent Methods: New Algorithms and Better Theory

Mher Safaryan, Peter Richtarik

Keywords Paper

Optimization, Distributed and Parallel Optimization

0

0

0

0

5:12

23/06/2021

Perceus: Garbage Free Reference Counting with Reuse

Alex Reinking, Ningning Xie, Leonardo de Moura, Daan Leijen

Keywords Paper

Reference Counting, Algebraic Effects, Handlers

0

0

0

0

24:39

23/06/2021

Incremental Whole-Program Analysis in Datalog with Lattices

Tamás Szabó, Sebastian Erdweg, Gábor Bergmann

Keywords Paper

Static Analysis, Incremental Computing, Datalog

0

0

0

0

22:53

19/10/2020

Autonomous predictive modeling via reinforcement learning

Udayan Khurana, Horst Samulowitz

Keywords Paper

reinforcement learning, data science automation, automated machine learning

0

0

0

0

4:21

06/12/2021

Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures

Yuan Cao, Quanquan Gu, Mikhail Belkin

Keywords Paper

deep learning, machine learning

0

0

0

0

13:47

18/07/2021

Robust Unsupervised Learning via L-statistic Minimization

Andreas Maurer, Daniela Angela Parletta, Andrea Paudice, Massimiliano Pontil

Keywords Paper

Theory, Statistical Learning Theory

0

0

0

0

5:03

13/04/2021

Prediction with finitely many errors almost surely

Changlong Wu, Narayana Santhanam

Keywords Paper

0

0

0

0

2:56

02/02/2021

Meta-Learning Framework with Applications to Zero-Shot Time-Series Forecasting

Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, Yoshua Bengio

Keywords Paper

0

0

0

0

17:41

02/02/2021

Asynchronous Optimization Methods for Efficient Training of Deep Neural Networks with Guarantees

Vyacheslav Kungurtsev, Malcolm Egan, Bapi Chatterjee, Dan Alistarh

Keywords Paper

0

0

0

0

19:56

06/12/2021

Efficient First-Order Contextual Bandits: Prediction, Allocation, and Triangular Discrimination

Dylan J Foster, Akshay Krishnamurthy

Keywords Paper

theory, reinforcement learning and planning, bandits, online learning

0

0

0

0

19:34

26/08/2020

ASAP: Architecture Search, Anneal and Prune

Asaf Noy, Niv Nayman, Tal Ridnik and
Nadav Zamir, Sivan Doveh, Itamar Friedman, Raja Giryes, Lihi Zelnik

Keywords Paper

0

0

0

0

11:59

18/07/2021

Meta-learning Hyperparameter Performance Prediction with Neural Processes

Ying WEI, Peilin Zhao, Junzhou Huang

Keywords Paper

Algorithms, Supervised Learning

0

0

0

0

5:07

19/08/2021

Fine-grained Generalization Analysis of Structured Output Prediction

Waleed Mustafa, Yunwen Lei, Antoine Ledent, Marius Kloft

Keywords Paper

Machine Learning, Learning Theory, Structured Prediction

0

0

0

0

15:46

26/04/2020

N-BEATS: Neural basis expansion analysis for interpretable time series forecasting

Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, Yoshua Bengio

Keywords Paper

time series forecasting, deep learning

1

0

0

0

4:52