Accelerating SLIDE Deep Learning on Modern CPUs: Vectorization, Quantizations, Memory Optimizations, and More

05/04/2021

Accelerating SLIDE Deep Learning on Modern CPUs: Vectorization, Quantizations, Memory Optimizations, and More

Shabnam Daghaghi, Nicholas Meisburger, Mengnan Zhao, Anshumali Shrivastava

Keywords:

Abstract Paper Similar Papers

Abstract: Deep learning implementations on CPUs (Central Processing Units) are gaining more traction. Enhanced AI capabilities on commodity x86 architectures are commercially appealing due to the reuse of existing hardware and virtualization ease. A notable work in this direction is the SLIDE system. SLIDE is a C++ implementation of a sparse hash table based back-propagation, which was shown to be significantly faster than GPUs in training hundreds of million parameter neural models. In this paper, we argue that SLIDE's current implementation is sub-optimal and does not exploit several opportunities available in modern CPUs. In particular, we show how SLIDE's computations allow for a unique possibility of vectorization via AVX (Advanced Vector Extensions)-512. Furthermore, we highlight opportunities for different kinds of memory optimization and quantizations. Combining all of them, we obtain up to 7x speedup in the computations on the same hardware. Our experiments are focused on large (hundreds of millions of parameters) recommendation and NLP models. Our work highlights several novel perspectives and opportunities for implementing randomized algorithms for deep learning on modern CPUs.

The video of this talk cannot be embedded. You can watch it here:

https://slideslive.com/38952707

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at MLSYS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

26/04/2020

Chameleon: Adaptive Code Optimization for Expedited Deep Neural Network Compilation

Byung Hoon Ahn, Prannoy Pilligundla, Amir Yazdanbakhsh, Hadi Esmaeilzadeh

Keywords Paper

Reinforcement Learning, Learning to Optimize, Combinatorial Optimization, Compilers, Code Optimization, Neural Networks, ML for Systems, Learning for Systems

0

0

0

0

4:55

06/12/2020

Efficient Algorithms for Device Placement of DNN Graph Operators

Jakub Tarnawski, Amar Phanishayee, Nikhil Devanur and
Divya Mahajan, Fanny Nina Paravecino

Keywords Paper

0

0

1

0

3:20

12/07/2020

Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning

Aleksei Petrenko, Zhehui Huang, Tushar Kumar and
Gaurav Sukhatme, Vladlen Koltun

Keywords Paper

Reinforcement Learning - Deep RL

0

0

0

0

15:56

11/08/2020

A computational approach to packet classification

Alon Rashelbach, Ori Rottenstreich, Mark Silberstein

Keywords Paper

Neural Networks, Virtual Switches, Packet Classification

0

0

0

0

16:56

06/12/2021

Practical, Provably-Correct Interactive Learning in the Realizable Setting: The Power of True Believers

Julian Katz-Samuels, Blake Mason, Kevin Jamieson, Rob Nowak

Keywords Paper

theory, machine learning, bandits, kernel methods, active learning

0

0

0

0

7:41

05/04/2021

Pipelined Backpropagation at Scale: Training Large Models without Batches

Atli Kosson, Vitaliy Chiley, Abhi Venigalla and
Joel Hestness, Urs Koster

Keywords Paper

0

0

0

0

18:00

05/04/2021

Pipelined Backpropagation at Scale: Training Large Models without Batches

Atli Kosson, Vitaliy Chiley, Abhi Venigalla and
Joel Hestness, Urs Koster

Keywords Paper

0

0

0

0

4:14

03/05/2021

Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning

Shauharda Khadka, Estelle Aflalo, Mattias Marder and
Avrech Ben-David, Santiago Miret, Shie Mannor, Tamir Hazan, Hanlin Tang, Somdeb Majumdar

Keywords Paper

Evolutionary Algorithms, Device Placement, Memory Mapping, Reinforcement Learning

0

0

0

0

5:49

03/05/2021

Reducing the Computational Cost of Deep Generative Models with Binary Neural Networks

Thomas Bird, Friso Kingma, David Barber

Keywords Paper

generative, binary, optimization, compression

0

0

0

0

5:14

18/07/2021

Learn2Hop: Learned Optimization on Rough Landscapes

Amil Merchant, Luke Metz, Samuel Schoenholz, Ekin Cubuk

Keywords Paper

Applications, Others

0

0

0

0

5:19

26/04/2020

Jelly Bean World: A Testbed for Never-Ending Learning

Emmanouil Antonios Platanios, Abulhair Saparov, Tom Mitchell

Keywords Paper

0

0

0

0

5:02

05/04/2021

Horizontally Fused Training Array: An Effective Hardware Utilization Squeezer for Training Novel Deep Learning Models

Shang Wang, Peiming Yang, Yuxuan Zheng and
Xin Li, Gennady Pekhimenko

Keywords Paper

Theory -> Statistical Physics of Learning, Optimization -> Non-Convex Optimization

0

0

0

0

4:46

05/04/2021

Horizontally Fused Training Array: An Effective Hardware Utilization Squeezer for Training Novel Deep Learning Models

Shang Wang, Peiming Yang, Yuxuan Zheng and
Xin Li, Gennady Pekhimenko

Keywords Paper

Theory -> Statistical Physics of Learning, Optimization -> Non-Convex Optimization

0

0

0

0

20:09

06/12/2020

Kernel Methods Through the Roof: Handling Billions of Points Efficiently

Giacomo Meanti, Luigi Carratino, Lorenzo Rosasco, Alessandro Rudi

Keywords Paper

0

0

0

0

3:28

04/11/2020

Rammer: Enabling Holistic Deep Learning Compiler Optimizations with rTasks

Lingxiao Ma, Zhiqiang Xie, Zhi Yang and
Jilong Xue, Youshan Miao, Wei Cui, Wenxiang Hu, Fan Yang, Lintao Zhang, Lidong Zhou

Keywords Paper

0

0

0

0

16:30

12/07/2020

Evolving Machine Learning Algorithms From Scratch

Esteban Real, Chen Liang, David So, Quoc Le

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

15:01

26/04/2020

Watch the Unobserved: A Simple Approach to Parallelizing Monte Carlo Tree Search

Anji Liu, Jianshu Chen, Mingze Yu and
Yu Zhai, Xuewen Zhou, Ji Liu

Keywords Paper

parallel Monte Carlo Tree Search (MCTS), Upper Confidence bound for Trees (UCT), Reinforcement Learning (RL)

0

0

0

0

14:43

15/11/2020

Shiftry: RNN Inference in 2KB of RAM

Aayan Kumar, Vivek Seshadri, Rahul Sharma

Keywords Paper

Programming language, Fixed-point, Memory management, Machine learning, Embedded devices, Compiler, IoT device

0

0

0

0

16:06

06/12/2020

Counterexample-Guided Learning of Monotonic Neural Networks

Aishwarya Sivaraman, Golnoosh Farnadi, Todd Millstein, Guy Van den Broeck

Keywords Paper

0

0

0

0

3:22

05/04/2021

A Learned Performance Model for Tensor Processing Units

Sam Kaufman, Mangpo Phothilimthana, Yanqi Zhou and
Charith Mendis, Sudip Roy, Amit Sabne, Mike Burrows

Keywords Paper

0

0

0

0

19:05

13/04/2021

Faster & more reliable tuning of neural networks: Bayesian optimization with importance sampling

Setareh Ariafar, Zelda Mariet, Dana Brooks and
Jennifer Dy, Jasper Snoek

Keywords Paper

0

0

0

0

3:01

14/09/2020

Squeezing Correlated Neurons for Resource-Efficient Deep Neural Networks

Elbruz Ozen, Alex Orailoglu

Keywords Paper

deep learning, information redundancy, pruning

0

0

0

0

14:48

06/12/2021

BatchQuant: Quantized-for-all Architecture Search with Robust Quantizer

Haoping Bai, Meng Cao, Ping Huang, Jiulong Shan

Keywords Paper

deep learning, optimization

0

0

0

0

4:12

06/12/2020

AdaTune: Adaptive Tensor Program Compilation Made Efficient

Menghao Li, Minjia Zhang, Chi Wang, Mingqin Li

Keywords Paper

0

0

0

0

3:16

02/02/2021

Fitting the Search Space of Weight-sharing NAS with Graph Convolutional Networks

Xin Chen, Lingxi Xie, Jun Wu and
Longhui Wei, Yuhui Xu, Qi Tian

Keywords Paper

0

0

0

0

15:02

23/08/2020

Time-aware user embeddings as a service

Martin Pavlovski, Jelena Gligorijevic, Ivan Stojkovic and
Shubham Agrawal, Shabhareesh Komirishetty, Djordje Gligorijevic, Narayan Bhamidipati, Zoran Obradovic

Keywords Paper

sequential models, user representation, neural embeddings

0

0

0

0

19:42

26/04/2020

PC-DARTS: Partial Channel Connections for Memory-Efficient Architecture Search

Yuhui Xu, Lingxi Xie, Xiaopeng Zhang and
Xin Chen, Guo-Jun Qi, Qi Tian, Hongkai Xiong

Keywords Paper

Neural Architecture Search, DARTS, Regularization, Normalization

0

0

0

0

4:40

06/12/2020

Hypersolvers: Toward Fast Continuous-Depth Models

Michael Poli, Stefano Massaroli, Atsushi Yamashita and
Hajime Asama, Jinkyoo Park

Keywords Paper

0

0

0

0

3:16

15/06/2020

Learning fast and precise numerical analysis

Jingxuan He, Gagandeep Singh, Markus Püschel, Martin Vechev

Keywords Paper

Abstract interpretation, Performance optimization, Machine learning, Numerical domains

0

0

0

0

14:20

02/02/2021

Classifying Sequences of Extreme Length with Constant Memory Applied to Malware Detection

Edward Raff, William Fleshman, Richard Zak and
Hyrum S. Anderson, Bobby Filar, Mark McLean

Keywords Paper

0

0

0

0

19:55

06/12/2021

Differentiable Optimization of Generalized Nondecomposable Functions using Linear Programs

Zihang Meng, Lopamudra Mukherjee, Yichao Wu and
Vikas Singh, Sathya Narayanan Ravi

Keywords Paper

deep learning, optimization

0

0

0

0

13:21

15/06/2020

CARAT: A case for virtual memory through compiler- and runtime-based address translation

Brian Suchy, Simone Campanoni, Nikos Hardavellas, Peter Dinda

Keywords Paper

memory management, virtual memory

0

0

0

0

15:39

04/11/2020

A Tensor Compiler for Unified Machine Learning Prediction Serving

Supun Nakandala, Karla Saur, Gyeong-In Yu and
Konstantinos Karanasos, Carlo Curino, Markus Weimer, Matteo Interlandi

Keywords Paper

0

0

0

0

19:56

19/01/2020

Disentanglement in Nested-Parallel Programs

Sam Westrick, Rohan Yadav, Matthew Fluet, Umut A. Acar

Keywords Paper

memory management, disentanglement, parallel computing, functional programming, data race

0

0

0

0

21:33

14/06/2020

F-BRS: Rethinking Backpropagating Refinement for Interactive Segmentation

Konstantin Sofiiuk, Ilia Petrov, Olga Barinova, Anton Konushin

Keywords Paper

interactive segmentation, interactive, instance segmentation, segmentation, backpropagating refinement, refinement

0

0

0

0

4:56

06/12/2021

LSH-SMILE: Locality Sensitive Hashing Accelerated Simulation and Learning

Chonghao Sima, Yexiang Xue

Keywords Paper

deep learning

0

0

0

0

14:48

12/07/2020

Searching to Exploit Memorization Effect in Learning with Noisy Labels

QUANMING YAO, Hansi Yang, Bo Han and
Gang Niu, James Kwok

Keywords Paper

Transfer, Multitask and Meta-learning

0

0

0

0

12:25

03/05/2021

LambdaNetworks: Modeling long-range Interactions without Attention

Irwan Bello

Keywords Paper

attention, neural networks, image classification, deep learning, vision, transformer

0

0

0

0

9:59

02/02/2021

A Scalable Reasoning and Learning Approach for Neural-Symbolic Stream Fusion

Danh Le-Phuoc, Thomas Eiter, Anh Le-Tuan

Keywords Paper

0

0

0

0

18:49