05/04/2021

sensAI: ConvNets Decomposition via Class Parallelism for Fast Inference on Live Data

Guanhua Wang, Zhuang Liu, Brandon Hsieh, Siyuan Zhuang, Joseph Gonzalez, Trevor Darrell, Ion Stoica

Keywords:

Abstract: Convolutional Neural Networks (ConvNets) enable computers to excel on vision learning tasks such as image classification, object detection. Recently, real-time inference on live data is becoming more and more important. From a system perspective, it requires fast inference on each single, incoming data item (e.g. 1 image). Two main-stream distributed model serving paradigms – data parallelism and model parallelism – are not necessarily desirable here, because we cannot further split a single input data piece via data parallelism, and model parallelism introduces huge communication overhead. To achieve live data inference with low latency, we propose sensAI, a novel and generic approach that decouples a CNN model into disconnected subnets, each is responsible for predicting certain class(es). We call this new model distribution paradigm as class parallelism. Experimental results show that, sensAI achieves up to 18x faster inference on single input data item with no or negligible accuracy loss on CIFAR-10, CIFAR-100 and ImageNet-1K datasets.

The video of this talk cannot be embedded. You can watch it here:
https://slideslive.com/38952769
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at MLSYS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers