05/04/2021

Accelerate Inference of CNNs for Video Analysis While Preserving Exactness Exploiting Activation Sparsity

Toshiaki Wakatsuki, Sekitoshi Kanai, Yasuhiro Fujiwara

Keywords:

Abstract: This paper proposes a range-bound-aware convolution layer that accelerates the inference of rectified linear unit (ReLU)-based convolutional neural networks (CNNs) for analyzing video streams. Since video analysis systems require to process each video frame in real-time, the computational cost of inference of CNNs must be reduced. Several techniques heuristically skip the computation for the current frame and reuse the results of the previous frame when the current and previous frames are sufficiently similar. However, for critical applications such as surveillance systems, their accuracy can be unsatisfactory because they sacrifice accuracy for efficiency. In contrast, our method reduces the computational cost of convolution layers accompanied by ReLU while producing exactly the same inference results as an original model. We utilize both temporal similarity of video frames and activation sparsity in ReLU-based CNNs to guarantee to skip truly redundant computations. We experimentally confirm that our method can accelerate widely used pre-trained CNNs with both CPU and GPU implementations.

The video of this talk cannot be embedded. You can watch it here:
https://slideslive.com/38952753
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at MLSYS 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers