Stochastic Precision Ensemble: Self-Knowledge Distillation for Quantized Deep Neural Networks

02/02/2021

Stochastic Precision Ensemble: Self-Knowledge Distillation for Quantized Deep Neural Networks

Yoonho Boo, Sungho Shin, Jungwook Choi, Wonyong Sung

Keywords:

Abstract Paper Similar Papers

Abstract: The quantization of deep neural networks (QDNNs) has been actively studied for deployment in edge devices. Recent studies employ the knowledge distillation (KD) method to improve the performance of quantized networks. In this study, we propose stochastic precision ensemble training for QDNNs (SPEQ). SPEQ is a knowledge distillation training scheme; however, the teacher is formed by sharing the model parameters of the student network. We obtain the soft labels of the teacher by randomly changing the bit precision of the activation stochastically at each layer of the forward-pass computation. The student model is trained with these soft labels to reduce the activation quantization noise. The cosine similarity loss is employed, instead of the KL-divergence, for KD training. As the teacher model changes continuously by random bit-precision assignment, it exploits the effect of stochastic ensemble KD. SPEQ outperforms the existing quantization training methods in various tasks, such as image classification, question-answering, and transfer learning without the need for cumbersome teacher networks.

The video of this talk cannot be embedded. You can watch it here:

https://slideslive.com/38949351

(Link will open in new window)

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at AAAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

Zero-Shot Knowledge Distillation from a Decision-Based Black-Box Model

Zi Wang

Keywords Paper

Deep Learning

0

0

0

0

5:08

02/02/2021

Data-Free Knowledge Distillation with Soft Targeted Transfer Set Synthesis

Zi Wang

Keywords Paper

0

0

0

0

14:19

19/04/2021

Annealing knowledge distillation

Aref Jafari, Mehdi Rezagholizadeh, Pranav Sharma, Ali Ghodsi

Keywords Paper

0

0

0

0

12:38

03/05/2021

SEED: Self-supervised Distillation For Visual Representation

Jacob Zhiyuan Fang, Jianfeng Wang, Lijuan Wang and
Lei Zhang, 'YZ' Yezhou Yang, Zicheng Liu

Keywords Paper

Representation Learning, Self Supervised Learning, Knowledge Distillation

0

0

0

0

5:09

30/11/2020

Introspective Learning by Distilling Knowledge from Online Self-explanation

Jindong Gu, Zhiliang Wu, Volker Tresp

Keywords Paper

0

0

0

0

10:18

19/08/2021

Comparing Kullback-Leibler Divergence and Mean Squared Error Loss in Knowledge Distillation

Taehyeon Kim, Jaehoon Oh, Nak Yil Kim and
Sangwook Cho, Se-Young Yun

Keywords Paper

Machine Learning, Classification, Deep Learning

0

0

0

0

12:43

06/12/2021

Unsupervised Representation Transfer for Small Networks: I Believe I Can Distill On-the-Fly

Hee Min Choi, Hyoa Kang, Dokwan Oh

Keywords Paper

self-supervised learning, representation learning

0

0

0

0

3:35

02/02/2021

Robust Knowledge Transfer via Hybrid Forward on the Teacher-Student Model

Liangchen Song, Jialian Wu, Ming Yang and
Qian Zhang, Yuan Li, Junsong Yuan

Keywords Paper

0

0

0

0

16:09

14/06/2020

Neural Networks Are More Productive Teachers Than Human Raters: Active Mixup for Data-Efficient Knowledge Distillation From a Blackbox Model

Dongdong Wang, Yandong Li, Liqiang Wang, Boqing Gong

Keywords Paper

blackbox knowledge distillation, data-efficient learning, active learning, mixup

0

0

0

0

4:59

22/11/2021

Semi-Online Knowledge Distillation

Zhiqiang Liu, Yanxia Liu, Chengkai Huang

Keywords Paper

Knowledge Distillation, Model Compression

0

0

0

0

3:00

03/05/2021

Knowledge distillation via softmax regression representation learning

Jing Yang, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos

Keywords Paper

0

0

0

0

4:56

06/12/2021

Online Learning Of Neural Computations From Sparse Temporal Feedback

Lukas Braun, Tim Vogels

Keywords Paper

deep learning, online learning

0

0

0

0

15:04

14/06/2020

Online Knowledge Distillation via Collaborative Learning

Qiushan Guo, Xinjiang Wang, Yichao Wu and
Zhipeng Yu, Ding Liang, Xiaolin Hu, Ping Luo

Keywords Paper

knowledge distillation, collaborative learning, transfer learning, deep neural network

0

0

0

0

4:37

18/07/2021

On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting

Shunta Akiyama, Taiji Suzuki

Keywords Paper

Theory, Deep learning Theory

0

0

0

0

5:17

06/12/2020

Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions

Stefano Sarao Mannelli, Eric Vanden-Eijnden, Lenka Zdeborová

Keywords Paper

Algorithms -> Model Selection and Structure Learning; Algorithms -> Representation Learning; Theory -> Computational Complexity, Reinforcement Learning and Planning -> Markov Decision Processes

0

0

0

0

3:22

03/05/2021

Rethinking Soft Labels for Knowledge Distillation: A Bias–Variance Tradeoff Perspective

Helong Zhou, Liangchen Song, Jiajie Chen and
Ye Zhou, Guoli Wang, Junsong Yuan, Qian Zhang

Keywords Paper

teacher-student model, soft labels, Knowledge distillation

0

0

0

0

2:20

02/02/2021

Learning to Reweight with Deep Interactions

Yang Fan, Yingce Xia, Lijun Wu and
Shufang Xie, Weiqing Liu, Jiang Bian, Tao Qin, Xiang-Yang Li

Keywords Paper

0

0

0

0

14:06

02/02/2021

ALP-KD: Attention-Based Layer Projection for Knowledge Distillation

Peyman Passban, Yimeng Wu, Mehdi Rezagholizadeh, Qun Liu

Keywords Paper

0

0

0

0

18:53

26/04/2020

Contrastive Representation Distillation

Yonglong Tian, Dilip Krishnan, Phillip Isola

Keywords Paper

Knowledge Distillation, Representation Learning, Contrastive Learning, Mutual Information

0

0

0

0

4:55

14/06/2020

Few Sample Knowledge Distillation for Efficient Network Compression

Tianhong Li, Jianguo Li, Zhuang Liu, Changshui Zhang

Keywords Paper

efficient network compression, few samples, knowledge distillation

0

0

0

0

1:01

06/12/2021

MixACM: Mixup-Based Robustness Transfer via Distillation of Activated Channel Maps

Awais Muhammad, Fengwei Zhou, Chuanlong Xie and
Jiawei Li, Sung-Ho Bae, Zhenguo Li

Keywords Paper

deep learning, optimization, robustness, adversarial robustness and security

0

0

0

0

12:51

06/12/2020

Knowledge Distillation in Wide Neural Networks: Risk Bound, Data Efficiency and Imperfect Teacher

Guangda Ji, Zhanxing Zhu

Keywords Paper

0

0

0

0

3:19

02/02/2021

Teacher Guided Neural Architecture Search for Face Recognition

Xiaobo Wang

Keywords Paper

0

0

0

0

13:54

14/06/2020

Search to Distill: Pearls Are Everywhere but Not the Eyes

Yu Liu, Xuhui Jia, Mingxing Tan and
Raviteja Vemulapalli, Yukun Zhu, Bradley Green, Xiaogang Wang

Keywords Paper

neural architecture search, knowledge distillation, nas, neural architecture

0

0

0

0

4:26

22/11/2021

Adaptive Distillation: Aggregating Knowledge from Multiple Paths for Efficient Distillation

Sumanth Chennupati, Mohammad Mahdi Kamani, Zhongwei Cheng, Lin Chen

Keywords Paper

Knowledge Distillation, Multitask Learning, Model Compression, Adaptive Distillation, Efficient Training

0

0

0

0

3:07

22/11/2021

Teacher-Class Network: A Neural Network Compression Mechanism

Shaiq Munir Malik, Fnu Mohbat, Muhammad Umair Haider and
Muhammad Musab Rasheed, Murtaza Taj

Keywords Paper

model compression, knowledge distillation, teacher-student network

0

0

0

0

3:17

14/06/2020

Distilling Cross-Task Knowledge via Relationship Matching

Han-Jia Ye, Su Lu, De-Chuan Zhan

Keywords Paper

knowledge distillation, model reuse, knowledge transfer, cross-task learning, embedding learning

0

0

0

0

4:54

06/12/2021

Iterative Teacher-Aware Learning

Luyao Yuan, Dongruo Zhou, Junhong Shen and
Jingdong Gao, Jeffrey L Chen, Quanquan Gu, Ying Nian Wu, Song-Chun Zhu

Keywords Paper

theory, optimization, reinforcement learning and planning, machine learning

0

0

0

0

6:40

14/06/2020

Heterogeneous Knowledge Distillation Using Information Flow Modeling

Nikolaos Passalis, Maria Tzelepi, Anastasios Tefas

Keywords Paper

neural network distillation, lightweight learning, information flow

0

0

0

0

1:00

14/06/2020

Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion

Hongxu Yin, Pavlo Molchanov, Jose M. Alvarez and
Zhizhong Li, Arun Mallya, Derek Hoiem, Niraj K. Jha, Jan Kautz

Keywords Paper

model inversiond, ata-free distillation, transfer, pruning, compression, incremental learning, continual learning, efficient, image synthesis, explainable ai

0

0

0

0

4:57

12/07/2020

Student Specialization in Deep Rectified Networks With Finite Width and Input Dimension

Yuandong Tian

Keywords Paper

Deep Learning - Theory

0

0

0

0

15:31

06/12/2021

Learning curves of generic features maps for realistic datasets with a teacher-student model

Bruno Loureiro, Cedric Gerbelot, Hugo Cui and
Sebastian Goldt, Florent Krzakala, Marc Mezard, Lenka Zdeborová

Keywords Paper

deep learning, machine learning, kernel methods

0

0

0

0

12:59

05/01/2021

Effectiveness of Arbitrary Transfer Sets for Data-Free Knowledge Distillation

Gaurav Kumar Nayak, Konda Reddy Mopuri, Anirban Chakraborty

Keywords Paper

0

0

0

0

5:00

02/02/2021

Collaborative Group Learning

Shaoxiong Feng, Hongshen Chen, Xuancheng Ren and
Zhuoye Ding, Kan Li, Xu Sun

Keywords Paper

0

0

0

0

17:58

06/12/2020

Black-Box Ripper: Copying black-box models using generative evolutionary algorithms

Antonio Barbalau, Adrian Cosma, Radu Tudor Ionescu, Marius Popescu

Keywords Paper

0

0

0

0

3:18

30/11/2020

Fully Supervised and Guided Distillation for One-Stage Detectors

Deyu Wang, Dongchao Wen, Junjie Liu and
Wei Tao, Tse-Wei Chen, Kinya Osa, Masami Kato

Keywords Paper

0

0

0

0

7:14

02/02/2021

Cross-Layer Distillation with Semantic Calibration

Defang Chen, Jian-Ping Mei, Yuan Zhang and
Can Wang, Zhe Wang, Yan Feng, Chun Chen

Keywords Paper

0

0

0

0

17:05

06/12/2021

Spatial Ensemble: a Novel Model Smoothing Mechanism for Student-Teacher Framework

Tengteng Huang, Yifan Sun, Xun Wang and
Haotian Yao, Chi Zhang

Keywords Paper

0

0

0

0

10:07

08/12/2020

Query Distillation: BERT-based Distillation for Ensemble Ranking

Wangshu Zhang, Junhong Liu, Zujie Wen and
Yafang Wang, Gerard de Melo

Keywords Paper

0

0

0

0

15:01

06/12/2021

Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

Gongfan Fang, Yifan Bao, Jie Song and
Xinchao Wang, Donglin Xie, Chengchao Shen, Mingli Song

Keywords Paper

machine learning, vision, privacy

0

0

0

0

5:35