Deep Knowledge Distillation using Trainable Dense Attention

22/11/2021

Deep Knowledge Distillation using Trainable Dense Attention

Bharat Sau, Soumya Roy, Vinay P Namboodiri, Raghu Sesha Iyengar

Keywords: Knowledge Distillation, Network Compression, Visual Attention

Abstract Paper Similar Papers

Abstract: Knowledge distillation based deep model compression has been actively pursued in order to obtain improved performance on specified student architectures by distilling knowledge from deeper networks. Among various methods, attention based knowledge distillation has shown great promise on large datasets. However, this approach is limited by hand-designed attention functions such as absolute sum. We address this shortcoming by proposing trainable attention methods that can be used to obtain improved performance while distilling knowledge from teacher to student. We also show that, using dense connections efficiently between attention modules, we can further improve the student’s performance. Our approach, when applied to ResNet50(teacher)-MobileNetv1(student) pair on ImageNet dataset, has a reduction of 9.6% in Top-1 error rate over the previous state-of-the-art method.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at BMVC 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

30/11/2020

Online Knowledge Distillation via Multi-branch Diversity Enhancement

Zheng Li, Ying Huang, Defang Chen and
Tianren Luo, Ning Cai, Zhigeng Pan

Keywords Paper

0

0

0

0

5:36

22/11/2021

Adaptive Distillation: Aggregating Knowledge from Multiple Paths for Efficient Distillation

Sumanth Chennupati, Mohammad Mahdi Kamani, Zhongwei Cheng, Lin Chen

Keywords Paper

Knowledge Distillation, Multitask Learning, Model Compression, Adaptive Distillation, Efficient Training

0

0

0

0

3:07

22/11/2021

Semi-Online Knowledge Distillation

Zhiqiang Liu, Yanxia Liu, Chengkai Huang

Keywords Paper

Knowledge Distillation, Model Compression

0

0

0

0

3:00

06/12/2021

Unsupervised Representation Transfer for Small Networks: I Believe I Can Distill On-the-Fly

Hee Min Choi, Hyoa Kang, Dokwan Oh

Keywords Paper

self-supervised learning, representation learning

0

0

0

0

3:35

14/06/2020

Neural Networks Are More Productive Teachers Than Human Raters: Active Mixup for Data-Efficient Knowledge Distillation From a Blackbox Model

Dongdong Wang, Yandong Li, Liqiang Wang, Boqing Gong

Keywords Paper

blackbox knowledge distillation, data-efficient learning, active learning, mixup

0

0

0

0

4:59

19/08/2021

Self-boosting for Feature Distillation

Yulong Pei, Yanyun Qu, Junping Zhang

Keywords Paper

Computer Vision, 2D and 3D Computer Vision, Recognition

0

0

0

0

12:57

19/04/2021

Annealing knowledge distillation

Aref Jafari, Mehdi Rezagholizadeh, Pranav Sharma, Ali Ghodsi

Keywords Paper

0

0

0

0

12:38

19/08/2021

Hierarchical Self-supervised Augmented Knowledge Distillation

Chuanguang Yang, Zhulin An, Linhang Cai, Yongjun Xu

Keywords Paper

Computer Vision, Recognition

0

0

0

0

13:30

14/06/2020

Online Knowledge Distillation via Collaborative Learning

Qiushan Guo, Xinjiang Wang, Yichao Wu and
Zhipeng Yu, Ding Liang, Xiaolin Hu, Ping Luo

Keywords Paper

knowledge distillation, collaborative learning, transfer learning, deep neural network

0

0

0

0

4:37

03/05/2021

Knowledge Distillation as Semiparametric Inference

Tri Dao, Govinda Kamath, Vasilis Syrgkanis, Lester Mackey

Keywords Paper

generalization bounds, knowledge distillation, model compression, loss correction, orthogonal machine learning, cross-fitting, semiparametric inference

0

0

0

0

5:10

03/05/2021

Knowledge distillation via softmax regression representation learning

Jing Yang, Brais Martinez, Adrian Bulat, Georgios Tzimiropoulos

Keywords Paper

0

0

0

0

4:56

06/12/2020

CompRess: Self-Supervised Learning by Compressing Representations

Soroush Abbasi Koohpayegani, Ajinkya Tejankar, Hamed Pirsiavash

Keywords Paper

0

0

0

0

3:23

02/02/2021

Data-Free Knowledge Distillation with Soft Targeted Transfer Set Synthesis

Zi Wang

Keywords Paper

0

0

0

0

14:19

19/08/2021

Perturb, Predict & Paraphrase: Semi-Supervised Learning using Noisy Student for Image Captioning

Arjit Jain, Pranay Reddy Samala, Preethi Jyothi and
Deepak Mittal, Maneesh Singh

Keywords Paper

Computer Vision, Language and Vision, Semi-Supervised Learning

0

0

0

0

10:06

30/11/2020

Data-Efficient Ranking Distillation for Image Retrieval

Zakaria Laskar, Juho Kannala

Keywords Paper

0

0

0

0

7:58

06/12/2021

Learning Student-Friendly Teacher Networks for Knowledge Distillation

Dae Young Park, Moon-Hyun Cha, changwook jeong and
Daesin Kim, Bohyung Han

Keywords Paper

deep learning, transfer learning

0

0

0

0

13:41

14/06/2020

Search to Distill: Pearls Are Everywhere but Not the Eyes

Yu Liu, Xuhui Jia, Mingxing Tan and
Raviteja Vemulapalli, Yukun Zhu, Bradley Green, Xiaogang Wang

Keywords Paper

neural architecture search, knowledge distillation, nas, neural architecture

0

0

0

0

4:26

22/11/2021

Multi-bit Adaptive Distillation for Binary Neural Networks

Ying Nie, Kai Han, Yunhe Wang

Keywords Paper

binary, distillation, 1bit

0

0

0

0

1:52

02/02/2021

Robust Knowledge Transfer via Hybrid Forward on the Teacher-Student Model

Liangchen Song, Jialian Wu, Ming Yang and
Qian Zhang, Yuan Li, Junsong Yuan

Keywords Paper

0

0

0

0

16:09

03/05/2021

SEED: Self-supervised Distillation For Visual Representation

Jacob Zhiyuan Fang, Jianfeng Wang, Lijuan Wang and
Lei Zhang, 'YZ' Yezhou Yang, Zicheng Liu

Keywords Paper

Representation Learning, Self Supervised Learning, Knowledge Distillation

0

0

0

0

5:09

14/06/2020

Few Sample Knowledge Distillation for Efficient Network Compression

Tianhong Li, Jianguo Li, Zhuang Liu, Changshui Zhang

Keywords Paper

efficient network compression, few samples, knowledge distillation

0

0

0

0

1:01

30/11/2020

Introspective Learning by Distilling Knowledge from Online Self-explanation

Jindong Gu, Zhiliang Wu, Volker Tresp

Keywords Paper

0

0

0

0

10:18

14/06/2020

Heterogeneous Knowledge Distillation Using Information Flow Modeling

Nikolaos Passalis, Maria Tzelepi, Anastasios Tefas

Keywords Paper

neural network distillation, lightweight learning, information flow

0

0

0

0

1:00

02/02/2021

ALP-KD: Attention-Based Layer Projection for Knowledge Distillation

Peyman Passban, Yimeng Wu, Mehdi Rezagholizadeh, Qun Liu

Keywords Paper

0

0

0

0

18:53

02/02/2021

Stochastic Precision Ensemble: Self-Knowledge Distillation for Quantized Deep Neural Networks

Yoonho Boo, Sungho Shin, Jungwook Choi, Wonyong Sung

Keywords Paper

0

0

0

0

19:03

14/06/2020

Collaborative Distillation for Ultra-Resolution Universal Style Transfer

Huan Wang, Yijun Li, Yuehai Wang and
Haoji Hu, Ming-Hsuan Yang

Keywords Paper

model compression, neural style transfer, knowledge distillation, low-level vision

0

0

0

0

1:01

14/06/2020

Uninformed Students: Student-Teacher Anomaly Detection With Discriminative Latent Embeddings

Paul Bergmann, Michael Fauser, David Sattlegger, Carsten Steger

Keywords Paper

anomaly detection, unsupervised learning, defect segmentation, student-teacher learning, uncertainty, novelty detection

0

0

0

0

1:00

02/02/2021

Collaborative Group Learning

Shaoxiong Feng, Hongshen Chen, Xuancheng Ren and
Zhuoye Ding, Kan Li, Xu Sun

Keywords Paper

0

0

0

0

17:58

30/11/2020

Fully Supervised and Guided Distillation for One-Stage Detectors

Deyu Wang, Dongchao Wen, Junjie Liu and
Wei Tao, Tse-Wei Chen, Kinya Osa, Masami Kato

Keywords Paper

0

0

0

0

7:14

06/12/2021

Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

Gongfan Fang, Yifan Bao, Jie Song and
Xinchao Wang, Donglin Xie, Chengchao Shen, Mingli Song

Keywords Paper

machine learning, vision, privacy

0

0

0

0

5:35

06/12/2021

Adversarial Teacher-Student Representation Learning for Domain Generalization

Fu-En Yang, Yuan-Chia Cheng, Zu-Yun Shiau, Yu-Chiang Frank Wang

Keywords Paper

machine learning, vision, domain adaptation, representation learning

0

0

0

0

8:14

14/06/2020

Creating Something From Nothing: Unsupervised Knowledge Distillation for Cross-Modal Hashing

Hengtong Hu, Lingxi Xie, Richang Hong, Qi Tian

Keywords Paper

cross-model retrieval, cross-model hashing, unsupervised learning, knowledge distillation, similarity calculation, teacher-student network, supervised learning, unsupervised model guiding supervised model, exploiting information from output features, achieving sota performance

0

0

0

0

1:00

22/11/2021

Object Re-identification Using Teacher-Like and Light Students

Yi Xie, Hanxiao Wu, Fei Shen and
Jianqing Zhu, Huanqiang Zeng

Keywords Paper

object re-identification, knowledge distillation, pruning, re-parameterization

0

0

0

0

3:19

02/02/2021

Interpretable Embedding Procedure Knowledge Transfer via Stacked Principal Component Analysis and Graph Neural Network

Seunghyun Lee, Byung Cheol Song

Keywords Paper

0

0

0

0

18:00

06/12/2021

VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer

Zineng Tang, Jaemin Cho, Hao Tan, Mohit Bansal

Keywords Paper

language

0

0

0

0

10:13

18/07/2021

AlphaNet: Improved Training of Supernets with Alpha-Divergence

Dilin Wang, Chengyue Gong, Meng Li and
Qiang Liu, Vikas Chandra

Keywords Paper

Deep Learning, Architectures

0

0

0

0

16:14

08/12/2020

Query Distillation: BERT-based Distillation for Ensemble Ranking

Wangshu Zhang, Junhong Liu, Zujie Wen and
Yafang Wang, Gerard de Melo

Keywords Paper

0

0

0

0

15:01

02/02/2021

Teacher Guided Neural Architecture Search for Face Recognition

Xiaobo Wang

Keywords Paper

0

0

0

0

13:54

06/12/2021

Comprehensive Knowledge Distillation with Causal Intervention

Xiang Deng, Zhongfei Zhang

Keywords Paper

representation learning, causality

0

0

0

0

12:24

16/11/2020

Autoregressive Knowledge Distillation through Imitation Learning

Alexander Lin, Jeremy Wohlwend, Howard Chen, Tao Lei

Keywords Paper

natural tasks, knowledge distillation, exposure problem, prototypical tasks

0

0

0

0

12:43