Make Baseline Model Stronger: Embedded Knowledge Distillation in Weight-Sharing Based Ensemble Network

22/11/2021

Make Baseline Model Stronger: Embedded Knowledge Distillation in Weight-Sharing Based Ensemble Network

Shuchang LYU, Qi Zhao, Yujing Ma, Lijiang Chen

Keywords: knowledge distillation, ensemble learning, high-efficiency network

Abstract Paper Similar Papers

Abstract: Recently, many notable convolutional neural networks have powerful performance with compact and efficient structure. To further pursue performance improvement, previous methods either introduce more computation or design complex modules. In this paper, we propose an elegant weight-sharing based ensemble network embedded knowledge distillation (EKD-FWSNet) to enhance the generalization ability of baseline models with no increase of computation and complex modules. Specifically, we first design an auxiliary branch alongside with baseline model, then set branch points and shortcut connections between two branches to construct different forward paths. In this way, we form a weight-sharing ensemble network with multiple output predictions. Furthermore, we integrate the information from diverse posterior probabilities and intermediate feature maps, which are then transferred to baseline model through knowledge distillation strategy. Extensive image classification experiments on CIFAR-10/100 and tiny-ImageNet datasets demonstrate that our proposed EKD-FWSNet can help numerous baseline models improve the accuracy by large margin (sometimes more than 4%). We also conduct extended experiments on remote sensing datasets (AID, NWPU-RESISC45, UC-Merced) and achieve state-of-the-art results.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at BMVC 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

18/07/2021

Group Fisher Pruning for Practical Network Compression

Liyang Liu, Shilong Zhang, Zhanghui Kuang and
Aojun Zhou, Jing-Hao Xue, Xinjiang Wang, Yimin Chen, Wenming Yang, Qingmin Liao, Wayne Zhang

Keywords Paper

Applications, Computer Vision

0

0

0

0

5:05

06/12/2021

Rethinking Neural Operations for Diverse Tasks

Nicholas Roberts, Mikhail Khodak, Tri Dao and
Liam Li, Christopher Ré, Ameet S Talwalkar

Keywords Paper

deep learning, machine learning

0

0

0

0

10:26

02/02/2021

DenserNet: Weakly Supervised Visual Localization Using Multi-Scale Feature Aggregation

Dongfang Liu, Yiming Cui, Liqi Yan and
Christos Mousas, Baijian Yang, Yingjie Chen

Keywords Paper

0

0

0

0

16:15

06/12/2020

Collegial Ensembles

Etai Littwin, Ben Myara, Sima Sabah and
Joshua Susskind, Shuangfei Zhai, Oren Golan

Keywords Paper

0

0

0

0

3:17

05/01/2021

MPRNet: Multi-Path Residual Network for Lightweight Image Super Resolution

Armin Mehri, Parichehr B. Ardakani, Angel D. Sappa

Keywords Paper

0

0

0

0

4:57

22/11/2021

SamplingAug: On the Importance of Patch Sampling Augmentation for Single Image Super-Resolution

Shizun Wang, Ming Lu, Kaixin Chen and
Jiaming Liu, Xiaoqi Li, Chuang Zhang, Ming Wu

Keywords Paper

Super-Resolution, Patch Sampling

0

0

0

0

2:18

22/11/2021

Parameter Efficient Dynamic Convolution via Tensor Decomposition

Zejiang Hou, Sun-Yuan Kung

Keywords Paper

dynamic convolution, input-dependent reparameterization, parameter efficiency, tensor decomposition

0

0

0

0

3:58

03/05/2021

Neurally Augmented ALISTA

Freya Behrens, Jonathan Sauder, Peter Jung

Keywords Paper

learned ISTA, unrolled algorithms, compressed sensing, sparse reconstruction

0

0

0

0

5:18

06/12/2021

Convolutional Normalization: Improving Deep Convolutional Network Robustness and Training

Sheng Liu, Xiao Li, Yuexiang Zhai and
Chong You, Zhihui Zhu, Carlos Fernandez-Granda, Qing Qu

Keywords Paper

deep learning, machine learning, robustness, generative model

0

0

0

0

6:45

22/11/2021

GhostShiftAddNet: More Features from Energy-Efficient Operations

Jia Bi, Jonathon Hare, Geoff V Merrett

Keywords Paper

Efficient convolutional neural network, embedded platform, feature redundancy, image classifier.

0

0

0

0

3:37

03/05/2021

Auxiliary Task Update Decomposition: The Good, the Bad and the Neutral

Lucio Dery, Yann Dauphin, David Grangier

Keywords Paper

multitask learning, deeplearning, pre-training, gradient decomposition

0

0

0

0

5:22

12/07/2020

dS^2LBI: Exploring Structural Sparsity on Deep Network via Differential Inclusion Paths

Yanwei Fu, Chen Liu, Donghao Li and
Xinwei Sun, Jinshan ZENG, Yuan Yao

Keywords Paper

Deep Learning - Algorithms

0

0

0

1

12:45

03/08/2020

An Interpretable and Sample Efficient Deep Kernel for Gaussian Process

Yijue Dai, Tianjian Zhang, Zhidi Lin and
Feng Yin, Sergios Theodoridis, Shuguang Cui

Keywords Paper

0

0

0

0

8:31

02/02/2021

Improving Gradient Flow with Unrolled Highway Expectation Maximization

Chonghyuk Song, Eunseok Kim, Inwook Shim

Keywords Paper

0

0

0

0

14:41

14/06/2020

MemNAS: Memory-Efficient Neural Architecture Search With Grow-Trim Learning

Peiye Liu, Bo Wu, Huadong Ma, Mingoo Seok

Keywords Paper

neural architecture search, recurrent neural network, memory optimization

0

0

0

0

0:59

12/07/2020

Operation-Aware Soft Channel Pruning using Differentiable Masks

Minsoo Kang, Bohyung Han

Keywords Paper

Applications - Computer Vision

0

0

0

0

14:56

03/05/2021

Reweighting Augmented Samples by Minimizing the Maximal Expected Loss

Mingyang Yi, LU HOU, Lifeng Shang and
Xin Jiang, Qun Liu, Zhi-Ming Ma

Keywords Paper

sample reweighting, data augmentation

0

0

0

0

4:58

06/12/2020

Model Fusion via Optimal Transport

Sidak Pal Singh, Martin Jaggi

Keywords Paper

1

0

0

1

3:10

06/12/2020

Hyperparameter Ensembles for Robustness and Uncertainty Quantification

Florian Wenzel, Jasper Snoek, Dustin Tran, Rodolphe Jenatton

Keywords Paper

0

0

0

0

3:21

12/07/2020

Inducing and Exploiting Activation Sparsity for Fast Inference on Deep Neural Networks

Mark Kurtz, Justin Kopinsky, Rati Gelashvili and
Alexander Matveev, John Carr, Michael Goin, William Leiserson, Sage Moore, Nir Shavit, Dan Alistarh

Keywords Paper

Deep Learning - Algorithms

0

0

0

0

14:41

26/04/2020

BatchEnsemble: an Alternative Approach to Efficient Ensemble and Lifelong Learning

Yeming Wen, Dustin Tran, Jimmy Ba

Keywords Paper

deep learning, ensembles

0

0

0

1

5:39

02/02/2021

Harmonized Dense Knowledge Distillation Training for Multi-Exit Architectures

Xinglu Wang, Yingming Li

Keywords Paper

0

0

0

0

15:12

12/07/2020

Enhancing Simple Models by Exploiting What They Already Know

Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss

Keywords Paper

Supervised Learning

0

0

0

0

13:57

14/06/2020

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

Tianzhe Wang, Kuan Wang, Han Cai and
Ji Lin, Zhijian Liu, Hanrui Wang, Yujun Lin, Song Han

Keywords Paper

efficiency, model compression, joint design, neural architecture search, channel pruning, mixed-precision quantization

0

0

0

0

1:00

14/06/2020

Residual Feature Aggregation Network for Image Super-Resolution

Jie Liu, Wenjie Zhang, Yuting Tang and
Jie Tang, Gangshan Wu

Keywords Paper

image super-resolution, convolutional neural network, deep learning

0

0

0

0

1:00

05/01/2021

OverNet: Lightweight Multi-Scale Super-Resolution With Overscaling Network

Parichehr Behjati, Pau Rodriguez, Armin Mehri and
Isabelle Hupont, Carles Fernandez Tena, Jordi Gonzalez

Keywords Paper

0

0

0

0

4:24

06/12/2020

Distributional Robustness with IPMs and links to Regularization and GANs

Hisham Husain

Keywords Paper

0

0

0

0

3:12

06/12/2020

Automatic Perturbation Analysis for Scalable Certified Robustness and Beyond

Kaidi Xu, Zhouxing Shi, Huan Zhang and
Yihan Wang, Kai-Wei Chang, Minlie Huang, Bhavya Kailkhura, Xue Lin, Cho-Jui Hsieh

Keywords Paper

0

0

0

0

3:18

22/11/2021

Image-Text Alignment using Adaptive Cross-attention with Transformer Encoder for Scene Graphs

Juyong Song, Sunghyun Choi

Keywords Paper

cross-attention, multi-modal, retrieval, scene-graphs, graph neural networks, contrastive loss

0

0

0

0

3:01

26/08/2020

Neural Topic Model with Attention for Supervised Learning

Xinyi Wang, YI YANG

Keywords Paper

0

0

0

0

12:39

06/12/2021

Co-Adaptation of Algorithmic and Implementational Innovations in Inference-based Deep Reinforcement Learning

Hiroki Furuta, Tadashi Kozuno, Tatsuya Matsushima and
Yutaka Matsuo, Shixiang (Shane) Gu

Keywords Paper

reinforcement learning and planning

0

0

0

0

10:00

03/05/2021

Learning Robust State Abstractions for Hidden-Parameter Block MDPs

Amy Zhang, Shagun Sodhani, Khimya Khetarpal, Joelle Pineau

Keywords Paper

bisimulation, block mdp, hidden-parameter mdp, multi-task reinforcement learning

0

0

0

0

4:17

02/02/2021

Semi-Supervised Learning with Variational Bayesian Inference and Maximum Uncertainty Regularization

Kien Do, Truyen Tran, Svetha Venkatesh

Keywords Paper

0

0

0

0

16:56

18/07/2021

A theory of high dimensional regression with arbitrary correlations between input features and target functions: sample complexity, multiple descent curves and a hierarchy of phase transitions

Gabriel Mel, Surya Ganguli

Keywords Paper

Theory

0

0

0

0

4:38

26/04/2020

Computation Reallocation for Object Detection

Feng Liang, Chen Lin, Ronghao Guo and
Ming Sun, Wei Wu, Junjie Yan, Wanli Ouyang

Keywords Paper

Neural Architecture Search, Object Detection

0

0

0

0

5:29

26/04/2020

Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning

Ruqi Zhang, Chunyuan Li, Jianyi Zhang and
Changyou Chen, Andrew Gordon Wilson

Keywords Paper

0

0

0

0

14:59

26/04/2020

Four Things Everyone Should Know to Improve Batch Normalization

Cecilia Summers, Michael J. Dinneen

Keywords Paper

batch normalization

0

0

0

0

4:32

06/12/2021

Post-Training Quantization for Vision Transformer

Zhenhua Liu, Yunhe Wang, Kai Han and
Wei Zhang, Siwei Ma, Wen Gao

Keywords Paper

deep learning, transformers, vision

0

0

0

0

5:52

02/02/2021

Provable Benefits of Overparameterization in Model Compression: From Double Descent to Pruning Neural Networks

Xiangyu Chang, Yingcong Li, Samet Oymak, Christos Thrampoulidis

Keywords Paper

0

0

0

0

18:14