22/11/2021

Make Baseline Model Stronger: Embedded Knowledge Distillation in Weight-Sharing Based Ensemble Network

Shuchang LYU, Qi Zhao, Yujing Ma, Lijiang Chen

Keywords: knowledge distillation, ensemble learning, high-efficiency network

Abstract: Recently, many notable convolutional neural networks have powerful performance with compact and efficient structure. To further pursue performance improvement, previous methods either introduce more computation or design complex modules. In this paper, we propose an elegant weight-sharing based ensemble network embedded knowledge distillation (EKD-FWSNet) to enhance the generalization ability of baseline models with no increase of computation and complex modules. Specifically, we first design an auxiliary branch alongside with baseline model, then set branch points and shortcut connections between two branches to construct different forward paths. In this way, we form a weight-sharing ensemble network with multiple output predictions. Furthermore, we integrate the information from diverse posterior probabilities and intermediate feature maps, which are then transferred to baseline model through knowledge distillation strategy. Extensive image classification experiments on CIFAR-10/100 and tiny-ImageNet datasets demonstrate that our proposed EKD-FWSNet can help numerous baseline models improve the accuracy by large margin (sometimes more than 4%). We also conduct extended experiments on remote sensing datasets (AID, NWPU-RESISC45, UC-Merced) and achieve state-of-the-art results.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at BMVC 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers