07/09/2020

SD-MTCNN: Self-Distilled Multi-Task CNN

Ankit Jha, Awanish Kumar, Biplab Banerjee, Vinay Namboodiri

Keywords: self-distillation, multi-task learning

Abstract: Multi-task learning (MTL) using convolutional neural networks (CNN) deals with training the network for multiple correlated tasks in concert. For accuracy-critical applications, there are endeavors to boost the model performance by resorting to a deeper network, which also increases the model complexity. However, such burdensome models are difficult to be deployed on mobile or edge devices. To ensure a trade-off between performance and complexity of CNNs in the context of MTL, we introduce the novel paradigm of self-distillation within the network. Different from traditional knowledge distillation (KD), which trains the Student in accordance with a cumbersome Teacher, our self-distilled multi-task CNN model: SD-MTCNN aims at distilling knowledge from deeper CNN layers into the shallow layers. Precisely, we follow a hard-sharing based MTL setup where all the tasks share a generic feature-encoder on top of which separate task-specific decoders are enacted. Under this premise, SD-MTCNN distills the more abstract features from the decoders to the encoded feature space, which guarantees improved multi-task performance from different parts of the network. We validate SD-MTCNN on three benchmark datasets: CityScapes, NYUv2, and Mini-Taskonomy, and results confirm the improved generalization capability of self-distilled multi-task CNNs in comparison to the literature and baselines.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at BMVC 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers