Abstract:
Interpretability has been regarded as an essential component for deploying deep neural networks, in which the saliency-based method is one of the most prevailing interpretable approaches since it can generate individually intuitive heatmaps that highlight parts of the input image that are most important to the decision of the deep networks on a particular classification target. However, heatmaps generated by existing methods either contain little information to represent objects (perturbation-based methods) or cannot effectively locate multi-class objects (activation-based approaches). To address this issue, a two-stage framework for visualizing the interpretability of deep neural networks, called Activation Optimized with Perturbation (AOP), is designed to optimize activation maps generated by general activation-based methods with the help of perturbation-based methods. Finally, in order to obtain better explanations for different types of images, we further present an instance of the AOP framework, Smooth Integrated Gradient-based Class Activation Map (SIGCAM), which proposes a weighted GradCAM by applying the feature map as weight coefficients and employs I-GOS to optimize the base-mask generated by weighted GradCAM. Experimental results on common-used benchmarks, including deletion and insertion tests on ImageNet-1k, and pointing game tests on COCO2017, show that the proposed AOP and SIGCAM outperform the current state-of-the-art methods significantly by generating higher quality image-based saliency maps.