Abstract:
Current deep learning models for image-based classification tasks are trained using mini-batches. In the present article, we show that exploiting similarity between samples in each mini-batch can significantly boost robustness to input perturbations, an often neglected consideration in the computer vision community. To accomplish this, we dynamically construct a similarity graph from the mini-batch samples and aggregate information using an attention module. In addition to the added robustness, this approach also improves performance in diverse image-based object and scene classification tasks.