19/08/2021

Text-based Person Search via Multi-Granularity Embedding Learning

Chengji Wang, Zhiming Luo, Yaojin Lin, Shaozi Li

Keywords: Computer Vision, Language and Vision, Recognition

Abstract: Most existing text-based person search methods highly depend on exploring the corresponding relations between the regions of the image and the words in the sentence. However, these methods correlated image regions and words in the same semantic granularity. It 1) results in irrelevant corresponding relations between image and text, 2) causes an ambiguity embedding problem. In this study, we propose a novel multi-granularity embedding learning model for text-based person search. It generates multi-granularity embeddings of partial person bodies in a coarse-to-fine manner by revisiting the person image at different spatial scales. Specifically, we distill the partial knowledge from image scrips to guide the model to select the semantically relevant words from the text description. It can learn discriminative and modality-invariant visual-textual embeddings. In addition, we integrate the partial embeddings at each granularity and perform multi-granularity image-text matching. Extensive experiments validate the effectiveness of our method, which can achieve new state-of-the-art performance by the learned discriminative partial embeddings.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at IJCAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers