19/08/2021

Local Representation is Not Enough: Soft Point-Wise Transformer for Descriptor and Detector of Local Features

Zihao Wang, Xueyi Li, Zhen Li

Keywords: Computer Vision, 2D and 3D Computer Vision, Recognition

Abstract: Significant progress has been witnessed for the descriptor and detector of local features, but there still exist several challenging and intractable limitations, such as insufficient localization accuracy and non-discriminative description, especially in repetitive- or blank-texture regions, which haven't be well addressed. The coarse feature representation and limited receptive field are considered as the main issues for these limitations. To address these issues, we propose a novel Soft Point-Wise Transformer for Descriptor and Detector, simultaneously mining long-range intrinsic and cross-scale dependencies of local features. Furthermore, our model leverages the distinct transformers based on the soft point-wise attention, substantially decreasing the memory and computation complexity, especially for high-resolution feature maps. In addition, multi-level decoder is constructed to guarantee the high detection accuracy and discriminative description. Extensive experiments demonstrate that our model outperforms the existing state-of-the-art methods on the image matching and visual localization benchmarks.

 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at IJCAI 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd Characters remaining: 140

Similar Papers