30/11/2020

Bidirectional Pyramid Networks for Semantic Segmentation

Dong Nie, Jia Xue, Xiaofeng Ren

Keywords:

Abstract: Semantic segmentation is a fundamental problem in com-puter vision that has attracted a lot of attention. Recent eorts havebeen devoted to network architecture innovations for ecient semanticsegmentation that can run in real-time for autonomous driving and otherapplications. Information ow between scales is crucial because accuratesegmentation needs both large context and ne detail. However, most ex-isting approaches still rely on pretrained backbone models (e.g. ResNeton ImageNet). In this work, we propose to open up the backbone and de-sign a simple yet eective multiscale network architecture, BidirectionalPyramid Network (BPNet). BPNet takes the shape of a pyramid: infor-mation ows from bottom (high-resolution, small receptive eld) to top(low-resolution, large receptive eld), and from top to bottom, in a sys-tematic manner, at every step of the processing. More importantly, fusionneeds to be ecient; this is done through an add-and-multiply modulewith learned weights. We also apply a unary-pairwise attention mecha-nism to balance position sensitivity and context aggregation. Auxiliaryloss is applied at multiple steps of the pyramid bottom. The resultingnetwork achieves high accuracy with eciency, without the need of pre-training. On the standard Cityscapes dataset, we achieve test mIoU 76:3with 5:1M parameters and 36 fps (on Nvidia 2080 Ti), competitive withthe state of the time real-time models. Meanwhile, our design is generaland can be used to build heavier networks: a ResNet-101 equivalent ver-sion of BPNet achieves mIoU 81.9 on Cityscapes, competitive with thebest published results. We further demonstrate the exibility of BPNeton a prostate MRI segmentation task, achieving the state of the art with a45x speed-up.

The video of this talk cannot be embedded. You can watch it here:
https://accv2020.github.io/miniconf/poster_817.html
(Link will open in new window)
 0
 0
 0
 0
This is an embedded video. Talk and the respective paper are published at ACCV 2020 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment
no comments yet
code of conduct: tbd

Similar Papers