MVT: Multi-view Vision Transformer for 3D Object Recognition

22/11/2021

MVT: Multi-view Vision Transformer for 3D Object Recognition

Shuo Chen, Tan Yu, Ping Li

Keywords: 3D object recognition, Transformer-based methods

Abstract Paper Similar Papers

Abstract: Inspired by the great success achieved by CNN in image recognition, view-based methods applied CNNs to model the projected views for 3D object understanding and achieved excellent performance. Nevertheless, multi-view CNN models cannot model the communications between patches from different views, limiting its effectiveness in 3D object recognition. Inspired by the recent success gained by vision Transformer in image recognition, we propose a Multi-view Vision Transformer (MVT) for 3D object recognition. Since each patch feature in a Transformer block has a global reception field, it naturally achieves communications between patches from different views. Meanwhile, it takes much less inductive bias compared with its CNN counterparts. Considering both effectiveness and efficiency, we develop a global-local structure for our MVT. Our experiments on two public benchmarks, ModelNet40 and ModelNet10, demonstrate the competitive performance of our MVT.

0

0

0

0

Share

This is an embedded video. Talk and the respective paper are published at BMVC 2021 virtual conference. If you are one of the authors of the paper and want to manage your upload, see the question "My papertalk has been externally embedded..." in the FAQ section.

Comments

Post Comment

no comments yet

Similar Papers

14/06/2020

Gate-Shift Networks for Video Action Recognition

Swathikiran Sudhakaran, Sergio Escalera, Oswald Lanz

Keywords Paper

action recognition, video representation learning, spatio-temporal interactions, video classification

0

0

0

0

1:00

07/09/2020

A Simple and Scalable Shape Representation for 3D Reconstruction.

Mateusz Michalkiewicz, Eugene Belilovsky, Mahsa Baktashmotlagh, Anders Eriksson

Keywords Paper

shape from x, 3d reconstruction from a single image, implicit shape representation, deep level sets

0

0

0

0

9:57

14/06/2020

Deep Grouping Model for Unified Perceptual Parsing

Zhiheng Li, Wenxuan Bao, Jiayang Zheng, Chenliang Xu

Keywords Paper

perceptual grouping, hierarhical graph, bottom-up segmentation, interpretability, unified perceptual parsing, graph neural network, interactive-segmentation, weakly-supervised segmentation

0

0

0

0

0:57

22/11/2021

Rethinking Token-Mixing MLP for MLP-based Vision Backbone

Tan Yu, XU LI, Yunfeng Cai and
Mingming Sun, Ping Li

Keywords Paper

vision backbone, MLP, image recognition

0

0

0

0

1:59

03/05/2021

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov and
Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby

Keywords Paper

computer vision, image recognition, large-scale training, self-attention, transformer

0

0

0

0

15:27

30/11/2020

Depth-Adapted CNN for RGB-D cameras

Zongwei WU, Guillaume Allibert, Christophe Stolz, Cedric Demonceaux

Keywords Paper

0

0

0

0

9:43

05/01/2021

Hierarchical Generative Adversarial Networks for Single Image Super-Resolution

Weimin Chen, Yuqing Ma, Xianglong Liu, Yi Yuan

Keywords Paper

0

0

0

0

4:46

14/06/2020

Deep Facial Non-Rigid Multi-View Stereo

Ziqian Bai, Zhaopeng Cui, Jamal Ahmed Rahim and
Xiaoming Liu, Ping Tan

Keywords Paper

multi-view 3d, non-rigid reconstruction, face reconstruction, deep learning

0

0

0

0

1:01

02/02/2021

Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection

Jiajun Deng, Shaoshuai Shi, Peiwei Li and
Wengang Zhou, Yanyong Zhang, Houqiang Li

Keywords Paper

0

0

0

0

16:42

12/07/2020

Informative Dropout for Robust Representation Learning: A Shape-bias Perspective

Baifeng Shi, Dinghuai Zhang, Qi Dai and
Jingdong Wang, Zhanxing Zhu, Yadong Mu

Keywords Paper

Accountability, Transparency and Interpretability

0

0

0

0

14:58

06/12/2021

MLP-Mixer: An all-MLP Architecture for Vision

Ilya Tolstikhin, Neil Houlsby, Alexander Kolesnikov and
Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, Mario Lucic, Alexey Dosovitskiy

Keywords Paper

deep learning, machine learning, transformers, vision, transfer learning

0

0

0

0

11:18

05/01/2021

Phase-Wise Parameter Aggregation for Improving SGD Optimization

Takumi Kobayashi

Keywords Paper

0

0

0

0

4:36

22/11/2021

Two Heads are Better than One: Geometric-Latent Attention for Point Cloud Classification and Segmentation

Hanz Cuevas, Antonio-Javier Gallego, Robert B Fisher

Keywords Paper

point cloud, point cloud segmentation, 3D, point cloud classification, shape classification, deep 3D segmentation, deep learning, scene understanding, 3D scene understanding, light network, local agregation, gnn

0

0

0

0

2:57

05/01/2021

SWAG: Superpixels Weighted by Average Gradients for Explanations of CNNs

Thomas Hartley, Kirill Sidorov, Christopher Willis, David Marshall

Keywords Paper

0

0

0

0

4:58

06/12/2021

Efficient Training of Visual Transformers with Small Datasets

Yahui Liu, Enver Sangineto, Wei Bi and
Nicu Sebe, Bruno Lepri, Marco Nadai

Keywords Paper

robustness, transformers, vision

0

0

0

0

8:23

14/06/2020

Improving Convolutional Networks With Self-Calibrated Convolutions

Jiang-Jiang Liu, Qibin Hou, Ming-Ming Cheng and
Changhu Wang, Jiashi Feng

Keywords Paper

self-calibrated, feature transformation, image classification, network architecture, convolutional neural networks

0

0

0

0

1:00

22/11/2021

Median Pixel Difference Convolutional Network for Robust Face Recognition

Jiehua Zhang, Zhuo Su, Li Liu

Keywords Paper

face recognition, noise robustness, efficient CNN

0

0

0

0

3:03

20/07/2020

Butterfly-Net2: Simplified Butterfly-Net and Fourier Transform Initialization

Zhongshu Xu, Yingzhou Li, Xiuyuan Cheng

Keywords Paper

0

0

0

0

13:42

03/05/2021

Attentional Constellation Nets for Few-Shot Learning

Weijian Xu, Yifan Xu, Huaijin Wang, Zhuowen Tu

Keywords Paper

few-shot learning, constellation models

0

0

0

0

5:10

02/02/2021

Learning Comprehensive Motion Representation for Action Recognition

Mingyu Wu, Boyuan Jiang, Donghao Luo and
Junchi Yan, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Xiaokang Yang

Keywords Paper

0

0

0

0

15:15

02/02/2021

Explicitly Modeled Attention Maps for Image Classification

Andong Tan, Duc Tam Nguyen, Maximilian Dax and
Matthias Nießner, Thomas Brox

Keywords Paper

0

0

0

0

16:59

14/06/2020

Erasing Integrated Learning: A Simple Yet Effective Approach for Weakly Supervised Object Localization

Jinjie Mai, Meng Yang, Wenfeng Luo

Keywords Paper

weakly supervised, object localization, adversarial erasing

0

0

0

0

5:00

30/11/2020

Frequency Attention Network: Blind Noise Removal for Real Images

Hongcheng Mo, Jianfei Jiang, Qin Wang and
Dong Yin, Pengyu Dong, Jingjun Tian

Keywords Paper

0

0

0

0

6:47

02/02/2021

MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Wenhao Wu, Dongliang He, Tianwei Lin and
Fu Li, Chuang Gan, Errui Ding

Keywords Paper

0

0

0

0

14:02

14/06/2020

On Translation Invariance in CNNs: Convolutional Layers Can Exploit Absolute Spatial Location

Osman Semih Kayhan, Jan C. van Gemert

Keywords Paper

inductive prior, equivariance, translation invariance, shift invariance, data efficiency, convolution, boundary effects, padding

0

0

0

0

0:59

06/12/2021

Robust Contrastive Learning Using Negative Samples with Diminished Semantics

Songwei Ge, Shlok Mishra, Chun-Liang Li and
Haohan Wang, David Jacobs

Keywords Paper

robustness, self-supervised learning, contrastive learning

0

0

0

0

10:00

14/06/2020

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior

Jinshan Pan, Haoran Bai, Jinhui Tang

Keywords Paper

video deblurring, deep convolutional neural network, motion blur estimation, optical flow, temporal sharpness prior, image restoration

0

0

0

0

0:53

14/06/2020

Adaptive Fractional Dilated Convolution Network for Image Aesthetics Assessment

Qiuyu Chen, Wei Zhang, Ning Zhou and
Peng Lei, Yi Xu, Yu Zheng, Jianping Fan

Keywords Paper

image aesthetics assessment, kernel embedding, adaptive convolution, parameter-free, aspect ratio

0

0

0

0

1:01

07/09/2020

ViewSynth: Learning Local Features from Depth using View Synthesis

Jisan Mahmud, Rajat Vikram Singh, Peri Akiva and
Spondon Kundu, Kuan-Chuan Peng, Jan-Michael Frahm

Keywords Paper

viewpoint invariant representation learning, depth representation learning, view synthesis, correspondence learning

0

0

0

0

10:00

06/12/2020

Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification

Yulin Wang, Kangchen Lv, Rui Huang and
Shiji Song, Le Yang, Gao Huang

Keywords Paper

0

0

0

0

3:23

07/09/2020

Making a Case for 3D Convolutions for Object Segmentation in Videos

Sabarinath Mahadevan, Ali Athar, Aljosa Osep and
Laura Leal-Taixé, Bastian Leibe, Sebastian Hennen

Keywords Paper

object tracking, video segmentation, video object segmentation, video scene understanding, object segmentation

0

0

0

0

8:16

06/12/2020

DVERGE: Diversifying Vulnerabilities for Enhanced Robust Generation of Ensembles

Huanrui Yang, Jingyang Zhang, Hongliang Dong and
Nathan Inkawhich, Andrew Gardner, Andrew Touchet, Wesley Wilkes, Heath Berry, Helen Li

Keywords Paper

0

0

0

0

3:25

03/05/2021

Group Equivariant Generative Adversarial Networks

Neel Dey, Antong Chen, Soheil Ghafurian

Keywords Paper

Geometric Deep Learning, Generative Adversarial Networks, Group Equivariance

0

0

0

0

5:06

14/06/2020

ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

Qilong Wang, Banggu Wu, Pengfei Zhu and
Peihua Li, Wangmeng Zuo, Qinghua Hu

Keywords Paper

channel attention, efficient, adaptive 1d convolution, deep cnns, image classifcation, object detection, instance segmentation

0

0

0

0

0:57

06/12/2021

Adaptive Denoising via GainTuning

Sreyas Mohan, Joshua L Vincent, Ramon Manzorro and
Peter Crozier, Carlos Fernandez-Granda, Eero P Simoncelli

Keywords Paper

deep learning

0

0

0

0

15:08

14/06/2020

Rethinking Depthwise Separable Convolutions: How Intra-Kernel Correlations Lead to Improved MobileNets

Daniel Haase, Manuel Amthor

Keywords Paper

cnns, depthwise separable convolutions, mobilenet, efficient neural networks, efficientnet, resnet, imagenet, cifar, fine-grained, orthonormal loss

0

0

0

0

1:01

06/12/2021

Container: Context Aggregation Networks

peng gao, Jiasen Lu, hongsheng Li and
Roozbeh Mottaghi, Aniruddha Kembhavi

Keywords Paper

deep learning, self-supervised learning, transformers, vision, language

0

0

0

0

8:50

05/01/2021

Learning of Low-Level Feature Keypoints for Accurate and Robust Detection

Suwichaya Suwanwimolkul, Satoshi Komorita, Kazuyuki Tasaka

Keywords Paper

0

0

0

0

5:01

05/01/2021

Spatial Context-Aware Self-Attention Model for Multi-Organ Segmentation

Hao Tang, Xingwei Liu, Kun Han and
Xiaohui Xie, Xuming Chen, Huang Qian, Yong Liu, Shanlin Sun, Narisu Bai

Keywords Paper

0

0

0

0

5:01

12/07/2020

Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data

Marc Finzi, Samuel Stanton, Pavel Izmailov, Andrew Wilson

Keywords Paper

Deep Learning - General

0

0

0

0

14:05