MoNet3D: Towards Accurate Monocular 3D Object Localization in Real Time

Abstract: Monocular multi-object detection and localization in 3D space has been proven to be a challenging task. The MoNet3D algorithm is a novel and effective framework that can predict the 3D position of each object in a monocular image, and draw a 3D bounding box on each object. The MoNet3D method incorporates the prior knowledge of spatial geometric correlation of neighboring objects into the deep neural network training process, in order to improve the accuracy of 3D object localization. Experiments over the KITTI data set show that the accuracy of predicting the depth and horizontal coordinate of the object in 3D space can reach 96.25% and 94.74%, respectively. Meanwhile, the method can realize the real-time image processing capability of 27.85 FPS. Our demo and code will be published on GitHub when the paper is accepted.

06/12/2021

structured representation, 3D representation, 3D Gaussians, image generation, image synthesis, image editing, controlled generation, GANs

2:49

14/06/2020

semi-supervised, instance segmentation, saliency, propagation, message passing, multiple instance learning, partial-supervised, generalization

1:01

14/06/2020

3d face dataset, face prediction, riggable model, 3d morphable model, dynamic details, deep neural network, displacement map

1:00

30/11/2020

autolabeling, differentiable rendering, pose and shape optimization, curriculum learning, object detection, autonomous driving, 3d shape modeling

4:59

14/06/2020

6d pose estimation, 3d instance segmentation, 3d semantic segmentation, 3d keypoint, 3d scene understanding, vision for robotics, 3d single view, rgbd, 3d computer vision

1:01

30/11/2020

Martin Rünz, Kejie Li, Meng Tang and
Lingni Ma, Chen Kong, Tanner Schmidt, Ian Reid, Lourdes Agapito, Julian Straub, Steven Lovegrove, Richard Newcombe

Keywords Paper

reconstruction, shape embedding, 3d vision, object detection, shape prior, object representation, monocular, sdf, pointcloud, inference

1:01

14/06/2020

3d object detection, edge-aware pointnet, instance segmentation, unsupervised clustering, cascaded modules, semantic segmentation, amodal bounding box detection

0:51

14/06/2020

Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space

single image view synthesis, view synthesis, differentiable rendering, point cloud, convolutional neural networks, generative networks

4:58

06/12/2020

Canonical 3D Deformer Maps: Unifying parametric and non-parametric methods for dense weakly-supervised category reconstruction

Wanli Peng, Hao Pan, He Liu, Yi Sun

pose estimation, pose, neural rendering, zero-shot, shape learning, 3d reconstruction, datasets, generative models, multi-view, robotics

1:01

22/11/2021

Qian Chen, Ze Liu, Yi Zhang and
Keren Fu, Qijun Zhao, Hongwei Du

Keywords Paper

14:04