Single Multi-feature detector for Amodal 3D Object Detection in RGB-D Images
This paper aims at fast and high-accuracy amodal 3D object detections in RGB-D images, which requires a compact 3D bounding box around the whole object even under partial observations. To avoid the time-consuming proposals preextraction, we propose a single end-to-end framework based on the deep neural networks which hierarchically incorporates appearance and geometric features from 2.5D representation to 3D objects. The depth information has helped on reducing the output space of 3D bounding boxes into a manageable set of 3D anchor boxes with different sizes on multiple feature layers. At prediction time, in a convolutional fashion, the network predicts scores for categories and adjustments for locations, sizes and orientations of each 3D anchor box, which has considered multi-scale 2D features. Experiments on the challenging SUN RGB-D datasets show that our algorithm outperforms the state-of-the-art by 10.2 in mAP and is 88x faster than the Deep Sliding Shape. In addition, experiments suggest our algorithm even with a smaller input image size performs comparably but is 454x faster than the state-of-art on NYUV2 datasets.
READ FULL TEXT