Local Grid Rendering Networks for 3D Object Detection in Point Clouds
The performance of 3D object detection models over point clouds highly depends on their capability of modeling local geometric patterns. Conventional point-based models exploit local patterns through a symmetric function (e.g. max pooling) or based on graphs, which easily leads to loss of fine-grained geometric structures. Regarding capturing spatial patterns, CNNs are powerful but it would be computationally costly to directly apply convolutions on point data after voxelizing the entire point clouds to a dense regular 3D grid. In this work, we aim to improve performance of point-based models by enhancing their pattern learning ability through leveraging CNNs while preserving computational efficiency. We propose a novel and principled Local Grid Rendering (LGR) operation to render the small neighborhood of a subset of input points into a low-resolution 3D grid independently, which allows small-size CNNs to accurately model local patterns and avoids convolutions over a dense grid to save computation cost. With the LGR operation, we introduce a new generic backbone called LGR-Net for point cloud feature extraction with simple design and high efficiency. We validate LGR-Net for 3D object detection on the challenging ScanNet and SUN RGB-D datasets. It advances state-of-the-art results significantly by 5.5 and 4.5 mAP, respectively, with only slight increased computation overhead.
READ FULL TEXT