A Unified BEV Model for Joint Learning of 3D Local Features and Overlap Estimation
Pairwise point cloud registration is a critical task for many applications, which heavily depends on finding the right correspondences from the two point clouds. However, the low overlap between the input point clouds makes the registration prone to fail, leading to mistaken overlapping and mismatched correspondences, especially in scenes where non-overlapping regions contain similar structures. In this paper, we present a unified bird's-eye view (BEV) model for jointly learning of 3D local features and overlap estimation to fulfill the pairwise registration and loop closure. Feature description based on BEV representation is performed by a sparse UNet-like network, and the 3D keypoints are extracted by a detection head for 2D locations and a regression head for heights, respectively. For overlap detection, a cross-attention module is applied for interacting contextual information of the input point clouds, followed by a classification head to estimate the overlapping region. We evaluate our unified model extensively on the KITTI dataset and Apollo-SouthBay dataset. The experiments demonstrate that our method significantly outperforms existing methods on overlap prediction, especially in scenes with small overlaps. The registration precision also achieves top performance on both datasets in terms of translation and rotation errors. Source codes will be available soon.
READ FULL TEXT