Next-Best-View Prediction for Active Stereo Cameras and Highly Reflective Objects
Depth acquisition with the active stereo camera is a challenging task for highly reflective objects. When setup permits, multi-view fusion can provide increased levels of depth completion. However, due to the slow acquisition speed of high-end active stereo cameras, collecting a large number of viewpoints for a single scene is generally not practical. In this work, we propose a next-best-view framework to strategically select camera viewpoints for completing depth data on reflective objects. In particular, we explicitly model the specular reflection of reflective surfaces based on the Phong reflection model and a photometric response function. Given the object CAD model and grayscale image, we employ an RGB-based pose estimator to obtain current pose predictions from the existing data, which is used to form predicted surface normal and depth hypotheses, and allows us to then assess the information gain from a subsequent frame for any candidate viewpoint. Using this formulation, we implement an active perception pipeline which is evaluated on a challenging real-world dataset. The evaluation results demonstrate that our active depth acquisition method outperforms two strong baselines for both depth completion and object pose estimation performance.
READ FULL TEXT