SPARC: Sparse Render-and-Compare for CAD model alignment in a single RGB image

10/03/2022
by   Florian Langer, et al.
11

Estimating 3D shapes and poses of static objects from a single image has important applications for robotics, augmented reality and digital content creation. Often this is done through direct mesh predictions which produces unrealistic, overly tessellated shapes or by formulating shape prediction as a retrieval task followed by CAD model alignment. Directly predicting CAD model poses from 2D image features is difficult and inaccurate. Some works, such as ROCA, regress normalised object coordinates and use those for computing poses. While this can produce more accurate pose estimates, predicting normalised object coordinates is susceptible to systematic failure. Leveraging efficient transformer architectures we demonstrate that a sparse, iterative, render-and-compare approach is more accurate and robust than relying on normalised object coordinates. For this we combine 2D image information including sparse depth and surface normal values which we estimate directly from the image with 3D CAD model information in early fusion. In particular, we reproject points sampled from the CAD model in an initial, random pose and compute their depth and surface normal values. This combined information is the input to a pose prediction network, SPARC-Net which we train to predict a 9 DoF CAD model pose update. The CAD model is reprojected again and the next pose update is predicted. Our alignment procedure converges after just 3 iterations, improving the state-of-the-art performance on the challenging real-world dataset ScanNet from 25.0 released at https://github.com/florianlanger/SPARC .

READ FULL TEXT

page 1

page 2

page 4

page 7

page 9

page 18

page 20

research
11/10/2021

Leveraging Geometry for Shape Estimation from a Single RGB Image

Predicting 3D shapes and poses of static objects from a single RGB image...
research
04/05/2016

Marr Revisited: 2D-3D Alignment via Surface Normal Prediction

We introduce an approach that leverages surface normal predictions, alon...
research
08/07/2020

HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures

We present HoliCity, a city-scale 3D dataset with rich structural inform...
research
09/15/2022

PIZZA: A Powerful Image-only Zero-Shot Zero-CAD Approach to 6 DoF Tracking

Estimating the relative pose of a new object without prior knowledge is ...
research
09/12/2023

HOC-Search: Efficient CAD Model and Pose Retrieval from RGB-D Scans

We present an automated and efficient approach for retrieving high-quali...
research
09/17/2023

Uncertainty-aware 3D Object-Level Mapping with Deep Shape Priors

3D object-level mapping is a fundamental problem in robotics, which is e...
research
08/07/2019

Location Field Descriptors: Single Image 3D Model Retrieval in the Wild

We present Location Field Descriptors, a novel approach for single image...

Please sign up or login with your details

Forgot password? Click here to reset