MonoScene: Monocular 3D Semantic Scene Completion

12/01/2021
by   Anh-Quan Cao, et al.
36

MonoScene proposes a 3D Semantic Scene Completion (SSC) framework, where the dense geometry and semantics of a scene are inferred from a single monocular RGB image. Different from the SSC literature, relying on 2.5 or 3D input, we solve the complex problem of 2D to 3D scene reconstruction while jointly inferring its semantics. Our framework relies on successive 2D and 3D UNets bridged by a novel 2D-3D features projection inspiring from optics and introduces a 3D context relation prior to enforce spatio-semantic consistency. Along with architectural contributions, we introduce novel global scene and local frustums losses. Experiments show we outperform the literature on all metrics and datasets while hallucinating plausible scenery even beyond the camera field of view. Our code and trained models are available at https://github.com/cv-rits/MonoScene

READ FULL TEXT

page 1

page 7

page 8

page 11

page 12

page 13

research
04/08/2021

Semantic Scene Completion via Integrating Instances and Scene in-the-Loop

Semantic Scene Completion aims at reconstructing a complete 3D scene wit...
research
02/27/2023

OccDepth: A Depth-Aware Method for 3D Semantic Scene Completion

3D Semantic Scene Completion (SSC) can provide dense geometric and seman...
research
03/24/2023

FishDreamer: Towards Fisheye Semantic Completion via Unified Image Outpainting and Segmentation

This paper raises the new task of Fisheye Semantic Completion (FSC), whe...
research
03/12/2021

3D Semantic Scene Completion: a Survey

Semantic Scene Completion (SSC) aims to jointly estimate the complete ge...
research
03/24/2023

StereoScene: BEV-Assisted Stereo Matching Empowers 3D Semantic Scene Completion

3D semantic scene completion (SSC) is an ill-posed task that requires in...
research
08/18/2020

DeepLiDARFlow: A Deep Learning Architecture For Scene Flow Estimation Using Monocular Camera and Sparse LiDAR

Scene flow is the dense 3D reconstruction of motion and geometry of a sc...
research
02/23/2023

VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion

Humans can easily imagine the complete 3D geometry of occluded objects a...

Please sign up or login with your details

Forgot password? Click here to reset