Attention-based Multi-modal Fusion Network for Semantic Scene Completion

03/31/2020
by   Siqi Li, et al.
0

This paper presents an end-to-end 3D convolutional network named attention-based multi-modal fusion network (AMFNet) for the semantic scene completion (SSC) task of inferring the occupancy and semantic labels of a volumetric 3D scene from single-view RGB-D images. Compared with previous methods which use only the semantic features extracted from RGB-D images, the proposed AMFNet learns to perform effective 3D scene completion and semantic segmentation simultaneously via leveraging the experience of inferring 2D semantic segmentation from RGB-D images as well as the reliable depth cues in spatial dimension. It is achieved by employing a multi-modal fusion architecture boosted from 2D semantic segmentation and a 3D semantic completion network empowered by residual attention blocks. We validate our method on both the synthetic SUNCG-RGBD dataset and the real NYUv2 dataset and the results show that our method respectively achieves the gains of 2.5 synthetic SUNCG-RGBD dataset and the real NYUv2 dataset against the state-of-the-art method.

READ FULL TEXT

page 1

page 2

page 6

page 7

research
12/25/2019

Multi-Modal Attention-based Fusion Model for Semantic Segmentation of RGB-Depth Images

The 3D scene understanding is mainly considered as a crucial requirement...
research
08/08/2019

EdgeNet: Semantic Scene Completion from RGB-D images

Semantic scene completion is the task of predicting a complete 3D repres...
research
08/18/2023

Single Frame Semantic Segmentation Using Multi-Modal Spherical Images

In recent years, the research community has shown a lot of interest to p...
research
02/17/2020

3D Gated Recurrent Fusion for Semantic Scene Completion

This paper tackles the problem of data fusion in the semantic scene comp...
research
03/23/2020

Atlas: End-to-End 3D Scene Reconstruction from Posed Images

We present an end-to-end 3D reconstruction method for a scene by directl...
research
06/29/2021

IMENet: Joint 3D Semantic Scene Completion and 2D Semantic Segmentation through Iterative Mutual Enhancement

3D semantic scene completion and 2D semantic segmentation are two tightl...
research
03/02/2019

RGBD Based Dimensional Decomposition Residual Network for 3D Semantic Scene Completion

RGB images differentiate from depth images as they carry more details ab...

Please sign up or login with your details

Forgot password? Click here to reset