RFBNet: Deep Multimodal Networks with Residual Fusion Blocks for RGB-D Semantic Segmentation

06/29/2019
by   Liuyuan Deng, et al.
0

Signals from RGB and depth data carry complementary information about the scene. Conventional RGB-D semantic segmentation methods adopt two-stream fusion structure which uses two modality-specific encoders to extract features from the RGB and depth data. There is currently no explicit mechanism to model the interdependencies between the encoders. This letter proposes a novel bottom-up interactive fusion structure which introduces an interaction stream to bridge the modality-specific encoders. The interaction stream progressively aggregates modality-specific features from the encoders and computes complementary features for the encoders. To instantiate this structure, the letter proposes a residual fusion block (RFB) to formulate the interdependences of the encoders. The RFB consists of two residual units and one fusion unit with gate mechanism. It learns complementary features for the modality-specific encoders and extracts modality-specific features as well as cross-modal features. Based on the RFB, the letter presents the deep multimodal networks for RGB-D semantic segmentation called RFBNet. The experiments conducted on two datasets demonstrate the effectiveness of modeling the interdependencies and that the RFBNet outperforms state-of-the-art methods.

READ FULL TEXT

page 1

page 4

research
03/09/2022

CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers

The performance of semantic segmentation of RGB images can be advanced b...
research
06/17/2023

Residual Spatial Fusion Network for RGB-Thermal Semantic Segmentation

Semantic segmentation plays an important role in widespread applications...
research
07/17/2023

Variational Probabilistic Fusion Network for RGB-T Semantic Segmentation

RGB-T semantic segmentation has been widely adopted to handle hard scene...
research
08/03/2016

Learning Common and Specific Features for RGB-D Semantic Segmentation with Deconvolutional Networks

In this paper, we tackle the problem of RGB-D semantic segmentation of i...
research
10/06/2022

Robust Double-Encoder Network for RGB-D Panoptic Segmentation

Perception is crucial for robots that act in real-world environments, as...
research
03/02/2023

Delivering Arbitrary-Modal Semantic Segmentation

Multimodal fusion can make semantic segmentation more robust. However, f...
research
08/11/2018

Self-Supervised Model Adaptation for Multimodal Semantic Segmentation

Learning to reliably perceive and understand the scene is an integral en...

Please sign up or login with your details

Forgot password? Click here to reset