OctopusNet: A Deep Learning Segmentation Network for Multi-modal Medical Images

06/05/2019
by   Yu Chen, et al.
0

Deep learning models, such as the fully convolutional network (FCN), have been widely used in 3D biomedical segmentation and achieved state-of-the-art performance. Multiple modalities are often used for disease diagnosis and quantification. Two approaches are widely used in the literature to fuse multiple modalities in the segmentation networks: early-fusion (which stacks multiple modalities as different input channels) and late-fusion (which fuses the segmentation results from different modalities at the very end). These fusion methods easily suffer from the cross-modal interference caused by the input modalities which have wide variations. To address the problem, we propose a novel deep learning architecture, namely OctopusNet, to better leverage and fuse the information contained in multi-modalities. The proposed framework employs a separate encoder for each modality for feature extraction and exploits a hyper-fusion decoder to fuse the extracted features while avoiding feature explosion. We evaluate the proposed OctopusNet on two publicly available datasets, i.e. ISLES-2018 and MRBrainS-2013. The experimental results show that our framework outperforms the commonly-used feature fusion approaches and yields the state-of-the-art segmentation accuracy.

READ FULL TEXT
research
08/06/2019

Learning Cross-Modal Deep Representations for Multi-Modal MR Image Segmentation

Multi-modal magnetic resonance imaging (MRI) is essential in clinics for...
research
08/29/2021

MBDF-Net: Multi-Branch Deep Fusion Network for 3D Object Detection

Point clouds and images could provide complementary information when rep...
research
04/07/2022

Multi-objective optimization determines when, which and how to fuse deep networks: an application to predict COVID-19 outcomes

The COVID-19 pandemic has caused millions of cases and deaths and the AI...
research
03/12/2020

MVLoc: Multimodal Variational Geometry-Aware Learning for Visual Localization

Recent learning-based research has achieved impressive results in the fi...
research
10/13/2022

X-Align: Cross-Modal Cross-View Alignment for Bird's-Eye-View Segmentation

Bird's-eye-view (BEV) grid is a common representation for the perception...
research
11/21/2021

MaIL: A Unified Mask-Image-Language Trimodal Network for Referring Image Segmentation

Referring image segmentation is a typical multi-modal task, which aims a...

Please sign up or login with your details

Forgot password? Click here to reset