DeepAI AI Chat
Log In Sign Up

Deep Multimodal Fusion by Channel Exchanging

11/10/2020
by   Yikai Wang, et al.
8

Deep multimodal fusion by using multiple sources of data for classification or regression has exhibited a clear advantage over the unimodal counterpart on various applications. Yet, current methods including aggregation-based and alignment-based fusion are still inadequate in balancing the trade-off between inter-modal fusion and intra-modal processing, incurring a bottleneck of performance improvement. To this end, this paper proposes Channel-Exchanging-Network (CEN), a parameter-free multimodal fusion framework that dynamically exchanges channels between sub-networks of different modalities. Specifically, the channel exchanging process is self-guided by individual channel importance that is measured by the magnitude of Batch-Normalization (BN) scaling factor during training. The validity of such exchanging process is also guaranteed by sharing convolutional filters yet keeping separate BN layers across modalities, which, as an add-on benefit, allows our multimodal architecture to be almost as compact as a unimodal network. Extensive experiments on semantic segmentation via RGB-D data and image translation through multi-domain input verify the effectiveness of our CEN compared to current state-of-the-art methods. Detailed ablation studies have also been carried out, which provably affirm the advantage of each component we propose. Our code is available at https://github.com/yikaiw/CEN.

READ FULL TEXT

page 8

page 13

page 16

page 17

page 18

page 19

page 20

12/04/2021

Channel Exchanging Networks for Multimodal and Multitask Dense Image Prediction

Multimodal fusion and multitask learning are two vital topics in machine...
08/11/2021

Learning Deep Multimodal Feature Representation with Asymmetric Multi-layer Fusion

We propose a compact and effective framework to fuse multimodal features...
04/19/2022

Multimodal Token Fusion for Vision Transformers

Many adaptations of transformers have emerged to address the single-moda...
10/16/2020

Deep-HOSeq: Deep Higher Order Sequence Fusion for Multimodal Sentiment Analysis

Multimodal sentiment analysis utilizes multiple heterogeneous modalities...
03/20/2023

IMF: Interactive Multimodal Fusion Model for Link Prediction

Link prediction aims to identify potential missing triples in knowledge ...
03/29/2022

Balanced Multimodal Learning via On-the-fly Gradient Modulation

Multimodal learning helps to comprehensively understand the world, by in...
11/16/2022

A Unified Multimodal De- and Re-coupling Framework for RGB-D Motion Recognition

Motion recognition is a promising direction in computer vision, but the ...

Code Repositories

CEN

[NeurIPS 2020] Code release for "Deep Multimodal Fusion by Channel Exchanging"


view repo