Dynamic Fusion for Multimodal Data

11/10/2019
by   Gaurav Sahu, et al.
0

Effective fusion of data from multiple modalities, such as video, speech, and text, is challenging pertaining to the heterogeneous nature of multimodal data. In this paper, we propose dynamic fusion techniques that model context from different modalities efficiently. Instead of defining a deterministic fusion operation, such as concatenation, for the network, we let the network decide "how" to combine given multimodal features in the most optimal way. We propose two networks: 1) transfusion network, which learns to compress information from different modalities while preserving the context, and 2) a GAN-based network, which regularizes the learned latent space given context from complimenting modalities. A quantitative evaluation on the tasks of machine translation, and emotion recognition suggest that such adaptive networks are able to model context better than all existing methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/04/2022

MM-DFN: Multimodal Dynamic Fusion Network for Emotion Recognition in Conversations

Emotion Recognition in Conversations (ERC) has considerable prospects fo...
research
04/08/2021

Multimodal Fusion Refiner Networks

Tasks that rely on multi-modal information typically include a fusion mo...
research
03/15/2022

Modular and Parameter-Efficient Multimodal Fusion with Prompting

Recent research has made impressive progress in large-scale multimodal p...
research
05/22/2018

Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment

Multimodal affective computing, learning to recognize and interpret huma...
research
07/17/2023

Clarifying the Half Full or Half Empty Question: Multimodal Container Classification

Multimodal integration is a key component of allowing robots to perceive...
research
11/09/2019

M3ER: Multiplicative Multimodal Emotion Recognition Using Facial, Textual, and Speech Cues

We present M3ER, a learning-based method for emotion recognition from mu...
research
09/28/2021

Neural Dependency Coding inspired Multimodal Fusion

Information integration from different modalities is an active area of r...

Please sign up or login with your details

Forgot password? Click here to reset