Low Rank Fusion based Transformers for Multimodal Sequences

07/04/2020
by   Saurav Sahay, et al.
0

Our senses individually work in a coordinated fashion to express our emotional intentions. In this work, we experiment with modeling modality-specific sensory signals to attend to our latent multimodal emotional intentions and vice versa expressed via low-rank multimodal fusion and multimodal transformers. The low-rank factorization of multimodal fusion amongst the modalities helps represent approximate multiplicative latent signal interactions. Motivated by the work of <cit.> and <cit.>, we present our transformer-based cross-fusion architecture without any over-parameterization of the model. The low-rank fusion helps represent the latent signal interactions while the modality-specific attention helps focus on relevant parts of the signal. We present two methods for the Multimodal Sentiment and Emotion Recognition results on CMU-MOSEI, CMU-MOSI, and IEMOCAP datasets and show that our models have lesser parameters, train faster and perform comparably to many larger fusion-based architectures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/31/2018

Efficient Low-rank Multimodal Fusion with Modality-Specific Factors

Multimodal research is an emerging field of artificial intelligence, and...
research
01/03/2019

A Network-based Multimodal Data Fusion Approach for Characterizing Dynamic Multimodal Physiological Patterns

Characterizing the dynamic interactive patterns of complex systems helps...
research
02/17/2023

Tensorized Optical Multimodal Fusion Network

We propose the first tensorized optical multimodal fusion network archit...
research
12/08/2020

Parameter Efficient Multimodal Transformers for Video Representation Learning

The recent success of Transformers in the language domain has motivated ...
research
11/23/2021

Sparse Fusion for Multimodal Transformers

Multimodal classification is a core task in human-centric machine learni...
research
11/30/2018

Modality-based Factorization for Multimodal Fusion

We propose a multimodal data fusion method by obtaining a M+1 dimensiona...
research
01/24/2022

MMLatch: Bottom-up Top-down Fusion for Multimodal Sentiment Analysis

Current deep learning approaches for multimodal fusion rely on bottom-up...

Please sign up or login with your details

Forgot password? Click here to reset