Multi-attention Recurrent Network for Human Communication Comprehension

02/03/2018
by   Amir Zadeh, et al.
0

Human face-to-face communication is a complex multimodal signal. We use words (language modality), gestures (vision modality) and changes in tone (acoustic modality) to convey our intentions. Humans easily process and understand face-to-face communication, however, comprehending this form of communication remains a significant challenge for Artificial Intelligence (AI). AI must understand each modality and the interactions between them that shape human communication. In this paper, we present a novel neural architecture for understanding human communication called the Multi-attention Recurrent Network (MARN). The main strength of our model comes from discovering interactions between modalities through time using a neural component called the Multi-attention Block (MAB) and storing them in the hybrid memory of a recurrent component called the Long-short Term Hybrid Memory (LSTHM). We perform extensive comparisons on six publicly available datasets for multimodal sentiment analysis, speaker trait recognition and emotion recognition. MARN shows state-of-the-art performance on all the datasets.

READ FULL TEXT
research
05/04/2023

SI-LSTM: Speaker Hybrid Long-short Term Memory and Cross Modal Attention for Emotion Recognition in Conversation

Emotion Recognition in Conversation (ERC) across modalities is of vital ...
research
10/22/2020

MTGAT: Multimodal Temporal Graph Attention Networks for Unaligned Human Multimodal Language Sequences

Human communication is multimodal in nature; it is through multiple moda...
research
11/23/2018

Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors

Humans convey their intentions through the usage of both verbal and nonv...
research
11/11/2020

Improving Multimodal Accuracy Through Modality Pre-training and Attention

Training a multimodal network is challenging and it requires complex arc...
research
04/14/2021

The MuSe 2021 Multimodal Sentiment Analysis Challenge: Sentiment, Emotion, Physiological-Emotion, and Stress

Multimodal Sentiment Analysis (MuSe) 2021 is a challenge focusing on the...
research
12/11/2021

Multimodal neural networks better explain multivoxel patterns in the hippocampus

The human hippocampus possesses "concept cells", neurons that fire when ...
research
04/29/2020

Interpretable Multimodal Routing for Human Multimodal Language

The human language has heterogeneous sources of information, including t...

Please sign up or login with your details

Forgot password? Click here to reset