Multi-modal embeddings using multi-task learning for emotion recognition

09/10/2020
by   Aparna Khare, et al.
0

General embeddings like word2vec, GloVe and ELMo have shown a lot of success in natural language tasks. The embeddings are typically extracted from models that are built on general tasks such as skip-gram models and natural language generation. In this paper, we extend the work from natural language understanding to multi-modal architectures that use audio, visual and textual information for machine learning tasks. The embeddings in our network are extracted using the encoder of a transformer model trained using multi-task training. We use person identification and automatic speech recognition as the tasks in our embedding generation framework. We tune and evaluate the embeddings on the downstream task of emotion recognition and demonstrate that on the CMU-MOSEI dataset, the embeddings can be used to improve over previous state of the art results.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

11/20/2020

Self-Supervised learning with cross-modal transformers for emotion recognition

Emotion recognition is a challenging task due to limited availability of...
06/08/2021

Efficient Speech Emotion Recognition Using Multi-Scale CNN and Attention

Emotion recognition from speech is a challenging task. Re-cent advances ...
02/20/2019

Audio-Linguistic Embeddings for Spoken Sentences

We propose spoken sentence embeddings which capture both acoustic and li...
11/13/2020

Multi-Modal Emotion Detection with Transfer Learning

Automated emotion detection in speech is a challenging task due to the c...
11/02/2020

Multimodal Continuous Emotion Recognition using Deep Multi-Task Learning with Correlation Loss

In this study, we focus on continuous emotion recognition using body mot...
07/30/2021

Perceiver IO: A General Architecture for Structured Inputs Outputs

The recently-proposed Perceiver model obtains good results on several do...
06/27/2021

Multi-Modal Chorus Recognition for Improving Song Search

We discuss a novel task, Chorus Recognition, which could potentially ben...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.