Learning Affective Correspondence between Music and Image

03/30/2019
by   Gaurav Verma, et al.
0

We introduce the problem of learning affective correspondence between audio (music) and visual data (images). For this task, a music clip and an image are considered similar (having true correspondence) if they have similar emotion content. In order to estimate this crossmodal, emotion-centric similarity, we propose a deep neural network architecture that learns to project the data from the two modalities to a common representation space, and performs a binary classification task of predicting the affective correspondence (true or false). To facilitate the current study, we construct a large scale database containing more than 3,500 music clips and 85,000 images with three emotion classes (positive, neutral, negative). The proposed approach achieves 61.67% accuracy for the affective correspondence prediction task on this database, outperforming two relevant and competitive baselines. We also demonstrate that our network learns modality-specific representations of emotion (without explicitly being trained with emotion labels), which are useful for emotion recognition in individual modalities.

READ FULL TEXT
research
08/03/2021

EMOPIA: A Multi-Modal Pop Piano Dataset For Emotion Recognition and Emotion-based Music Generation

While there are many music datasets with emotion labels in the literatur...
research
04/13/2021

Comparison and Analysis of Deep Audio Embeddings for Music Emotion Recognition

Emotion is a complicated notion present in music that is hard to capture...
research
08/06/2020

Learnable Graph Inception Network for Emotion Recognition

Analyzing emotion from verbal and non-verbal behavioral cues is critical...
research
02/20/2022

Enhancing Affective Representations of Music-Induced EEG through Multimodal Supervision and latent Domain Adaptation

The study of Music Cognition and neural responses to music has been inva...
research
10/10/2018

A Multimodal Approach towards Emotion Recognition of Music using Audio and Lyrical Content

We propose MoodNet - A Deep Convolutional Neural Network based architect...
research
06/27/2021

Use of Variational Inference in Music Emotion Recognition

This work was developed aiming to employ Statistical techniques to the f...

Please sign up or login with your details

Forgot password? Click here to reset