Emotion Embedding Spaces for Matching Music to Stories

11/26/2021
by   Minz Won, et al.
0

Content creators often use music to enhance their stories, as it can be a powerful tool to convey emotion. In this paper, our goal is to help creators find music to match the emotion of their story. We focus on text-based stories that can be auralized (e.g., books), use multiple sentences as input queries, and automatically retrieve matching music. We formalize this task as a cross-modal text-to-music retrieval problem. Both the music and text domains have existing datasets with emotion labels, but mismatched emotion vocabularies prevent us from using mood or emotion annotations directly for matching. To address this challenge, we propose and investigate several emotion embedding spaces, both manually defined (e.g., valence/arousal) and data-driven (e.g., Word2Vec and metric learning) to bridge this gap. Our experiments show that by leveraging these embedding spaces, we are able to successfully bridge the gap between modalities to facilitate cross modal retrieval. We show that our method can leverage the well established valence-arousal space, but that it can also achieve our goal via data-driven embedding spaces. By leveraging data-driven embeddings, our approach has the potential of being generalized to other retrieval tasks that require broader or completely different vocabularies.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/24/2023

Emotion-Aligned Contrastive Learning Between Images and Music

Traditional music search engines rely on retrieval methods that match na...
research
08/22/2020

Emotion-Based End-to-End Matching Between Image and Music in Valence-Arousal Space

Both images and music can convey rich semantics and are widely used to i...
research
03/19/2023

Textless Speech-to-Music Retrieval Using Emotion Similarity

We introduce a framework that recommends music based on the emotions of ...
research
12/14/2021

Cross-modal Music Emotion Recognition Using Composite Loss-based Embeddings

Most music emotion recognition approaches use one-way classification or ...
research
10/30/2020

Multimodal Metric Learning for Tag-based Music Retrieval

Tag-based music retrieval is crucial to browse large-scale music librari...
research
08/26/2022

MuLan: A Joint Embedding of Music Audio and Natural Language

Music tagging and content-based retrieval systems have traditionally bee...
research
04/14/2021

Continual learning in cross-modal retrieval

Multimodal representations and continual learning are two areas closely ...

Please sign up or login with your details

Forgot password? Click here to reset