Multimodal Metric Learning for Tag-based Music Retrieval

10/30/2020
by   Minz Won, et al.
0

Tag-based music retrieval is crucial to browse large-scale music libraries efficiently. Hence, automatic music tagging has been actively explored, mostly as a classification task, which has an inherent limitation: a fixed vocabulary. On the other hand, metric learning enables flexible vocabularies by using pretrained word embeddings as side information. Also, metric learning has already proven its suitability for cross-modal retrieval tasks in other domains (e.g., text-to-image) by jointly learning a multimodal embedding space. In this paper, we investigate three ideas to successfully introduce multimodal metric learning for tag-based music retrieval: elaborate triplet sampling, acoustic and cultural music information, and domain-specific word embeddings. Our experimental results show that the proposed ideas enhance the retrieval system quantitatively, and qualitatively. Furthermore, we release the MSD500, a subset of the Million Song Dataset (MSD) containing 500 cleaned tags, 7 manually annotated tag categories, and user taste profiles.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2020

Multilingual Music Genre Embeddings for Effective Cross-Lingual Music Item Annotation

Annotating music items with music genres is crucial for music recommenda...
research
11/26/2022

Toward Universal Text-to-Music Retrieval

This paper introduces effective design choices for text-to-music retriev...
research
09/17/2019

Multi-Task Music Representation Learning from Multi-Label Embeddings

This paper presents a novel approach to music representation learning. T...
research
08/09/2020

Metric Learning vs Classification for Disentangled Music Representation Learning

Deep representation learning offers a powerful paradigm for mapping inpu...
research
11/15/2022

Music Similarity Calculation of Individual Instrumental Sounds Using Metric Learning

The criteria for measuring music similarity are important for developing...
research
11/21/2016

Sampled Image Tagging and Retrieval Methods on User Generated Content

Traditional image tagging and retrieval algorithms have limited value as...
research
11/26/2021

Emotion Embedding Spaces for Matching Music to Stories

Content creators often use music to enhance their stories, as it can be ...

Please sign up or login with your details

Forgot password? Click here to reset