Contrastive Learning for Cross-modal Artist Retrieval

08/12/2023
by   Andres Ferraro, et al.
0

Music retrieval and recommendation applications often rely on content features encoded as embeddings, which provide vector representations of items in a music dataset. Numerous complementary embeddings can be derived from processing items originally represented in several modalities, e.g., audio signals, user interaction data, or editorial data. However, data of any given modality might not be available for all items in any music dataset. In this work, we propose a method based on contrastive learning to combine embeddings from multiple modalities and explore the impact of the presence or absence of embeddings from diverse modalities in an artist similarity task. Experiments on two datasets suggest that our contrastive method outperforms single-modality embeddings and baseline algorithms for combining modalities, both in terms of artist retrieval accuracy and coverage. Improvements with respect to other methods are particularly significant for less popular query artists. We demonstrate our method successfully combines complementary information from diverse modalities, and is more robust to missing modality data (i.e., it better handles the retrieval of artists with different modality embeddings than the query artist's).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/12/2019

Cross-Modal Music Retrieval and Applications: An Overview of Key Methodologies

There has been a rapid growth of digitally available music data, includi...
research
06/02/2021

Exploring modality-agnostic representations for music classification

Music information is often conveyed or recorded across multiple data mod...
research
04/01/2021

Enriched Music Representations with Multiple Cross-modal Contrastive Learning

Modeling various aspects that make a music piece unique is a challenging...
research
09/21/2023

Self-Supervised Contrastive Learning for Robust Audio-Sheet Music Retrieval Systems

Linking sheet music images to audio recordings remains a key problem for...
research
04/21/2021

Deep Music Retrieval for Fine-Grained Videos by Exploiting Cross-Modal-Encoded Voice-Overs

Recently, the witness of the rapidly growing popularity of short videos ...
research
05/07/2020

COBRA: Contrastive Bi-Modal Representation Algorithm

There are a wide range of applications that involve multi-modal data, su...
research
03/08/2022

Mutual Contrastive Learning to Disentangle Whole Slide Image Representations for Glioma Grading

Whole slide images (WSI) provide valuable phenotypic information for his...

Please sign up or login with your details

Forgot password? Click here to reset