DisCover: Disentangled Music Representation Learning for Cover Song Identification

07/19/2023
by   Jiahao Xun, et al.
0

In the field of music information retrieval (MIR), cover song identification (CSI) is a challenging task that aims to identify cover versions of a query song from a massive collection. Existing works still suffer from high intra-song variances and inter-song correlations, due to the entangled nature of version-specific and version-invariant factors in their modeling. In this work, we set the goal of disentangling version-specific and version-invariant factors, which could make it easier for the model to learn invariant music representations for unseen query songs. We analyze the CSI task in a disentanglement view with the causal graph technique, and identify the intra-version and inter-version effects biasing the invariant learning. To block these effects, we propose the disentangled music representation learning framework (DisCover) for CSI. DisCover consists of two critical components: (1) Knowledge-guided Disentanglement Module (KDM) and (2) Gradient-based Adversarial Disentanglement Module (GADM), which block intra-version and inter-version biased effects, respectively. KDM minimizes the mutual information between the learned representations and version-variant factors that are identified with prior domain knowledge. GADM identifies version-variant factors by simulating the representation transitions between intra-song versions, and exploits adversarial distillation for effect blocking. Extensive comparisons with best-performing methods and in-depth analysis demonstrate the effectiveness of DisCover and the and necessity of disentanglement for CSI.

READ FULL TEXT
research
03/21/2023

ByteCover3: Accurate Cover Song Identification on Short Queries

Deep learning based methods have become a paradigm for cover song identi...
research
10/27/2020

ByteCover: Cover Song Identification via Multi-Loss Training

We present in this paper ByteCover, which is a new feature learning meth...
research
11/02/2021

Multi-input Architecture and Disentangled Representation Learning for Multi-dimensional Modeling of Music Similarity

In the context of music information retrieval, similarity-based approach...
research
11/25/2019

Bridging Disentanglement with Independence and Conditional Independence via Mutual Information for Representation Learning

Existing works on disentangled representation learning usually lie on a ...
research
08/09/2023

Pareto Invariant Representation Learning for Multimedia Recommendation

Multimedia recommendation involves personalized ranking tasks, where mul...
research
06/15/2023

CoverHunter: Cover Song Identification with Refined Attention and Alignments

Abstract: Cover song identification (CSI) focuses on finding the same mu...
research
03/14/2022

Disentangled Representation Learning for Text-Video Retrieval

Cross-modality interaction is a critical component in Text-Video Retriev...

Please sign up or login with your details

Forgot password? Click here to reset