Learning Music Representations with wav2vec 2.0

10/27/2022
by   Alessandro Ragano, et al.
0

Learning music representations that are general-purpose offers the flexibility to finetune several downstream tasks using smaller datasets. The wav2vec 2.0 speech representation model showed promising results in many downstream speech tasks, but has been less effective when adapted to music. In this paper, we evaluate whether pre-training wav2vec 2.0 directly on music data can be a better solution instead of finetuning the speech model. We illustrate that when pre-training on music data, the discrete latent representations are able to encode the semantic meaning of musical concepts such as pitch and instrument. Our results show that finetuning wav2vec 2.0 pre-trained on music data allows us to achieve promising results on music classification tasks that are competitive with prior work on audio representations. In addition, the results are superior to the pre-trained model on speech embeddings, demonstrating that wav2vec 2.0 pre-trained on music data can be a promising music representation model.

READ FULL TEXT
research
04/24/2023

Pre-Training Strategies Using Contrastive Learning and Playlist Information for Music Classification and Similarity

In this work, we investigate an approach that relies on contrastive lear...
research
07/11/2023

On the Effectiveness of Speech Self-supervised Learning for Music

Self-supervised learning (SSL) has shown promising results in various sp...
research
10/07/2022

Supervised and Unsupervised Learning of Audio Representations for Music Understanding

In this work, we provide a broad comparative analysis of strategies for ...
research
05/31/2023

Learning Music Sequence Representation from Text Supervision

Music representation learning is notoriously difficult for its complex h...
research
11/15/2022

Music Instrument Classification Reprogrammed

The performance of approaches to Music Instrument Classification, a popu...
research
05/24/2021

One4all User Representation for Recommender Systems in E-commerce

General-purpose representation learning through large-scale pre-training...
research
07/20/2023

Transfer Learning and Bias Correction with Pre-trained Audio Embeddings

Deep neural network models have become the dominant approach to a large ...

Please sign up or login with your details

Forgot password? Click here to reset