CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information Retrieval

04/21/2023
by   Shangda Wu, et al.
0

We introduce CLaMP: Contrastive Language-Music Pre-training, which learns cross-modal representations between natural language and symbolic music using a music encoder and a text encoder trained jointly with a contrastive loss. To pre-train CLaMP, we collected a large dataset of 1.4 million music-text pairs. It employed text dropout as a data augmentation technique and bar patching to efficiently represent music data which reduces sequence length to less than 10%. In addition, we developed a masked music model pre-training objective to enhance the music encoder's comprehension of musical context and structure. CLaMP integrates textual information to enable semantic search and zero-shot classification for symbolic music, surpassing the capabilities of previous models. To support the evaluation of semantic search and music classification, we publicly release WikiMusicText (WikiMT), a dataset of 1010 lead sheets in ABC notation, each accompanied by a title, artist, genre, and description. In comparison to state-of-the-art models that require fine-tuning, zero-shot CLaMP demonstrated comparable or superior performance on score-oriented datasets. Our models and code are available at https://github.com/microsoft/muzic/tree/main/clamp.

READ FULL TEXT
research
08/29/2023

Cross-Modal Retrieval Meets Inference:Improving Zero-Shot Classification with Cross-Modal Retrieval

Contrastive language-image pre-training (CLIP) has demonstrated remarkab...
research
06/06/2023

Systematic Analysis of Music Representations from BERT

There have been numerous attempts to represent raw data as numerical vec...
research
08/25/2022

Contrastive Audio-Language Learning for Music

As one of the most intuitive interfaces known to humans, natural languag...
research
09/19/2023

Learning Tri-modal Embeddings for Zero-Shot Soundscape Mapping

We focus on the task of soundscape mapping, which involves predicting th...
research
04/24/2023

Pre-Training Strategies Using Contrastive Learning and Playlist Information for Music Classification and Similarity

In this work, we investigate an approach that relies on contrastive lear...
research
03/27/2023

EVA-CLIP: Improved Training Techniques for CLIP at Scale

Contrastive language-image pre-training, CLIP for short, has gained incr...
research
09/19/2023

Motif-Centric Representation Learning for Symbolic Music

Music motif, as a conceptual building block of composition, is crucial f...

Please sign up or login with your details

Forgot password? Click here to reset