Composer Style Classification of Piano Sheet Music Images Using Language Model Pretraining

07/29/2020
by   TJ Tsai, et al.
0

This paper studies composer style classification of piano sheet music images. Previous approaches to the composer classification task have been limited by a scarcity of data. We address this issue in two ways: (1) we recast the problem to be based on raw sheet music images rather than a symbolic music format, and (2) we propose an approach that can be trained on unlabeled data. Our approach first converts the sheet music image into a sequence of musical "words" based on the bootleg feature representation, and then feeds the sequence into a text classifier. We show that it is possible to significantly improve classifier performance by first training a language model on a set of unlabeled data, initializing the classifier with the pretrained language model weights, and then finetuning the classifier on a small amount of labeled data. We train AWD-LSTM, GPT-2, and RoBERTa language models on all piano sheet music images in IMSLP. We find that transformer-based architectures outperform CNN and LSTM models, and pretraining boosts classification accuracy for the GPT-2 model from 46% to 70% on a 9-way classification task. The trained model can also be used as a feature extractor that projects piano sheet music into a feature space that characterizes compositional style.

READ FULL TEXT

page 5

page 6

research
09/15/2023

MusiLingo: Bridging Music and Text with Pre-trained Language Models for Music Captioning and Query Response

Large Language Models (LLMs) have shown immense potential in multimodal ...
research
04/04/2023

Unsupervised Improvement of Factual Knowledge in Language Models

Masked language modeling (MLM) plays a key role in pretraining large lan...
research
07/31/2023

Pretrained deep models outperform GBDTs in Learning-To-Rank under label scarcity

While deep learning (DL) models are state-of-the-art in text and image d...
research
10/16/2020

Melody Classifier with Stacked-LSTM

Attempts to use generative models for music generation have been common ...
research
10/01/2020

Using ROC and Unlabeled Data for Increasing Low-Shot Transfer Learning Classification Accuracy

One of the most important characteristics of human visual intelligence i...
research
10/15/2020

A Transformer Based Pitch Sequence Autoencoder with MIDI Augmentation

Algorithms based on deep learning have been widely put forward for autom...
research
12/10/2019

Encoding Musical Style with Transformer Autoencoders

We consider the problem of learning high-level controls over the global ...

Please sign up or login with your details

Forgot password? Click here to reset