JS Fake Chorales: a Synthetic Dataset of Polyphonic Music with Human Annotation

07/21/2021
by   Omar Peracha, et al.
0

High quality datasets for learning-based modelling of polyphonic symbolic music remain less readily-accessible at scale than in other domains, such as language modelling or image classification. In particular, datasets which contain information revealing insights about human responses to the given music samples are rare. The issue of scale persists as a general hindrance towards breakthroughs in the field, while the lack of listener evaluation is especially relevant to the generative modelling problem-space, where clear objective metrics correlating strongly with qualitative success remain elusive. We propose the JS Fake Chorales, a dataset of 500 pieces generated by a new learning-based algorithm, provided in MIDI form. We take consecutive outputs from the algorithm and avoid cherry-picking in order to validate the potential to further scale this dataset on-demand. We conduct an online experiment for human evaluation, designed to be as fair to the listener as possible, and find that respondents were on average only 7 distinguishing JS Fake Chorales from real chorales composed by JS Bach. Furthermore, we make anonymised data collected from experiments available along with the MIDI samples, such as the respondents' musical experience and how long they took to submit their response for each sample. Finally, we conduct ablation studies to demonstrate the effectiveness of using the synthetic pieces for research in polyphonic music modelling, and find that we can improve on state-of-the-art validation set loss for the canonical JSB Chorales dataset, using a known algorithm, by simply augmenting the training set with the JS Fake Chorales.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/31/2021

Evaluating Deep Music Generation Methods Using Data Augmentation

Despite advances in deep algorithmic music generation, evaluation of gen...
research
05/16/2023

Discrete Diffusion Probabilistic Models for Symbolic Music Generation

Denoising Diffusion Probabilistic Models (DDPMs) have made great strides...
research
03/10/2020

Quantifying Musical Style: Ranking Symbolic Music based on Similarity to a Style

Modelling human perception of musical similarity is critical for the eva...
research
05/05/2023

Exploring Softly Masked Language Modelling for Controllable Symbolic Music Generation

This document presents some early explorations of applying Softly Masked...
research
07/31/2023

LP-MusicCaps: LLM-Based Pseudo Music Captioning

Automatic music captioning, which generates natural language description...
research
11/16/2016

Image Credibility Analysis with Effective Domain Transferred Deep Networks

Numerous fake images spread on social media today and can severely jeopa...

Please sign up or login with your details

Forgot password? Click here to reset