DiffRoll: Diffusion-based Generative Music Transcription with Unsupervised Pretraining Capability

10/11/2022
by   Kin Wai Cheuk, et al.
16

In this paper we propose a novel generative approach, DiffRoll, to tackle automatic music transcription (AMT). Instead of treating AMT as a discriminative task in which the model is trained to convert spectrograms into piano rolls, we think of it as a conditional generative task where we train our model to generate realistic looking piano rolls from pure Gaussian noise conditioned on spectrograms. This new AMT formulation enables DiffRoll to transcribe, generate and even inpaint music. Due to the classifier-free nature, DiffRoll is also able to be trained on unpaired datasets where only piano rolls are available. Our experiments show that DiffRoll outperforms its discriminative counterpart by 19 percentage points (ppt.) and our ablation studies also indicate that it outperforms similar existing methods by 4.8 ppt. Source code and demonstration are available https://sony.github.io/DiffRoll/.

READ FULL TEXT

page 2

page 4

research
03/15/2023

Generating symbolic music using diffusion models

Probabilistic Denoising Diffusion models have emerged as simple yet very...
research
06/08/2023

Simple and Controllable Music Generation

We tackle the task of conditional music generation. We introduce MusicGe...
research
06/21/2020

Feel The Music: Automatically Generating A Dance For An Input Song

We present a general computational approach that enables a machine to ge...
research
03/14/2023

DiffuseRoll: Multi-track multi-category music generation based on diffusion model

Recent advancements in generative models have shown remarkable progress ...
research
04/07/2022

Genre-conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music

Lyrics transcription of polyphonic music is challenging not only because...
research
01/28/2022

Dual Learning Music Composition and Dance Choreography

Music and dance have always co-existed as pillars of human activities, c...
research
09/04/2019

Towards Interpretable Polyphonic Transcription with Invertible Neural Networks

We explore a novel way of conceptualising the task of polyphonic music t...

Please sign up or login with your details

Forgot password? Click here to reset