Unaligned Supervision For Automatic Music Transcription in The Wild

04/28/2022
by   Ben Maman, et al.
0

Multi-instrument Automatic Music Transcription (AMT), or the decoding of a musical recording into semantic musical content, is one of the holy grails of Music Information Retrieval. Current AMT approaches are restricted to piano and (some) guitar recordings, due to difficult data collection. In order to overcome data collection barriers, previous AMT approaches attempt to employ musical scores in the form of a digitized version of the same song or piece. The scores are typically aligned using audio features and strenuous human intervention to generate training labels. We introduce NoteEM, a method for simultaneously training a transcriber and aligning the scores to their corresponding performances, in a fully-automated process. Using this unaligned supervision scheme, complemented by pseudo-labels and pitch-shift augmentation, our method enables training on in-the-wild recordings with unprecedented accuracy and instrumental variety. Using only synthetic data and unaligned supervision, we report SOTA note-level accuracy of the MAPS dataset, and large favorable margins on cross-dataset evaluations. We also demonstrate robustness and ease of use; we report comparable results when training on a small, easily obtainable, self-collected dataset, and we propose alternative labeling to the MusicNet dataset, which we show to be more accurate. Our project page is available at https://benadar293.github.io

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2020

GiantMIDI-Piano: A large-scale MIDI dataset for classical piano music

Symbolic music datasets are important for music information retrieval an...
research
06/15/2023

Exploring Isolated Musical Notes as Pre-training Data for Predominant Instrument Recognition in Polyphonic Music

With the growing amount of musical data available, automatic instrument ...
research
11/16/2022

Structural Segmentation and Labeling of Tabla Solo Performances

Tabla is a North Indian percussion instrument used as an accompaniment a...
research
07/14/2019

Markov-switching State Space Models for Uncovering Musical Interpretation

For concertgoers, musical interpretation is the most important factor in...
research
09/26/2022

HSD: A hierarchical singing annotation dataset

Commonly music has an obvious hierarchical structure, especially for the...
research
09/05/2023

The Batik-plays-Mozart Corpus: Linking Performance to Score to Musicological Annotations

We present the Batik-plays-Mozart Corpus, a piano performance dataset co...
research
11/16/2022

Annotation of Soft Onsets in String Ensemble Recordings

Onset detection is the process of identifying the start points of musica...

Please sign up or login with your details

Forgot password? Click here to reset