ReconVAT: A Semi-Supervised Automatic Music Transcription Framework for Low-Resource Real-World Data

07/11/2021
by   Kin Wai Cheuk, et al.
0

Most of the current supervised automatic music transcription (AMT) models lack the ability to generalize. This means that they have trouble transcribing real-world music recordings from diverse musical genres that are not presented in the labelled training data. In this paper, we propose a semi-supervised framework, ReconVAT, which solves this issue by leveraging the huge amount of available unlabelled music recordings. The proposed ReconVAT uses reconstruction loss and virtual adversarial training. When combined with existing U-net models for AMT, ReconVAT achieves competitive results on common benchmark datasets such as MAPS and MusicNet. For example, in the few-shot setting for the string part version of MusicNet, ReconVAT achieves F1-scores of 61.0 respectively, which translates into an improvement of 22.2 to the supervised baseline model. Our proposed framework also demonstrates the potential of continual learning on new data, which could be useful in real-world applications whereby new data is constantly available.

READ FULL TEXT
research
11/23/2021

Music Classification: Beyond Supervised Learning, Towards Real-world Applications

Music classification is a music information retrieval (MIR) task to clas...
research
10/05/2021

Hypernetworks for Continual Semi-Supervised Learning

Learning from data sequentially arriving, possibly in a non i.i.d. way, ...
research
02/10/2022

Semi-Supervised Convolutive NMF for Automatic Music Transcription

Automatic Music Transcription, which consists in transforming an audio r...
research
03/16/2020

A semi-supervised sparse K-Means algorithm

We consider the problem of data clustering with unidentified feature qua...
research
02/20/2022

towards automatic transcription of polyphonic electric guitar music:a new dataset and a multi-loss transformer model

In this paper, we propose a new dataset named EGDB, that con-tains trans...
research
08/09/2020

Cosine-Distance Virtual Adversarial Training for Semi-Supervised Speaker-Discriminative Acoustic Embeddings

In this paper, we propose a semi-supervised learning (SSL) technique for...
research
12/14/2018

Semi-Supervised Monaural Singing Voice Separation With a Masking Network Trained on Synthetic Mixtures

We study the problem of semi-supervised singing voice separation, in whi...

Please sign up or login with your details

Forgot password? Click here to reset