Onsets and Frames: Dual-Objective Piano Transcription

10/30/2017

∙

We consider the problem of transcribing polyphonic piano music with an emphasis on generalizing to unseen instruments. We use deep neural networks and propose a novel approach that predicts onsets and frames using both CNNs and LSTMs. This model predicts pitch onset events and then uses those predictions to condition framewise pitch predictions. During inference, we restrict the predictions from the framewise detector by not allowing a new note to start unless the onset detector also agrees that an onset for that pitch is present in the frame. We focus on improving onsets and offsets together instead of either in isolation as we believe it correlates better with human musical perception. This technique results in over a 100 with offset score on the MAPS dataset.

READ FULL TEXT

Onsets and Frames: Dual-Objective Piano Transcription

Sign in with Google

Consider DeepAI Pro