Multi-modal Conditional Bounding Box Regression for Music Score Following

05/10/2021
by   Florian Henkel, et al.
0

This paper addresses the problem of sheet-image-based on-line audio-to-score alignment also known as score following. Drawing inspiration from object detection, a conditional neural network architecture is proposed that directly predicts x,y coordinates of the matching positions in a complete score sheet image at each point in time for a given musical performance. Experiments are conducted on a synthetic polyphonic piano benchmark dataset and the new method is compared to several existing approaches from the literature for sheet-image-based score following as well as an Optical Music Recognition baseline. The proposed approach achieves new state-of-the-art results and furthermore significantly improves the alignment performance on a set of real-world piano recordings by applying Impulse Responses as a data augmentation technique.

READ FULL TEXT
research
07/21/2020

Learning to Read and Follow Music in Complete Score Sheet Images

This paper addresses the task of score following in sheet music given as...
research
04/21/2020

MIDI-Sheet Music Alignment Using Bootleg Score Synthesis

MIDI-sheet music alignment is the task of finding correspondences betwee...
research
05/06/2022

Musical Score Following and Audio Alignment

Real-time tracking of the position of a musical performance on a musical...
research
11/13/2017

Audio-to-score alignment of piano music using RNN-based automatic music transcription

We propose a framework for audio-to-score alignment on piano performance...
research
08/05/2017

Detecting Noteheads in Handwritten Scores with ConvNets and Bounding Box Regression

Noteheads are the interface between the written score and music. Each no...
research
10/16/2019

Audio-Conditioned U-Net for Position Estimation in Full Sheet Images

The goal of score following is to track a musical performance, usually i...
research
04/22/2020

Using Cell Phone Pictures of Sheet Music To Retrieve MIDI Passages

This article investigates a cross-modal retrieval problem in which a use...

Please sign up or login with your details

Forgot password? Click here to reset