Detecting Noteheads in Handwritten Scores with ConvNets and Bounding Box Regression
Noteheads are the interface between the written score and music. Each notehead on the page signifies one note to be played, and detecting noteheads is thus an unavoidable step for Optical Music Recognition. Noteheads are clearly distinct objects, however, the variety of music notation handwriting makes noteheads harder to identify, and while handwritten music notation symbol classification is a well-studied task, symbol detection has usually been limited to heuristics and rule-based systems instead of machine learning methods better suited to deal with the uncertainties in handwriting. We present ongoing work on a simple notehead detector using convolutional neural networks for pixel classification and bounding box regression that achieves a detection f-score of 0.97 on binary score images in the MUSCIMA++ dataset, does not require staff removal, and is applicable to a variety of handwriting styles and levels of musical complexity.
READ FULL TEXT