Late multimodal fusion for image and audio music transcription

04/06/2022
by   María Alfaro-Contreras, et al.
0

Music transcription, which deals with the conversion of music sources into a structured digital format, is a key problem for Music Information Retrieval (MIR). When addressing this challenge in computational terms, the MIR community follows two lines of research: music documents, which is the case of Optical Music Recognition (OMR), or audio recordings, which is the case of Automatic Music Transcription (AMT). The different nature of the aforementioned input data has conditioned these fields to develop modality-specific frameworks. However, their recent definition in terms of sequence labeling tasks leads to a common output representation, which enables research on a combined paradigm. In this respect, multimodal image and audio music transcription comprises the challenge of effectively combining the information conveyed by image and audio modalities. In this work, we explore this question at a late-fusion level: we study four combination approaches in order to merge, for the first time, the hypotheses regarding end-to-end OMR and AMT systems in a lattice-based search space. The results obtained for a series of performance scenarios – in which the corresponding single-modality models yield different error rates – showed interesting benefits of these approaches. In addition, two of the four strategies considered significantly improve the corresponding unimodal standard recognition frameworks.

READ FULL TEXT
research
02/14/2019

Multimodal music information processing and retrieval: survey and future challenges

Towards improving the performance in various music information processin...
research
02/12/2019

Cross-Modal Music Retrieval and Applications: An Overview of Key Methodologies

There has been a rapid growth of digitally available music data, includi...
research
12/10/2022

A Comparison of Audio Preprocessing Techniques and Deep Learning Algorithms for Raga Recognition

Ragas form the foundation for Indian Classical Music. The task of Raga R...
research
06/02/2021

Exploring modality-agnostic representations for music classification

Music information is often conveyed or recorded across multiple data mod...
research
02/22/2020

DECIBEL: Improving Audio Chord Estimation for Popular Music by Alignment and Integration of Crowd-Sourced Symbolic Representations

Automatic Chord Estimation (ACE) is a fundamental task in Music Informat...
research
05/14/2021

Chord Recognition- Music and Audio Information Retrieval

Music Information Retrieval (MIR) is a collaborative scientific study th...
research
02/24/2022

A Perceptual Measure for Evaluating the Resynthesis of Automatic Music Transcriptions

This study focuses on the perception of music performances when contextu...

Please sign up or login with your details

Forgot password? Click here to reset