Attention as a Perspective for Learning Tempo-invariant Audio Queries

09/15/2018
by   Matthias Dorfer, et al.
0

Current models for audio--sheet music retrieval via multimodal embedding space learning use convolutional neural networks with a fixed-size window for the input audio. Depending on the tempo of a query performance, this window captures more or less musical content, while notehead density in the score is largely tempo-independent. In this work we address this disparity with a soft attention mechanism, which allows the model to encode only those parts of an audio excerpt that are most relevant with respect to efficient query codes. Empirical results on classical piano music indicate that attention is beneficial for retrieval performance, and exhibits intuitively appealing behavior.

READ FULL TEXT
research
06/26/2019

Learning Soft-Attention Models for Tempo-invariant Audio-Sheet Music Retrieval

Connecting large libraries of digitized audio recordings to their corres...
research
09/21/2023

Passage Summarization with Recurrent Models for Audio-Sheet Music Retrieval

Many applications of cross-modal music retrieval are related to connecti...
research
05/26/2021

Exploiting Temporal Dependencies for Cross-Modal Music Piece Identification

This paper addresses the problem of cross-modal musical piece identifica...
research
04/24/2021

MusCaps: Generating Captions for Music Audio

Content-based music information retrieval has seen rapid progress with t...
research
01/15/2020

Deep Learning for MIR Tutorial

Deep Learning has become state of the art in visual computing and contin...
research
09/09/2022

MATT: A Multiple-instance Attention Mechanism for Long-tail Music Genre Classification

Imbalanced music genre classification is a crucial task in the Music Inf...
research
06/30/2011

A Comprehensive Trainable Error Model for Sung Music Queries

We propose a model for errors in sung queries, a variant of the hidden M...

Please sign up or login with your details

Forgot password? Click here to reset