Revisiting the Onsets and Frames Model with Additive Attention

04/14/2021
by   Kin Wai Cheuk, et al.
0

Recent advances in automatic music transcription (AMT) have achieved highly accurate polyphonic piano transcription results by incorporating onset and offset detection. The existing literature, however, focuses mainly on the leverage of deep and complex models to achieve state-of-the-art (SOTA) accuracy, without understanding model behaviour. In this paper, we conduct a comprehensive examination of the Onsets-and-Frames AMT model, and pinpoint the essential components contributing to a strong AMT performance. This is achieved through exploitation of a modified additive attention mechanism. The experimental results suggest that the attention mechanism beyond a moderate temporal context does not benefit the model, and that rule-based post-processing is largely responsible for the SOTA performance. We also demonstrate that the onsets are the most significant attentive feature regardless of model complexity. The findings encourage AMT research to weigh more on both a robust onset detector and an effective post-processor.

READ FULL TEXT

page 2

page 4

page 5

page 6

research
08/20/2021

Fastformer: Additive Attention Can Be All You Need

Transformer is a powerful model for text understanding. However, it is i...
research
04/25/2020

Detective: An Attentive Recurrent Model for Sparse Object Detection

In this work, we present Detective - an attentive object detector that i...
research
07/12/2020

Learning Frame Level Attention for Environmental Sound Classification

Environmental sound classification (ESC) is a challenging problem due to...
research
09/06/2022

PTSEFormer: Progressive Temporal-Spatial Enhanced TransFormer Towards Video Object Detection

Recent years have witnessed a trend of applying context frames to boost ...
research
10/07/2021

Learning post-processing for QRS detection using Recurrent Neural Network

Deep-learning based QRS-detection algorithms often require essential pos...
research
11/30/2016

Sync-DRAW: Automatic Video Generation using Deep Recurrent Attentive Architectures

This paper introduces a novel approach for generating videos called Sync...

Please sign up or login with your details

Forgot password? Click here to reset