Multitrack Music Transcription with a Time-Frequency Perceiver

06/19/2023
by   Wei-Tsung Lu, et al.
0

Multitrack music transcription aims to transcribe a music audio input into the musical notes of multiple instruments simultaneously. It is a very challenging task that typically requires a more complex model to achieve satisfactory result. In addition, prior works mostly focus on transcriptions of regular instruments, however, neglecting vocals, which are usually the most important signal source if present in a piece of music. In this paper, we propose a novel deep neural network architecture, Perceiver TF, to model the time-frequency representation of audio input for multitrack transcription. Perceiver TF augments the Perceiver architecture by introducing a hierarchical expansion with an additional Transformer layer to model temporal coherence. Accordingly, our model inherits the benefits of Perceiver that posses better scalability, allowing it to well handle transcriptions of many instruments in a single model. In experiments, we train a Perceiver TF to model 12 instrument classes as well as vocal in a multi-task learning manner. Our result demonstrates that the proposed system outperforms the state-of-the-art counterparts (e.g., MT3 and SpecTNT) on various public datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/09/2021

Music demixing with the sliCQ transform

Music source separation is the task of extracting an estimate of one or ...
research
11/04/2021

MT3: Multi-Task Multitrack Music Transcription

Automatic Music Transcription (AMT), inferring musical notes from raw au...
research
01/25/2020

The impact of Audio input representations on neural network based music transcription

This paper thoroughly analyses the effect of different input representat...
research
05/29/2022

Modeling Beats and Downbeats with a Time-Frequency Transformer

Transformer is a successful deep neural network (DNN) architecture that ...
research
01/15/2019

Classical Music Generation in Distinct Dastgahs with AlimNet ACGAN

In this paper AlimNet (With respect to great musician, Alim Qasimov) an ...
research
11/18/2020

Vertical-Horizontal Structured Attention for Generating Music with Chords

In this paper, we propose a lightweight music-generating model based on ...
research
11/26/2020

Real-time error correction and performance aid for MIDI instruments

Making a slight mistake during live music performance can easily be spot...

Please sign up or login with your details

Forgot password? Click here to reset