High-dimensional sequence transduction

We investigate the problem of transforming an input sequence into a high-dimensional output sequence in order to transcribe polyphonic audio music into symbolic notation. We introduce a probabilistic model based on a recurrent neural network that is able to learn realistic output distributions given the input and we devise an efficient algorithm to search for the global mode of that distribution. The resulting method produces musically plausible transcriptions even under high levels of noise and drastically outperforms previous state-of-the-art approaches on five datasets of synthesized sounds and real recordings, approximately halving the test error rate.

READ FULL TEXT
research
06/27/2012

Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription

We investigate the problem of modeling symbolic sequences of polyphonic ...
research
08/08/2023

Dual input neural networks for positional sound source localization

In many signal processing applications, metadata may be advantageously u...
research
10/03/2021

Music Playlist Title Generation: A Machine-Translation Approach

We propose a machine-translation approach to automatically generate a pl...
research
07/23/2021

Using Deep Learning Techniques and Inferential Speech Statistics for AI Synthesised Speech Recognition

The recent developments in technology have re-warded us with amazing aud...
research
08/19/2019

An Efficient Algorithm to Test Potentially Bipartiteness of Graphical Degree Sequences

As a partial answer to a question of Rao, a deterministic and customizab...
research
04/04/2016

Recurrent Neural Networks for Polyphonic Sound Event Detection in Real Life Recordings

In this paper we present an approach to polyphonic sound event detection...
research
10/11/2018

Piano Genie

We present Piano Genie, an intelligent controller which allows non-music...

Please sign up or login with your details

Forgot password? Click here to reset