Differentiable Dictionary Search: Integrating Linear Mixing with Deep Non-Linear Modelling for Audio Source Separation

11/28/2022
by   Lukas Samuel Martak, et al.
0

This paper describes several improvements to a new method for signal decomposition that we recently formulated under the name of Differentiable Dictionary Search (DDS). The fundamental idea of DDS is to exploit a class of powerful deep invertible density estimators called normalizing flows, to model the dictionary in a linear decomposition method such as NMF, effectively creating a bijection between the space of dictionary elements and the associated probability space, allowing a differentiable search through the dictionary space, guided by the estimated densities. As the initial formulation was a proof of concept with some practical limitations, we will present several steps towards making it scalable, hoping to improve both the computational complexity of the method and its signal decomposition capabilities. As a testbed for experimental evaluation, we choose the task of frame-level piano transcription, where the signal is to be decomposed into sources whose activity is attributed to individual piano notes. To highlight the impact of improved non-linear modelling of sources, we compare variants of our method to a linear overcomplete NMF baseline. Experimental results will show that even in the absence of additional constraints, our models produce increasingly sparse and precise decompositions, according to two pertinent evaluation measures.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/28/2022

Probabilistic Modelling of Signal Mixtures with Differentiable Dictionaries

We introduce a novel way to incorporate prior information into (semi-) s...
research
09/29/2018

Generalized Multichannel Variational Autoencoder for Underdetermined Source Separation

This paper deals with a multichannel audio source separation problem und...
research
07/04/2019

Blind Audio Source Separation with Minimum-Volume Beta-Divergence NMF

Considering a mixed signal composed of various audio sources and recorde...
research
04/05/2019

Unsupervised Low Latency Speech Enhancement with RT-GCC-NMF

In this paper, we present RT-GCC-NMF: a real-time (RT), two-channel blin...
research
09/20/2010

Fast Sparse Decomposition by Iterative Detection-Estimation

Finding sparse solutions of underdetermined systems of linear equations ...
research
10/27/2016

Sparse Signal Subspace Decomposition Based on Adaptive Over-complete Dictionary

This paper proposes a subspace decomposition method based on an over-com...
research
10/25/2022

Search for Concepts: Discovering Visual Concepts Using Direct Optimization

Finding an unsupervised decomposition of an image into individual object...

Please sign up or login with your details

Forgot password? Click here to reset