Examining the Mapping Functions of Denoising Autoencoders in Music Source Separation

The goal of this work is to investigate what music source separation approaches based on neural networks learn from the data. We examine the mapping functions of neural networks that are based on the denoising autoencoder (DAE) model, and conditioned on the mixture magnitude spectra. For approximating the mapping functions, we propose an algorithm that is inspired by the knowledge distillation and is denoted as the neural couplings algorithm (NCA). The NCA yields a matrix that expresses the mapping of the mixture to the target source magnitude information. Using the NCA we examine the mapping functions of three fundamental DAE models in music source separation; one with single layer encoder and decoder, one with multi-layer encoder and single layer decoder, and one using the skip-filtering connections (SF) with a single encoding and decoding layer. We first train these models with realistic data to estimate the singing voice magnitude spectra from the corresponding mixture. We then use the optimized models and test spectral data as input to the NCA. Our experimental findings show that approaches based on the DAE model learn scalar filtering operators, exhibiting a predominant diagonal structure in their corresponding mapping functions, limiting the exploitation of inter-frequency structure of music data. In contrast, skip-filtering connections are shown to assist the DAE model in learning filtering operators that exploit richer inter-frequency structure.

READ FULL TEXT

page 10

page 11

research
09/02/2017

A Recurrent Encoder-Decoder Approach with Skip-filtering Connections for Monaural Singing Voice Separation

The objective of deep learning methods based on encoder-decoder architec...
research
07/05/2018

Denoising Auto-encoder with Recurrent Skip Connections and Residual Regression for Music Source Separation

Convolutional neural networks with skip connections have shown good perf...
research
07/30/2018

Harmonic-Percussive Source Separation with Deep Neural Networks and Phase Recovery

Harmonic/percussive source separation (HPSS) consists in separating the ...
research
02/12/2020

Content Based Singing Voice Extraction From a Musical Mixture

We present a deep learning based methodology for extracting the singing ...
research
08/12/2020

Channel-wise Subband Input for Better Voice and Accompaniment Separation on High Resolution Music

This paper presents a new input format, channel-wise subband input (CWS)...
research
08/22/2017

Bitwise Source Separation on Hashed Spectra: An Efficient Posterior Estimation Scheme Using Partial Rank Order Metrics

This paper proposes an efficient bitwise solution to the single-channel ...
research
05/06/2019

Investigating kernel shapes and skip connections for deep learning-based harmonic-percussive separation

In this paper we propose an efficient deep learning encoder-decoder netw...

Please sign up or login with your details

Forgot password? Click here to reset