Two-Step Sound Source Separation: Training on Learned Latent Targets

10/22/2019
by   Efthymios Tzinis, et al.
0

In this paper, we propose a two-step training procedure for source separation via a deep neural network. In the first step we learn a transform (and it's inverse) to a latent space where masking-based separation performance using oracles is optimal. For the second step, we train a separation module that operates on the previously learned space. In order to do so, we also make use of a scale-invariant signal to distortion ratio (SI-SDR) loss function that works in the latent space, and we prove that it lower-bounds the SI-SDR in the time domain. We run various sound separation experiments that show how this approach can obtain better performance as compared to systems that learn the transform and the separation module jointly. The proposed methodology is general enough to be applicable to a large class of neural network end-to-end separation systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/05/2018

End-to-end Networks for Supervised Single-channel Speech Separation

The performance of single channel source separation algorithms has impro...
research
11/22/2022

Latent Iterative Refinement for Modular Source Separation

Traditional source separation approaches train deep neural network model...
research
07/31/2023

Deep Learning Meets Adaptive Filtering: A Stein's Unbiased Risk Estimator Approach

This paper revisits two prominent adaptive filtering algorithms through ...
research
03/07/2021

HTMD-Net: A Hybrid Masking-Denoising Approach to Time-Domain Monaural Singing Voice Separation

The advent of deep learning has led to the prevalence of deep neural net...
research
10/13/2021

Deep Metric Learning with Locality Sensitive Angular Loss for Self-Correcting Source Separation of Neural Spiking Signals

Neurophysiological time series, such as electromyographic signal and int...
research
10/11/2021

Unsupervised Source Separation via Bayesian Inference in the Latent Domain

State of the art audio source separation models rely on supervised data-...
research
08/23/2019

Incremental Binarization On Recurrent Neural Networks For Single-Channel Source Separation

This paper proposes a Bitwise Gated Recurrent Unit (BGRU) network for th...

Please sign up or login with your details

Forgot password? Click here to reset