Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation

02/13/2015
by   Po-Sen Huang, et al.
0

Monaural source separation is important for many real world applications. It is challenging because, with only a single channel of information available, without any constraints, an infinite number of solutions are possible. In this paper, we explore joint optimization of masking functions and deep recurrent neural networks for monaural source separation tasks, including monaural speech separation, monaural singing voice separation, and speech denoising. The joint optimization of the deep recurrent neural networks with an extra masking layer enforces a reconstruction constraint. Moreover, we explore a discriminative criterion for training neural networks to further enhance the separation performance. We evaluate the proposed system on the TSP, MIR-1K, and TIMIT datasets for speech separation, singing voice separation, and speech denoising tasks, respectively. Our approaches achieve 2.30--4.98 dB SDR gain compared to NMF models in the speech separation task, 2.30--2.48 dB GNSDR gain and 4.32--5.42 dB GSIR gain compared to existing models in the singing voice separation task, and outperform NMF and DNN baselines in the speech denoising task.

READ FULL TEXT

page 6

page 7

page 9

page 10

page 12

research
11/12/2013

Deep neural networks for single channel source separation

In this paper, a novel approach for single channel source separation (SC...
research
08/23/2019

Incremental Binarization On Recurrent Neural Networks For Single-Channel Source Separation

This paper proposes a Bitwise Gated Recurrent Unit (BGRU) network for th...
research
06/01/2023

UnDiff: Unsupervised Voice Restoration with Unconditional Diffusion Model

This paper introduces UnDiff, a diffusion probabilistic model capable of...
research
03/06/2023

Scaling strategies for on-device low-complexity source separation with Conv-Tasnet

Recently, several very effective neural approaches for single-channel sp...
research
05/20/2020

SADDEL: Joint Speech Separation and Denoising Model based on Multitask Learning

Speech data collected in real-world scenarios often encounters two issue...
research
11/06/2018

Building Corpora for Single-Channel Speech Separation Across Multiple Domains

To date, the bulk of research on single-channel speech separation has be...
research
09/21/2017

Deep Recurrent NMF for Speech Separation by Unfolding Iterative Thresholding

In this paper, we propose a novel recurrent neural network architecture ...

Please sign up or login with your details

Forgot password? Click here to reset