Improving Deep Attractor Network by BGRU and GMM for Speech Separation

08/07/2023
by   Rawad Melhem, et al.
0

Deep Attractor Network (DANet) is the state-of-the-art technique in speech separation field, which uses Bidirectional Long Short-Term Memory (BLSTM), but the complexity of the DANet model is very high. In this paper, a simplified and powerful DANet model is proposed using Bidirectional Gated neural network (BGRU) instead of BLSTM. The Gaussian Mixture Model (GMM) other than the k-means was applied in DANet as a clustering algorithm to reduce the complexity and increase the learning speed and accuracy. The metrics used in this paper are Signal to Distortion Ratio (SDR), Signal to Interference Ratio (SIR), Signal to Artifact Ratio (SAR), and Perceptual Evaluation Speech Quality (PESQ) score. Two speaker mixture datasets from TIMIT corpus were prepared to evaluate the proposed model, and the system achieved 12.3 dB and 2.94 for SDR and PESQ scores respectively, which were better than the original DANet model. Other improvements were 20.7 training, respectively. The model was applied on mixed Arabic speech signals and the results were better than that in English.

READ FULL TEXT

page 2

page 5

research
02/19/2019

Low-Latency Deep Clustering For Speech Separation

This paper proposes a low algorithmic latency adaptation of the deep clu...
research
12/25/2019

Utterance-level Permutation Invariant Training with Latency-controlled BLSTM for Single-channel Multi-talker Speech Separation

Utterance-level permutation invariant training (uPIT) has achieved promi...
research
01/15/2019

Orthonormal Embedding-based Deep Clustering for Single-channel Speech Separation

Deep clustering is a deep neural network-based speech separation algorit...
research
02/02/2019

FurcaNet: An end-to-end deep gated convolutional, long short-term memory, deep neural networks for single channel speech separation

Deep gated convolutional networks have been proved to be very effective ...
research
08/23/2020

Independent Vector Analysis with Deep Neural Network Source Priors

This paper studies the density priors for independent vector analysis (I...
research
08/09/2022

Recycling an anechoic pre-trained speech separation deep neural network for binaural dereverberation of a single source

Reverberation results in reduced intelligibility for both normal and hea...
research
02/08/2021

Speaker and Direction Inferred Dual-channel Speech Separation

Most speech separation methods, trying to separate all channel sources s...

Please sign up or login with your details

Forgot password? Click here to reset