SAGRNN: Self-Attentive Gated RNN for Binaural Speaker Separation with Interaural Cue Preservation

09/02/2020
by   Ke Tan, et al.
0

Most existing deep learning based binaural speaker separation systems focus on producing a monaural estimate for each of the target speakers, and thus do not preserve the interaural cues, which are crucial for human listeners to perform sound localization and lateralization. In this study, we address talker-independent binaural speaker separation with interaural cues preserved in the estimated binaural signals. Specifically, we extend a newly-developed gated recurrent neural network for monaural separation by additionally incorporating self-attention mechanisms and dense connectivity. We develop an end-to-end multiple-input multiple-output system, which directly maps from the binaural waveform of the mixture to those of the speech signals. The experimental results show that our proposed approach achieves significantly better separation performance than a recent binaural separation approach. In addition, our approach effectively preserves the interaural cues, which improves the accuracy of sound localization.

READ FULL TEXT
research
02/16/2020

Real-time binaural speech separation with preserved spatial cues

Deep learning speech separation algorithms have achieved great success i...
research
04/17/2021

MIMO Self-attentive RNN Beamformer for Multi-speaker Speech Separation

Recently, our proposed recurrent neural network (RNN) based all deep lea...
research
03/13/2023

Online Binaural Speech Separation of Moving Speakers With a Wavesplit Network

Binaural speech separation in real-world scenarios often involves moving...
research
12/07/2022

MIMO-DBnet: Multi-channel Input and Multiple Outputs DOA-aware Beamforming Network for Speech Separation

Recently, many deep learning based beamformers have been proposed for mu...
research
04/25/2019

Divide and Conquer: A Deep CASA Approach to Talker-independent Monaural Speaker Separation

We address talker-independent monaural speaker separation from the persp...
research
04/24/2023

Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters

In a multi-channel separation task with multiple speakers, we aim to rec...

Please sign up or login with your details

Forgot password? Click here to reset