Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters

04/24/2023
by   Kristina Tesch, et al.
0

In a multi-channel separation task with multiple speakers, we aim to recover all individual speech signals from the mixture. In contrast to single-channel approaches, which rely on the different spectro-temporal characteristics of the speech signals, multi-channel approaches should additionally utilize the different spatial locations of the sources for a more powerful separation especially when the number of sources increases. To enhance the spatial processing in a multi-channel source separation scenario, in this work, we propose a deep neural network (DNN) based spatially selective filter (SSF) that can be spatially steered to extract the speaker of interest by initializing a recurrent neural network layer with the target direction. We compare the proposed SSF with a common end-to-end direct separation (DS) approach trained using utterance-wise permutation invariant training (PIT), which only implicitly learns to perform spatial filtering. We show that the SSF has a clear advantage over a DS approach with the same underlying network architecture when there are more than two speakers in the mixture, which can be attributed to a better use of the spatial information. Furthermore, we find that the SSF generalizes much better to additional noise sources that were not seen during training.

READ FULL TEXT

page 1

page 3

research
11/04/2022

Spatially Selective Deep Non-linear Filters for Speaker Extraction

In a scenario with multiple persons talking simultaneously, the spatial ...
research
04/28/2020

Neural Speech Separation Using Spatially Distributed Microphones

This paper proposes a neural network based speech separation method usin...
research
03/13/2023

Multi-Microphone Speaker Separation by Spatial Regions

We consider the task of region-based source separation of reverberant mu...
research
04/18/2021

Many-Speakers Single Channel Speech Separation with Optimal Permutation Training

Single channel speech separation has experienced great progress in the l...
research
01/02/2020

Temporal-Spatial Neural Filter: Direction Informed End-to-End Multi-channel Target Speech Separation

Target speech separation refers to extracting the target speaker's speec...
research
05/31/2023

UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures

In reverberant conditions with multiple concurrent speakers, each microp...
research
09/02/2020

SAGRNN: Self-Attentive Gated RNN for Binaural Speaker Separation with Interaural Cue Preservation

Most existing deep learning based binaural speaker separation systems fo...

Please sign up or login with your details

Forgot password? Click here to reset