Spatially Selective Deep Non-linear Filters for Speaker Extraction

11/04/2022
by   Kristina Tesch, et al.
0

In a scenario with multiple persons talking simultaneously, the spatial characteristics of the signals are the most distinct feature for extracting the target signal. In this work, we develop a deep joint spatial-spectral non-linear filter that can be steered in an arbitrary target direction. For this we propose a simple and effective conditioning mechanism, which sets the initial state of the filter's recurrent layers based on the target direction. We show that this scheme is more effective than the baseline approach and increases the flexibility of the filter at no performance cost. The resulting spatially selective non-linear filters can also be used for speech separation of an arbitrary number of speakers and enable very accurate multi-speaker localization as we demonstrate in this paper.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/24/2023

Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters

In a multi-channel separation task with multiple speakers, we aim to rec...
research
06/27/2022

Insights into Deep Non-linear Filters for Improved Multi-channel Speech Enhancement

The key advantage of using multiple microphones for speech enhancement i...
research
06/22/2022

On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement

Employing deep neural networks (DNNs) to directly learn filters for mult...
research
01/02/2020

Temporal-Spatial Neural Filter: Direction Informed End-to-End Multi-channel Target Speech Separation

Target speech separation refers to extracting the target speaker's speec...
research
01/31/2023

Neural Target Speech Extraction: An Overview

Humans can listen to a target speaker even in challenging acoustic condi...
research
04/03/2016

Multi-Bias Non-linear Activation in Deep Neural Networks

As a widely used non-linear activation, Rectified Linear Unit (ReLU) sep...
research
10/27/2022

Exploiting spatial information with the informed complex-valued spatial autoencoder for target speaker extraction

In conventional multichannel audio signal enhancement, spatial and spect...

Please sign up or login with your details

Forgot password? Click here to reset