Multi-Microphone Speaker Separation by Spatial Regions

03/13/2023
by   Julian Wechsler, et al.
0

We consider the task of region-based source separation of reverberant multi-microphone recordings. We assume pre-defined spatial regions with a single active source per region. The objective is to estimate the signals from the individual spatial regions as captured by a reference microphone while retaining a correspondence between signals and spatial regions. We propose a data-driven approach using a modified version of a state-of-the-art network, where different layers model spatial and spectro-temporal information. The network is trained to enforce a fixed mapping of regions to network outputs. Using speech from LibriMix, we construct a data set specifically designed to contain the region information. Additionally, we train the network with permutation invariant training. We show that both training methods result in a fixed mapping of regions to network outputs, achieve comparable performance, and that the networks exploit spatial information. The proposed network outperforms a baseline network by 1.5 dB in scale-invariant signal-to-distortion ratio.

READ FULL TEXT
research
11/30/2020

Convolutive Transfer Function Invariant SDR training criteria for Multi-Channel Reverberant Speech Separation

Time-domain training criteria have proven to be very effective for the s...
research
04/24/2023

Multi-channel Speech Separation Using Spatially Selective Deep Non-linear Filters

In a multi-channel separation task with multiple speakers, we aim to rec...
research
10/12/2021

Multi-channel Narrow-Band Deep Speech Separation with Full-band Permutation Invariant Training

This paper addresses the problem of multi-channel multi-speech separatio...
research
10/29/2021

SA-SDR: A novel loss function for separation of meeting style data

Many state-of-the-art neural network-based source separation systems use...
research
08/31/2023

ReZero: Region-customizable Sound Extraction

We introduce region-customizable sound extraction (ReZero), a general an...
research
06/09/2018

DIR-ST^2: Delineation of Imprecise Regions Using Spatio--Temporal--Textual Information

An imprecise region is referred to as a geographical area without a clea...
research
07/09/2022

Learning to Separate Voices by Spatial Regions

We consider the problem of audio voice separation for binaural applicati...

Please sign up or login with your details

Forgot password? Click here to reset