Convolutive Transfer Function Invariant SDR training criteria for Multi-Channel Reverberant Speech Separation

11/30/2020
by   Christoph Boeddeker, et al.
0

Time-domain training criteria have proven to be very effective for the separation of single-channel non-reverberant speech mixtures. Likewise, mask-based beamforming has shown impressive performance in multi-channel reverberant speech enhancement and source separation. Here, we propose to combine neural network supported multi-channel source separation with a time-domain training objective function. For the objective we propose to use a convolutive transfer function invariant Signal-to-Distortion Ratio (CI-SDR) based loss. While this is a well-known evaluation metric (BSS Eval), it has not been used as a training objective before. To show the effectiveness, we demonstrate the performance on LibriSpeech based reverberant mixtures. On this task, the proposed system approaches the error rate obtained on single-source non-reverberant input, i.e., LibriSpeech test_clean, with a difference of only 1.2 percentage points, thus outperforming a conventional permutation invariant training based system and alternative objectives like Scale Invariant Signal-to-Distortion Ratio by a large margin.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2021

Monaural source separation: From anechoic to reverberant environments

Impressive progress in neural network-based single-channel speech source...
research
11/11/2020

Surrogate Source Model Learning for Determined Source Separation

We propose to learn surrogate functions of universal speech priors for d...
research
11/06/2018

SDR - half-baked or well done?

In speech enhancement and source separation, signal-to-noise ratio is a ...
research
03/13/2023

Multi-Microphone Speaker Separation by Spatial Regions

We consider the task of region-based source separation of reverberant mu...
research
11/20/2019

Demystifying TasNet: A Dissecting Approach

In recent years time domain speech separation has excelled over frequenc...
research
10/29/2021

SA-SDR: A novel loss function for separation of meeting style data

Many state-of-the-art neural network-based source separation systems use...
research
04/07/2022

Heterogeneous Target Speech Separation

We introduce a new paradigm for single-channel target source separation ...

Please sign up or login with your details

Forgot password? Click here to reset