SA-SDR: A novel loss function for separation of meeting style data

10/29/2021
by   Thilo von Neumann, et al.
0

Many state-of-the-art neural network-based source separation systems use the averaged Signal-to-Distortion Ratio (SDR) as a training objective function. The basic SDR is, however, undefined if the network reconstructs the reference signal perfectly or if the reference signal contains silence, e.g., when a two-output separator processes a single-speaker recording. Many modifications to the plain SDR have been proposed that trade-off between making the loss more robust and distorting its value. We propose to switch from a mean over the SDRs of each individual output channel to a global SDR over all output channels at the same time, which we call source-aggregated SDR (SA-SDR). This makes the loss robust against silence and perfect reconstruction as long as at least one reference signal is not silent. We experimentally show that our proposed SA-SDR is more stable and preferable over other well-known modifications when processing meeting-style data that typically contains many silent or single-speaker regions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/06/2019

The sound of my voice: speaker representation loss for target voice separation

Research on content and style representations has been widely studied in...
research
11/30/2020

Convolutive Transfer Function Invariant SDR training criteria for Multi-Channel Reverberant Speech Separation

Time-domain training criteria have proven to be very effective for the s...
research
06/24/2019

Single-Channel Speech Separation with Auxiliary Speaker Embeddings

We present a novel source separation model to decompose asingle-channel ...
research
11/09/2020

Guided Source Separation

State-of-the-art separation of desired signal components from a mixture ...
research
03/13/2023

Multi-Microphone Speaker Separation by Spatial Regions

We consider the task of region-based source separation of reverberant mu...
research
12/12/2018

Separation of water and fat signal in whole-body gradient echo scans using convolutional neural networks

Purpose: To perform and evaluate water and fat signal separation of whol...
research
10/13/2021

Deep Metric Learning with Locality Sensitive Angular Loss for Self-Correcting Source Separation of Neural Spiking Signals

Neurophysiological time series, such as electromyographic signal and int...

Please sign up or login with your details

Forgot password? Click here to reset