Training Noisy Single-Channel Speech Separation With Noisy Oracle Sources: A Large Gap and A Small Step

10/23/2020
by   Matthew Maciejewski, et al.
0

As the performance of single-channel speech separation systems has improved, there has been a desire to move to more challenging conditions than the clean, near-field speech that initial systems were developed on. When training deep learning separation models, a need for ground truth leads to training on synthetic mixtures. As such, training in noisy conditions requires either using noise synthetically added to clean speech, preventing the use of in-domain data for a noisy-condition task, or training using mixtures of noisy speech, requiring the network to additionally separate the noise. We demonstrate the relative inseparability of noise and that this noisy speech paradigm leads to significant degradation of system performance. We also propose an SI-SDR-inspired training objective that tries to exploit the inseparability of noise to implicitly partition the signal and discount noise separation errors, enabling the training of better separation systems with noisy oracle sources.

READ FULL TEXT
research
10/22/2019

WHAMR!: Noisy and Reverberant Single-Channel Speech Separation

While significant advances have been made in recent years in the separat...
research
11/06/2018

Building Corpora for Single-Channel Speech Separation Across Multiple Domains

To date, the bulk of research on single-channel speech separation has be...
research
05/07/2021

A Benchmarking on Cloud based Speech-To-Text Services for French Speech and Background Noise Effect

This study presents a large scale benchmarking on cloud based Speech-To-...
research
12/20/2017

Limits for Rumor Spreading in stochastic populations

Biological systems can share and collectively process information to yie...
research
11/15/2022

Reverberation as Supervision for Speech Separation

This paper proposes reverberation as supervision (RAS), a novel unsuperv...
research
07/02/2019

WHAM!: Extending Speech Separation to Noisy Environments

Recent progress in separating the speech signals from multiple overlappi...
research
12/02/2020

Improved MVDR Beamforming Using LSTM Speech Models to Clean Spatial Clustering Masks

Spatial clustering techniques can achieve significant multi-channel nois...

Please sign up or login with your details

Forgot password? Click here to reset