WHAMR!: Noisy and Reverberant Single-Channel Speech Separation

10/22/2019
by   Matthew Maciejewski, et al.
0

While significant advances have been made in recent years in the separation of overlapping speech signals, studies have been largely constrained to mixtures of clean, near-field speech, not representative of many real-world scenarios. Although the WHAM! dataset introduced noise to the ubiquitous wsj0-2mix dataset, it did not include the addition of reverberation, generally present in indoor recordings outside of recording studios. The spectral smearing caused by reverberation can result in significant performance degradation for standard deep learning-based speech separation systems, which rely on spectral structure and the sparsity of speech signals to tease apart sources. To address this, we introduce WHAMR!, an augmented version of WHAM! with synthetic reverberated sources, and provide a thorough baseline analysis of current techniques as well as novel cascaded architectures on the newly introduced conditions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/23/2020

Training Noisy Single-Channel Speech Separation With Noisy Oracle Sources: A Large Gap and A Small Step

As the performance of single-channel speech separation systems has impro...
research
07/02/2019

WHAM!: Extending Speech Separation to Noisy Environments

Recent progress in separating the speech signals from multiple overlappi...
research
11/06/2018

Building Corpora for Single-Channel Speech Separation Across Multiple Domains

To date, the bulk of research on single-channel speech separation has be...
research
05/25/2023

Towards Solving Cocktail-Party: The First Method to Build a Realistic Dataset with Ground Truths for Speech Separation

Speech separation is very important in real-world applications such as h...
research
10/20/2021

Adapting Speech Separation to Real-World Meetings Using Mixture Invariant Training

The recently-proposed mixture invariant training (MixIT) is an unsupervi...
research
01/22/2019

Speech Separation Using Gain-Adapted Factorial Hidden Markov Models

We present a new probabilistic graphical model which generalizes factori...
research
05/20/2020

SADDEL: Joint Speech Separation and Denoising Model based on Multitask Learning

Speech data collected in real-world scenarios often encounters two issue...

Please sign up or login with your details

Forgot password? Click here to reset