A scalable noisy speech dataset and online subjective test framework

09/17/2019
by   Chandan K A Reddy, et al.
0

Background noise is a major source of quality impairments in Voice over Internet Protocol (VoIP) and Public Switched Telephone Network (PSTN) calls. Recent work shows the efficacy of deep learning for noise suppression, but the datasets have been relatively small compared to those used in other domains (e.g., ImageNet) and the associated evaluations have been more focused. In order to better facilitate deep learning research in Speech Enhancement, we present a noisy speech dataset (MS-SNSD) that can scale to arbitrary sizes depending on the number of speakers, noise types, and Speech to Noise Ratio (SNR) levels desired. We show that increasing dataset sizes increases noise suppression performance as expected. In addition, we provide an open-source evaluation methodology to evaluate the results subjectively at scale using crowdsourcing, with a reference algorithm to normalize the results. To demonstrate the dataset and evaluation framework we apply it to several noise suppressors and compare the subjective Mean Opinion Score (MOS) with objective quality measures such as SNR, PESQ, POLQA, and VISQOL and show why MOS is still required. Our subjective MOS evaluation is the first large scale evaluation of Speech Enhancement algorithms that we are aware of.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/17/2021

Dual-Stage Low-Complexity Reconfigurable Speech Enhancement

This paper proposes a dual-stage, low complexity, and reconfigurable tec...
research
09/10/2020

ICASSP 2021 Acoustic Echo Cancellation Challenge: Datasets and Testing Framework

The ICASSP 2021 Acoustic Echo Cancellation Challenge is intended to stim...
research
04/02/2021

INTERSPEECH 2021 ConferencingSpeech Challenge: Towards Far-field Multi-Channel Speech Enhancement for Video Conferencing

The ConferencingSpeech 2021 challenge is proposed to stimulate research ...
research
08/09/2022

Subjective Evaluation of Deep Neural Network Based Speech Enhancement Systems in Real-World Conditions

Subjective evaluation results for two low-latency deep neural networks (...
research
01/29/2020

Environment-aware Reconfigurable Noise Suppression

The paper proposes an efficient, robust, and reconfigurable technique to...
research
03/04/2022

PercepNet+: A Phase and SNR Aware PercepNet for Real-Time Speech Enhancement

PercepNet, a recent extension of the RNNoise, an efficient, high-quality...
research
02/27/2022

ICASSP 2022 Acoustic Echo Cancellation Challenge

The ICASSP 2022 Acoustic Echo Cancellation Challenge is intended to stim...

Please sign up or login with your details

Forgot password? Click here to reset