Task-aware Warping Factors in Mask-based Speech Enhancement

08/27/2021
by   Qiongqiong Wang, et al.
0

This paper proposes the use of two task-aware warping factors in mask-based speech enhancement (SE). One controls the balance between speech-maintenance and noise-removal in training phases, while the other controls SE power applied to specific downstream tasks in testing phases. Our intention is to alleviate the problem that SE systems trained to improve speech quality often fail to improve other downstream tasks, such as automatic speaker verification (ASV) and automatic speech recognition (ASR), because they do not share the same objects. It is easy to apply the proposed dual-warping factors approach to any mask-based SE method, and it allows a single SE system to handle multiple tasks without task-dependent training. The effectiveness of our proposed approach has been confirmed on the SITW dataset for ASV evaluation and the LibriSpeech dataset for ASR and speech quality evaluations of 0-20dB. We show that different warping values are necessary for a single SE to achieve optimal performance w.r.t. the three tasks. With the use of task-dependent warping factors, speech quality was improved by an 84.7 EER reduction, and ASR had a 52.2 effectiveness of the task-dependent warping factors were also cross-validated on VoxCeleb-1 test set for ASV and LibriSpeech dev-clean set for ASV and quality evaluations. The proposed method is highly effective and easy to apply in practice.

READ FULL TEXT
research
05/23/2023

SE-Bridge: Speech Enhancement with Consistent Brownian Bridge

We propose SE-Bridge, a novel method for speech enhancement (SE). After ...
research
09/15/2022

MVNet: Memory Assistance and Vocal Reinforcement Network for Speech Enhancement

Speech enhancement improves speech quality and promotes the performance ...
research
08/26/2021

Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR

In recent decades, many studies have suggested that phase information is...
research
08/24/2023

Naaloss: Rethinking the objective of speech enhancement

Reducing noise interference is crucial for automatic speech recognition ...
research
05/24/2023

Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss

Self-supervised learning (SSL) is the latest breakthrough in speech proc...
research
07/04/2021

TENET: A Time-reversal Enhancement Network for Noise-robust ASR

Due to the unprecedented breakthroughs brought about by deep learning, s...
research
05/11/2023

Extending Audio Masked Autoencoders Toward Audio Restoration

Audio classification and restoration are among major downstream tasks in...

Please sign up or login with your details

Forgot password? Click here to reset