Improving Speech Enhancement via Event-based Query

02/20/2023
by   Yifei Xin, et al.
0

Existing deep learning based speech enhancement (SE) methods either use blind end-to-end training or explicitly incorporate speaker embedding or phonetic information into the SE network to enhance speech quality. In this paper, we perceive speech and noises as different types of sound events and propose an event-based query method for SE. Specifically, representative speech embeddings that can discriminate speech with noises are first pre-trained with the sound event detection (SED) task. The embeddings are then clustered into fixed golden speech queries to assist the SE network to enhance the speech from noisy audio. The golden speech queries can be obtained offline and generalizable to different SE datasets and networks. Therefore, little extra complexity is introduced and no enrollment is needed for each speaker. Experimental results show that the proposed method yields significant gains compared with baselines and the golden queries are well generalized to different datasets.

READ FULL TEXT
research
03/22/2022

Joint Noise Reduction and Listening Enhancement for Full-End Speech Enhancement

Speech enhancement (SE) methods mainly focus on recovering clean speech ...
research
09/19/2023

Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement

Audio-visual speech enhancement (AV-SE) aims to enhance degraded speech ...
research
08/31/2023

ReZero: Region-customizable Sound Extraction

We introduce region-customizable sound extraction (ReZero), a general an...
research
05/06/2022

Robustness of Neural Architectures for Audio Event Detection

Traditionally, in Audio Recognition pipeline, noise is suppressed by the...
research
08/21/2020

CITISEN: A Deep Learning-Based Speech Signal-Processing Mobile Application

In this paper, we present a deep learning-based speech signal-processing...
research
01/29/2021

Speech Enhancement for Wake-Up-Word detection in Voice Assistants

Keyword spotting and in particular Wake-Up-Word (WUW) detection is a ver...
research
10/06/2020

A Unified Deep Learning Framework for Short-Duration Speaker Verification in Adverse Environments

Speaker verification (SV) has recently attracted considerable research i...

Please sign up or login with your details

Forgot password? Click here to reset