Speech Denoising in the Waveform Domain with Self-Attention

02/15/2022
by   Zhifeng Kong, et al.
6

In this work, we present CleanUNet, a causal speech denoising model on the raw waveform. The proposed model is based on an encoder-decoder architecture combined with several self-attention blocks to refine its bottleneck representations, which is crucial to obtain good results. The model is optimized through a set of losses defined over both waveform and multi-resolution spectrograms. The proposed method outperforms the state-of-the-art models in terms of denoised speech quality from various objective and subjective evaluation metrics.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/12/2023

CleanUNet 2: A Hybrid Speech Denoising Model on Waveform and Spectrogram

In this work, we present CleanUNet 2, a speech denoising model that comb...
research
06/23/2020

Real Time Speech Enhancement in the Waveform Domain

We present a causal speech enhancement model working on the raw waveform...
research
10/01/2019

State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D Convolutions

Self-attention has been a huge success for many downstream tasks in NLP,...
research
09/03/2020

Dense CNN with Self-Attention for Time-Domain Speech Enhancement

Speech enhancement in the time domain is becoming increasingly popular i...
research
02/06/2022

On Using Transformers for Speech-Separation

Transformers have enabled major improvements in deep learning. They ofte...
research
04/22/2022

FAIR4Cov: Fused Audio Instance and Representation for COVID-19 Detection

Audio-based classification techniques on body sounds have long been stud...
research
04/12/2022

A Post Auto-regressive GAN Vocoder Focused on Spectrum Fracture

Generative adversarial networks (GANs) have been indicated their superio...

Please sign up or login with your details

Forgot password? Click here to reset