Speech Enhancement Using Multi-Stage Self-Attentive Temporal Convolutional Networks

02/24/2021
by   Ju Lin, et al.
0

Multi-stage learning is an effective technique to invoke multiple deep-learning modules sequentially. This paper applies multi-stage learning to speech enhancement by using a multi-stage structure, where each stage comprises a self-attention (SA) block followed by stacks of temporal convolutional network (TCN) blocks with doubling dilation factors. Each stage generates a prediction that is refined in a subsequent stage. A fusion block is inserted at the input of later stages to re-inject original information. The resulting multi-stage speech enhancement system, in short, multi-stage SA-TCN, is compared with state-of-the-art deep-learning speech enhancement methods using the LibriSpeech and VCTK data sets. The multi-stage SA-TCN system's hyper-parameters are fine-tuned, and the impact of the SA block, the fusion block and the number of stages are determined. The use of a multi-stage SA-TCN system as a front-end for automatic speech recognition systems is investigated as well. It is shown that the multi-stage SA-TCN systems perform well relative to other state-of-the-art systems in terms of speech enhancement and speech recognition scores.

READ FULL TEXT

page 1

page 3

page 6

page 8

research
12/12/2021

Improving Speech Recognition on Noisy Speech via Speech Enhancement with Multi-Discriminators CycleGAN

This paper presents our latest investigations on improving automatic spe...
research
10/25/2021

Multichannel Speech Enhancement without Beamforming

Deep neural networks are often coupled with traditional spatial filters,...
research
06/22/2021

Learning to Inference with Early Exit in the Progressive Speech Enhancement

In real scenarios, it is often necessary and significant to control the ...
research
05/09/2019

Block-Online Multi-Channel Speech Enhancement Using DNN-Supported Relative Transfer Function Estimates

This paper addresses the problem of block-online processing for multi-ch...
research
03/03/2020

Phonetic Feedback for Speech Enhancement With and Without Parallel Speech Data

While deep learning systems have gained significant ground in speech enh...
research
03/22/2020

A Time-domain Monaural Speech Enhancement with Recursive Learning

In this paper, we propose a type of neural network with recursive learni...

Please sign up or login with your details

Forgot password? Click here to reset