Time-Frequency Attention for Monaural Speech Enhancement

11/15/2021
by   Qiquan Zhang, et al.
0

Most studies on speech enhancement generally don't consider the energy distribution of speech in time-frequency (T-F) representation, which is important for accurate prediction of mask or spectra. In this paper, we present a simple yet effective T-F attention (TFA) module, where a 2-D attention map is produced to provide differentiated weights to the spectral components of T-F representation. To validate the effectiveness of our proposed TFA module, we use the residual temporal convolution network (ResTCN) as the backbone network and conduct extensive experiments on two commonly used training targets. Our experiments demonstrate that applying our TFA module significantly improves the performance in terms of five objective evaluation metrics with negligible parameter overhead. The evaluation results show that the proposed ResTCN with the TFA module (ResTCN+TFA) consistently outperforms other baselines by a large margin.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/08/2021

Phoneme-based Distribution Regularization for Speech Enhancement

Existing speech enhancement methods mainly separate speech from noises a...
research
04/07/2019

VoiceID Loss: Speech Enhancement for Speaker Verification

In this paper, we propose VoiceID loss, a novel loss function for traini...
research
02/03/2021

Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses

Deep complex U-Net structure and convolutional recurrent network (CRN) s...
research
09/24/2022

Speech Enhancement with Perceptually-motivated Optimization and Dual Transformations

To address the monaural speech enhancement problem, numerous research st...
research
09/01/2021

Embedding and Beamforming: All-neural Causal Beamformer for Multichannel Speech Enhancement

The spatial covariance matrix has been considered to be significant for ...
research
06/30/2022

GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block

For monaural speech enhancement, contextual information is important for...
research
10/24/2022

TridentSE: Guiding Speech Enhancement with 32 Global Tokens

In this paper, we present TridentSE, a novel architecture for speech enh...

Please sign up or login with your details

Forgot password? Click here to reset