GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block

06/30/2022
by   Xinmeng Xu, et al.
0

For monaural speech enhancement, contextual information is important for accurate speech estimation. However, commonly used convolution neural networks (CNNs) are weak in capturing temporal contexts since they only build blocks that process one local neighborhood at a time. To address this problem, we learn from human auditory perception to introduce a two-stage trainable reasoning mechanism, referred as global-local dependency (GLD) block. GLD blocks capture long-term dependency of time-frequency bins both in global level and local level from the noisy spectrogram to help detecting correlations among speech part, noise part, and whole noisy input. What is more, we conduct a monaural speech enhancement network called GLD-Net, which adopts encoder-decoder architecture and consists of speech object branch, interference branch, and global noisy branch. The extracted speech feature at global-level and local-level are efficiently reasoned and aggregated in each of the branches. We compare the proposed GLD-Net with existing state-of-art methods on WSJ0 and DEMAND dataset. The results show that GLD-Net outperforms the state-of-the-art methods in terms of PESQ and STOI.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/11/2020

GID-Net: Detecting Human-Object Interaction with Global and Instance Dependency

Since detecting and recognizing individual human or object are not adequ...
research
10/26/2022

Parallel Gated Neural Network With Attention Mechanism For Speech Enhancement

Deep learning algorithm are increasingly used for speech enhancement (SE...
research
10/24/2022

TridentSE: Guiding Speech Enhancement with 32 Global Tokens

In this paper, we present TridentSE, a novel architecture for speech enh...
research
09/22/2022

CMGAN: Conformer-Based Metric-GAN for Monaural Speech Enhancement

Convolution-augmented transformers (Conformers) are recently proposed in...
research
03/18/2021

TSTNN: Two-stage Transformer based Neural Network for Speech Enhancement in the Time Domain

In this paper, we propose a transformer-based architecture, called two-s...
research
10/12/2021

Foster Strengths and Circumvent Weaknesses: a Speech Enhancement Framework with Two-branch Collaborative Learning

Recent single-channel speech enhancement methods usually convert wavefor...
research
11/15/2021

Time-Frequency Attention for Monaural Speech Enhancement

Most studies on speech enhancement generally don't consider the energy d...

Please sign up or login with your details

Forgot password? Click here to reset