Neural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated Full- and Sub-Band Modeling

04/18/2023
by   Zhong-Qiu Wang, et al.
0

We propose FSB-LSTM, a novel long short-term memory (LSTM) based architecture that integrates full- and sub-band (FSB) modeling, for single- and multi-channel speech enhancement in the short-time Fourier transform (STFT) domain. The model maintains an information highway to flow an over-complete input representation through multiple FSB-LSTM modules. Each FSB-LSTM module consists of a full-band block to model spectro-temporal patterns at all frequencies and a sub-band block to model patterns within each sub-band, where each of the two blocks takes a down-sampled representation as input and returns an up-sampled discriminative representation to be added to the block input via a residual connection. The model is designed to have a low algorithmic complexity, a small run-time buffer and a very low algorithmic latency, at the same time producing a strong enhancement performance on a noisy-reverberant speech enhancement task even if the hop size is as low as 2 ms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/23/2022

FullSubNet+: Channel Attention FullSubNet with Complex Spectrograms for Speech Enhancement

Previously proposed FullSubNet has achieved outstanding performance in D...
research
10/29/2020

FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement

This paper proposes a full-band and sub-band fusion model, named as Full...
research
11/17/2020

Ultra-Lightweight Speech Separation via Group Communication

Model size and complexity remain the biggest challenges in the deploymen...
research
04/10/2019

Audio-noise Power Spectral Density Estimation Using Long Short-term Memory

We propose a method using a long short-term memory (LSTM) network to est...
research
09/24/2022

Speech Enhancement with Perceptually-motivated Optimization and Dual Transformations

To address the monaural speech enhancement problem, numerous research st...
research
04/21/2022

STFT-Domain Neural Speech Enhancement with Very Low Algorithmic Latency

Deep learning based speech enhancement in the short-term Fourier transfo...
research
02/19/2019

Low-Latency Deep Clustering For Speech Separation

This paper proposes a low algorithmic latency adaptation of the deep clu...

Please sign up or login with your details

Forgot password? Click here to reset