Ripple sparse self-attention for monaural speech enhancement

05/15/2023
by   Qiquan Zhang, et al.
0

The use of Transformer represents a recent success in speech enhancement. However, as its core component, self-attention suffers from quadratic complexity, which is computationally prohibited for long speech recordings. Moreover, it allows each time frame to attend to all time frames, neglecting the strong local correlations of speech signals. This study presents a simple yet effective sparse self-attention for speech enhancement, called ripple attention, which simultaneously performs fine- and coarse-grained modeling for local and global dependencies, respectively. Specifically, we employ local band attention to enable each frame to attend to its closest neighbor frames in a window at fine granularity, while employing dilated attention outside the window to model the global dependencies at a coarse granularity. We evaluate the efficacy of our ripple attention for speech enhancement on two commonly used training objectives. Extensive experimental results consistently confirm the superior performance of the ripple attention design over standard full self-attention, blockwise attention, and dual-path attention (Sep-Former) in terms of speech quality and intelligibility.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/19/2022

Axially Expanded Windows for Local-Global Interaction in Vision Transformers

Recently, Transformers have shown promising performance in various visio...
research
08/04/2023

Efficient Monaural Speech Enhancement using Spectrum Attention Fusion

Speech enhancement is a demanding task in automated speech processing pi...
research
10/13/2019

T-GSA: Transformer with Gaussian-weighted self-attention for speech enhancement

Transformer neural networks (TNN) demonstrated state-of-art performance ...
research
10/13/2019

Transformer with Gaussian weighted self-attention for speech enhancement

The Transformer architecture recently replaced recurrent neural networks...
research
12/11/2021

U-shaped Transformer with Frequency-Band Aware Attention for Speech Enhancement

The state-of-the-art speech enhancement has limited performance in speec...
research
06/23/2022

Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes

The SepFormer architecture shows very good results in speech separation....
research
04/15/2022

Improving Frame-Online Neural Speech Enhancement with Overlapped-Frame Prediction

Frame-online speech enhancement systems in the short-time Fourier transf...

Please sign up or login with your details

Forgot password? Click here to reset