PCNN: A Lightweight Parallel Conformer Neural Network for Efficient Monaural Speech Enhancement

07/28/2023
by   Xinmeng Xu, et al.
0

Convolutional neural networks (CNN) and Transformer have wildly succeeded in multimedia applications. However, more effort needs to be made to harmonize these two architectures effectively to satisfy speech enhancement. This paper aims to unify these two architectures and presents a Parallel Conformer for speech enhancement. In particular, the CNN and the self-attention (SA) in the Transformer are fully exploited for local format patterns and global structure representations. Based on the small receptive field size of CNN and the high computational complexity of SA, we specially designed a multi-branch dilated convolution (MBDC) and a self-channel-time-frequency attention (Self-CTFA) module. MBDC contains three convolutional layers with different dilation rates for the feature from local to non-local processing. Experimental results show that our method performs better than state-of-the-art methods in most evaluation criteria while maintaining the lowest model parameters.

READ FULL TEXT

page 2

page 3

research
08/04/2023

Efficient Monaural Speech Enhancement using Spectrum Attention Fusion

Speech enhancement is a demanding task in automated speech processing pi...
research
04/06/2022

FFC-SE: Fast Fourier Convolution for Speech Enhancement

Fast Fourier convolution (FFC) is the recently proposed neural operator ...
research
12/07/2022

Selector-Enhancer: Learning Dynamic Selection of Local and Non-local Attention Operation for Speech Enhancement

Attention mechanisms, such as local and non-local attention, play a fund...
research
06/30/2021

DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement

Single-channel speech enhancement (SE) is an important task in speech pr...
research
12/11/2021

U-shaped Transformer with Frequency-Band Aware Attention for Speech Enhancement

The state-of-the-art speech enhancement has limited performance in speec...
research
06/01/2023

A Multi-dimensional Deep Structured State Space Approach to Speech Enhancement Using Small-footprint Models

We propose a multi-dimensional structured state space (S4) approach to s...
research
07/25/2020

Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement

This paper investigates different trade-offs between the number of model...

Please sign up or login with your details

Forgot password? Click here to reset