DeepAI
Log In Sign Up

Boosting Objective Scores of Speech Enhancement Model through MetricGAN Post-Processing

06/18/2020
by   Szu-Wei Fu, et al.
0

The Transformer architecture has shown its superior ability than recurrent neural networks on many different natural language processing applications. Therefore, this study applies a modified Transformer on the speech enhancement task. Specifically, the positional encoding may not be necessary and hence is replaced by convolutional layers. To further improve PESQ scores of enhanced speech, the L_1 pre-trained Transformer is fine-tuned by MetricGAN framework. The proposed MetricGAN can be treated as a general post-processing module to further boost interested objective scores. The experiments are conducted using the data sets provided by the organizer of the Deep Noise Suppression (DNS) challenge. Experimental results demonstrate that the proposed system outperforms the challenge baseline in both subjective and objective evaluation with a large margin.

READ FULL TEXT
10/13/2019

Transformer with Gaussian weighted self-attention for speech enhancement

The Transformer architecture recently replaced recurrent neural networks...
08/20/2017

An evaluation of intrusive instrumental intelligibility metrics

Instrumental intelligibility metrics are commonly used as an alternative...
10/13/2019

T-GSA: Transformer with Gaussian-weighted self-attention for speech enhancement

Transformer neural networks (TNN) demonstrated state-of-art performance ...
06/20/2019

A Monaural Speech Enhancement Method for Robust Small-Footprint Keyword Spotting

Robustness against noise is critical for keyword spotting (KWS) in real-...
09/01/2021

Embedding and Beamforming: All-neural Causal Beamformer for Multichannel Speech Enhancement

The spatial covariance matrix has been considered to be significant for ...
06/23/2022

Efficient Transformer-based Speech Enhancement Using Long Frames and STFT Magnitudes

The SepFormer architecture shows very good results in speech separation....
06/16/2021

DCCRN+: Channel-wise Subband DCCRN with SNR Estimation for Speech Enhancement

Deep complex convolution recurrent network (DCCRN), which extends CRN wi...