Know Your Enemy, Know Yourself: A Unified Two-Stage Framework for Speech Enhancement

10/27/2021
by   Wenzhe Liu, et al.
0

Traditional spectral subtraction-type single channel speech enhancement (SE) algorithms often need to estimate interference components including noise and/or reverberation before subtracting them while deep neural network-based SE methods often aim to realize the end-to-end target mapping. In this paper, we show that both denoising and dereverberation can be unified into a common problem by introducing a two-stage paradigm, namely for interference components estimation and speech recovery. In the first stage, we propose to explicitly extract the magnitude of interference components, which serves as the prior information. In the second stage, with the guidance of this estimated magnitude prior, we can expect to better recover the target speech. In addition, we propose a transform module to facilitate the interaction between interference components and the desired speech modalities. Meanwhile, a temporal fusion module is designed to model long-term dependencies without ignoring short-term details. We conduct the experiments on the WSJ0-SI84 corpus and the results on both denoising and dereverberation tasks show that our approach outperforms previous advanced systems and achieves state-of-the-art performance in terms of many objective metrics.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 4

09/05/2021

A Two-stage Complex Network using Cycle-consistent Generative Adversarial Networks for Speech Enhancement

Cycle-consistent generative adversarial networks (CycleGAN) have shown t...
02/16/2022

DBT-Net: Dual-branch federative magnitude and phase estimation with attention-in-attention transformer for monaural speech enhancement

The decoupling-style concept begins to ignite in the speech enhancement ...
03/14/2022

MDNet: Learning Monaural Speech Enhancement from Deep Prior Gradient

While traditional statistical signal processing model-based methods can ...
04/06/2020

WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-end Speech Enhancement

Due to the simple design pipeline, end-to-end (E2E) neural models for sp...
02/02/2018

Monaural Speech Enhancement using Deep Neural Networks by Maximizing a Short-Time Objective Intelligibility Measure

In this paper we propose a Deep Neural Network (DNN) based Speech Enhanc...
04/30/2022

Taylor, Can You Hear Me Now? A Taylor-Unfolding Framework for Monaural Speech Enhancement

While the deep learning techniques promote the rapid development of the ...
12/20/2018

A unified convolutional beamformer for simultaneous denoising and dereverberation

This paper proposes a method for estimating a convolutional beamformer t...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.