MDNet: Learning Monaural Speech Enhancement from Deep Prior Gradient

03/14/2022
by   Andong Li, et al.
0

While traditional statistical signal processing model-based methods can derive the optimal estimators relying on specific statistical assumptions, current learning-based methods further promote the performance upper bound via deep neural networks but at the expense of high encapsulation and lack adequate interpretability. Standing upon the intersection between traditional model-based methods and learning-based methods, we propose a model-driven approach based on the maximum a posteriori (MAP) framework, termed as MDNet, for single-channel speech enhancement. Specifically, the original problem is formulated into the joint posterior estimation w.r.t. speech and noise components. Different from the manual assumption toward the prior terms, we propose to model the prior distribution via networks and thus can learn from training data. The framework takes the unfolding structure and in each step, the target parameters can be progressively estimated through explicit gradient descent operations. Besides, another network serves as the fusion module to further refine the previous speech estimation. The experiments are conducted on the WSJ0-SI84 and Interspeech2020 DNS-Challenge datasets, and quantitative results show that the proposed approach outshines previous state-of-the-art baselines.

READ FULL TEXT
research
03/14/2022

TaylorBeamformer: Learning All-Neural Beamformer for Multi-Channel Speech Enhancement from Taylor's Approximation Theory

While existing end-to-end beamformers achieve impressive performance in ...
research
10/27/2021

Know Your Enemy, Know Yourself: A Unified Two-Stage Framework for Speech Enhancement

Traditional spectral subtraction-type single channel speech enhancement ...
research
07/28/2020

Neural Kalman Filtering for Speech Enhancement

Statistical signal processing based speech enhancement methods adopt exp...
research
08/31/2018

Single-Microphone Speech Enhancement and Separation Using Deep Learning

The cocktail party problem comprises the challenging task of understandi...
research
03/04/2022

Integrating Statistical Uncertainty into Neural Network-Based Speech Enhancement

Speech enhancement in the time-frequency domain is often performed by es...
research
03/19/2023

A model is worth tens of thousands of examples

Traditional signal processing methods relying on mathematical data gener...
research
04/30/2022

Taylor, Can You Hear Me Now? A Taylor-Unfolding Framework for Monaural Speech Enhancement

While the deep learning techniques promote the rapid development of the ...

Please sign up or login with your details

Forgot password? Click here to reset