Integrating Uncertainty into Neural Network-based Speech Enhancement

by   Huajian Fang, et al.

Supervised masking approaches in the time-frequency domain aim to employ deep neural networks to estimate a multiplicative mask to extract clean speech. This leads to a single estimate for each input without any guarantees or measures of reliability. In this paper, we study the benefits of modeling uncertainty in clean speech estimation. Prediction uncertainty is typically categorized into aleatoric uncertainty and epistemic uncertainty. The former refers to inherent randomness in data, while the latter describes uncertainty in the model parameters. In this work, we propose a framework to jointly model aleatoric and epistemic uncertainties in neural network-based speech enhancement. The proposed approach captures aleatoric uncertainty by estimating the statistical moments of the speech posterior distribution and explicitly incorporates the uncertainty estimate to further improve clean speech estimation. For epistemic uncertainty, we investigate two Bayesian deep learning approaches: Monte Carlo dropout and Deep ensembles to quantify the uncertainty of the neural network parameters. Our analyses show that the proposed framework promotes capturing practical and reliable uncertainty, while combining different sources of uncertainties yields more reliable predictive uncertainty estimates. Furthermore, we demonstrate the benefits of modeling uncertainty on speech enhancement performance by evaluating the framework on different datasets, exhibiting notable improvement over comparable models that fail to account for uncertainty.


page 1

page 7

page 8

page 10


Uncertainty Estimation in Deep Speech Enhancement Using Complex Gaussian Mixture Models

Single-channel deep speech enhancement approaches often estimate a singl...

Integrating Statistical Uncertainty into Neural Network-Based Speech Enhancement

Speech enhancement in the time-frequency domain is often performed by es...

A General Framework for quantifying Aleatoric and Epistemic uncertainty in Graph Neural Networks

Graph Neural Networks (GNN) provide a powerful framework that elegantly ...

A Bayesian Convolutional Neural Network for Robust Galaxy Ellipticity Regression

Cosmic shear estimation is an essential scientific goal for large galaxy...

Bayesian Inductive Learner for Graph Resiliency under uncertainty

In the quest to improve efficiency, interdependence and complexity are b...

dugMatting: Decomposed-Uncertainty-Guided Matting

Cutting out an object and estimating its opacity mask, known as image ma...

Prediction Uncertainty Estimation for Hate Speech Classification

As a result of social network popularity, in recent years, hate speech p...

Please sign up or login with your details

Forgot password? Click here to reset