Frequency bin-wise single channel speech presence probability estimation using multiple DNNs

02/23/2023
by   Shuai Tao, et al.
0

In this work, we propose a frequency bin-wise method to estimate the single-channel speech presence probability (SPP) with multiple deep neural networks (DNNs) in the short-time Fourier transform domain. Since all frequency bins are typically considered simultaneously as input features for conventional DNN-based SPP estimators, high model complexity is inevitable. To reduce the model complexity and the requirements on the training data, we take a single frequency bin and some of its neighboring frequency bins into account to train separate gate recurrent units. In addition, the noisy speech and the a posteriori probability SPP representation are used to train our model. The experiments were performed on the Deep Noise Suppression challenge dataset. The experimental results show that the speech detection accuracy can be improved when we employ the frequency bin-wise model. Finally, we also demonstrate that our proposed method outperforms most of the state-of-the-art SPP estimation methods in terms of speech detection accuracy and model complexity.

READ FULL TEXT
research
07/02/2018

Waveform to Single Sinusoid Regression to Estimate the F0 Contour from Noisy Speech Using Recurrent Deep Neural Networks

The fundamental frequency (F0) represents pitch in speech that determine...
research
05/08/2018

A Regression Model of Recurrent Deep Neural Networks for Noise Robust Estimation of the Fundamental Frequency Contour of Speech

The fundamental frequency (F0) contour of speech is a key aspect to repr...
research
04/10/2019

Audio-noise Power Spectral Density Estimation Using Long Short-term Memory

We propose a method using a long short-term memory (LSTM) network to est...
research
03/23/2022

Deep Frequency Filtering for Domain Generalization

Improving the generalization capability of Deep Neural Networks (DNNs) i...
research
02/20/2023

A DNN based Normalized Time-frequency Weighted Criterion for Robust Wideband DoA Estimation

Deep neural networks (DNNs) have greatly benefited direction of arrival ...
research
08/07/2019

Pitch-Synchronous Single Frequency Filtering Spectrogram for Speech Emotion Recognition

Convolutional neural networks (CNN) are widely used for speech emotion r...
research
03/02/2021

Open Range Pitch Tracking for Carrier Frequency Difference Estimation from HF Transmitted Speech

In this paper we investigate the task of detecting carrier frequency dif...

Please sign up or login with your details

Forgot password? Click here to reset