Enhanced Factored Three-Way Restricted Boltzmann Machines for Speech Detection

11/01/2016
by   Pengfei Sun, et al.
0

In this letter, we propose enhanced factored three way restricted Boltzmann machines (EFTW-RBMs) for speech detection. The proposed model incorporates conditional feature learning by multiplying the dynamical state of the third unit, which allows a modulation over the visible-hidden node pairs. Instead of stacking previous frames of speech as the third unit in a recursive manner, the correlation related weighting coefficients are assigned to the contextual neighboring frames. Specifically, a threshold function is designed to capture the long-term features and blend the globally stored speech structure. A factored low rank approximation is introduced to reduce the parameters of the three-dimensional interaction tensor, on which non-negative constraint is imposed to address the sparsity characteristic. The validations through the area-under-ROC-curve (AUC) and signal distortion ratio (SDR) show that our approach outperforms several existing 1D and 2D (i.e., time and time-frequency domain) speech detection algorithms in various noisy environments.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/03/2019

GEDI: Gammachirp Envelope Distortion Index for Predicting Intelligibility of Enhanced Speech

In this study, we proposed a new concept, gammachirp envelope distortion...
research
05/27/2019

Modelling conditional probabilities with Riemann-Theta Boltzmann Machines

The probability density function for the visible sector of a Riemann-The...
research
06/19/2018

Restricted Boltzmann Machines: Introduction and Review

The restricted Boltzmann machine is a network of stochastic units with u...
research
03/27/2023

Spatial-photonic Boltzmann machines: low-rank combinatorial optimization and statistical learning by spatial light modulation

The spatial-photonic Ising machine (SPIM) [D. Pierangeli et al., Phys. R...
research
04/26/2021

Complex Neural Spatial Filter: Enhancing Multi-channel Target Speech Separation in Complex Domain

To date, mainstream target speech separation (TSS) approaches are formul...
research
03/28/2020

Unsupervised feature learning for speech using correspondence and Siamese networks

In zero-resource settings where transcribed speech audio is unavailable,...

Please sign up or login with your details

Forgot password? Click here to reset