Speech Separation Using Gain-Adapted Factorial Hidden Markov Models

01/22/2019
by   Martin H. Radfar, et al.
0

We present a new probabilistic graphical model which generalizes factorial hidden Markov models (FHMM) for the problem of single-channel speech separation (SCSS) in which we wish to separate the two speech signals X(t) and V(t) from a single recording of their mixture Y(t)=X(t)+V(t) using the trained models of the speakers' speech signals. Current techniques assume the data used in the training and test phases of the separation model have the same loudness. In this paper, we introduce GFHMM, gain adapted FHMM, to extend SCSS to the general case in which Y(t)=g_xX(t)+g_vV(t), where g_x and g_v are unknown gain factors. GFHMM consists of two independent-state HMMs and a hidden node which model spectral patterns and gain difference, respectively. A novel inference method is presented using the Viterbi algorithm and quadratic optimization with minimal computational overhead. Experimental results, conducted on 180 mixtures with gain differences from 0 to 15 dB, show that the proposed technique significantly outperforms FHMM and its memoryless counterpart, i.e., vector quantization (VQ)-based SCSS.

READ FULL TEXT
research
03/29/2018

Cracking the cocktail party problem by multi-beam deep attractor network

While recent progresses in neural network approaches to single-channel s...
research
03/14/2023

Towards Real-Time Single-Channel Speech Separation in Noisy and Reverberant Environments

Real-time single-channel speech separation aims to unmix an audio stream...
research
06/29/2017

Speaker Identification in the Shouted Environment Using Suprasegmental Hidden Markov Models

In this paper, Suprasegmental Hidden Markov Models (SPHMMs) have been us...
research
10/22/2019

WHAMR!: Noisy and Reverberant Single-Channel Speech Separation

While significant advances have been made in recent years in the separat...
research
03/30/2022

Coarse-to-Fine Recursive Speech Separation for Unknown Number of Speakers

The vast majority of speech separation methods assume that the number of...
research
04/07/2022

Declipping of Speech Signals Using Frequency Selective Extrapolation

The reconstruction of clipped speech signals is an important task in aud...
research
01/03/2018

An Analysis of Two Common Reference Points for EEGs

Clinical electroencephalographic (EEG) data varies significantly dependi...

Please sign up or login with your details

Forgot password? Click here to reset