Novel Hybrid DNN Approaches for Speaker Verification in Emotional and Stressful Talking Environments

12/26/2021
by   Ismail Shahin, et al.
10

In this work, we conducted an empirical comparative study of the performance of text-independent speaker verification in emotional and stressful environments. This work combined deep models with shallow architecture, which resulted in novel hybrid classifiers. Four distinct hybrid models were utilized: deep neural network-hidden Markov model (DNN-HMM), deep neural network-Gaussian mixture model (DNN-GMM), Gaussian mixture model-deep neural network (GMM-DNN), and hidden Markov model-deep neural network (HMM-DNN). All models were based on novel implemented architecture. The comparative study used three distinct speech datasets: a private Arabic dataset and two public English databases, namely, Speech Under Simulated and Actual Stress (SUSAS) and Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS). The test results of the aforementioned hybrid models demonstrated that the proposed HMM-DNN leveraged the verification performance in emotional and stressful environments. Results also showed that HMM-DNN outperformed all other hybrid models in terms of equal error rate (EER) and area under the curve (AUC) evaluation metrics. The average resulting verification system based on the three datasets yielded EERs of 7.19 and GMM-DNN, respectively. Furthermore, we found that the DNN-GMM model demonstrated the least computational complexity compared to all other hybrid models in both talking environments. Conversely, the HMM-DNN model required the greatest amount of training time. Findings also demonstrated that EER and AUC values depended on the database when comparing average emotional and stressful performances.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/11/2018

Novel Cascaded Gaussian Mixture Model-Deep Neural Network Classifier for Speaker Identification in Emotional Talking Environments

This research is an effort to present an effective approach to enhance t...
research
10/28/2017

Investigation of Frame Alignments for GMM-based Text-prompted Speaker Verification

The frame alignment acts as an important role in GMM-based speaker verif...
research
08/04/2016

An improved uncertainty decoding scheme with weighted samples for DNN-HMM hybrid systems

In this paper, we advance a recently-proposed uncertainty decoding schem...
research
09/29/2019

Speaker Verification in Emotional Talking Environments based on Third-Order Circular Suprasegmental Hidden Markov Model

Speaker verification accuracy in emotional talking environments is not h...
research
06/29/2017

Talking Condition Recognition in Stressful and Emotional Talking Environments Based on CSPHMM2s

This work is aimed at exploiting Second-Order Circular Suprasegmental Hi...
research
01/25/2022

Improved Mispronunciation detection system using a hybrid CTC-ATT based approach for L2 English speakers

This report proposes state-of-the-art research in the field of Computer ...
research
08/13/2018

Parsimonious HMMs for Offline Handwritten Chinese Text Recognition

Recently, hidden Markov models (HMMs) have achieved promising results fo...

Please sign up or login with your details

Forgot password? Click here to reset