DeepAI AI Chat
Log In Sign Up

Speaker Verification in Emotional Talking Environments based on Three-Stage Framework

03/31/2018
by   Ismail Shahin, et al.
0

This work is dedicated to introducing, executing, and assessing a three-stage speaker verification framework to enhance the degraded speaker verification performance in emotional talking environments. Our framework is comprised of three cascaded stages: gender identification stage followed by an emotion identification stage followed by a speaker verification stage. The proposed framework has been assessed on two distinct and independent emotional speech datasets: our collected dataset and Emotional Prosody Speech and Transcripts dataset. Our results demonstrate that speaker verification based on both gender cues and emotion cues is superior to each of speaker verification based on gender cues only, emotion cues only, and neither gender cues nor emotion cues. The achieved average speaker verification performance based on the suggested methodology is very similar to that attained in subjective assessment by human listeners.

READ FULL TEXT

page 1

page 2

page 3

page 4

09/03/2018

Three-Stage Speaker Verification Architecture in Emotional Talking Environments

Speaker verification performance in neutral talking environment is usual...
07/01/2017

Employing Emotion Cues to Verify Speakers in Emotional Talking Environments

Usually, people talk neutrally in environments where there are no abnorm...
01/22/2018

Identifying Speakers Using Their Emotion Cues

This paper addresses the formulation of a new speaker identification app...
08/30/2018

Contribution of Glottal Waveform in Speech Emotion: A Comparative Pairwise Investigation

In this work, we investigated the contribution of the glottal waveform i...
08/07/2020

Disentangled speaker and nuisance attribute embedding for robust speaker verification

Over the recent years, various deep learning-based embedding methods hav...
07/16/2018

Subjective and objective experiments on the influence of speaker's gender on the unvoiced segments

Subjective and objective experiments are conducted to understand the ext...