Vocal Style Factorization for Effective Speaker Recognition in Affective Scenarios

05/13/2023
by   Morgan Sandler, et al.
0

The accuracy of automated speaker recognition is negatively impacted by change in emotions in a person's speech. In this paper, we hypothesize that speaker identity is composed of various vocal style factors that may be learned from unlabeled data and re-combined using a neural network architecture to generate holistic speaker identity representations for affective scenarios. In this regard we propose the E-Vector architecture, composed of a 1-D CNN for learning speaker identity features and a vocal style factorization technique for determining vocal styles. Experiments conducted on the MSP-Podcast dataset demonstrate that the proposed architecture improves state-of-the-art speaker recognition accuracy in the affective domain over baseline ECAPA-TDNN speaker recognition models. For instance, the true match rate at a false match rate of 1

READ FULL TEXT
research
11/15/2022

Is Style All You Need? Dependencies Between Emotion and GST-based Speaker Recognition

In this work, we study the hypothesis that speaker identity embeddings e...
research
12/09/2020

DeepTalk: Vocal Style Encoding for Speaker Recognition and Speech Synthesis

Automatic speaker recognition algorithms typically characterize speech a...
research
11/02/2022

Multi-Speaker Multi-Style Speech Synthesis with Timbre and Style Disentanglement

Disentanglement of a speaker's timbre and style is very important for st...
research
01/21/2021

A Study of F0 Modification for X-Vector Based Speech Pseudonymization Across Gender

Speech pseudonymization aims at altering a speech signal to map the iden...
research
09/15/2023

PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-to-Speech Using Natural Language Descriptions

We propose PromptTTS++, a prompt-based text-to-speech (TTS) synthesis sy...
research
08/08/2020

Variable frame rate-based data augmentation to handle speaking-style variability for automatic speaker verification

The effects of speaking-style variability on automatic speaker verificat...
research
02/27/2018

Deep factorization for speech signal

Various informative factors mixed in speech signals, leading to great di...

Please sign up or login with your details

Forgot password? Click here to reset