Analysis of Deep Feature Loss based Enhancement for Speaker Verification

02/01/2020
by   Saurabh Kataria, et al.
0

Data augmentation is conventionally used to inject robustness in Speaker Verification systems. Several recently organized challenges focus on handling novel acoustic environments. Deep learning based speech enhancement is a modern solution for this. Recently, a study proposed to optimize the enhancement network in the activation space of a pre-trained auxiliary network. This methodology, called deep feature loss, greatly improved over the state-of-the-art conventional x-vector based system on a children speech dataset called BabyTrain. This work analyzes various facets of that approach and asks few novel questions in that context. We first search for optimal number of auxiliary network activations, training data, and enhancement feature dimension. Experiments reveal the importance of Signal-to-Noise Ratio filtering that we employ to create a large, clean, and naturalistic corpus for enhancement network training. To counter the "mismatch" problem in enhancement, we find enhancing front-end (x-vector network) data helpful while harmful for the back-end (Probabilistic Linear Discriminant Analysis (PLDA)). Importantly, we find enhanced signals contain complementary information to original. Established by combining them in front-end, this gives  40 improvement over the baseline. We also do an ablation study to remove a noise class from x-vector data augmentation and, for such systems, we establish the utility of enhancement regardless of whether it has seen that noise class itself during training. Finally, we design several dereverberation schemes to conclude ineffectiveness of deep feature loss enhancement scheme for this task.

READ FULL TEXT
research
10/25/2019

Feature Enhancement with Deep Feature Losses for Speaker Verification

Speaker Verification still suffers from the challenge of generalization ...
research
04/07/2019

VoiceID Loss: Speech Enhancement for Speaker Verification

In this paper, we propose VoiceID loss, a novel loss function for traini...
research
10/25/2019

Unsupervised Feature Enhancement for speaker verification

The task of making speaker verification systems robust to adverse scenar...
research
11/14/2022

The Potential of Neural Speech Synthesis-based Data Augmentation for Personalized Speech Enhancement

With the advances in deep learning, speech enhancement systems benefited...
research
11/19/2018

Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition

In this work, we present an analysis of a DNN-based autoencoder for spee...
research
12/10/2021

Learning-based personal speech enhancement for teleconferencing by exploiting spatial-spectral features

Teleconferencing is becoming essential during the COVID-19 pandemic. How...
research
06/27/2022

Extended U-Net for Speaker Verification in Noisy Environments

Background noise is a well-known factor that deteriorates the accuracy a...

Please sign up or login with your details

Forgot password? Click here to reset