Feature Enhancement with Deep Feature Losses for Speaker Verification

by   Saurabh Kataria, et al.

Speaker Verification still suffers from the challenge of generalization to novel adverse environments. We leverage on the recent advancements made by deep learning based speech enhancement and propose a feature-domain supervised denoising based solution. We propose to use Deep Feature Loss which optimizes the enhancement network in the hidden activation space of a pre-trained auxiliary speaker embedding network. We experimentally verify the approach on simulated and real data. A simulated testing setup is created using various noise types at different SNR levels. For evaluation on real data, we choose BabyTrain corpus which consists of children recordings in uncontrolled environments. We observe consistent gains in every condition over the state-of-the-art augmented Factorized-TDNN x-vector system. On BabyTrain corpus, we observe relative gains of 10.38 respectively.


page 1

page 2

page 3

page 4


Analysis of Deep Feature Loss based Enhancement for Speaker Verification

Data augmentation is conventionally used to inject robustness in Speaker...

Unsupervised Feature Enhancement for speaker verification

The task of making speaker verification systems robust to adverse scenar...

Single Channel Far Field Feature Enhancement For Speaker Verification In The Wild

We investigated an enhancement and a domain adaptation approach to make ...

How to Leverage DNN-based speech enhancement for multi-channel speaker verification?

Speaker verification (SV) suffers from unsatisfactory performance in far...

Extended U-Net for Speaker Verification in Noisy Environments

Background noise is a well-known factor that deteriorates the accuracy a...

Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition

In this work, we present an analysis of a DNN-based autoencoder for spee...

Please sign up or login with your details

Forgot password? Click here to reset