Analysis of DNN Speech Signal Enhancement for Robust Speaker Recognition

11/19/2018
by   Ondrej Novotny, et al.
0

In this work, we present an analysis of a DNN-based autoencoder for speech enhancement, dereverberation and denoising. The target application is a robust speaker verification (SV) system. We start our approach by carefully designing a data augmentation process to cover wide range of acoustic conditions and obtain rich training data for various components of our SV system. We augment several well-known databases used in SV with artificially noised and reverberated data and we use them to train a denoising autoencoder (mapping noisy and reverberated speech to its clean version) as well as an x-vector extractor which is currently considered as state-of-the-art in SV. Later, we use the autoencoder as a preprocessing step for text-independent SV system. We compare results achieved with autoencoder enhancement, multi-condition PLDA training and their simultaneous use. We present a detailed analysis with various conditions of NIST SRE 2010, 2016, PRISM and with re-transmitted data. We conclude that the proposed preprocessing can significantly improve both i-vector and x-vector baselines and that this technique can be used to build a robust SV system for various target domains.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/07/2018

On the use of DNN Autoencoder for Robust Speaker Recognition

In this paper, we present an analysis of a DNN-based autoencoder for spe...
research
11/02/2022

Analysis of Noisy-target Training for DNN-based speech enhancement

Deep neural network (DNN)-based speech enhancement usually uses a clean ...
research
10/25/2019

Unsupervised Feature Enhancement for speaker verification

The task of making speaker verification systems robust to adverse scenar...
research
02/01/2020

Analysis of Deep Feature Loss based Enhancement for Speaker Verification

Data augmentation is conventionally used to inject robustness in Speaker...
research
06/29/2020

Data augmentation versus noise compensation for x- vector speaker recognition systems in noisy environments

The explosion of available speech data and new speaker modeling methods ...
research
03/13/2020

End-to-end Recurrent Denoising Autoencoder Embeddings for Speaker Identification

Speech 'in-the-wild' is a handicap for speaker recognition systems due t...
research
10/25/2019

Feature Enhancement with Deep Feature Losses for Speaker Verification

Speaker Verification still suffers from the challenge of generalization ...

Please sign up or login with your details

Forgot password? Click here to reset