Extended U-Net for Speaker Verification in Noisy Environments

06/27/2022
by   Ju-ho Kim, et al.
0

Background noise is a well-known factor that deteriorates the accuracy and reliability of speaker verification (SV) systems by blurring speech intelligibility. Various studies have used separate pretrained enhancement models as the front-end module of the SV system in noisy environments, and these methods effectively remove noises. However, the denoising process of independent enhancement models not tailored to the SV task can also distort the speaker information included in utterances. We argue that the enhancement network and speaker embedding extractor should be fully jointly trained for SV tasks under noisy conditions to alleviate this issue. Therefore, we proposed a U-Net-based integrated framework that simultaneously optimizes speaker identification and feature enhancement losses. Moreover, we analyzed the structural limitations of using U-Net directly for noise SV tasks and further proposed Extended U-Net to reduce these drawbacks. We evaluated the models on the noise-synthesized VoxCeleb1 test set and VOiCES development set recorded in various noisy scenarios. The experimental results demonstrate that the U-Net-based fully joint training framework is more effective than the baseline, and the extended U-Net exhibited state-of-the-art performance versus the recently proposed compensation systems.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/14/2023

Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models

Background noise considerably reduces the accuracy and reliability of sp...
research
10/03/2021

PL-EESR: Perceptual Loss Based END-TO-END Robust Speaker Representation Extraction

Speech enhancement aims to improve the perceptual quality of the speech ...
research
10/25/2019

Unsupervised Feature Enhancement for speaker verification

The task of making speaker verification systems robust to adverse scenar...
research
06/28/2023

Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information

Previously, Target Speaker Extraction (TSE) has yielded outstanding perf...
research
10/25/2019

Feature Enhancement with Deep Feature Losses for Speaker Verification

Speaker Verification still suffers from the challenge of generalization ...
research
02/01/2020

Analysis of Deep Feature Loss based Enhancement for Speaker Verification

Data augmentation is conventionally used to inject robustness in Speaker...
research
06/23/2022

Speaker-Independent Microphone Identification in Noisy Conditions

This work proposes a method for source device identification from speech...

Please sign up or login with your details

Forgot password? Click here to reset