DeepAI AI Chat
Log In Sign Up

Robust Speech Representation Learning via Flow-based Embedding Regularization

by   Woo Hyun Kang, et al.

Over the recent years, various deep learning-based methods were proposed for extracting a fixed-dimensional embedding vector from speech signals. Although the deep learning-based embedding extraction methods have shown good performance in numerous tasks including speaker verification, language identification and anti-spoofing, their performance is limited when it comes to mismatched conditions due to the variability within them unrelated to the main task. In order to alleviate this problem, we propose a novel training strategy that regularizes the embedding network to have minimum information about the nuisance attributes. To achieve this, our proposed method directly incorporates the information bottleneck scheme into the training process, where the mutual information is estimated using the main task classifier and an auxiliary normalizing flow network. The proposed method was evaluated on different speech processing tasks and showed improvement over the standard training strategy in all experimentation.


page 1

page 2

page 3

page 4


Disentangled speaker and nuisance attribute embedding for robust speaker verification

Over the recent years, various deep learning-based embedding methods hav...

Anti-Spoofing Using Transfer Learning with Variational Information Bottleneck

Recent advances in sophisticated synthetic speech generated from text-to...

SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice Anti-Spoofing

Voice anti-spoofing systems are crucial auxiliaries for automatic speake...

Optimization of data-driven filterbank for automatic speaker verification

Most of the speech processing applications use triangular filters spaced...

UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training

Self-supervised learning (SSL) is a long-standing goal for speech proces...

Time-Domain Based Embeddings for Spoofed Audio Representation

Anti-spoofing is the task of speech authentication. That is, identifying...

Post Training in Deep Learning with Last Kernel

One of the main challenges of deep learning methods is the choice of an ...