Deep Neural Network Embeddings with Gating Mechanisms for Text-Independent Speaker Verification

03/28/2019
by   Lanhua You, et al.
0

In this paper, gating mechanisms are applied in deep neural network (DNN) training for x-vector-based text-independent speaker verification. First, a gated convolution neural network (GCNN) is employed for modeling the frame-level embedding layers. Compared with the time-delay DNN (TDNN), the GCNN can obtain more expressive frame-level representations through carefully designed memory cell and gating mechanisms. Moreover, we propose a novel gated-attention statistics pooling strategy in which the attention scores are shared with the output gate. The gated-attention statistics pooling combines both gating and attention mechanisms into one framework; therefore, we can capture more useful information in the temporal pooling layer. Experiments are carried out using the NIST SRE16 and SRE18 evaluation datasets. The results demonstrate the effectiveness of the GCNN and show that the proposed gated-attention statistics pooling can further improve the performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/14/2020

An Improved Deep Neural Network for Modeling Speaker Characteristics at Different Temporal Scales

This paper presents an improved deep embedding learning method based on ...
research
03/28/2019

Deep Neural Network Embedding Learning with High-Order Statistics for Text-Independent Speaker Verification

The x-vector based deep neural network (DNN) embedding systems have demo...
research
02/21/2019

Deep Speaker Embedding Learning with Multi-Level Pooling for Text-Independent Speaker Verification

This paper aims to improve the widely used deep speaker embedding x-vect...
research
05/10/2021

Study on the temporal pooling used in deep neural networks for speaker verification

The x-vector architecture has recently achieved state-of-the-art results...
research
04/15/2022

Email Spam Detection Using Hierarchical Attention Hybrid Deep Learning Method

Email is one of the most widely used ways to communicate, with millions ...
research
11/06/2021

Class Token and Knowledge Distillation for Multi-head Self-Attention Speaker Verification Systems

This paper explores three novel approaches to improve the performance of...
research
10/16/2019

Frequency and temporal convolutional attention for text-independent speaker recognition

Majority of the recent approaches for text-independent speaker recogniti...

Please sign up or login with your details

Forgot password? Click here to reset