Crossed-Time Delay Neural Network for Speaker Recognition

05/31/2020
by   Liang Chen, et al.
0

Time Delay Neural Network (TDNN) is a well-performing structure for DNN-based speaker recognition systems. In this paper we introduce a novel structure Crossed-Time Delay Neural Network (CTDNN) to enhance the performance of current TDNN. Inspired by the multi-filters setting of convolution layer from convolution neural network, we set multiple time delay units each with different context size at the bottom layer and construct a multilayer parallel network. The proposed CTDNN gives significant improvements over original TDNN on both speaker verification and identification tasks. It outperforms in VoxCeleb1 dataset in verification experiment with a 2.6 Rate improvement. In few shots condition CTDNN reaches 90.4 accuracy, which doubles the identification accuracy of original TDNN. We also compare the proposed CTDNN with another new variant of TDNN, FTDNN, which shows that our model has a 36 shots condition and can better handle training of a larger batch in a shorter training time, which better utilize the calculation resources.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/07/2020

LEAP System for SRE19 Challenge – Improvements and Error Analysis

The NIST Speaker Recognition Evaluation - Conversational Telephone Speec...
research
03/01/2023

CAM++: A Fast and Efficient Network for Speaker Verification Using Context-Aware Masking

Time delay neural network (TDNN) has been proven to be efficient for spe...
research
10/24/2019

Delving into VoxCeleb: environment invariant speaker recognition

Research in speaker recognition has recently seen significant progress d...
research
02/07/2020

Understanding and Optimizing Packed Neural Network Training for Hyper-Parameter Tuning

As neural networks are increasingly employed in machine learning practic...
research
10/25/2019

Structural sparsification for Far-field Speaker Recognition with GNA

Recently, deep neural networks (DNN) have been widely used in speaker re...
research
10/28/2017

Investigation of Frame Alignments for GMM-based Text-prompted Speaker Verification

The frame alignment acts as an important role in GMM-based speaker verif...
research
10/26/2021

CS-Rep: Making Speaker Verification Networks Embracing Re-parameterization

Automatic speaker verification (ASV) systems, which determine whether tw...

Please sign up or login with your details

Forgot password? Click here to reset