Optimization of data-driven filterbank for automatic speaker verification

07/21/2020
by   Susanta Sarangi, et al.
0

Most of the speech processing applications use triangular filters spaced in mel-scale for feature extraction. In this paper, we propose a new data-driven filter design method which optimizes filter parameters from a given speech data. First, we introduce a frame-selection based approach for developing speech-signal-based frequency warping scale. Then, we propose a new method for computing the filter frequency responses by using principal component analysis (PCA). The main advantage of the proposed method over the recently introduced deep learning based methods is that it requires very limited amount of unlabeled speech-data. We demonstrate that the proposed filterbank has more speaker discriminative power than commonly used mel filterbank as well as existing data-driven filterbank. We conduct automatic speaker verification (ASV) experiments with different corpora using various classifier back-ends. We show that the acoustic features created with proposed filterbank are better than existing mel-frequency cepstral coefficients (MFCCs) and speech-signal-based frequency cepstral coefficients (SFCCs) in most cases. In the experiments with VoxCeleb1 and popular i-vector back-end, we observe 9.75 relative improvement in equal error rate (EER) over MFCCs. Similarly, the relative improvement is 4.43 obtain further improvement using fusion of the proposed method with standard MFCC-based approach.

READ FULL TEXT
research
11/24/2022

A new Speech Feature Fusion method with cross gate parallel CNN for Speaker Recognition

In this paper, a new speech feature fusion method is proposed for speake...
research
05/03/2023

Improved Vocal Effort Transfer Vector Estimation for Vocal Effort-Robust Speaker Verification

Despite the maturity of modern speaker verification technology, its perf...
research
02/10/2022

Learnable Nonlinear Compression for Robust Speaker Verification

In this study, we focus on nonlinear compression methods in spectral fea...
research
07/24/2015

The SYSU System for the Interspeech 2015 Automatic Speaker Verification Spoofing and Countermeasures Challenge

Many existing speaker verification systems are reported to be vulnerable...
research
02/20/2021

Learnable MFCCs for Speaker Verification

We propose a learnable mel-frequency cepstral coefficient (MFCC) fronten...
research
12/07/2021

Robust Speech Representation Learning via Flow-based Embedding Regularization

Over the recent years, various deep learning-based methods were proposed...
research
03/07/2021

An Optimized Signal Processing Pipeline for Syllable Detection and Speech Rate Estimation

Syllable detection is an important speech analysis task with application...

Please sign up or login with your details

Forgot password? Click here to reset