Detection of Doctored Speech: Towards an End-to-End Parametric Learn-able Filter Approach

06/27/2022
by   Rohit Arora, et al.
0

The Automatic Speaker Verification systems have potential in biometrics applications for logical control access and authentication. A lot of things happen to be at stake if the ASV system is compromised. The preliminary work presents a comparative analysis of the wavelet and MFCC-based state-of-the-art spoof detection techniques developed in these papers, respectively (Novoselov et al., 2016) (Alam et al., 2016a). The results on ASVspoof 2015 justify our inclination towards wavelet-based features instead of MFCC features. The experiments on the ASVspoof 2019 database show the lack of credibility of the traditional handcrafted features and give us more reason to progress towards using end-to-end deep neural networks and more recent techniques. We use Sincnet architecture as our baseline. We get E2E deep learning models, which we call WSTnet and CWTnet, respectively, by replacing the Sinc layer with the Wavelet Scattering and Continuous wavelet transform layers. The fusion model achieved 62 and our Sincnet baseline when evaluated on the modern spoofing attacks in ASVspoof 2019. The final scale distribution and the number of scales used in CWTnet are far from optimal for the task at hand. So to solve this problem, we replaced the CWT layer with a Wavelet Deconvolution(WD) (Khan and Yener, 2018) layer in our CWTnet architecture. This layer calculates the Discrete-Continuous Wavelet Transform similar to the CWTnet but also optimizes the scale parameter using back-propagation. The WDnet model achieved 26 CWTnet and Sincnet models respectively when evaluated over ASVspoof 2019 dataset. This shows that more generalized features are extracted as compared to the features extracted by CWTnet as only the most important and relevant frequency regions are focused upon.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/29/2015

STC Anti-spoofing Systems for the ASVspoof 2015 Challenge

This paper presents the Speech Technology Center (STC) systems submitted...
research
05/23/2019

Glioma Grade Predictions using Scattering Wavelet Transform-Based Radiomics

Glioma grading before the surgery is very critical for the prognosis pre...
research
07/27/2022

End-To-End Audiovisual Feature Fusion for Active Speaker Detection

Active speaker detection plays a vital role in human-machine interaction...
research
06/13/2020

Historical traffic flow data reconstrucion applying Wavelet Transform

Despite the importance of fundamental parameters (traffic flow, density ...
research
03/30/2016

Palmprint Recognition Using Deep Scattering Convolutional Network

Palmprint recognition has drawn a lot of attention during the recent yea...
research
06/19/2018

End-to-End Speech Recognition From the Raw Waveform

State-of-the-art speech recognition systems rely on fixed, hand-crafted ...
research
03/19/2020

Decoding Imagined Speech using Wavelet Features and Deep Neural Networks

This paper proposes a novel approach that uses deep neural networks for ...

Please sign up or login with your details

Forgot password? Click here to reset