Speech Replay Detection with x-Vector Attack Embeddings and Spectral Features

09/23/2019
by   Jennifer Williams, et al.
0

We present our system submission to the ASVspoof 2019 Challenge Physical Access (PA) task. The objective for this challenge was to develop a countermeasure that identifies speech audio as either bona fide or intercepted and replayed. The target prediction was a value indicating that a speech segment was bona fide (positive values) or "spoofed" (negative values). Our system used convolutional neural networks (CNNs) and a representation of the speech audio that combined x-vector attack embeddings with signal processing features. The x-vector attack embeddings were created from mel-frequency cepstral coefficients (MFCCs) using a time-delay neural network (TDNN). These embeddings jointly modeled 27 different environments and 9 types of attacks from the labeled data. We also used sub-band spectral centroid magnitude coefficients (SCMCs) as features. We included an additive Gaussian noise layer during training as a way to augment the data to make our system more robust to previously unseen attack examples. We report system performance using the tandem detection cost function (tDCF) and equal error rate (EER). Our approach performed better that both of the challenge baselines. Our technique suggests that our x-vector attack embeddings can help regularize the CNN predictions even when environments or attacks are more challenging.

READ FULL TEXT

page 2

page 3

research
08/02/2022

Audio Deepfake Detection Based on a Combination of F0 Information and Real Plus Imaginary Spectrogram Features

Recently, pioneer research works have proposed a large number of acousti...
research
10/01/2018

Convolutional Neural Networks and x-vector Embedding for DCASE2018 Acoustic Scene Classification Challenge

In this paper, the Brno University of Technology (BUT) team submissions ...
research
03/07/2021

An Optimized Signal Processing Pipeline for Syllable Detection and Speech Rate Estimation

Syllable detection is an important speech analysis task with application...
research
06/30/2019

Deep Residual Neural Networks for Audio Spoofing Detection

The state-of-art models for speech synthesis and voice conversion are ca...
research
08/08/2020

Audio Spoofing Verification using Deep Convolutional Neural Networks by Transfer Learning

Automatic Speaker Verification systems are gaining popularity these days...
research
03/28/2022

Attacker Attribution of Audio Deepfakes

Deepfakes are synthetically generated media often devised with malicious...
research
04/25/2023

NUANCE: Near Ultrasound Attack On Networked Communication Environments

This study investigates a primary inaudible attack vector on Amazon Alex...

Please sign up or login with your details

Forgot password? Click here to reset