Phase perturbation improves channel robustness for speech spoofing countermeasures

06/06/2023
by   Yongyi Zang, et al.
0

In this paper, we aim to address the problem of channel robustness in speech countermeasure (CM) systems, which are used to distinguish synthetic speech from human natural speech. On the basis of two hypotheses, we suggest an approach for perturbing phase information during the training of time-domain CM systems. Communication networks often employ lossy compression codec that encodes only magnitude information, therefore heavily altering phase information. Also, state-of-the-art CM systems rely on phase information to identify spoofed speech. Thus, we believe the information loss in the phase domain induced by lossy compression codec degrades the performance of the unseen channel. We first establish the dependence of time-domain CM systems on phase information by perturbing phase in evaluation, showing strong degradation. Then, we demonstrated that perturbing phase during training leads to a significant performance improvement, whereas perturbing magnitude leads to further degradation.

READ FULL TEXT
research
04/03/2021

An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems

Spoofing countermeasure (CM) systems are critical in speaker verificatio...
research
03/21/2022

Phase-Aware Spoof Speech Detection Based on Res2Net with Phase Network

The spoof speech detection (SSD) is the essential countermeasure for aut...
research
07/26/2021

UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021

In this paper, we present UR-AIR system submission to the logical access...
research
02/24/2022

Phase Continuity: Learning Derivatives of Phase Spectrum for Speech Enhancement

Modern neural speech enhancement models usually include various forms of...
research
03/08/2019

A Deep Generative Model of Speech Complex Spectrograms

This paper proposes an approach to the joint modeling of the short-time ...
research
08/11/2021

On The Compensation Between Magnitude and Phase in Speech Separation

Deep neural network (DNN) based end-to-end optimization in the complex t...

Please sign up or login with your details

Forgot password? Click here to reset