When Automatic Voice Disguise Meets Automatic Speaker Verification

09/15/2020
by   Linlin Zheng, et al.
0

The technique of transforming voices in order to hide the real identity of a speaker is called voice disguise, among which automatic voice disguise (AVD) by modifying the spectral and temporal characteristics of voices with miscellaneous algorithms are easily conducted with softwares accessible to the public. AVD has posed great threat to both human listening and automatic speaker verification (ASV). In this paper, we have found that ASV is not only a victim of AVD but could be a tool to beat some simple types of AVD. Firstly, three types of AVD, pitch scaling, vocal tract length normalization (VTLN) and voice conversion (VC), are introduced as representative methods. State-of-the-art ASV methods are subsequently utilized to objectively evaluate the impact of AVD on ASV by equal error rates (EER). Moreover, an approach to restore disguised voice to its original version is proposed by minimizing a function of ASV scores w.r.t. restoration parameters. Experiments are then conducted on disguised voices from Voxceleb, a dataset recorded in real-world noisy scenario. The results have shown that, for the voice disguise by pitch scaling, the proposed approach obtains an EER around 7 EER of a recently proposed baseline using the ratio of fundamental frequencies. The proposed approach generalizes well to restore the disguise with nonlinear frequency warping in VTLN by reducing its EER from 34.3 is difficult to restore the source speakers in VC by our approach, where more complex forms of restoration functions or other paralinguistic cues might be necessary to restore the nonlinear transform in VC. Finally, contrastive visualization on ASV features with and without restoration illustrate the role of the proposed approach in an intuitive way.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 10

11/17/2020

Optimizing voice conversion network with cycle consistency loss of speaker identity

We propose a novel training scheme to optimize voice conversion network ...
08/15/2019

Speaker Verification Using Simple Temporal Features and Pitch Synchronous Cepstral Coefficients

Speaker verification is the process by which a speakers claim of identit...
08/05/2019

V2S attack: building DNN-based voice conversion from automatic speaker verification

This paper presents a new voice impersonation attack using voice convers...
05/18/2020

Defending Your Voice: Adversarial Attack on Voice Conversion

Substantial improvements have been achieved in recent years in voice con...
10/30/2018

Generative Adversarial Networks for Unpaired Voice Transformation on Impaired Speech

This paper focuses on using voice conversion (VC) to improve the speech ...
06/16/2021

Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments

Voice Conversion (VC) is a technique that aims to transform the non-ling...
04/25/2020

Active Voice Authentication

Active authentication refers to a new mode of identity verification in w...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.