Real-time Detection of AI-Generated Speech for DeepFake Voice Conversion

08/24/2023
by   Jordan J. Bird, et al.
0

There are growing implications surrounding generative AI in the speech domain that enable voice cloning and real-time voice conversion from one individual to another. This technology poses a significant ethical threat and could lead to breaches of privacy and misrepresentation, thus there is an urgent need for real-time detection of AI-generated speech for DeepFake Voice Conversion. To address the above emerging issues, the DEEP-VOICE dataset is generated in this study, comprised of real human speech from eight well-known figures and their speech converted to one another using Retrieval-based Voice Conversion. Presenting as a binary classification problem of whether the speech is real or AI-generated, statistical analysis of temporal audio features through t-testing reveals that there are significantly different distributions. Hyperparameter optimisation is implemented for machine learning models to identify the source of speech. Following the training of 208 individual machine learning models over 10-fold cross validation, it is found that the Extreme Gradient Boosting model can achieve an average classification accuracy of 99.3 speech in real-time, at around 0.004 milliseconds given one second of speech. All data generated for this study is released publicly for future research on AI speech detection.

READ FULL TEXT
research
07/21/2021

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion

We present an unsupervised non-parallel many-to-many voice conversion (V...
research
12/27/2022

Voice conversion with limited data and limitless data augmentations

Applying changes to an input speech signal to change the perceived speak...
research
01/19/2023

Warning: Humans Cannot Reliably Detect Speech Deepfakes

Speech deepfakes are artificial voices generated by machine learning mod...
research
07/17/2023

ivrit.ai: A Comprehensive Dataset of Hebrew Speech for AI Research and Development

We introduce "ivrit.ai", a comprehensive Hebrew speech dataset, addressi...
research
12/05/2019

Towards Robust Neural Vocoding for Speech Generation: A Survey

Recently, neural vocoders have been widely used in speech synthesis task...
research
06/28/2023

Fake the Real: Backdoor Attack on Deep Speech Classification via Voice Conversion

Deep speech classification has achieved tremendous success and greatly p...
research
07/21/2021

Digital Einstein Experience: Fast Text-to-Speech for Conversational AI

We describe our approach to create and deliver a custom voice for a conv...

Please sign up or login with your details

Forgot password? Click here to reset