AI-Synthesized Voice Detection Using Neural Vocoder Artifacts

04/25/2023
by   Chengzhe Sun, et al.
0

Advancements in AI-synthesized human voices have created a growing threat of impersonation and disinformation, making it crucial to develop methods to detect synthetic human voices. This study proposes a new approach to identifying synthetic human voices by detecting artifacts of vocoders in audio signals. Most DeepFake audio synthesis models use a neural vocoder, a neural network that generates waveforms from temporal-frequency representations like mel-spectrograms. By identifying neural vocoder processing in audio, we can determine if a sample is synthesized. To detect synthetic human voices, we introduce a multi-task learning framework for a binary-class RawNet2 model that shares the feature extractor with a vocoder identification module. By treating vocoder identification as a pretext task, we constrain the feature extractor to focus on vocoder artifacts and provide discriminative features for the final binary classifier. Our experiments show that the improved RawNet2 model based on vocoder identification achieves high classification performance on the binary task overall.

READ FULL TEXT
research
02/18/2023

Exposing AI-Synthesized Human Voices Using Neural Vocoder Artifacts

The advancements of AI-synthesized human voices have introduced a growin...
research
03/07/2022

Detection of AI Synthesized Hindi Speech

The recent advancements in generative artificial speech models have made...
research
12/05/2022

Evince the artifacts of Spoof Speech by blending Vocal Tract and Voice Source Features

With the rapid advancement in synthetic speech generation technologies, ...
research
05/03/2022

Frequency Domain-Based Detection of Generated Audio

Attackers may manipulate audio with the intent of presenting falsified r...
research
10/27/2020

Upsampling artifacts in neural audio synthesis

A number of recent advances in audio synthesis rely on neural upsamplers...
research
10/31/2022

Audio Time-Scale Modification with Temporal Compressing Networks

We proposed a novel approach in the field of time-scale modification on ...
research
09/18/2023

Offline Detection of Misspelled Handwritten Words by Convolving Recognition Model Features with Text Labels

Offline handwriting recognition (HWR) has improved significantly with th...

Please sign up or login with your details

Forgot password? Click here to reset