Detection of AI Synthesized Hindi Speech

03/07/2022
by   Karan Bhatia, et al.
0

The recent advancements in generative artificial speech models have made possible the generation of highly realistic speech signals. At first, it seems exciting to obtain these artificially synthesized signals such as speech clones or deep fakes but if left unchecked, it may lead us to digital dystopia. One of the primary focus in audio forensics is validating the authenticity of a speech. Though some solutions are proposed for English speeches but the detection of synthetic Hindi speeches have not gained much attention. Here, we propose an approach for discrimination of AI synthesized Hindi speech from an actual human speech. We have exploited the Bicoherence Phase, Bicoherence Magnitude, Mel Frequency Cepstral Coefficient (MFCC), Delta Cepstral, and Delta Square Cepstral as the discriminating features for machine learning models. Also, we extend the study to using deep neural networks for extensive experiments, specifically VGG16 and homemade CNN as the architecture models. We obtained an accuracy of 99.83

READ FULL TEXT

page 2

page 4

research
09/03/2020

Detection of AI-Synthesized Speech Using Cepstral Bispectral Statistics

Digital technology has made possible unimaginable applications come true...
research
07/23/2021

Using Deep Learning Techniques and Inferential Speech Statistics for AI Synthesised Speech Recognition

The recent developments in technology have re-warded us with amazing aud...
research
01/19/2023

Warning: Humans Cannot Reliably Detect Speech Deepfakes

Speech deepfakes are artificial voices generated by machine learning mod...
research
04/25/2023

AI-Synthesized Voice Detection Using Neural Vocoder Artifacts

Advancements in AI-synthesized human voices have created a growing threa...
research
02/18/2023

Exposing AI-Synthesized Human Voices Using Neural Vocoder Artifacts

The advancements of AI-synthesized human voices have introduced a growin...
research
05/03/2022

Synthesized Speech Detection Using Convolutional Transformer-Based Spectrogram Analysis

Synthesized speech is common today due to the prevalence of virtual assi...
research
10/06/2022

The Sound of Silence: Efficiency of First Digit Features in Synthetic Audio Detection

The recent integration of generative neural strategies and audio process...

Please sign up or login with your details

Forgot password? Click here to reset