Using Deep Learning Techniques and Inferential Speech Statistics for AI Synthesised Speech Recognition

07/23/2021
by   Arun Kumar Singh, et al.
10

The recent developments in technology have re-warded us with amazing audio synthesis models like TACOTRON and WAVENETS. On the other side, it poses greater threats such as speech clones and deep fakes, that may go undetected. To tackle these alarming situations, there is an urgent need to propose models that can help discriminate a synthesized speech from an actual human speech and also identify the source of such a synthesis. Here, we propose a model based on Convolutional Neural Network (CNN) and Bidirectional Recurrent Neural Network (BiRNN) that helps to achieve both the aforementioned objectives. The temporal dependencies present in AI synthesized speech are exploited using Bidirectional RNN and CNN. The model outperforms the state-of-the-art approaches by classifying the AI synthesized audio from real human speech with an error rate of 1.9

READ FULL TEXT

page 1

page 6

page 8

page 10

page 12

research
03/07/2022

Detection of AI Synthesized Hindi Speech

The recent advancements in generative artificial speech models have made...
research
09/13/2022

Deep Speech Synthesis from Articulatory Representations

In the articulatory synthesis task, speech is synthesized from input fea...
research
09/03/2020

Detection of AI-Synthesized Speech Using Cepstral Bispectral Statistics

Digital technology has made possible unimaginable applications come true...
research
01/02/2017

Vid2speech: Speech Reconstruction from Silent Video

Speechreading is a notoriously difficult task for humans to perform. In ...
research
02/21/2020

AutoFoley: Artificial Synthesis of Synchronized Sound Tracks for Silent Videos with Deep Learning

In movie productions, the Foley Artist is responsible for creating an ov...
research
02/26/2018

Deep Feed-forward Sequential Memory Networks for Speech Synthesis

The Bidirectional LSTM (BLSTM) RNN based speech synthesis system is amon...
research
12/09/2012

High-dimensional sequence transduction

We investigate the problem of transforming an input sequence into a high...

Please sign up or login with your details

Forgot password? Click here to reset