Synthesized Speech Detection Using Convolutional Transformer-Based Spectrogram Analysis

05/03/2022
by   Emily R. Bartusiak, et al.
0

Synthesized speech is common today due to the prevalence of virtual assistants, easy-to-use tools for generating and modifying speech signals, and remote work practices. Synthesized speech can also be used for nefarious purposes, including creating a purported speech signal and attributing it to someone who did not speak the content of the signal. We need methods to detect if a speech signal is synthesized. In this paper, we analyze speech signals in the form of spectrograms with a Compact Convolutional Transformer (CCT) for synthesized speech detection. A CCT utilizes a convolutional layer that introduces inductive biases and shared weights into a network, allowing a transformer architecture to perform well with fewer data samples used for training. The CCT uses an attention mechanism to incorporate information from all parts of a signal under analysis. Trained on both genuine human voice signals and synthesized human voice signals, we demonstrate that our CCT approach successfully differentiates between genuine and synthesized speech signals.

READ FULL TEXT

page 1

page 2

research
10/14/2022

Transformer-Based Speech Synthesizer Attribution in an Open Set Scenario

Speech synthesis methods can create realistic-sounding speech, which may...
research
06/03/2021

An Improved Model for Voicing Silent Speech

In this paper, we present an improved model for voicing silent speech, w...
research
09/03/2020

Detection of AI-Synthesized Speech Using Cepstral Bispectral Statistics

Digital technology has made possible unimaginable applications come true...
research
03/07/2022

Detection of AI Synthesized Hindi Speech

The recent advancements in generative artificial speech models have made...
research
05/03/2021

Full-Reference Speech Quality Estimation with Attentional Siamese Neural Networks

In this paper, we present a full-reference speech quality prediction mod...
research
07/04/2017

Hidden-Markov-Model Based Speech Enhancement

The goal of this contribution is to use a parametric speech synthesis sy...
research
07/05/2022

Ultra-Low-Bitrate Speech Coding with Pretrained Transformers

Speech coding facilitates the transmission of speech over low-bandwidth ...

Please sign up or login with your details

Forgot password? Click here to reset