Single and Multi-Speaker Cloned Voice Detection: From Perceptual to Learned Features

07/15/2023
by   Sarah Barrington, et al.
0

Synthetic-voice cloning technologies have seen significant advances in recent years, giving rise to a range of potential harms. From small- and large-scale financial fraud to disinformation campaigns, the need for reliable methods to differentiate real and synthesized voices is imperative. We describe three techniques for differentiating a real from a cloned voice designed to impersonate a specific person. These three approaches differ in their feature extraction stage with low-dimensional perceptual features offering high interpretability but lower accuracy, to generic spectral features, and end-to-end learned features offering less interpretability but higher accuracy. We show the efficacy of these approaches when trained on a single speaker's voice and when trained on multiple voices. The learned features consistently yield an equal error rate between 0% and 4%, and are reasonably robust to adversarial laundering.

READ FULL TEXT
research
05/28/2021

DIVE: End-to-end Speech Diarization via Iterative Speaker Embedding

We introduce DIVE, an end-to-end speaker diarization algorithm. Our neur...
research
11/24/2020

How Far Are We from Robust Voice Conversion: A Survey

Voice conversion technologies have been greatly improved in recent years...
research
10/28/2022

Target-Speaker Voice Activity Detection via Sequence-to-Sequence Prediction

Target-speaker voice activity detection is currently a promising approac...
research
10/27/2020

FragmentVC: Any-to-Any Voice Conversion by End-to-End Extracting and Fusing Fine-Grained Voice Fragments With Attention

Any-to-any voice conversion aims to convert the voice from and to any sp...
research
02/18/2019

Securing Voice-driven Interfaces against Fake (Cloned) Audio Attacks

Voice cloning technologies have found applications in a variety of areas...
research
07/19/2013

Speaker Independent Continuous Speech to Text Converter for Mobile Application

An efficient speech to text converter for mobile application is presente...
research
06/17/2019

DigiVoice: Voice Biomarker Featurization and Analysis Pipeline

In recent years, data-driven models have enabled significant advances in...

Please sign up or login with your details

Forgot password? Click here to reset