DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion

09/09/2022
by   Ruibin Yuan, et al.
0

The widespread adoption of speech-based online services raises security and privacy concerns regarding the data that they use and share. If the data were compromised, attackers could exploit user speech to bypass speaker verification systems or even impersonate users. To mitigate this, we propose DeID-VC, a speaker de-identification system that converts a real speaker to pseudo speakers, thus removing or obfuscating the speaker-dependent attributes from a spoken voice. The key components of DeID-VC include a Variational Autoencoder (VAE) based Pseudo Speaker Generator (PSG) and a voice conversion Autoencoder (AE) under zero-shot settings. With the help of PSG, DeID-VC can assign unique pseudo speakers at speaker level or even at utterance level. Also, two novel learning objectives are added to bridge the gap between training and inference of zero-shot voice conversion. We present our experimental results with word error rate (WER) and equal error rate (EER), along with three subjective metrics to evaluate the generated output of DeID-VC. The result shows that our method substantially improved intelligibility (WER 10 de-identification effectiveness (EER 5 and listening demo: https://github.com/a43992899/DeID-VC

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/18/2022

DGC-vector: A new speaker embedding for zero-shot voice conversion

Recently, more and more zero-shot voice conversion algorithms have been ...
research
05/18/2020

Design Choices for X-vector Based Speaker Anonymization

The recently proposed x-vector based anonymization scheme converts any i...
research
11/09/2020

Speaker De-identification System using Autoencodersand Adversarial Training

The fast increase of web services and mobile apps, which collect persona...
research
03/30/2022

Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion

Traditional studies on voice conversion (VC) have made progress with par...
research
05/30/2019

Speaker Anonymization Using X-vector and Neural Waveform Models

The social media revolution has produced a plethora of web services to w...
research
11/08/2018

Who Do I Sound Like? Showcasing Speaker Recognition Technology by YouTube Voice Search

The popularization of science can often be disregarded by scientists as ...
research
11/15/2022

Improved disentangled speech representations using contrastive learning in factorized hierarchical variational autoencoder

By utilizing the fact that speaker identity and content vary on differen...

Please sign up or login with your details

Forgot password? Click here to reset