Speaker Anonymization Using X-vector and Neural Waveform Models

by   Fuming Fang, et al.

The social media revolution has produced a plethora of web services to which users can easily upload and share multimedia documents. Despite the popularity and convenience of such services, the sharing of such inherently personal data, including speech data, raises obvious security and privacy concerns. In particular, a user's speech data may be acquired and used with speech synthesis systems to produce high-quality speech utterances which reflect the same user's speaker identity. These utterances may then be used to attack speaker verification systems. One solution to mitigate these concerns involves the concealing of speaker identities before the sharing of speech data. For this purpose, we present a new approach to speaker anonymization. The idea is to extract linguistic and speaker identity features from an utterance and then to use these with neural acoustic and waveform models to synthesize anonymized speech. The original speaker identity, in the form of timbre, is suppressed and replaced with that of an anonymous pseudo identity. The approach exploits state-of-the-art x-vector speaker representations. These are used to derive anonymized pseudo speaker identities through the combination of multiple, random speaker x-vectors. Experimental results show that the proposed approach is effective in concealing speaker identities. It increases the equal error rate of a speaker verification system while maintaining high quality, anonymized speech.


page 1

page 2

page 3

page 4


Distinguishable Speaker Anonymization based on Formant and Fundamental Frequency Scaling

Speech data on the Internet are proliferating exponentially because of t...

MIRNet: Learning multiple identities representations in overlapped speech

Many approaches can derive information about a single speaker's identity...

Speaker De-identification System using Autoencodersand Adversarial Training

The fast increase of web services and mobile apps, which collect persona...

DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion

The widespread adoption of speech-based online services raises security ...

Evaluating X-vector-based Speaker Anonymization under White-box Assessment

In the scenario of the Voice Privacy challenge, anonymization is achieve...

NWPU-ASLP System for the VoicePrivacy 2022 Challenge

This paper presents the NWPU-ASLP speaker anonymization system for Voice...

Design Choices for X-vector Based Speaker Anonymization

The recently proposed x-vector based anonymization scheme converts any i...

Please sign up or login with your details

Forgot password? Click here to reset