NWPU-ASLP System for the VoicePrivacy 2022 Challenge

09/24/2022
by   Jixun Yao, et al.
0

This paper presents the NWPU-ASLP speaker anonymization system for VoicePrivacy 2022 Challenge. Our submission does not involve additional Automatic Speaker Verification (ASV) model or x-vector pool. Our system consists of four modules, including feature extractor, acoustic model, anonymization module, and neural vocoder. First, the feature extractor extracts the Phonetic Posteriorgram (PPG) and pitch from the input speech signal. Then, we reserve a pseudo speaker ID from a speaker look-up table (LUT), which is subsequently fed into a speaker encoder to generate the pseudo speaker embedding that is not corresponding to any real speaker. To ensure the pseudo speaker is distinguishable, we further average the randomly selected speaker embedding and weighted concatenate it with the pseudo speaker embedding to generate the anonymized speaker embedding. Finally, the acoustic model outputs the anonymized mel-spectrogram from the anonymized speaker embedding and a modified version of HifiGAN transforms the mel-spectrogram into the anonymized speech waveform. Experimental results demonstrate the effectiveness of our proposed anonymization system.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/05/2021

The Phonexia VoxCeleb Speaker Recognition Challenge 2021 System Description

We describe the Phonexia submission for the VoxCeleb Speaker Recognition...
research
04/25/2023

Multi-Speaker Multi-Lingual VQTTS System for LIMMITS 2023 Challenge

In this paper, we describe the systems developed by the SJTU X-LANCE tea...
research
02/26/2022

Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models

Speaker anonymization aims to protect the privacy of speakers while pres...
research
06/03/2019

Problem-Agnostic Speech Embeddings for Multi-Speaker Text-to-Speech with SampleRNN

Text-to-speech (TTS) acoustic models map linguistic features into an aco...
research
05/30/2019

Speaker Anonymization Using X-vector and Neural Waveform Models

The social media revolution has produced a plethora of web services to w...
research
10/31/2018

Discriminatively Re-trained i-vector Extractor for Speaker Recognition

In this work we revisit discriminative training of the i-vector extracto...
research
01/16/2023

Improving Target Speaker Extraction with Sparse LDA-transformed Speaker Embeddings

As a practical alternative of speech separation, target speaker extracti...

Please sign up or login with your details

Forgot password? Click here to reset