Vocoder drift compensation by x-vector alignment in speaker anonymisation

07/17/2023
by   Michele Panariello, et al.
0

For the most popular x-vector-based approaches to speaker anonymisation, the bulk of the anonymisation can stem from vocoding rather than from the core anonymisation function which is used to substitute an original speaker x-vector with that of a fictitious pseudo-speaker. This phenomenon can impede the design of better anonymisation systems since there is a lack of fine-grained control over the x-vector space. The work reported in this paper explores the origin of so-called vocoder drift and shows that it is due to the mismatch between the substituted x-vector and the original representations of the linguistic content, intonation and prosody. Also reported is an original approach to vocoder drift compensation. While anonymisation performance degrades as expected, compensation reduces vocoder drift substantially, offers improved control over the x-vector space and lays a foundation for the design of better anonymisation functions in the future.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/18/2020

Design Choices for X-vector Based Speaker Anonymization

The recently proposed x-vector based anonymization scheme converts any i...
research
05/30/2023

Language-independent speaker anonymization using orthogonal Householder neural network

Speaker anonymization aims to conceal a speaker's identity while preserv...
research
11/19/2015

Compressing Word Embeddings

Recent methods for learning vector space representations of words have s...
research
05/12/2017

Monaural Audio Speaker Separation with Source Contrastive Estimation

We propose an algorithm to separate simultaneously speaking persons from...
research
08/21/2023

Multi-GradSpeech: Towards Diffusion-based Multi-Speaker Text-to-speech Using Consistent Diffusion Models

Despite imperfect score-matching causing drift in training and sampling ...
research
06/28/2022

A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion

Typically, singing voice conversion (SVC) depends on an embedding vector...
research
06/12/2020

Improved Fixed-Budget Results via Drift Analysis

Fixed-budget theory is concerned with computing or bounding the fitness ...

Please sign up or login with your details

Forgot password? Click here to reset