Many-to-Many Voice Conversion based Feature Disentanglement using Variational Autoencoder

07/11/2021
by   Manh Luong, et al.
0

Voice conversion is a challenging task which transforms the voice characteristics of a source speaker to a target speaker without changing linguistic content. Recently, there have been many works on many-to-many Voice Conversion (VC) based on Variational Autoencoder (VAEs) achieving good results, however, these methods lack the ability to disentangle speaker identity and linguistic content to achieve good performance on unseen speaker scenarios. In this paper, we propose a new method based on feature disentanglement to tackle many to many voice conversion. The method has the capability to disentangle speaker identity and linguistic content from utterances, it can convert from many source speakers to many target speakers with a single autoencoder network. Moreover, it naturally deals with the unseen target speaker scenarios. We perform both objective and subjective evaluations to show the competitive performance of our proposed method compared with other state-of-the-art models in terms of naturalness and target speaker similarity.

READ FULL TEXT
research
05/26/2020

Adversarial Contrastive Predictive Coding for Unsupervised Learning of Disentangled Representations

In this work we tackle disentanglement of speaker and content related va...
research
10/20/2022

DisC-VC: Disentangled and F0-Controllable Neural Voice Conversion

Voice conversion is a task to convert a non-linguistic feature of a give...
research
03/04/2020

A Robust Speaker Clustering Method Based on Discrete Tied Variational Autoencoder

Recently, the speaker clustering model based on aggregation hierarchy cl...
research
09/15/2023

Controllable Residual Speaker Representation for Voice Conversion

Recently, there have been significant advancements in voice conversion, ...
research
10/27/2016

Voice Conversion using Convolutional Neural Networks

The human auditory system is able to distinguish the vocal source of tho...
research
03/04/2021

crank: An Open-Source Software for Nonparallel Voice Conversion Based on Vector-Quantized Variational Autoencoder

In this paper, we present an open-source software for developing a nonpa...
research
11/23/2022

Voice-preserving Zero-shot Multiple Accent Conversion

Most people who have tried to learn a foreign language would have experi...

Please sign up or login with your details

Forgot password? Click here to reset