Voice-preserving Zero-shot Multiple Accent Conversion

11/23/2022
by   Mumin Jin, et al.
0

Most people who have tried to learn a foreign language would have experienced difficulties understanding or speaking with a native speaker's accent. For native speakers, understanding or speaking a new accent is likewise a difficult task. An accent conversion system that changes a speaker's accent but preserves that speaker's voice identity, such as timbre and pitch, has the potential for a range of applications, such as communication, language learning, and entertainment. Existing accent conversion models tend to change the speaker identity and accent at the same time. Here, we use adversarial learning to disentangle accent dependent features while retaining other acoustic characteristics. What sets our work apart from existing accent conversion models is the capability to convert an unseen speaker's utterance to multiple accents while preserving its original voice identity. Subjective evaluations show that our model generates audio that sound closer to the target accent and like the original speaker.

READ FULL TEXT
research
10/24/2020

GAZEV: GAN-Based Zero-Shot Voice Conversion over Non-parallel Speech Corpus

Non-parallel many-to-many voice conversion is recently attract-ing huge ...
research
09/05/2023

Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion

Foreign accent conversion (FAC) is a special application of voice conver...
research
04/10/2020

Generating Multilingual Voices Using Speaker Space Translation Based on Bilingual Speaker Data

We present progress towards bilingual Text-to-Speech which is able to tr...
research
07/11/2021

Many-to-Many Voice Conversion based Feature Disentanglement using Variational Autoencoder

Voice conversion is a challenging task which transforms the voice charac...
research
05/18/2020

Defending Your Voice: Adversarial Attack on Voice Conversion

Substantial improvements have been achieved in recent years in voice con...
research
10/20/2022

DisC-VC: Disentangled and F0-Controllable Neural Voice Conversion

Voice conversion is a task to convert a non-linguistic feature of a give...
research
06/28/2022

A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion

Typically, singing voice conversion (SVC) depends on an embedding vector...

Please sign up or login with your details

Forgot password? Click here to reset