Data Augmentation for Diverse Voice Conversion in Noisy Environments

05/18/2023
by   Avani Tanna, et al.
0

Voice conversion (VC) models have demonstrated impressive few-shot conversion quality on the clean, native speech populations they're trained on. However, when source or target speech accents, background noise conditions, or microphone characteristics differ from training, quality voice conversion is not guaranteed. These problems are often left unexamined in VC research, giving rise to frustration in users trying to use pretrained VC models on their own data. We are interested in accent-preserving voice conversion for name pronunciation from self-recorded examples, a domain in which all three of the aforementioned conditions are present, and posit that demonstrating higher performance in this domain correlates with creating VC models that are more usable by otherwise frustrated users. We demonstrate that existing SOTA encoder-decoder VC models can be made robust to these variations and endowed with natural denoising capabilities using more diverse data and simple data augmentation techniques in pretraining.

READ FULL TEXT
research
09/15/2022

Non-Parallel Voice Conversion for ASR Augmentation

Automatic speech recognition (ASR) needs to be robust to speaker differe...
research
12/27/2022

Voice conversion with limited data and limitless data augmentations

Applying changes to an input speech signal to change the perceived speak...
research
09/22/2021

Noisy-to-Noisy Voice Conversion Framework with Denoising Model

In a conventional voice conversion (VC) framework, a VC model is often t...
research
04/13/2019

Unsupervised Singing Voice Conversion

We present a deep learning method for singing voice conversion. The prop...
research
06/30/2022

An Evaluation of Three-Stage Voice Conversion Framework for Noisy and Reverberant Conditions

This paper presents a new voice conversion (VC) framework capable of dea...
research
05/21/2023

DualVC: Dual-mode Voice Conversion using Intra-model Knowledge Distillation and Hybrid Predictive Coding

Voice conversion is an increasingly popular technology, and the growing ...
research
12/22/2016

Robustness of Voice Conversion Techniques Under Mismatched Conditions

Most of the existing studies on voice conversion (VC) are conducted in a...

Please sign up or login with your details

Forgot password? Click here to reset