An Evaluation of Three-Stage Voice Conversion Framework for Noisy and Reverberant Conditions

06/30/2022
by   Yeonjong Choi, et al.
0

This paper presents a new voice conversion (VC) framework capable of dealing with both additive noise and reverberation, and its performance evaluation. There have been studied some VC researches focusing on real-world circumstances where speech data are interfered with background noise and reverberation. To deal with more practical conditions where no clean target dataset is available, one possible approach is zero-shot VC, but its performance tends to degrade compared with VC using sufficient amount of target speech data. To leverage large amount of noisy-reverberant target speech data, we propose a three-stage VC framework based on denoising process using a pretrained denoising model, dereverberation process using a dereverberation model, and VC process using a nonparallel VC model based on a variational autoencoder. The experimental results show that 1) noise and reverberation additively cause significant VC performance degradation, 2) the proposed method alleviates the adverse effects caused by both noise and reverberation, and significantly outperforms the baseline directly trained on the noisy-reverberant speech data, and 3) the potential degradation introduced by the denoising and dereverberation still causes noticeable adverse effects on VC performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/13/2021

Direct Noisy Speech Modeling for Noisy-to-Noisy Voice Conversion

Beyond the conventional voice conversion (VC) where the speaker informat...
research
02/11/2019

A Vocoder-free WaveNet Voice Conversion with Non-Parallel Data

In a typical voice conversion system, vocoder is commonly used for speec...
research
05/18/2023

Data Augmentation for Diverse Voice Conversion in Noisy Environments

Voice conversion (VC) models have demonstrated impressive few-shot conve...
research
09/22/2021

Noisy-to-Noisy Voice Conversion Framework with Denoising Model

In a conventional voice conversion (VC) framework, a VC model is often t...
research
11/27/2018

Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion

This paper presents a refinement framework of WaveNet vocoders for varia...
research
10/14/2021

Toward Degradation-Robust Voice Conversion

Any-to-any voice conversion technologies convert the vocal timbre of an ...
research
02/17/2022

Multi-Channel Speech Denoising for Machine Ears

This work describes a speech denoising system for machine ears that aims...

Please sign up or login with your details

Forgot password? Click here to reset