HiFi-VC: High Quality ASR-Based Voice Conversion

03/31/2022
by   A. Kashkin, et al.
0

The goal of voice conversion (VC) is to convert input voice to match the target speaker's voice while keeping text and prosody intact. VC is usually used in entertainment and speaking-aid systems, as well as applied for speech data generation and augmentation. The development of any-to-any VC systems, which are capable of generating voices unseen during model training, is of particular interest to both researchers and the industry. Despite recent progress, any-to-any conversion quality is still inferior to natural speech. In this work, we propose a new any-to-any voice conversion pipeline. Our approach uses automated speech recognition (ASR) features, pitch tracking, and a state-of-the-art waveform prediction model. According to multiple subjective and objective evaluations, our method outperforms modern baselines in terms of voice quality, similarity and consistency.

READ FULL TEXT
research
07/20/2021

On Prosody Modeling for ASR+TTS based Voice Conversion

In voice conversion (VC), an approach showing promising results in the l...
research
12/05/2019

Towards Robust Neural Vocoding for Speech Generation: A Survey

Recently, neural vocoders have been widely used in speech synthesis task...
research
05/24/2023

Iteratively Improving Speech Recognition and Voice Conversion

Many existing works on voice conversion (VC) tasks use automatic speech ...
research
09/14/2019

Bootstrapping non-parallel voice conversion from speaker-adaptive text-to-speech

Voice conversion (VC) and text-to-speech (TTS) are two tasks that share ...
research
11/12/2021

AC-VC: Non-parallel Low Latency Phonetic Posteriorgrams Based Voice Conversion

This paper presents AC-VC (Almost Causal Voice Conversion), a phonetic p...
research
10/27/2022

FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion

Voice conversion (VC) can be achieved by first extracting source content...
research
04/30/2018

Collapsed speech segment detection and suppression for WaveNet vocoder

In this paper, we propose a technique to alleviate quality degradation c...

Please sign up or login with your details

Forgot password? Click here to reset