crank: An Open-Source Software for Nonparallel Voice Conversion Based on Vector-Quantized Variational Autoencoder

03/04/2021
by   Kazuhiro Kobayashi, et al.
0

In this paper, we present an open-source software for developing a nonparallel voice conversion (VC) system named crank. Although we have released an open-source VC software based on the Gaussian mixture model named sprocket in the last VC Challenge, it is not straightforward to apply any speech corpus because it is necessary to prepare parallel utterances of source and target speakers to model a statistical conversion function. To address this issue, in this study, we developed a new open-source VC software that enables users to model the conversion function by using only a nonparallel speech corpus. For implementing the VC software, we used a vector-quantized variational autoencoder (VQVAE). To rapidly examine the effectiveness of recent technologies developed in this research field, crank also supports several representative works for autoencoder-based VC methods such as the use of hierarchical architectures, cyclic architectures, generative adversarial networks, speaker adversarial training, and neural vocoders. Moreover, it is possible to automatically estimate objective measures such as mel-cepstrum distortion and pseudo mean opinion score based on MOSNet. In this paper, we describe representative functions developed in crank and make brief comparisons by objective evaluations.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/09/2020

Baseline System of Voice Conversion Challenge 2020 with Cyclic Variational Autoencoder and Parallel WaveGAN

In this paper, we present a description of the baseline system of Voice ...
research
09/15/2019

Voice Conversion Using Cycle-Consistent Variational Autoencoder

One of the most critical obstacles in voice conversion is the requiremen...
research
07/11/2021

Many-to-Many Voice Conversion based Feature Disentanglement using Variational Autoencoder

Voice conversion is a challenging task which transforms the voice charac...
research
10/15/2020

The NeteaseGames System for Voice Conversion Challenge 2020 with Vector-quantization Variational Autoencoder and WaveNet

This paper presents the description of our submitted system for Voice Co...
research
10/06/2020

The Academia Sinica Systems of Voice Conversion for VCC2020

This paper describes the Academia Sinica systems for the two tasks of Vo...
research
06/09/2022

Speak Like a Dog: Human to Non-human creature Voice Conversion

This paper proposes a new voice conversion (VC) task from human speech t...
research
12/04/2022

Generative Models for Improved Naturalness, Intelligibility, and Voicing of Whispered Speech

This work adapts two recent architectures of generative models and evalu...

Please sign up or login with your details

Forgot password? Click here to reset