Speak Like a Dog: Human to Non-human creature Voice Conversion

06/09/2022
by   Kohei Suzuki, et al.
0

This paper proposes a new voice conversion (VC) task from human speech to dog-like speech while preserving linguistic information as an example of human to non-human creature voice conversion (H2NH-VC) tasks. Although most VC studies deal with human to human VC, H2NH-VC aims to convert human speech into non-human creature-like speech. Non-parallel VC allows us to develop H2NH-VC, because we cannot collect a parallel dataset that non-human creatures speak human language. In this study, we propose to use dogs as an example of a non-human creature target domain and define the "speak like a dog" task. To clarify the possibilities and characteristics of the "speak like a dog" task, we conducted a comparative experiment using existing representative non-parallel VC methods in acoustic features (Mel-cepstral coefficients and Mel-spectrograms), network architectures (five different kernel-size settings), and training criteria (variational autoencoder (VAE)- based and generative adversarial network-based). Finally, the converted voices were evaluated using mean opinion scores: dog-likeness, sound quality and intelligibility, and character error rate (CER). The experiment showed that the employment of the Mel-spectrogram improved the dog-likeness of the converted speech, while it is challenging to preserve linguistic information. Challenges and limitations of the current VC methods for H2NH-VC are highlighted.

READ FULL TEXT

page 1

page 3

research
11/02/2020

CVC: Contrastive Learning for Non-parallel Voice Conversion

Cycle consistent generative adversarial network (CycleGAN) and variation...
research
08/10/2021

StarGAN-VC+ASR: StarGAN-based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition

Preserving the linguistic content of input speech is essential during vo...
research
09/15/2019

Voice Conversion Using Cycle-Consistent Variational Autoencoder

One of the most critical obstacles in voice conversion is the requiremen...
research
05/10/2022

Read the Room: Adapting a Robot's Voice to Ambient and Social Contexts

Adapting one's voice to different ambient environments and social intera...
research
10/09/2020

Baseline System of Voice Conversion Challenge 2020 with Cyclic Variational Autoencoder and Parallel WaveGAN

In this paper, we present a description of the baseline system of Voice ...
research
10/09/2016

Emergence of linguistic laws in human voice

Linguistic laws constitute one of the quantitative cornerstones of moder...
research
03/04/2021

crank: An Open-Source Software for Nonparallel Voice Conversion Based on Vector-Quantized Variational Autoencoder

In this paper, we present an open-source software for developing a nonpa...

Please sign up or login with your details

Forgot password? Click here to reset