Effects of Convolutional Autoencoder Bottleneck Width on StarGAN-based Singing Technique Conversion

08/19/2023
by   Tung-Cheng Su, et al.
0

Singing technique conversion (STC) refers to the task of converting from one voice technique to another while leaving the original singer identity, melody, and linguistic components intact. Previous STC studies, as well as singing voice conversion research in general, have utilized convolutional autoencoders (CAEs) for conversion, but how the bottleneck width of the CAE affects the synthesis quality has not been thoroughly evaluated. To this end, we constructed a GAN-based multi-domain STC system which took advantage of the WORLD vocoder representation and the CAE architecture. We varied the bottleneck width of the CAE, and evaluated the conversion results subjectively. The model was trained on a Mandarin dataset which features four singers and four singing techniques: the chest voice, the falsetto, the raspy voice, and the whistle voice. The results show that a wider bottleneck corresponds to better articulation clarity but does not necessarily lead to higher likeness to the target technique. Among the four techniques, we also found that the whistle voice is the easiest target for conversion, while the other three techniques as a source produce more convincing conversion results than the whistle.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/29/2020

The IQIYI System for Voice Conversion Challenge 2020

This paper presents the IQIYI voice conversion system (T24) for Voice Co...
research
02/27/2023

A Comparative Analysis Of Latent Regressor Losses For Singing Voice Conversion

Previous research has shown that established techniques for spoken voice...
research
11/16/2021

Zero-shot Singing Technique Conversion

In this paper we propose modifications to the neural network framework, ...
research
10/31/2022

VoicePrivacy 2022 System Description: Speaker Anonymization with Feature-matched F0 Trajectories

We introduce a novel method to improve the performance of the VoicePriva...
research
11/18/2019

Walking the Tightrope: An Investigation of the Convolutional Autoencoder Bottleneck

In this paper, we present an in-depth investigation of the convolutional...
research
12/03/2019

Singing Voice Conversion with Disentangled Representations of Singer and Vocal Technique Using Variational Autoencoders

We propose a flexible framework that deals with both singer conversion a...
research
06/21/2023

Automatic Speech Disentanglement for Voice Conversion using Rank Module and Speech Augmentation

Voice Conversion (VC) converts the voice of a source speech to that of a...

Please sign up or login with your details

Forgot password? Click here to reset