Voice Conversion with Conditional SampleRNN

08/24/2018
by   Cong Zhou, et al.
0

Here we present a novel approach to conditioning the SampleRNN generative model for voice conversion (VC). Conventional methods for VC modify the perceived speaker identity by converting between source and target acoustic features. Our approach focuses on preserving voice content and depends on the generative network to learn voice style. We first train a multi-speaker SampleRNN model conditioned on linguistic features, pitch contour, and speaker identity using a multi-speaker speech corpus. Voice-converted speech is generated using linguistic features and pitch contour extracted from the source speaker, and the target speaker identity. We demonstrate that our system is capable of many-to-many voice conversion without requiring parallel data, enabling broad applications. Subjective evaluation demonstrates that our approach outperforms conventional VC methods.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

04/15/2020

F0-consistent many-to-many non-parallel voice conversion via conditional autoencoder

Non-parallel many-to-many voice conversion remains an interesting but ch...
10/25/2021

Controllable and Interpretable Singing Voice Decomposition via Assem-VC

We propose a singing decomposition system that encodes time-aligned ling...
10/30/2018

Generative Adversarial Networks for Unpaired Voice Transformation on Impaired Speech

This paper focuses on using voice conversion (VC) to improve the speech ...
01/31/2021

High Fidelity Speech Regeneration with Application to Speech Enhancement

Speech enhancement has seen great improvement in recent years mainly thr...
08/06/2020

Unsupervised Cross-Domain Singing Voice Conversion

We present a wav-to-wav generative model for the task of singing voice c...
05/19/2020

Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech

Accent conversion (AC) transforms a non-native speaker's accent into a n...
02/16/2021

Axial Residual Networks for CycleGAN-based Voice Conversion

We propose a novel architecture and improved training objectives for non...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.