Speaker Adaptation with Continuous Vocoder-based DNN-TTS

08/02/2021
by   Ali Raheem Mandeel, et al.
0

Traditional vocoder-based statistical parametric speech synthesis can be advantageous in applications that require low computational complexity. Recent neural vocoders, which can produce high naturalness, still cannot fulfill the requirement of being real-time during synthesis. In this paper, we experiment with our earlier continuous vocoder, in which the excitation is modeled with two one-dimensional parameters: continuous F0 and Maximum Voiced Frequency. We show on the data of 9 speakers that an average voice can be trained for DNN-TTS, and speaker adaptation is feasible 400 utterances (about 14 minutes). Objective experiments support that the quality of speaker adaptation with Continuous Vocoder-based DNN-TTS is similar to the quality of the speaker adaptation with a WORLD Vocoder-based baseline.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/05/2018

Linear networks based speaker adaptation for speech synthesis

Speaker adaptation methods aim to create fair quality synthesis speech v...
research
06/01/2022

AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation

Speaker adaptation in text-to-speech synthesis (TTS) is to finetune a pr...
research
11/15/2018

Effect of data reduction on sequence-to-sequence neural TTS

Recent speech synthesis systems based on sampling from autoregressive ne...
research
08/16/2021

GC-TTS: Few-shot Speaker Adaptation with Geometric Constraints

Few-shot speaker adaptation is a specific Text-to-Speech (TTS) system th...
research
05/24/2022

TDASS: Target Domain Adaptation Speech Synthesis Framework for Multi-speaker Low-Resource TTS

Recently, synthesizing personalized speech by text-to-speech (TTS) appli...
research
04/07/2022

Detecting Vocal Fatigue with Neural Embeddings

Vocal fatigue refers to the feeling of tiredness and weakness of voice d...
research
06/12/2020

Neural voice cloning with a few low-quality samples

In this paper, we explore the possibility of speech synthesis from low q...

Please sign up or login with your details

Forgot password? Click here to reset