FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis

06/29/2021
by   Taejun Bak, et al.
0

Methods for modeling and controlling prosody with acoustic features have been proposed for neural text-to-speech (TTS) models. Prosodic speech can be generated by conditioning acoustic features. However, synthesized speech with a large pitch-shift scale suffers from audio quality degradation, and speaker characteristics deformation. To address this problem, we propose a feed-forward Transformer based TTS model that is designed based on the source-filter theory. This model, called FastPitchFormant, has a unique structure that handles text and acoustic features in parallel. With modeling each feature separately, the tendency that the model learns the relationship between two features can be mitigated.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 3

04/25/2018

Speaker-independent raw waveform model for glottal excitation

Recent speech technology research has seen a growing interest in using W...
11/19/2018

Limitations of Source-Filter Coupling In Phonation

The coupling of vocal fold (source) and vocal tract (filter) is one of t...
03/18/2022

A^3T: Alignment-Aware Acoustic and Text Pretraining for Speech Synthesis and Editing

Recently, speech representation learning has improved many speech-relate...
04/07/2018

A comparison of recent waveform generation and acoustic modeling methods for neural-network-based speech synthesis

Recent advances in speech synthesis suggest that limitations such as the...
11/28/2018

UFANS: U-shaped Fully-Parallel Acoustic Neural Structure For Statistical Parametric Speech Synthesis With 20X Faster

Neural networks with Auto-regressive structures, such as Recurrent Neura...
10/22/2020

The NTU-AISG Text-to-speech System for Blizzard Challenge 2020

We report our NTU-AISG Text-to-speech (TTS) entry systems for the Blizza...
04/03/2022

On incorporating social speaker characteristics in synthetic speech

In our previous work, we derived the acoustic features, that contribute ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.