A High Quality Text-To-Speech System Composed of Multiple Neural Networks

12/05/1998
by   Orhan Karaali, et al.
0

While neural networks have been employed to handle several different text-to-speech tasks, ours is the first system to use neural networks throughout, for both linguistic and acoustic processing. We divide the text-to-speech task into three subtasks, a linguistic module mapping from text to a linguistic representation, an acoustic module mapping from the linguistic representation to speech, and a video module mapping from the linguistic representation to animated images. The linguistic module employs a letter-to-sound neural network and a postlexical neural network. The acoustic module employs a duration neural network and a phonetic neural network. The visual neural network is employed in parallel to the acoustic module to drive a talking head. The use of neural networks that can be retrained on the characteristics of different voices and languages affords our system a degree of adaptability and naturalness heretofore unavailable.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/24/1998

Speech Synthesis with Neural Networks

Text-to-speech conversion has traditionally been performed either by con...
research
08/31/2018

Self-Attention Linguistic-Acoustic Decoder

The conversion from text to speech relies on the accurate mapping from l...
research
04/13/2021

A Tale of Two Lexica Testing Computational Hypotheses with Deep Convolutional Neural Networks

Gow's (2012) dual lexicon model suggests that the primary purpose of wor...
research
01/21/2016

On Structured Sparsity of Phonological Posteriors for Linguistic Parsing

The speech signal conveys information on different time scales from shor...
research
04/07/2022

Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition

General accent recognition (AR) models tend to directly extract low-leve...
research
10/31/2019

A comparative study of estimating articulatory movements from phoneme sequences and acoustic features

Unlike phoneme sequences, movements of speech articulators (lips, tongue...
research
09/16/2015

amLite: Amharic Transliteration Using Key Map Dictionary

amLite is a framework developed to map ASCII transliterated Amharic text...

Please sign up or login with your details

Forgot password? Click here to reset