Prosodic Representation Learning and Contextual Sampling for Neural Text-to-Speech

11/04/2020
by   Sri Karlapati, et al.
0

In this paper, we introduce Kathaka, a model trained with a novel two-stage training process for neural speech synthesis with contextually appropriate prosody. In Stage I, we learn a prosodic distribution at the sentence level from mel-spectrograms available during training. In Stage II, we propose a novel method to sample from this learnt prosodic distribution using the contextual information available in text. To do this, we use BERT on text, and graph-attention networks on parse trees extracted from text. We show a statistically significant relative improvement of 13.2% in naturalness over a strong baseline when compared to recordings. We also conduct an ablation study on variations of our sampling technique, and show a statistically significant improvement over the baseline in each case.

READ FULL TEXT
research
06/20/2023

eCat: An End-to-End Model for Multi-Speaker TTS Many-to-Many Fine-Grained Prosody Transfer

We present eCat, a novel end-to-end multispeaker model capable of: a) ge...
research
06/27/2022

CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer

In this paper, we present CopyCat2 (CC2), a novel model capable of: a) s...
research
09/19/2023

End-to-End Speech Recognition Contextualization with Large Language Models

In recent years, Large Language Models (LLMs) have garnered significant ...
research
10/24/2021

Discrete acoustic space for an efficient sampling in neural text-to-speech

We present an SVQ-VAE architecture using a split vector quantizer for NT...
research
07/31/2023

VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design

Single-stage text-to-speech models have been actively studied recently, ...
research
06/29/2022

Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody

Generating expressive and contextually appropriate prosody remains a cha...

Please sign up or login with your details

Forgot password? Click here to reset