Prosody Transfer in Neural Text to Speech Using Global Pitch and Loudness Features

11/21/2019
by   Siddharth Gururani, et al.
0

This paper presents a simple yet effective method to achieve prosody transfer from a reference speech signal to synthesized speech. The main idea is to incorporate well-known acoustic correlates of prosody such as pitch and loudness contours of the reference speech into a modern neural text-to-speech (TTS) synthesizer such as Tacotron2 (TC2). More specifically, a small set of acoustic features are extracted from the reference audio and then used to condition a TC2 synthesizer. The trained model is evaluated using subjective listening tests and novel objective evaluations of prosody transfer are proposed. Listening tests show that the synthesized speech is rated as highly natural and that prosody is successfully transferred from the reference speech signal to the synthesized signal.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2018

Towards End-to-End Prosody Transfer for Expressive Speech Synthesis with Tacotron

We present an extension to the Tacotron speech synthesis architecture th...
research
03/07/2023

Do Prosody Transfer Models Transfer Prosody?

Some recent models for Text-to-Speech synthesis aim to transfer the pros...
research
06/07/2020

Analysis and Synthesis of Hypo and Hyperarticulated Speech

This paper focuses on the analysis and synthesis of hypo and hyperarticu...
research
04/05/2022

AILTTS: Adversarial Learning of Intermediate Acoustic Feature for End-to-End Lightweight Text-to-Speech

The quality of end-to-end neural text-to-speech (TTS) systems highly dep...
research
08/07/2020

Controllable Neural Prosody Synthesis

Speech synthesis has recently seen significant improvements in fidelity,...
research
05/21/2020

Pitchtron: Towards audiobook generation from ordinary people's voices

In this paper, we explore prosody transfer for audiobook generation unde...
research
06/10/2022

Going Beyond the Cookie Theft Picture Test: Detecting Cognitive Impairments using Acoustic Features

Standardized tests play a crucial role in the detection of cognitive imp...

Please sign up or login with your details

Forgot password? Click here to reset