Visualization and Interpretation of Latent Spaces for Controlling Expressive Speech Synthesis through Audio Analysis

03/27/2019
by   Noé Tits, et al.
0

The field of Text-to-Speech has experienced huge improvements last years benefiting from deep learning techniques. Producing realistic speech becomes possible now. As a consequence, the research on the control of the expressiveness, allowing to generate speech in different styles or manners, has attracted increasing attention lately. Systems able to control style have been developed and show impressive results. However the control parameters often consist of latent variables and remain complex to interpret. In this paper, we analyze and compare different latent spaces and obtain an interpretation of their influence on expressive speech. This will enable the possibility to build controllable speech synthesis systems with an understandable behaviour.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/17/2020

Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis

This paper proposes a hierarchical generative model with a multi-grained...
research
02/23/2018

Do WaveNets Dream of Acoustic Waves?

Various sources have reported the WaveNet deep learning architecture bei...
research
10/14/2019

The Theory behind Controllable Expressive Speech Synthesis: a Cross-disciplinary Approach

As part of the Human-Computer Interaction field, Expressive speech synth...
research
04/07/2022

Unsupervised Quantized Prosody Representation for Controllable Speech Synthesis

In this paper, we propose a novel prosody disentangle method for prosodi...
research
05/10/2021

Learning Robust Latent Representations for Controllable Speech Synthesis

State-of-the-art Variational Auto-Encoders (VAEs) for learning disentang...
research
11/24/2022

Prosody-controllable spontaneous TTS with neural HMMs

Spontaneous speech has many affective and pragmatic functions that are i...
research
06/15/2021

STAN: A stuttering therapy analysis helper

Stuttering is a complex speech disorder identified by repeti-tions, prol...

Please sign up or login with your details

Forgot password? Click here to reset