Style Equalization: Unsupervised Learning of Controllable Generative Sequence Models

10/06/2021
by   Jen-Hao Rick Chang, et al.
0

Controllable generative sequence models with the capability to extract and replicate the style of specific examples enable many applications, including narrating audiobooks in different voices, auto-completing and auto-correcting written handwriting, and generating missing training samples for downstream recognition tasks. However, typical training algorithms for these controllable sequence generative models suffer from the training-inference mismatch, where the same sample is used as content and style input during training but different samples are given during inference. In this paper, we tackle the training-inference mismatch encountered during unsupervised learning of controllable generative sequence models. By introducing a style transformation module that we call style equalization, we enable training using different content and style samples and thereby mitigate the training-inference mismatch. To demonstrate its generality, we applied style equalization to text-to-speech and text-to-handwriting synthesis on three datasets. Our models achieve state-of-the-art style replication with a similar mean style opinion score as the real data. Moreover, the proposed method enables style interpolation between sequences and generates novel styles.

READ FULL TEXT
research
10/13/2021

Data Incubation – Synthesizing Missing Data for Handwriting Recognition

In this paper, we demonstrate how a generative model can be used to buil...
research
11/22/2022

PromptTTS: Controllable Text-to-Speech with Text Descriptions

Using a text description as prompt to guide the generation of text or im...
research
03/17/2021

STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech

Previous works on neural text-to-speech (TTS) have been addressed on lim...
research
08/19/2021

Towards Controllable and Photorealistic Region-wise Image Manipulation

Adaptive and flexible image editing is a desirable function of modern ge...
research
04/03/2018

Unsupervised Learning of Sequence Representations by Autoencoders

Traditional machine learning models have problems with handling sequence...
research
09/10/2018

Unsupervised Controllable Text Formalization

We propose a novel framework for controllable natural language transform...
research
10/08/2020

On the Role of Style in Parsing Speech with Neural Models

The differences in written text and conversational speech are substantia...

Please sign up or login with your details

Forgot password? Click here to reset