Controlling High-Dimensional Data With Sparse Input

03/14/2023
by   Dan-Andrei Iliescu, et al.
0

We address the problem of human-in-the-loop control for generating highly-structured data. This task is challenging because existing generative models lack an efficient interface through which users can modify the output. Users have the option to either manually explore a non-interpretable latent space, or to laboriously annotate the data with conditioning labels. To solve this, we introduce a novel framework whereby an encoder maps a sparse, human interpretable control space onto the latent space of a generative model. We apply this framework to the task of controlling prosody in text-to-speech synthesis. We propose a model, called Multiple-Instance CVAE (MICVAE), that is specifically designed to encode sparse prosodic features and output complete waveforms. We show empirically that MICVAE displays desirable qualities of a sparse human-in-the-loop control mechanism: efficiency, robustness, and faithfulness. With even a very small number of input values ( 4), MICVAE enables users to improve the quality of the output significantly, in terms of listener preference (4:1).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/10/2019

Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech

This paper proposed a novel approach for the detection and reconstructio...
research
02/27/2020

Deep Meditations: Controlled navigation of latent space

We introduce a method which allows users to creatively explore and navig...
research
10/20/2022

Diffusion Models already have a Semantic Latent Space

Diffusion models achieve outstanding generative performance in various d...
research
03/30/2017

Autocomplete 3D Sculpting

Digital sculpting is a popular means to create 3D models but remains a c...
research
06/17/2021

Learning Perceptual Manifold of Fonts

Along the rapid development of deep learning techniques in generative mo...
research
09/30/2019

Imagine That! Leveraging Emergent Affordances for Tool Synthesis in Reaching Tasks

In this paper we investigate an artificial agent's ability to perform ta...
research
04/28/2022

Oracle Guided Image Synthesis with Relative Queries

Isolating and controlling specific features in the outputs of generative...

Please sign up or login with your details

Forgot password? Click here to reset