Dual Adversarial Inference for Text-to-Image Synthesis

08/14/2019
by   Qicheng Lao, et al.
8

Synthesizing images from a given text description involves engaging two types of information: the content, which includes information explicitly described in the text (e.g., color, composition, etc.), and the style, which is usually not well described in the text (e.g., location, quantity, size, etc.). However, in previous works, it is typically treated as a process of generating images only from the content, i.e., without considering learning meaningful style representations. In this paper, we aim to learn two variables that are disentangled in the latent space, representing content and style respectively. We achieve this by augmenting current text-to-image synthesis frameworks with a dual adversarial inference mechanism. Through extensive experiments, we show that our model learns, in an unsupervised manner, style representations corresponding to certain meaningful information present in the image that are not well described in the text. The new framework also improves the quality of synthesized images when evaluated on Oxford-102, CUB and COCO datasets.

READ FULL TEXT

page 7

page 8

page 14

page 15

page 16

page 18

page 19

page 20

research
11/04/2021

StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Synthesis

Generating images that fit a given text description using machine learni...
research
11/22/2022

PromptTTS: Controllable Text-to-Speech with Text Descriptions

Using a text description as prompt to guide the generation of text or im...
research
05/11/2019

Disentangling Content and Style via Unsupervised Geometry Distillation

It is challenging to disentangle an object into two orthogonal spaces of...
research
03/30/2021

FONTNET: On-Device Font Understanding and Prediction Pipeline

Fonts are one of the most basic and core design concepts. Numerous use c...
research
03/09/2020

Unsupervised Style and Content Separation by Minimizing Mutual Information for Speech Synthesis

We present a method to generate speech from input text and a style vecto...
research
02/26/2020

Sketch-to-Art: Synthesizing Stylized Art Images From Sketches

We propose a new approach for synthesizing fully detailed art-stylized i...
research
12/16/2020

Multi-type Disentanglement without Adversarial Training

Controlling the style of natural language by disentangling the latent sp...

Please sign up or login with your details

Forgot password? Click here to reset