StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

12/10/2016
by   Han Zhang, et al.
0

Synthesizing high-quality images from text descriptions is a challenging problem in computer vision and has many practical applications. Samples generated by existing text-to-image approaches can roughly reflect the meaning of the given descriptions, but they fail to contain necessary details and vivid object parts. In this paper, we propose Stacked Generative Adversarial Networks (StackGAN) to generate 256x256 photo-realistic images conditioned on text descriptions. We decompose the hard problem into more manageable sub-problems through a sketch-refinement process. The Stage-I GAN sketches the primitive shape and colors of the object based on the given text description, yielding Stage-I low-resolution images. The Stage-II GAN takes Stage-I results and text descriptions as inputs, and generates high-resolution images with photo-realistic details. It is able to rectify defects in Stage-I results and add compelling details with the refinement process. To improve the diversity of the synthesized images and stabilize the training of the conditional-GAN, we introduce a novel Conditioning Augmentation technique that encourages smoothness in the latent conditioning manifold. Extensive experiments and comparisons with state-of-the-arts on benchmark datasets demonstrate that the proposed method achieves significant improvements on generating photo-realistic images conditioned on text descriptions.

READ FULL TEXT

page 6

page 7

page 8

page 10

page 11

page 12

page 13

page 14

research
10/19/2017

StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks

Although Generative Adversarial Networks (GANs) have shown remarkable su...
research
07/06/2022

Text to Image Synthesis using Stacked Conditional Variational Autoencoders and Conditional Generative Adversarial Networks

Synthesizing a realistic image from textual description is a major chall...
research
03/04/2021

Robustness Evaluation of Stacked Generative Adversarial Networks using Metamorphic Testing

Synthesising photo-realistic images from natural language is one of the ...
research
07/28/2021

CRD-CGAN: Category-Consistent and Relativistic Constraints for Diverse Text-to-Image Generation

Generating photo-realistic images from a text description is a challengi...
research
09/22/2018

Parametric Synthesis of Text on Stylized Backgrounds using PGGANs

We describe a novel method of generating high-resolution real-world imag...
research
07/02/2020

PerceptionGAN: Real-world Image Construction from Provided Text through Perceptual Understanding

Generating an image from a provided descriptive text is quite a challeng...
research
08/04/2022

TIC: Text-Guided Image Colorization

Image colorization is a well-known problem in computer vision. However, ...

Please sign up or login with your details

Forgot password? Click here to reset