PerceptionGAN: Real-world Image Construction from Provided Text through Perceptual Understanding

07/02/2020
by   Kanish Garg, et al.
0

Generating an image from a provided descriptive text is quite a challenging task because of the difficulty in incorporating perceptual information (object shapes, colors, and their interactions) along with providing high relevancy related to the provided text. Current methods first generate an initial low-resolution image, which typically has irregular object shapes, colors, and interaction between objects. This initial image is then improved by conditioning on the text. However, these methods mainly address the problem of using text representation efficiently in the refinement of the initially generated image, while the success of this refinement process depends heavily on the quality of the initially generated image, as pointed out in the DM-GAN paper. Hence, we propose a method to provide good initialized images by incorporating perceptual understanding in the discriminator module. We improve the perceptual information at the first stage itself, which results in significant improvement in the final generated image. In this paper, we have applied our approach to the novel StackGAN architecture. We then show that the perceptual information included in the initial image is improved while modeling image distribution at multiple stages. Finally, we generated realistic multi-colored images conditioned by text. These images have good quality along with containing improved basic perceptual information. More importantly, the proposed method can be integrated into the pipeline of other state-of-the-art text-based-image-generation models to generate initial low-resolution images. We also worked on improving the refinement process in StackGAN by augmenting the third stage of the generator-discriminator pair in the StackGAN architecture. Our experimental analysis and comparison with the state-of-the-art on a large but sparse dataset MS COCO further validate the usefulness of our proposed approach.

READ FULL TEXT

page 1

page 3

page 5

research
10/15/2021

Multi-Tailed, Multi-Headed, Spatial Dynamic Memory refined Text-to-Image Synthesis

Synthesizing high-quality, realistic images from text-descriptions is a ...
research
12/10/2016

StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

Synthesizing high-quality images from text descriptions is a challenging...
research
04/02/2019

DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-to-Image Synthesis

In this paper, we focus on generating realistic images from text descrip...
research
05/30/2023

Real-World Image Variation by Aligning Diffusion Inversion Chain

Recent diffusion model advancements have enabled high-fidelity images to...
research
09/03/2022

DSE-GAN: Dynamic Semantic Evolution Generative Adversarial Network for Text-to-Image Generation

Text-to-image generation aims at generating realistic images which are s...
research
06/18/2023

GAN-based Image Compression with Improved RDO Process

GAN-based image compression schemes have shown remarkable progress latel...
research
06/19/2021

One-to-many Approach for Improving Super-Resolution

Super-resolution (SR) is a one-to-many task with multiple possible solut...

Please sign up or login with your details

Forgot password? Click here to reset