Controllable Text-to-Image Generation

09/16/2019
by   Bowen Li, et al.
7

In this paper, we propose a novel controllable text-to-image generative adversarial network (ControlGAN), which can effectively synthesise high-quality images and also control parts of the image generation according to natural language descriptions. To achieve this, we introduce a word-level spatial and channel-wise attention-driven generator that can disentangle different visual attributes, and allow the model to focus on generating and manipulating subregions corresponding to the most relevant words. Also, a word-level discriminator is proposed to provide fine-grained supervisory feedback by correlating words with image regions, facilitating training an effective generator which is able to manipulate specific visual attributes without affecting the generation of other contents. Furthermore, perceptual loss is adopted to reduce the randomness involved in the image generation, and to encourage the generator to manipulate specific attributes required in the modified text. Extensive experiments on benchmark datasets demonstrate that our method outperforms existing state of the art, and is able to effectively manipulate synthetic images using natural language descriptions.

READ FULL TEXT

page 2

page 8

page 9

research
10/23/2020

Lightweight Generative Adversarial Networks for Text-Guided Image Manipulation

We propose a novel lightweight generative adversarial network for effici...
research
11/28/2017

AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks

In this paper, we propose an Attentional Generative Adversarial Network ...
research
05/07/2019

Spatially Constrained Generative Adversarial Networks for Conditional Image Generation

Image generation has raised tremendous attention in both academic and in...
research
04/26/2021

CAGAN: Text-To-Image Generation with Combined Attention GANs

Generating images according to natural language descriptions is a challe...
research
11/07/2020

Text-to-Image Generation Grounded by Fine-Grained User Attention

Localized Narratives is a dataset with detailed natural language descrip...
research
10/29/2018

Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language

This paper addresses the problem of manipulating images using natural la...
research
11/05/2020

DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image Generation

Most existing text-to-image generation methods adopt a multi-stage modul...

Please sign up or login with your details

Forgot password? Click here to reset