CAGAN: Text-To-Image Generation with Combined Attention GANs

04/26/2021
by   Henning Schulze, et al.
0

Generating images according to natural language descriptions is a challenging task. In this work, we propose the Combined Attention Generative Adversarial Network (CAGAN) to generate photo-realistic images according to textual descriptions. The proposed CAGAN utilises two attention models: word attention to draw different sub-regions conditioned on related words; and squeeze-and-excitation attention to capture non-linear interaction among channels. With spectral normalisation to stabilise training, our proposed CAGAN improves the state of the art on the IS and FID on the CUB dataset and the FID on the more challenging COCO dataset. Furthermore, we demonstrate that judging a model by a single evaluation metric can be misleading by developing an additional model adding local self-attention which scores a higher IS, outperforming the state of the art on the CUB dataset, but generates unrealistic images through feature repetition.

READ FULL TEXT

page 1

page 4

research
11/09/2015

Generating Images from Captions with Attention

Motivated by the recent progress in generative models, we introduce a mo...
research
09/16/2019

Controllable Text-to-Image Generation

In this paper, we propose a novel controllable text-to-image generative ...
research
05/21/2018

Self-Attention Generative Adversarial Networks

In this paper, we propose the Self-Attention Generative Adversarial Netw...
research
10/20/2019

LinesToFacePhoto: Face Photo Generation from Lines with Conditional Self-Attention Generative Adversarial Network

In this paper, we explore the task of generating photo-realistic face im...
research
02/04/2019

Realistic Image Generation using Region-phrase Attention

The Generative Adversarial Network (GAN) has recently been applied to ge...
research
02/16/2015

DRAW: A Recurrent Neural Network For Image Generation

This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural ...
research
09/06/2018

Describing a Knowledge Base

We aim to automatically generate natural language descriptions about an ...

Please sign up or login with your details

Forgot password? Click here to reset