CPGAN: Full-Spectrum Content-Parsing Generative Adversarial Networks for Text-to-Image Synthesis

12/18/2019
by   Jiadong Liang, et al.
7

Typical methods for text-to-image synthesis seek to design effective generative architecture to model the text-to-image mapping directly. It is fairly arduous due to the cross-modality translation involved in the task of text-to-image synthesis. In this paper we circumvent this problem by focusing on parsing the content of both the input text and the synthesized image thoroughly to model the text-to-image consistency in the semantic level. In particular, we design a memory structure to parse the textual content by exploring semantic correspondence between each word in the vocabulary to its various visual contexts across relevant images in training data during text encoding. On the other hand, the synthesized image is parsed to learn its semantics in an object-aware manner. Moreover, we customize a conditional discriminator, which models the fine-grained correlations between words and image sub-regions to push for the cross-modality semantic alignment between the input text and the synthesized image. Thus, a full-spectrum content-oriented parsing in the deep semantic level is performed by our model, which is referred to as Content-Parsing Generative Adversarial Networks (CPGAN). Extensive experiments on COCO dataset manifest that CPGAN advances the state-of-the-art performance significantly.

READ FULL TEXT

page 1

page 3

page 8

research
08/12/2022

Layout-Bridging Text-to-Image Synthesis

The crux of text-to-image synthesis stems from the difficulty of preserv...
research
08/20/2022

Vision-Language Matching for Text-to-Image Synthesis via Generative Adversarial Networks

Text-to-image synthesis aims to generate a photo-realistic and semantic ...
research
04/11/2019

FTGAN: A Fully-trained Generative Adversarial Networks for Text to Face Generation

As a sub-domain of text-to-image synthesis, text-to-face generation has ...
research
08/21/2020

Toward Quantifying Ambiguities in Artistic Images

It has long been hypothesized that perceptual ambiguities play an import...
research
04/22/2022

Recurrent Affine Transformation for Text-to-image Synthesis

Text-to-image synthesis aims to generate natural images conditioned on t...
research
02/12/2020

Image-to-Image Translation with Text Guidance

The goal of this paper is to embed controllable factors, i.e., natural l...
research
10/29/2018

Text-Adaptive Generative Adversarial Networks: Manipulating Images with Natural Language

This paper addresses the problem of manipulating images using natural la...

Please sign up or login with your details

Forgot password? Click here to reset