MGD-GAN: Text-to-Pedestrian generation through Multi-Grained Discrimination

10/02/2020
by   Shengyu Zhang, et al.
0

In this paper, we investigate the problem of text-to-pedestrian synthesis, which has many potential applications in art, design, and video surveillance. Existing methods for text-to-bird/flower synthesis are still far from solving this fine-grained image generation problem, due to the complex structure and heterogeneous appearance that the pedestrians naturally take on. To this end, we propose the Multi-Grained Discrimination enhanced Generative Adversarial Network, that capitalizes a human-part-based Discriminator (HPD) and a self-cross-attended (SCA) global Discriminator in order to capture the coherence of the complex body structure. A fined-grained word-level attention mechanism is employed in the HPD module to enforce diversified appearance and vivid details. In addition, two pedestrian generation metrics, named Pose Score and Pose Variance, are devised to evaluate the generation quality and diversity, respectively. We conduct extensive experiments and ablation studies on the caption-annotated pedestrian dataset, CUHK Person Description Dataset. The substantial improvement over the various metrics demonstrates the efficacy of MGD-GAN on the text-to-pedestrian synthesis scenario.

READ FULL TEXT

page 1

page 3

page 6

page 7

research
09/24/2021

Fine-Grained Image Generation from Bangla Text Description using Attentional Generative Adversarial Network

Generating fine-grained, realistic images from text has many application...
research
01/12/2021

Fine-grained Semantic Constraint in Image Synthesis

In this paper, we propose a multi-stage and high-resolution model for im...
research
11/11/2022

HumanDiffusion: a Coarse-to-Fine Alignment Diffusion Framework for Controllable Text-Driven Person Image Generation

Text-driven person image generation is an emerging and challenging task ...
research
02/27/2019

Object-driven Text-to-Image Synthesis via Adversarial Training

In this paper, we propose Object-driven Attentive Generative Adversarial...
research
04/30/2022

Look Closer to Supervise Better: One-Shot Font Generation via Component-Based Discriminator

Automatic font generation remains a challenging research issue due to th...
research
10/14/2021

Towards Using Clothes Style Transfer for Scenario-aware Person Video Generation

Clothes style transfer for person video generation is a challenging task...
research
06/20/2023

Data-Driven but Privacy-Conscious: Pedestrian Dataset De-identification via Full-Body Person Synthesis

The advent of data-driven technology solutions is accompanied by an incr...

Please sign up or login with your details

Forgot password? Click here to reset