StyleGAN-Human: A Data-Centric Odyssey of Human Generation

04/25/2022
by   Jianglin Fu, et al.
8

Unconditional human image generation is an important task in vision and graphics, which enables various applications in the creative industry. Existing studies in this field mainly focus on "network engineering" such as designing new components and objective functions. This work takes a data-centric perspective and investigates multiple critical aspects in "data engineering", which we believe would complement the current practice. To facilitate a comprehensive study, we collect and annotate a large-scale human image dataset with over 230K samples capturing diverse poses and textures. Equipped with this large dataset, we rigorously investigate three essential factors in data engineering for StyleGAN-based human generation, namely data size, data distribution, and data alignment. Extensive experiments reveal several valuable observations w.r.t. these aspects: 1) Large-scale data, more than 40K images, are needed to train a high-fidelity unconditional human generation model with vanilla StyleGAN. 2) A balanced training set helps improve the generation quality with rare face poses compared to the long-tailed counterpart, whereas simply balancing the clothing texture distribution does not effectively bring an improvement. 3) Human GAN models with body centers for alignment outperform models trained using face centers or pelvis points as alignment anchors. In addition, a model zoo and human editing applications are demonstrated to facilitate future research in the community.

READ FULL TEXT

page 1

page 16

page 17

page 18

page 19

page 20

page 22

page 23

research
05/24/2022

M6-Fashion: High-Fidelity Multi-modal Image Generation and Editing

The fashion industry has diverse applications in multi-modal image gener...
research
05/09/2019

From a Human-Centric Perspective: What Might 6G Be?

As the standardization of fifth generation (5G) communications has been ...
research
05/16/2023

Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation

Text-guided human motion generation has drawn significant interest becau...
research
05/10/2023

Analyzing Bias in Diffusion-based Face Generation Models

Diffusion models are becoming increasingly popular in synthetic data gen...
research
06/01/2023

Intelligent Grimm – Open-ended Visual Storytelling via Latent Diffusion Models

Generative models have recently exhibited exceptional capabilities in va...
research
04/30/2023

Class-Balancing Diffusion Models

Diffusion-based models have shown the merits of generating high-quality ...
research
07/18/2020

SRNet: Improving Generalization in 3D Human Pose Estimation with a Split-and-Recombine Approach

Human poses that are rare or unseen in a training set are challenging fo...

Please sign up or login with your details

Forgot password? Click here to reset