TediGAN: Text-Guided Diverse Face Image Generation and Manipulation

12/06/2020
by   Weihao Xia, et al.
0

In this work, we propose TediGAN, a novel framework for multi-modal image generation and manipulation with textual descriptions. The proposed method consists of three components: StyleGAN inversion module, visual-linguistic similarity learning, and instance-level optimization. The inversion module maps real images to the latent space of a well-trained StyleGAN. The visual-linguistic similarity learns the text-image matching by mapping the image and text into a common embedding space. The instance-level optimization is for identity preservation in manipulation. Our model can produce diverse and high-quality images with an unprecedented resolution at 1024. Using a control mechanism based on style-mixing, our TediGAN inherently supports image synthesis with multi-modal inputs, such as sketches or semantic labels, with or without instance guidance. To facilitate text-guided multi-modal synthesis, we propose the Multi-Modal CelebA-HQ, a large-scale dataset consisting of real face images and corresponding semantic segmentation map, sketch, and textual descriptions. Extensive experiments on the introduced dataset demonstrate the superior performance of our proposed method. Code and data are available at https://github.com/weihaox/TediGAN.

READ FULL TEXT

page 3

page 4

page 5

page 7

page 8

page 9

page 10

page 11

research
04/18/2021

Towards Open-World Text-Guided Face Image Generation and Manipulation

The existing text-guided image synthesis methods can only produce limite...
research
03/05/2023

Text2Face: A Multi-Modal 3D Face Model

We present the first 3D morphable modelling approach, whereby 3D face sh...
research
03/29/2022

AnyFace: Free-style Text-to-Face Synthesis and Manipulation

Existing text-to-image synthesis methods generally are only applicable t...
research
10/10/2021

Identity-Guided Face Generation with Multi-modal Contour Conditions

Recent face generation methods have tried to synthesize faces based on t...
research
09/21/2023

TextCLIP: Text-Guided Face Image Generation And Manipulation Without Adversarial Training

Text-guided image generation aimed to generate desired images conditione...
research
03/11/2021

Diverse Semantic Image Synthesis via Probability Distribution Modeling

Semantic image synthesis, translating semantic layouts to photo-realisti...
research
03/28/2020

Semantically Multi-modal Image Synthesis

In this paper, we focus on semantically multi-modal image synthesis (SMI...

Please sign up or login with your details

Forgot password? Click here to reset