Multi-Modal Face Stylization with a Generative Prior

05/29/2023
by   Mengtian Li, et al.
0

In this work, we introduce a new approach for artistic face stylization. Despite existing methods achieving impressive results in this task, there is still room for improvement in generating high-quality stylized faces with diverse styles and accurate facial reconstruction. Our proposed framework, MMFS, supports multi-modal face stylization by leveraging the strengths of StyleGAN and integrates it into an encoder-decoder architecture. Specifically, we use the mid-resolution and high-resolution layers of StyleGAN as the decoder to generate high-quality faces, while aligning its low-resolution layer with the encoder to extract and preserve input facial details. We also introduce a two-stage training strategy, where we train the encoder in the first stage to align the feature maps with StyleGAN and enable a faithful reconstruction of input faces. In the second stage, the entire network is fine-tuned with artistic data for stylized face generation. To enable the fine-tuned model to be applied in zero-shot and one-shot stylization tasks, we train an additional mapping network from the large-scale Contrastive-Language-Image-Pre-training (CLIP) space to a latent w+ space of fine-tuned StyleGAN. Qualitative and quantitative experiments show that our framework achieves superior face stylization performance in both one-shot and zero-shot stylization tasks, outperforming state-of-the-art methods by a large margin.

READ FULL TEXT

page 1

page 4

page 6

page 7

page 8

research
03/06/2023

DeCap: Decoding CLIP Latents for Zero-Shot Captioning via Text-Only Training

Large-scale pre-trained multi-modal models (e.g., CLIP) demonstrate stro...
research
05/10/2020

A Simple and Scalable Shape Representation for 3D Reconstruction

Deep learning applied to the reconstruction of 3D shapes has seen growin...
research
02/09/2020

Face Hallucination with Finishing Touches

Obtaining a high-quality frontal face image from a low-resolution (LR) n...
research
03/29/2022

AnyFace: Free-style Text-to-Face Synthesis and Manipulation

Existing text-to-image synthesis methods generally are only applicable t...
research
06/07/2023

Evaluation of ChatGPT on Biomedical Tasks: A Zero-Shot Comparison with Fine-Tuned Generative Transformers

ChatGPT is a large language model developed by OpenAI. Despite its impre...
research
11/24/2019

3FabRec: Fast Few-shot Face alignment by Reconstruction

Current supervised frameworks for facial landmark detection require a la...
research
03/15/2022

Hyperdecoders: Instance-specific decoders for multi-task NLP

We investigate input-conditioned hypernetworks for multi-tasking in NLP,...

Please sign up or login with your details

Forgot password? Click here to reset