MagiCapture: High-Resolution Multi-Concept Portrait Customization

09/13/2023
by   Junha Hyung, et al.
0

Large-scale text-to-image models including Stable Diffusion are capable of generating high-fidelity photorealistic portrait images. There is an active research area dedicated to personalizing these models, aiming to synthesize specific subjects or styles using provided sets of reference images. However, despite the plausible results from these personalization methods, they tend to produce images that often fall short of realism and are not yet on a commercially viable level. This is particularly noticeable in portrait image generation, where any unnatural artifact in human faces is easily discernible due to our inherent human bias. To address this, we introduce MagiCapture, a personalization method for integrating subject and style concepts to generate high-resolution portrait images using just a few subject and style references. For instance, given a handful of random selfies, our fine-tuned model can generate high-quality portrait images in specific styles, such as passport or profile photos. The main challenge with this task is the absence of ground truth for the composed concepts, leading to a reduction in the quality of the final output and an identity shift of the source subject. To address these issues, we present a novel Attention Refocusing loss coupled with auxiliary priors, both of which facilitate robust learning within this weakly supervised learning setting. Our pipeline also includes additional post-processing steps to ensure the creation of highly realistic outputs. MagiCapture outperforms other baselines in both quantitative and qualitative evaluations and can also be generalized to other non-human objects.

READ FULL TEXT

page 1

page 3

page 4

page 6

page 7

research
08/07/2023

AvatarVerse: High-quality Stable 3D Avatar Creation from Text and Pose

Creating expressive, diverse and high-quality 3D avatars from highly cus...
research
11/14/2022

Arbitrary Style Guidance for Enhanced Diffusion-Based Text-to-Image Generation

Diffusion-based text-to-image generation models like GLIDE and DALLE-2 h...
research
11/21/2022

DreamArtist: Towards Controllable One-Shot Text-to-Image Generation via Contrastive Prompt-Tuning

Large-scale text-to-image generation models have achieved remarkable pro...
research
12/08/2022

Multi-Concept Customization of Text-to-Image Diffusion

While generative models produce high-quality images of concepts learned ...
research
09/04/2023

StyleAdapter: A Single-Pass LoRA-Free Model for Stylized Image Generation

This paper presents a LoRA-free method for stylized image generation tha...
research
10/11/2022

Style-Guided Inference of Transformer for High-resolution Image Synthesis

Transformer is eminently suitable for auto-regressive image synthesis wh...
research
08/19/2023

DUAW: Data-free Universal Adversarial Watermark against Stable Diffusion Customization

Stable Diffusion (SD) customization approaches enable users to personali...

Please sign up or login with your details

Forgot password? Click here to reset