StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing

03/28/2023
by   Senmao Li, et al.
0

A significant research effort is focused on exploiting the amazing capacities of pretrained diffusion models for the editing of images. They either finetune the model, or invert the image in the latent space of the pretrained model. However, they suffer from two problems: (1) Unsatisfying results for selected regions, and unexpected changes in nonselected regions. (2) They require careful text prompt editing where the prompt should include all visual objects in the input image. To address this, we propose two improvements: (1) Only optimizing the input of the value linear network in the cross-attention layers, is sufficiently powerful to reconstruct a real image. (2) We propose attention regularization to preserve the object-like attention maps after editing, enabling us to obtain accurate style editing without invoking significant structural changes. We further improve the editing technique which is used for the unconditional branch of classifier-free guidance, as well as the conditional one as used by P2P. Extensive experimental prompt-editing results on a variety of images, demonstrate qualitatively and quantitatively that our method has superior editing capabilities than existing and concurrent works.

READ FULL TEXT

page 5

page 12

page 13

page 14

page 15

page 16

page 17

page 18

research
02/06/2023

Zero-shot Image-to-Image Translation

Large-scale text-to-image generative models have shown their remarkable ...
research
07/15/2022

GopCaml: A Structural Editor for OCaml

This talk presents Gopcaml-mode, the first structural editing plugin for...
research
11/17/2022

Assessing Neural Network Robustness via Adversarial Pivotal Tuning

The ability to assess the robustness of image classifiers to a diverse s...
research
08/23/2023

Blending-NeRF: Text-Driven Localized Editing in Neural Radiance Fields

Text-driven localized editing of 3D objects is particularly difficult as...
research
07/17/2023

CLIP-Guided StyleGAN Inversion for Text-Driven Real Image Editing

Researchers have recently begun exploring the use of StyleGAN-based mode...
research
06/08/2023

Improving Negative-Prompt Inversion via Proximal Guidance

DDIM inversion has revealed the remarkable potential of real image editi...
research
03/21/2023

Vox-E: Text-guided Voxel Editing of 3D Objects

Large scale text-guided diffusion models have garnered significant atten...

Please sign up or login with your details

Forgot password? Click here to reset