LDEdit: Towards Generalized Text Guided Image Manipulation via Latent Diffusion Models

10/05/2022
by   Paramanand Chandramouli, et al.
0

Research in vision-language models has seen rapid developments off-late, enabling natural language-based interfaces for image generation and manipulation. Many existing text guided manipulation techniques are restricted to specific classes of images, and often require fine-tuning to transfer to a different style or domain. Nevertheless, generic image manipulation using a single model with flexible text inputs is highly desirable. Recent work addresses this task by guiding generative models trained on the generic image datasets using pretrained vision-language encoders. While promising, this approach requires expensive optimization for each input. In this work, we propose an optimization-free method for the task of generic image manipulation from text prompts. Our approach exploits recent Latent Diffusion Models (LDM) for text to image generation to achieve zero-shot text guided manipulation. We employ a deterministic forward diffusion in a lower dimensional latent space, and the desired manipulation is achieved by simply providing the target text to condition the reverse diffusion process. We refer to our approach as LDEdit. We demonstrate the applicability of our method on semantic image manipulation and artistic style transfer. Our method can accomplish image manipulation on diverse domains and enables editing multiple attributes in a straightforward fashion. Extensive experiments demonstrate the benefit of our approach over competing baselines.

READ FULL TEXT

page 2

page 6

page 7

page 8

page 9

page 10

page 11

page 12

research
03/15/2023

Zero-Shot Contrastive Loss for Text-Guided Diffusion Image Style Transfer

Diffusion models have shown great promise in text-guided image style tra...
research
10/06/2021

DiffusionCLIP: Text-guided Image Manipulation Using Diffusion Models

Diffusion models are recent generative models that have shown great succ...
research
03/15/2023

Highly Personalized Text Embedding for Image Manipulation by Stable Diffusion

Diffusion models have shown superior performance in image generation and...
research
06/14/2023

GBSD: Generative Bokeh with Stage Diffusion

The bokeh effect is an artistic technique that blurs out-of-focus areas ...
research
08/12/2018

Language Guided Fashion Image Manipulation with Feature-wise Transformations

Developing techniques for editing an outfit image through natural senten...
research
06/06/2022

Blended Latent Diffusion

The tremendous progress in neural image generation, coupled with the eme...
research
10/10/2022

Bridging CLIP and StyleGAN through Latent Alignment for Image Editing

Text-driven image manipulation is developed since the vision-language mo...

Please sign up or login with your details

Forgot password? Click here to reset