User-friendly Image Editing with Minimal Text Input: Leveraging Captioning and Injection Techniques

06/05/2023
by   Sunwoo Kim, et al.
0

Recent text-driven image editing in diffusion models has shown remarkable success. However, the existing methods assume that the user's description sufficiently grounds the contexts in the source image, such as objects, background, style, and their relations. This assumption is unsuitable for real-world applications because users have to manually engineer text prompts to find optimal descriptions for different images. From the users' standpoint, prompt engineering is a labor-intensive process, and users prefer to provide a target word for editing instead of a full sentence. To address this problem, we first demonstrate the importance of a detailed text description of the source image, by dividing prompts into three categories based on the level of semantic details. Then, we propose simple yet effective methods by combining prompt generation frameworks, thereby making the prompt engineering process more user-friendly. Extensive qualitative and quantitative experiments demonstrate the importance of prompts in text-driven image editing and our method is comparable to ground-truth prompts.

READ FULL TEXT

page 3

page 7

page 8

research
04/10/2023

Towards Real-time Text-driven Image Manipulation with Unconditional Diffusion Models

Recent advances in diffusion models enable many powerful instruments for...
research
03/23/2023

SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field

Despite the great success in 2D editing using user-friendly tools, such ...
research
01/25/2023

Towards Arbitrary Text-driven Image Manipulation via Space Alignment

The recent GAN inversion methods have been able to successfully invert t...
research
03/02/2016

LevelMerge: Collaborative Game Level Editing by Merging Labeled Graphs

Game level editing is the process of constructing a full game level star...
research
11/29/2021

Blended Diffusion for Text-driven Editing of Natural Images

Natural language offers a highly intuitive interface for image editing. ...
research
08/08/2019

Editing Text in the Wild

In this paper, we are interested in editing text in natural images, whic...
research
04/27/2018

Customized Image Narrative Generation via Interactive Visual Question Generation and Answering

Image description task has been invariably examined in a static manner w...

Please sign up or login with your details

Forgot password? Click here to reset