A Unified Prompt-Guided In-Context Inpainting Framework for Reference-based Image Manipulations

05/19/2023
by   Chenjie Cao, et al.
0

Recent advancements in Text-to-Image (T2I) generative models have yielded impressive results in generating high-fidelity images based on consistent text prompts. However, there is a growing interest in exploring the potential of these models for more diverse reference-based image manipulation tasks that require spatial understanding and visual context. Previous approaches have achieved this by incorporating additional control modules or fine-tuning the generative models specifically for each task until convergence. In this paper, we propose a different perspective. We conjecture that current large-scale T2I generative models already possess the capability to perform these tasks but are not fully activated within the standard generation process. To unlock these capabilities, we introduce a unified Prompt-Guided In-Context inpainting (PGIC) framework, which leverages large-scale T2I models to re-formulate and solve reference-guided image manipulations. In the PGIC framework, the reference and masked target are stitched together as a new input for the generative models, enabling the filling of masked regions as producing final results. Furthermore, we demonstrate that the self-attention modules in T2I models are well-suited for establishing spatial correlations and efficiently addressing challenging reference-guided manipulations. These large T2I models can be effectively driven by task-specific prompts with minimal training cost or even with frozen backbones. We synthetically evaluate the effectiveness of the proposed PGIC framework across various tasks, including reference-guided image inpainting, faithful inpainting, outpainting, local super-resolution, and novel view synthesis. Our results show that PGIC achieves significantly better performance while requiring less computation compared to other fine-tuning based approaches.

READ FULL TEXT

page 7

page 8

page 15

page 19

page 20

page 21

page 22

page 24

research
03/13/2023

Reference-Guided Large-Scale Face Inpainting with Identity and Texture Control

Face inpainting aims at plausibly predicting missing pixels of face imag...
research
11/22/2022

Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation

Large-scale text-to-image generative models have been a revolutionary br...
research
11/09/2022

3DFill:Reference-guided Image Inpainting by Self-supervised 3D Image Alignment

Most existing image inpainting algorithms are based on a single view, st...
research
05/02/2023

DreamPaint: Few-Shot Inpainting of E-Commerce Items for Virtual Try-On without 3D Modeling

We introduce DreamPaint, a framework to intelligently inpaint any e-comm...
research
12/05/2022

CLIPVG: Text-Guided Image Manipulation Using Differentiable Vector Graphics

Considerable progress has recently been made in leveraging CLIP (Contras...
research
02/21/2020

Learning to Inpaint by Progressively Growing the Mask Regions

Image inpainting is one of the most challenging tasks in computer vision...
research
12/07/2021

A Generic Approach for Enhancing GANs by Regularized Latent Optimization

With the rapidly growing model complexity and data volume, training deep...

Please sign up or login with your details

Forgot password? Click here to reset