TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition

07/24/2023
by   Shilin Lu, et al.
1

Text-driven diffusion models have exhibited impressive generative capabilities, enabling various image editing tasks. In this paper, we propose TF-ICON, a novel Training-Free Image COmpositioN framework that harnesses the power of text-driven diffusion models for cross-domain image-guided composition. This task aims to seamlessly integrate user-provided objects into a specific visual context. Current diffusion-based methods often involve costly instance-based optimization or finetuning of pretrained models on customized datasets, which can potentially undermine their rich prior. In contrast, TF-ICON can leverage off-the-shelf diffusion models to perform cross-domain image-guided composition without requiring additional training, finetuning, or optimization. Moreover, we introduce the exceptional prompt, which contains no information, to facilitate text-driven diffusion models in accurately inverting real images into latent representations, forming the basis for compositing. Our experiments show that equipping Stable Diffusion with the exceptional prompt outperforms state-of-the-art inversion methods on various datasets (CelebA-HQ, COCO, and ImageNet), and that TF-ICON surpasses prior baselines in versatile visual domains. Code is available at https://github.com/Shilin-LU/TF-ICON

READ FULL TEXT

page 19

page 20

page 21

page 24

page 25

page 26

page 27

page 28

research
05/01/2023

In-Context Learning Unlocked for Diffusion Models

We present Prompt Diffusion, a framework for enabling in-context learnin...
research
11/30/2022

High-Fidelity Guided Image Synthesis with Latent Diffusion Models

Controllable image synthesis with user scribbles has gained huge public ...
research
02/20/2023

Cross-domain Compositing with Pretrained Diffusion Models

Diffusion models have enabled high-quality, conditional image editing ca...
research
12/01/2022

Shape-Guided Diffusion with Inside-Outside Attention

Shape can specify key object constraints, yet existing text-to-image dif...
research
02/16/2023

Text-driven Visual Synthesis with Latent Diffusion Prior

There has been tremendous progress in large-scale text-to-image synthesi...
research
03/23/2023

ReVersion: Diffusion-Based Relation Inversion from Images

Diffusion models gain increasing popularity for their generative capabil...
research
05/22/2023

If at First You Don't Succeed, Try, Try Again: Faithful Diffusion-based Text-to-Image Generation by Selection

Despite their impressive capabilities, diffusion-based text-to-image (T2...

Please sign up or login with your details

Forgot password? Click here to reset