Shape-Guided Diffusion with Inside-Outside Attention

12/01/2022
by   Dong Huk Park, et al.
0

Shape can specify key object constraints, yet existing text-to-image diffusion models ignore this cue and synthesize objects that are incorrectly scaled, cut off, or replaced with background content. We propose a training-free method, Shape-Guided Diffusion, which uses a novel Inside-Outside Attention mechanism to constrain the cross-attention (and self-attention) maps such that prompt tokens (and pixels) referring to the inside of the shape cannot attend outside the shape, and vice versa. To demonstrate the efficacy of our method, we propose a new image editing task where the model must replace an object specified by its mask and a text prompt. We curate a new ShapePrompts benchmark based on MS-COCO and achieve SOTA results in shape faithfulness, text alignment, and realism according to both quantitative metrics and human preferences. Our data and code will be made available at https://shape-guided-diffusion.github.io.

READ FULL TEXT

page 6

page 11

page 14

page 15

page 16

page 17

page 18

page 19

research
07/24/2023

TF-ICON: Diffusion-Based Training-Free Cross-Domain Image Composition

Text-driven diffusion models have exhibited impressive generative capabi...
research
12/06/2022

Diffusion-SDF: Text-to-Shape via Voxelized Diffusion

With the rising industrial attention to 3D virtual modeling technology, ...
research
04/17/2023

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

Despite the success in large-scale text-to-image generation and text-con...
research
11/29/2021

Blended Diffusion for Text-driven Editing of Natural Images

Natural language offers a highly intuitive interface for image editing. ...
research
02/25/2023

Directed Diffusion: Direct Control of Object Placement through Attention Guidance

Text-guided diffusion models such as DALLE-2, IMAGEN, and Stable Diffusi...
research
03/20/2023

Localizing Object-level Shape Variations with Text-to-Image Diffusion Models

Text-to-image models give rise to workflows which often begin with an ex...
research
09/08/2023

MaskDiffusion: Boosting Text-to-Image Consistency with Conditional Mask

Recent advancements in diffusion models have showcased their impressive ...

Please sign up or login with your details

Forgot password? Click here to reset