SyncDiffusion: Coherent Montage via Synchronized Joint Diffusions

06/08/2023
by   Yuseung Lee, et al.
0

The remarkable capabilities of pretrained image diffusion models have been utilized not only for generating fixed-size images but also for creating panoramas. However, naive stitching of multiple images often results in visible seams. Recent techniques have attempted to address this issue by performing joint diffusions in multiple windows and averaging latent features in overlapping regions. However, these approaches, which focus on seamless montage generation, often yield incoherent outputs by blending different scenes within a single image. To overcome this limitation, we propose SyncDiffusion, a plug-and-play module that synchronizes multiple diffusions through gradient descent from a perceptual similarity loss. Specifically, we compute the gradient of the perceptual loss using the predicted denoised images at each denoising step, providing meaningful guidance for achieving coherent montages. Our experimental results demonstrate that our method produces significantly more coherent outputs compared to previous methods (66.35 user study) while still maintaining fidelity (as assessed by GIQA) and compatibility with the input prompt (as measured by CLIP score).

READ FULL TEXT

page 13

page 14

page 15

page 20

page 21

page 22

page 23

page 24

research
05/30/2023

Real-World Image Variation by Aligning Diffusion Inversion Chain

Recent diffusion model advancements have enabled high-fidelity images to...
research
03/29/2020

Structure-Preserving Super Resolution with Gradient Guidance

Structures matter in single image super resolution (SISR). Recent studie...
research
09/18/2023

Gradpaint: Gradient-Guided Inpainting with Diffusion Models

Denoising Diffusion Probabilistic Models (DDPMs) have recently achieved ...
research
05/15/2020

Enhancing Perceptual Loss with Adversarial Feature Matching for Super-Resolution

Single image super-resolution (SISR) is an ill-posed problem with an ind...
research
05/29/2023

Text-Only Image Captioning with Multi-Context Data Generation

Text-only Image Captioning (TIC) is an approach that aims to construct a...
research
05/05/2021

Perceptual Gradient Networks

Many applications of deep learning for image generation use perceptual l...
research
01/13/2020

180-degree Outpainting from a Single Image

Presenting context images to a viewer's peripheral vision is one of the ...

Please sign up or login with your details

Forgot password? Click here to reset