DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

08/25/2022
by   Nataniel Ruiz, et al.
0

Large text-to-image models achieved a remarkable leap in the evolution of AI, enabling high-quality and diverse synthesis of images from a given text prompt. However, these models lack the ability to mimic the appearance of subjects in a given reference set and synthesize novel renditions of them in different contexts. In this work, we present a new approach for "personalization" of text-to-image diffusion models (specializing them to users' needs). Given as input just a few images of a subject, we fine-tune a pretrained text-to-image model (Imagen, although our method is not limited to a specific model) such that it learns to bind a unique identifier with that specific subject. Once the subject is embedded in the output domain of the model, the unique identifier can then be used to synthesize fully-novel photorealistic images of the subject contextualized in different scenes. By leveraging the semantic prior embedded in the model with a new autogenous class-specific prior preservation loss, our technique enables synthesizing the subject in diverse scenes, poses, views, and lighting conditions that do not appear in the reference images. We apply our technique to several previously-unassailable tasks, including subject recontextualization, text-guided view synthesis, appearance modification, and artistic rendering (all while preserving the subject's key features). Project page: https://dreambooth.github.io/

READ FULL TEXT

page 1

page 9

page 10

page 11

page 12

page 14

page 15

page 16

research
11/22/2022

Plug-and-Play Diffusion Features for Text-Driven Image-to-Image Translation

Large-scale text-to-image generative models have been a revolutionary br...
research
04/01/2023

Subject-driven Text-to-Image Generation via Apprenticeship Learning

Recent text-to-image generation models like DreamBooth have made remarka...
research
06/02/2023

Adjustable Visual Appearance for Generalizable Novel View Synthesis

We present a generalizable novel view synthesis method where it is possi...
research
07/13/2023

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models

Personalization has emerged as a prominent aspect within the field of ge...
research
06/12/2023

Controlling Text-to-Image Diffusion by Orthogonal Finetuning

Large text-to-image diffusion models have impressive capabilities in gen...
research
02/08/2023

GLAZE: Protecting Artists from Style Mimicry by Text-to-Image Models

Recent text-to-image diffusion models such as MidJourney and Stable Diff...
research
07/20/2023

BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

Recent text-to-image diffusion models have demonstrated an astonishing c...

Please sign up or login with your details

Forgot password? Click here to reset