Subject-driven Text-to-Image Generation via Apprenticeship Learning

04/01/2023
by   Wenhu Chen, et al.
0

Recent text-to-image generation models like DreamBooth have made remarkable progress in generating highly customized images of a target subject, by fine-tuning an “expert model” for a given subject from a few examples. However, this process is expensive, since a new expert model must be learned for each subject. In this paper, we present SuTI, a Subject-driven Text-to-Image generator that replaces subject-specific fine tuning with in-context learning. Given a few demonstrations of a new subject, SuTI can instantly generate novel renditions of the subject in different scenes, without any subject-specific optimization. SuTI is powered by apprenticeship learning, where a single apprentice model is learned from data generated by massive amount of subject-specific expert models. Specifically, we mine millions of image clusters from the Internet, each centered around a specific visual subject. We adopt these clusters to train massive amount of expert models specialized on different subjects. The apprentice model SuTI then learns to mimic the behavior of these experts through the proposed apprenticeship learning algorithm. SuTI can generate high-quality and customized subject-specific images 20x faster than optimization-based SoTA methods. On the challenging DreamBench and DreamBench-v2, our human evaluation shows that SuTI can significantly outperform existing approaches like InstructPix2Pix, Textual Inversion, Imagic, Prompt2Prompt, Re-Imagen while performing on par with DreamBooth.

READ FULL TEXT

page 1

page 9

page 10

page 11

page 15

page 16

page 17

page 18

research
05/17/2023

FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention

Diffusion models excel at text-to-image generation, especially in subjec...
research
05/24/2023

BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing

Subject-driven text-to-image generation models create novel renditions o...
research
08/25/2022

DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

Large text-to-image models achieved a remarkable leap in the evolution o...
research
06/22/2023

DreamEdit: Subject-driven Image Editing

Subject-driven image generation aims at generating images containing cus...
research
03/09/2023

Cones: Concept Neurons in Diffusion Models for Customized Generation

Human brains respond to semantic features of presented stimuli with diff...
research
03/23/2023

DreamBooth3D: Subject-Driven Text-to-3D Generation

We present DreamBooth3D, an approach to personalize text-to-3D generativ...
research
09/29/2022

Re-Imagen: Retrieval-Augmented Text-to-Image Generator

Research on text-to-image generation has witnessed significant progress ...

Please sign up or login with your details

Forgot password? Click here to reset