InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning

04/06/2023
by   Jing Shi, et al.
0

Recent advances in personalized image generation allow a pre-trained text-to-image model to learn a new concept from a set of images. However, existing personalization approaches usually require heavy test-time finetuning for each concept, which is time-consuming and difficult to scale. We propose InstantBooth, a novel approach built upon pre-trained text-to-image models that enables instant text-guided image personalization without any test-time finetuning. We achieve this with several major components. First, we learn the general concept of the input images by converting them to a textual token with a learnable image encoder. Second, to keep the fine details of the identity, we learn rich visual feature representation by introducing a few adapter layers to the pre-trained model. We train our components only on text-image pairs without using paired images of the same concept. Compared to test-time finetuning-based methods like DreamBooth and Textual-Inversion, our model can generate competitive results on unseen concepts concerning language-image alignment, image fidelity, and identity preservation while being 100 times faster.

READ FULL TEXT

page 1

page 4

page 5

page 6

page 7

page 8

page 9

page 13

research
11/27/2021

LAFITE: Towards Language-Free Training for Text-to-Image Generation

One of the major challenges in training text-to-image generation models ...
research
05/24/2023

A Neural Space-Time Representation for Text-to-Image Personalization

A key aspect of text-to-image personalization methods is the manner in w...
research
06/01/2023

Inserting Anybody in Diffusion Models via Celeb Basis

Exquisite demand exists for customizing the pretrained large text-to-ima...
research
09/12/2023

Catch You Everything Everywhere: Guarding Textual Inversion via Concept Watermarking

AIGC (AI-Generated Content) has achieved tremendous success in many appl...
research
04/12/2023

Gradient-Free Textual Inversion

Recent works on personalized text-to-image generation usually learn to b...
research
09/08/2022

Text-Free Learning of a Natural Language Interface for Pretrained Face Generators

We propose Fast text2StyleGAN, a natural language interface that adapts ...
research
09/06/2023

Image Aesthetics Assessment via Learnable Queries

Image aesthetics assessment (IAA) aims to estimate the aesthetics of ima...

Please sign up or login with your details

Forgot password? Click here to reset