P+: Extended Textual Conditioning in Text-to-Image Generation

03/16/2023
by   Andrey Voynov, et al.
0

We introduce an Extended Textual Conditioning space in text-to-image models, referred to as P+. This space consists of multiple textual conditions, derived from per-layer prompts, each corresponding to a layer of the denoising U-net of the diffusion model. We show that the extended space provides greater disentangling and control over image synthesis. We further introduce Extended Textual Inversion (XTI), where the images are inverted into P+, and represented by per-layer tokens. We show that XTI is more expressive and precise, and converges faster than the original Textual Inversion (TI) space. The extended inversion method does not involve any noticeable trade-off between reconstruction and editability and induces more regular inversions. We conduct a series of extensive experiments to analyze and understand the properties of the new space, and to showcase the effectiveness of our method for personalizing text-to-image models. Furthermore, we utilize the unique properties of this space to achieve previously unattainable results in object-style mixing using text-to-image models. Project page: https://prompt-plus.github.io

READ FULL TEXT

page 5

page 7

page 8

page 9

page 10

page 14

page 15

page 16

research
05/25/2023

ProSpect: Expanded Conditioning for the Personalization of Attribute-aware Image Generation

Personalizing generative models offers a way to guide image generation w...
research
04/12/2023

Gradient-Free Textual Inversion

Recent works on personalized text-to-image generation usually learn to b...
research
02/09/2023

Is This Loss Informative? Speeding Up Textual Inversion with Deterministic Objective Evaluation

Text-to-image generation models represent the next step of evolution in ...
research
11/23/2022

Inversion-Based Style Transfer with Diffusion Models

The artistic style within a painting is the means of expression, which i...
research
08/02/2022

An Image is Worth One Word: Personalizing Text-to-Image Generation using Textual Inversion

Text-to-image models offer unprecedented freedom to guide creation throu...
research
05/22/2023

LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On

The rapidly evolving fields of e-commerce and metaverse continue to seek...
research
05/24/2023

A Neural Space-Time Representation for Text-to-Image Personalization

A key aspect of text-to-image personalization methods is the manner in w...

Please sign up or login with your details

Forgot password? Click here to reset