LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On

05/22/2023
by   Davide Morelli, et al.
0

The rapidly evolving fields of e-commerce and metaverse continue to seek innovative approaches to enhance the consumer experience. At the same time, recent advancements in the development of diffusion models have enabled generative networks to create remarkably realistic images. In this context, image-based virtual try-on, which consists in generating a novel image of a target model wearing a given in-shop garment, has yet to capitalize on the potential of these powerful generative solutions. This work introduces LaDI-VTON, the first Latent Diffusion textual Inversion-enhanced model for the Virtual Try-ON task. The proposed architecture relies on a latent diffusion model extended with a novel additional autoencoder module that exploits learnable skip connections to enhance the generation process preserving the model's characteristics. To effectively maintain the texture and details of the in-shop garment, we propose a textual inversion component that can map the visual features of the garment to the CLIP token embedding space and thus generate a set of pseudo-word token embeddings capable of conditioning the generation process. Experimental results on Dress Code and VITON-HD datasets demonstrate that our approach outperforms the competitors by a consistent margin, achieving a significant milestone for the task. Source code and trained models are publicly available at: https://github.com/miccunifi/ladi-vton.

READ FULL TEXT

page 1

page 3

page 5

page 7

page 8

page 10

page 12

page 13

research
08/11/2023

Taming the Power of Diffusion Models for High-Quality Virtual Try-On with Appearance Flow

Virtual try-on is a critical image synthesis task that aims to transfer ...
research
03/27/2023

Zero-Shot Composed Image Retrieval with Textual Inversion

Composed Image Retrieval (CIR) aims to retrieve a target image based on ...
research
04/18/2022

Dress Code: High-Resolution Multi-Category Virtual Try-On

Image-based virtual try-on strives to transfer the appearance of a cloth...
research
01/30/2023

ArchiSound: Audio Generation with Diffusion

The recent surge in popularity of diffusion models for image generation ...
research
03/16/2023

P+: Extended Textual Conditioning in Text-to-Image Generation

We introduce an Extended Textual Conditioning space in text-to-image mod...
research
04/04/2023

Multimodal Garment Designer: Human-Centric Latent Diffusion Models for Fashion Image Editing

Fashion illustration is used by designers to communicate their vision an...
research
04/12/2021

Cloth Interactive Transformer for Virtual Try-On

2D image-based virtual try-on has attracted increased attention from the...

Please sign up or login with your details

Forgot password? Click here to reset