CompoDiff: Versatile Composed Image Retrieval With Latent Diffusion

03/21/2023
by   Geonmo Gu, et al.
0

This paper proposes a novel diffusion-based model, CompoDiff, for solving Composed Image Retrieval (CIR) with latent diffusion and presents a newly created dataset of 18 million reference images, conditions, and corresponding target image triplets to train the model. CompoDiff not only achieves a new zero-shot state-of-the-art on a CIR benchmark such as FashionIQ but also enables a more versatile CIR by accepting various conditions, such as negative text and image mask conditions, which are unavailable with existing CIR methods. In addition, the CompoDiff features are on the intact CLIP embedding space so that they can be directly used for all existing models exploiting the CLIP space. The code and dataset used for the training, and the pre-trained weights are available at https://github.com/navervision/CompoDiff

READ FULL TEXT

page 6

page 8

page 11

page 15

page 17

page 18

page 19

page 20

research
03/27/2023

Zero-Shot Composed Image Retrieval with Textual Inversion

Composed Image Retrieval (CIR) aims to retrieve a target image based on ...
research
02/06/2023

Pic2Word: Mapping Pictures to Words for Zero-shot Composed Image Retrieval

In Composed Image Retrieval (CIR), a user combines a query image with te...
research
08/22/2023

Composed Image Retrieval using Contrastive Learning and Task-oriented CLIP-based Features

Given a query composed of a reference image and a relative caption, the ...
research
06/12/2023

Zero-shot Composed Text-Image Retrieval

In this paper, we consider the problem of composed image retrieval (CIR)...
research
03/01/2023

Unlimited-Size Diffusion Restoration

Recently, using diffusion models for zero-shot image restoration (IR) ha...
research
05/27/2023

FACTUAL: A Benchmark for Faithful and Consistent Textual Scene Graph Parsing

Textual scene graph parsing has become increasingly important in various...
research
11/05/2021

Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences for Image-Text Retrieval

Matching model is essential for Image-Text Retrieval framework. Existing...

Please sign up or login with your details

Forgot password? Click here to reset