UIGR: Unified Interactive Garment Retrieval

04/06/2022
by   Xiao Han, et al.
13

Interactive garment retrieval (IGR) aims to retrieve a target garment image based on a reference garment image along with user feedback on what to change on the reference garment. Two IGR tasks have been studied extensively: text-guided garment retrieval (TGR) and visually compatible garment retrieval (VCR). The user feedback for the former indicates what semantic attributes to change with the garment category preserved, while the category is the only thing to be changed explicitly for the latter, with an implicit requirement on style preservation. Despite the similarity between these two tasks and the practical need for an efficient system tackling both, they have never been unified and modeled jointly. In this paper, we propose a Unified Interactive Garment Retrieval (UIGR) framework to unify TGR and VCR. To this end, we first contribute a large-scale benchmark suited for both problems. We further propose a strong baseline architecture to integrate TGR and VCR in one model. Extensive experiments suggest that unifying two tasks in one framework is not only more efficient by requiring a single model only, it also leads to better performance. Code and datasets are available at https://github.com/BrandonHanx/CompFashion.

READ FULL TEXT

page 3

page 6

page 10

page 11

research
03/27/2023

Zero-Shot Composed Image Retrieval with Textual Inversion

Composed Image Retrieval (CIR) aims to retrieve a target image based on ...
research
03/10/2023

Semantic-Preserving Augmentation for Robust Image-Text Retrieval

Image text retrieval is a task to search for the proper textual descript...
research
05/01/2018

Dialog-based Interactive Image Retrieval

Existing methods for interactive image retrieval have demonstrated the m...
research
03/23/2023

Modular Retrieval for Generalization and Interpretation

New retrieval tasks have always been emerging, thus urging the developme...
research
01/17/2023

USER: Unified Semantic Enhancement with Momentum Contrast for Image-Text Retrieval

As a fundamental and challenging task in bridging language and vision do...
research
03/25/2023

Equivariant Similarity for Vision-Language Foundation Models

This study explores the concept of equivariance in vision-language found...
research
12/15/2021

Interscript: A dataset for interactive learning of scripts through error feedback

How can an end-user provide feedback if a deployed structured prediction...

Please sign up or login with your details

Forgot password? Click here to reset