Compositional Learning of Image-Text Query for Image Retrieval

06/19/2020
by   Muhammad Umer Anwaar, et al.
0

In this paper, we investigate the problem of retrieving images from a database based on a multi-modal (image-text) query. Specifically, the query text prompts some modification in the query image and the task is to retrieve images with the desired modifications. For instance, a user of an E-Commerce platform is interested in buying a dress, which should look similar to her friend's dress, but the dress should be of white color with a ribbon sash. In this case, we would like the algorithm to retrieve some dresses with desired modifications in the query dress. We propose an autoencoder based model, ComposeAE, to learn the composition of image and text query for retrieving images. We adopt a deep metric learning approach and learn a metric that pushes composition of source image and text query closer to the target images. We also propose a rotational symmetry constraint on the optimization problem. Our approach is able to outperform the state-of-the-art method TIRG <cit.> on three benchmark datasets, namely: MIT-States, Fashion200k and Fashion IQ. In order to ensure fair comparison, we introduce strong baselines by enhancing TIRG method. To ensure reproducibility of the results, we publish our code here: <https://anonymous.4open.science/r/d1babc3c-0e72-448a-8594-b618bae876dc/>.

READ FULL TEXT

page 4

page 8

page 12

research
03/08/2022

Image Search with Text Feedback by Additive Attention Compositional Learning

Effective image retrieval with text feedback stands to impact a range of...
research
04/07/2021

RTIC: Residual Learning for Text and Image Composition using Graph Convolutional Network

In this paper, we study the compositional learning of images and texts f...
research
12/18/2018

Composing Text and Image for Image Retrieval - An Empirical Odyssey

In this paper, we study the task of image retrieval, where the input que...
research
09/04/2023

Target-Guided Composed Image Retrieval

Composed image retrieval (CIR) is a new and flexible image retrieval par...
research
03/27/2020

CurlingNet: Compositional Learning between Images and Text for Fashion IQ Data

We present an approach named CurlingNet that can measure the semantic di...
research
08/31/2023

Learning with Multi-modal Gradient Attention for Explainable Composed Image Retrieval

We consider the problem of composed image retrieval that takes an input ...
research
10/21/2019

Designovel's system description for Fashion-IQ challenge 2019

This paper describes Designovel's systems which are submitted to the Fas...

Please sign up or login with your details

Forgot password? Click here to reset