StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval

03/29/2021
by   Aneeshan Sain, et al.
0

Sketch-based image retrieval (SBIR) is a cross-modal matching problem which is typically solved by learning a joint embedding space where the semantic content shared between photo and sketch modalities are preserved. However, a fundamental challenge in SBIR has been largely ignored so far, that is, sketches are drawn by humans and considerable style variations exist amongst different users. An effective SBIR model needs to explicitly account for this style diversity, crucially, to generalise to unseen user styles. To this end, a novel style-agnostic SBIR model is proposed. Different from existing models, a cross-modal variational autoencoder (VAE) is employed to explicitly disentangle each sketch into a semantic content part shared with the corresponding photo, and a style part unique to the sketcher. Importantly, to make our model dynamically adaptable to any unseen user styles, we propose to meta-train our cross-modal VAE by adding two style-adaptive components: a set of feature transformation layers to its encoder and a regulariser to the disentangled semantic content latent code. With this meta-learning framework, our model can not only disentangle the cross-modal shared semantic content for SBIR, but can adapt the disentanglement to any unseen user style as well, making the SBIR model truly style-agnostic. Extensive experiments show that our style-agnostic model yields state-of-the-art performance for both category-level and instance-level SBIR.

READ FULL TEXT

page 7

page 8

research
05/28/2017

Cross-modal Subspace Learning for Fine-grained Sketch-based Image Retrieval

Sketch-based image retrieval (SBIR) is challenging due to the inherent d...
research
03/25/2023

Zero-Shot Everything Sketch-Based Image Retrieval, and in Explainable Style

This paper studies the problem of zero-short sketch-based image retrieva...
research
07/29/2020

Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image Retrieval

Sketch as an image search query is an ideal alternative to text in captu...
research
04/20/2021

CrossATNet - A Novel Cross-Attention Based Framework for Sketch-Based Image Retrieval

We propose a novel framework for cross-modal zero-shot learning (ZSL) in...
research
04/25/2022

SceneTrilogy: On Scene Sketches and its Relationship with Text and Photo

We for the first time extend multi-modal scene understanding to include ...
research
07/26/2021

Towards the Unseen: Iterative Text Recognition by Distilling from Errors

Visual text recognition is undoubtedly one of the most extensively resea...
research
07/10/2016

Learning to Sketch Human Facial Portraits using Personal Styles by Case-Based Reasoning

This paper employs case-based reasoning (CBR) to capture the personal st...

Please sign up or login with your details

Forgot password? Click here to reset