What Can Human Sketches Do for Object Detection?

03/27/2023
by   Pinaki Nath Chowdhury, et al.
0

Sketches are highly expressive, inherently capturing subjective and fine-grained visual cues. The exploration of such innate properties of human sketches has, however, been limited to that of image retrieval. In this paper, for the first time, we cultivate the expressiveness of sketches but for the fundamental vision task of object detection. The end result is a sketch-enabled object detection framework that detects based on what you sketch – that “zebra” (e.g., one that is eating the grass) in a herd of zebras (instance-aware detection), and only the part (e.g., “head" of a “zebra") that you desire (part-aware detection). We further dictate that our model works without (i) knowing which category to expect at testing (zero-shot) and (ii) not requiring additional bounding boxes (as per fully supervised) and class labels (as per weakly supervised). Instead of devising a model from the ground up, we show an intuitive synergy between foundation models (e.g., CLIP) and existing sketch models build for sketch-based image retrieval (SBIR), which can already elegantly solve the task – CLIP to provide model generalisation, and SBIR to bridge the (sketch→photo) gap. In particular, we first perform independent prompting on both sketch and photo branches of an SBIR model to build highly generalisable sketch and photo encoders on the back of the generalisation ability of CLIP. We then devise a training paradigm to adapt the learned encoders for object detection, such that the region embeddings of detected boxes are aligned with the sketch and photo embeddings from SBIR. Evaluating our framework on standard object detection datasets like PASCAL-VOC and MS-COCO outperforms both supervised (SOD) and weakly-supervised object detectors (WSOD) on zero-shot setups. Project Page: <https://pinakinathc.github.io/sketch-detect>

READ FULL TEXT

page 5

page 7

page 8

page 9

page 10

research
03/23/2023

CLIP for All Things Zero-Shot Sketch-Based Image Retrieval, Fine-Grained or Not

In this paper, we leverage CLIP for zero-shot sketch based image retriev...
research
05/28/2019

An Analysis of Object Embeddings for Image Retrieval

We present an analysis of embeddings extracted from different pre-traine...
research
06/12/2020

Weakly-supervised Any-shot Object Detection

Methods for object detection and segmentation rely on large scale instan...
research
05/01/2018

Learning to Sketch with Shortcut Cycle Consistency

To see is to sketch -- free-hand sketching naturally builds ties between...
research
03/08/2019

Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-based Image Retrieval

Zero-shot sketch-based image retrieval (SBIR) is an emerging task in com...
research
05/18/2021

Towards Unsupervised Sketch-based Image Retrieval

Current supervised sketch-based image retrieval (SBIR) methods achieve e...
research
03/24/2022

Weakly-Supervised End-to-End CAD Retrieval to Scan Objects

CAD model retrieval to real-world scene observations has shown strong pr...

Please sign up or login with your details

Forgot password? Click here to reset