Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection

by   Chenchen Zhu, et al.

Few-shot object detection is an imperative and long-lasting problem due to the inherent long-tail distribution of real-world data. Its performance is largely affected by the data scarcity of novel classes. But the semantic relation between the novel classes and the base classes is constant regardless of the data availability. In this work, we investigate utilizing this semantic relation together with the visual information and introduce explicit relation reasoning into the learning of novel object detection. Specifically, we represent each class concept by a semantic embedding learned from a large corpus of text. The detector is trained to project the image representations of objects into this embedding space. We also identify the problems of trivially using the raw embeddings with a heuristic knowledge graph and propose to augment the embeddings with a dynamic relation graph. As a result, our few-shot detector, termed SRR-FSD, is robust and stable to the variation of shots of novel objects. Experiments show that SRR-FSD can achieve competitive results at higher shots, and more importantly, a significantly better performance given both lower explicit and implicit shots. The proposed benchmark protocol with implicit shots removed from the pretrained classification dataset can serve as a more realistic setting for future research.


page 2

page 10


Generalized Few-Shot Object Detection without Forgetting

Recently few-shot object detection is widely adopted to deal with data-l...

Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks

We propose a distance supervised relation extraction approach for long-t...

Cos R-CNN for Online Few-shot Object Detection

We propose Cos R-CNN, a simple exemplar-based R-CNN formulation that is ...

A Comparative Review of Recent Few-Shot Object Detection Algorithms

Few-shot object detection, learning to adapt to the novel classes with a...

Multi-Faceted Distillation of Base-Novel Commonality for Few-shot Object Detection

Most of existing methods for few-shot object detection follow the fine-t...

Hybrid Knowledge Routed Modules for Large-scale Object Detection

The dominant object detection approaches treat the recognition of each r...

COBE: Contextualized Object Embeddings from Narrated Instructional Video

Many objects in the real world undergo dramatic variations in visual app...

Please sign up or login with your details

Forgot password? Click here to reset