Object Relation Detection Based on One-shot Learning

07/16/2018
by   Li Zhou, et al.
6

Detecting the relations among objects, such as "cat on sofa" and "person ride horse", is a crucial task in image understanding, and beneficial to bridging the semantic gap between images and natural language. Despite the remarkable progress of deep learning in detection and recognition of individual objects, it is still a challenging task to localize and recognize the relations between objects due to the complex combinatorial nature of various kinds of object relations. Inspired by the recent advances in one-shot learning, we propose a simple yet effective Semantics Induced Learner (SIL) model for solving this challenging task. Learning in one-shot manner can enable a detection model to adapt to a huge number of object relations with diverse appearance effectively and robustly. In addition, the SIL combines bottom-up and top-down attention mech- anisms, therefore enabling attention at the level of vision and semantics favorably. Within our proposed model, the bottom-up mechanism, which is based on Faster R-CNN, proposes objects regions, and the top-down mechanism selects and integrates visual features according to semantic information. Experiments demonstrate the effectiveness of our framework over other state-of-the-art methods on two large-scale data sets for object relation detection.

READ FULL TEXT

page 1

page 2

page 4

page 9

research
02/27/2017

Visual Translation Embedding Network for Visual Relation Detection

Visual relations, such as "person ride bike" and "bike next to car", off...
research
06/05/2019

Baby steps towards few-shot learning with multiple semantics

Learning from one or few visual examples is one of the key capabilities ...
research
09/26/2020

Few-shot Object Detection with Self-adaptive Attention Network for Remote Sensing Images

In remote sensing field, there are many applications of object detection...
research
03/19/2021

ClawCraneNet: Leveraging Object-level Relation for Text-based Video Segmentation

Text-based video segmentation is a challenging task that segments out th...
research
01/10/2019

Multi-Granularity Reasoning for Social Relation Recognition from Images

Discovering social relations in images can make machines better interpre...
research
09/28/2019

Meta R-CNN : Towards General Solver for Instance-level Low-shot Learning

Resembling the rapid learning capability of human, low-shot learning emp...
research
01/06/2021

RethNet: Object-by-Object Learning for Detecting Facial Skin Problems

Semantic segmentation is a hot topic in computer vision where the most c...

Please sign up or login with your details

Forgot password? Click here to reset