ConsNet: Learning Consistency Graph for Zero-Shot Human-Object Interaction Detection

08/14/2020
by   Ye Liu, et al.
0

We consider the problem of Human-Object Interaction (HOI) Detection, which aims to locate and recognize HOI instances in the form of <human, action, object> in images. Most existing works treat HOIs as individual interaction categories, thus can not handle the problem of long-tail distribution and polysemy of action labels. We argue that multi-level consistencies among objects, actions and interactions are strong cues for generating semantic representations of rare or previously unseen HOIs. Leveraging the compositional and relational peculiarities of HOI labels, we propose ConsNet, a knowledge-aware framework that explicitly encodes the relations among objects, actions and interactions into an undirected graph called consistency graph, and exploits Graph Attention Networks (GATs) to propagate knowledge among HOI categories as well as their constituents. Our model takes visual features of candidate human-object pairs and word embeddings of HOI labels as inputs, maps them into visual-semantic joint embedding space and obtains detection results by measuring their similarities. We extensively evaluate our model on the challenging V-COCO and HICO-DET datasets, and results validate that our approach outperforms state-of-the-arts under both fully-supervised and zero-shot settings.

READ FULL TEXT

page 1

page 2

page 7

research
09/02/2020

Zero-Shot Human-Object Interaction Recognition via Affordance Graphs

We propose a new approach for Zero-Shot Human-Object Interaction Recogni...
research
05/28/2017

Care about you: towards large-scale human-centric visual relationship detection

Visual relationship detection aims to capture interactions between pairs...
research
07/28/2016

SEMBED: Semantic Embedding of Egocentric Action Videos

We present SEMBED, an approach for embedding an egocentric object intera...
research
01/13/2020

Classifying All Interacting Pairs in a Single Shot

In this paper, we introduce a novel human interaction detection approach...
research
12/05/2019

Zero-Shot Generation of Human-Object Interaction Videos

Generation of videos of complex scenes is an important open problem in c...
research
01/07/2020

Visual-Semantic Graph Attention Network for Human-Object Interaction Detection

In scene understanding, machines benefit from not only detecting individ...
research
03/08/2017

Deep Variation-structured Reinforcement Learning for Visual Relationship and Attribute Detection

Despite progress in visual perception tasks such as image classification...

Please sign up or login with your details

Forgot password? Click here to reset