FGAHOI: Fine-Grained Anchors for Human-Object Interaction Detection

01/08/2023
by   Shuailei Ma, et al.
2

Human-Object Interaction (HOI), as an important problem in computer vision, requires locating the human-object pair and identifying the interactive relationships between them. The HOI instance has a greater span in spatial, scale, and task than the individual object instance, making its detection more susceptible to noisy backgrounds. To alleviate the disturbance of noisy backgrounds on HOI detection, it is necessary to consider the input image information to generate fine-grained anchors which are then leveraged to guide the detection of HOI instances. However, it is challenging for the following reasons. i) how to extract pivotal features from the images with complex background information is still an open question. ii) how to semantically align the extracted features and query embeddings is also a difficult issue. In this paper, a novel end-to-end transformer-based framework (FGAHOI) is proposed to alleviate the above problems. FGAHOI comprises three dedicated components namely, multi-scale sampling (MSS), hierarchical spatial-aware merging (HSAM) and task-aware merging mechanism (TAM). MSS extracts features of humans, objects and interaction areas from noisy backgrounds for HOI instances of various scales. HSAM and TAM semantically align and merge the extracted features and query embeddings in the hierarchical spatial and task perspectives in turn. In the meanwhile, a novel training strategy Stage-wise Training Strategy is designed to reduce the training pressure caused by overly complex tasks done by FGAHOI. In addition, we propose two ways to measure the difficulty of HOI detection and a novel dataset, i.e., HOI-SDC for the two challenges (Uneven Distributed Area in Human-Object Pairs and Long Distance Visual Modeling of Human-Object Pairs) of HOI instances detection.

READ FULL TEXT

page 1

page 3

page 7

page 8

page 9

page 10

page 12

page 13

research
12/16/2021

QAHOI: Query-Based Anchors for Human-Object Interaction Detection

Human-object interaction (HOI) detection as a downstream of object detec...
research
03/28/2022

MSTR: Multi-Scale Transformer for End-to-End Human-Object Interaction Detection

Human-Object Interaction (HOI) detection is the task of identifying a se...
research
04/17/2023

DETR-based Layered Clothing Segmentation and Fine-Grained Attribute Recognition

Clothing segmentation and fine-grained attribute recognition are challen...
research
03/29/2023

MuRAL: Multi-Scale Region-based Active Learning for Object Detection

Obtaining large-scale labeled object detection dataset can be costly and...
research
09/27/2020

Human-Object Interaction Detection:A Quick Survey and Examination of Methods

Human-object interaction detection is a relatively new task in the world...
research
04/14/2018

HyperFusion-Net: Densely Reflective Fusion for Salient Object Detection

Salient object detection (SOD), which aims to find the most important re...
research
03/09/2021

QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information

We propose a simple, intuitive yet powerful method for human-object inte...

Please sign up or login with your details

Forgot password? Click here to reset