PPDM: Parallel Point Detection and Matching for Real-time Human-Object Interaction Detection

by   Yue Liao, et al.

We propose a single-stage Human-Object Interaction (HOI) detection method that has outperformed all existing methods on HICO-DET dataset at 37 fps on a single Titan XP GPU. It is the first real-time HOI detection method. Conventional HOI detection methods are composed of two stages, i.e., human-object proposals generation, and proposals classification. Their effectiveness and efficiency are limited by the sequential and separate architecture. In this paper, we propose a Parallel Point Detection and Matching (PPDM) HOI detection framework. In PPDM, an HOI is defined as a point triplet < human point, interaction point, object point>. Human and object points are the center of the detection boxes, and the interaction point is the midpoint of the human and object points. PPDM contains two parallel branches, namely point detection branch and point matching branch. The point detection branch predicts three points. Simultaneously, the point matching branch predicts two displacements from the interaction point to its corresponding human and object points. The human point and the object point originated from the same interaction point are considered as matched pairs. In our novel parallel architecture, the interaction points implicitly provide context and regularization for human and object detection. The isolated detection boxes are unlikely to form meaning HOI triplets are suppressed, which increases the precision of HOI detection. Moreover, the matching between human and object detection boxes is only applied around limited numbers of filtered candidate interaction points, which saves much computational cost. Additionally, we build a new applicationoriented database named HOI-A, which severs as a good supplement to the existing datasets. The source code and the dataset will be made publicly available to facilitate the development of HOI detection.


page 2

page 3

page 5

page 8


Mining the Benefits of Two-stage and One-stage HOI Detection

Two-stage methods have dominated Human-Object Interaction (HOI) detectio...

QAHOI: Query-Based Anchors for Human-Object Interaction Detection

Human-object interaction (HOI) detection as a downstream of object detec...

Scale-Aware Trident Networks for Object Detection

Scale variation is one of the key challenges in object detection. In thi...

Real-time 3D object proposal generation and classification under limited processing resources

The task of detecting 3D objects is important to various robotic applica...

Human-Object Interaction Detection via Disentangled Transformer

Human-Object Interaction Detection tackles the problem of joint localiza...

Improving Human-Object Interaction Detection via Phrase Learning and Label Composition

Human-Object Interaction (HOI) detection is a fundamental task in high-l...

CenterAtt: Fast 2-stage Center Attention Network

In this technical report, we introduce the methods of HIKVISION_LiDAR_De...

Please sign up or login with your details

Forgot password? Click here to reset