Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning

03/02/2023
by   Bo Wan, et al.
0

Human object interaction (HOI) detection plays a crucial role in human-centric scene understanding and serves as a fundamental building-block for many vision tasks. One generalizable and scalable strategy for HOI detection is to use weak supervision, learning from image-level annotations only. This is inherently challenging due to ambiguous human-object associations, large search space of detecting HOIs and highly noisy training signal. A promising strategy to address those challenges is to exploit knowledge from large-scale pretrained models (e.g., CLIP), but a direct knowledge distillation strategy <cit.> does not perform well on the weakly-supervised setting. In contrast, we develop a CLIP-guided HOI representation capable of incorporating the prior knowledge at both image level and HOI instance level, and adopt a self-taught mechanism to prune incorrect human-object associations. Experimental results on HICO-DET and V-COCO show that our method outperforms the previous works by a sizable margin, showing the efficacy of our HOI representation.

READ FULL TEXT

page 3

page 9

page 16

research
03/09/2023

Weakly-Supervised HOI Detection from Interaction Labels Only and Language/Vision-Language Priors

Human-object interaction (HOI) detection aims to extract interacting hum...
research
10/07/2021

Weakly Supervised Human-Object Interaction Detection in Video via Contrastive Spatiotemporal Regions

We introduce the task of weakly supervised learning for detecting human ...
research
03/03/2017

Bridging Saliency Detection to Weakly Supervised Object Detection Based on Self-paced Curriculum Learning

Weakly-supervised object detection (WOD) is a challenging problems in co...
research
07/26/2019

Distill-to-Label: Weakly Supervised Instance Labeling Using Knowledge Distillation

Weakly supervised instance labeling using only image-level labels, in li...
research
10/05/2018

Weakly Supervised Object Detection in Artworks

We propose a method for the weakly supervised detection of objects in pa...
research
09/10/2023

Exploiting CLIP for Zero-shot HOI Detection Requires Knowledge Distillation at Multiple Levels

In this paper, we investigate the task of zero-shot human-object interac...
research
08/29/2018

Interact as You Intend: Intention-Driven Human-Object Interaction Detection

The recent advances in instance-level detection tasks lay strong foundat...

Please sign up or login with your details

Forgot password? Click here to reset