Weakly-Supervised Temporal Action Detection for Fine-Grained Videos with Hierarchical Atomic Actions

07/24/2022
by   Zhi Li, et al.
0

Action understanding has evolved into the era of fine granularity, as most human behaviors in real life have only minor differences. To detect these fine-grained actions accurately in a label-efficient way, we tackle the problem of weakly-supervised fine-grained temporal action detection in videos for the first time. Without the careful design to capture subtle differences between fine-grained actions, previous weakly-supervised models for general action detection cannot perform well in the fine-grained setting. We propose to model actions as the combinations of reusable atomic actions which are automatically discovered from data through self-supervised clustering, in order to capture the commonality and individuality of fine-grained actions. The learnt atomic actions, represented by visual concepts, are further mapped to fine and coarse action labels leveraging the semantic label hierarchy. Our approach constructs a visual representation hierarchy of four levels: clip level, atomic action level, fine action class level and coarse action class level, with supervision at each level. Extensive experiments on two large-scale fine-grained video datasets, FineAction and FineGym, show the benefit of our proposed weakly-supervised model for fine-grained action detection, and it achieves state-of-the-art results.

READ FULL TEXT

page 2

page 6

page 14

research
09/11/2020

HAA500: Human-Centric Atomic Action Dataset with Curated Videos

We contribute HAA500, a manually annotated human-centric atomic action d...
research
09/04/2018

Hierarchical Video Understanding

We introduce a hierarchical architecture for video understanding that ex...
research
08/21/2023

UbiPhysio: Support Daily Functioning, Fitness, and Rehabilitation with Action Understanding and Feedback in Natural Language

We introduce UbiPhysio, a milestone framework that delivers fine-grained...
research
08/01/2022

Large-Scale Product Retrieval with Weakly Supervised Representation Learning

Large-scale weakly supervised product retrieval is a practically useful ...
research
03/23/2022

How Do You Do It? Fine-Grained Action Understanding with Pseudo-Adverbs

We aim to understand how actions are performed and identify subtle diffe...
research
06/24/2023

Weakly Supervised Multi-Label Classification of Full-Text Scientific Papers

Instead of relying on human-annotated training samples to build a classi...
research
08/24/2021

ProtoMIL: Multiple Instance Learning with Prototypical Parts for Fine-Grained Interpretability

Multiple Instance Learning (MIL) gains popularity in many real-life mach...

Please sign up or login with your details

Forgot password? Click here to reset