Uncertainty-Aware Weakly Supervised Action Detection from Untrimmed Videos

07/21/2020
by   Anurag Arnab, et al.
5

Despite the recent advances in video classification, progress in spatio-temporal action recognition has lagged behind. A major contributing factor has been the prohibitive cost of annotating videos frame-by-frame. In this paper, we present a spatio-temporal action recognition model that is trained with only video-level labels, which are significantly easier to annotate. Our method leverages per-frame person detectors which have been trained on large image datasets within a Multiple Instance Learning framework. We show how we can apply our method in cases where the standard Multiple Instance Learning assumption, that each bag contains at least one instance with the specified label, is invalid using a novel probabilistic variant of MIL where we estimate the uncertainty of each prediction. Furthermore, we report the first weakly-supervised results on the AVA dataset and state-of-the-art results among weakly-supervised methods on UCF101-24.

READ FULL TEXT

page 2

page 6

page 7

page 12

research
02/09/2020

Weakly-Supervised Multi-Person Action Recognition in 360^∘ Videos

The recent development of commodity 360^∘ cameras have enabled a single ...
research
06/05/2020

WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos

Online action detection in untrimmed videos aims to identify an action a...
research
05/02/2019

Large-scale weakly-supervised pre-training for video action recognition

Current fully-supervised video datasets consist of only a few hundred th...
research
12/03/2017

Multimodal Visual Concept Learning with Weakly Supervised Techniques

Despite the availability of a huge amount of video data accompanied by d...
research
05/06/2019

Spatio-Temporal Action Localization in a Weakly Supervised Setting

Enabling computational systems with the ability to localize actions in v...
research
07/09/2018

Step-by-step Erasion, One-by-one Collection: A Weakly Supervised Temporal Action Detector

Weakly supervised temporal action detection is a Herculean task in under...
research
11/02/2022

Learning a Condensed Frame for Memory-Efficient Video Class-Incremental Learning

Recent incremental learning for action recognition usually stores repres...

Please sign up or login with your details

Forgot password? Click here to reset