Weakly-Supervised Action Localization with Expectation-Maximization Multi-Instance Learning

03/31/2020
by   Zhekun Luo, et al.
5

Weakly-supervised action localization problem requires training a model to localize the action segments in the video given only video level action label. It can be solved under the Multiple Instance Learning (MIL) framework, where a bag (video) contains multiple instances (action segments). Since only the bag's label is known, the main challenge is to assign which key instances within the bag trigger the bag's label. Most previous models use an attention-based approach. These models use attention to generate the bag's representation from instances and then train it via bag's classification. In this work, we explicitly model the key instances assignment as a hidden variable and adopt an Expectation-Maximization framework. We derive two pseudo-label generation schemes to model the E and M process and iteratively optimize the likelihood lower bound. We also show that previous attention-based models implicitly violate the MIL assumptions that instances in negative bags should be uniformly negative. In comparison, Our EM-MIL approach more accurately models these assumptions. Our model achieves state-of-the-art performance on two standard benchmarks, THUMOS14 and ActivityNet1.2, and shows the superiority of detecting relative complete action boundary in videos containing multiple actions.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2022

ASM-Loc: Action-aware Segment Modeling for Weakly-Supervised Temporal Action Localization

Weakly-supervised temporal action localization aims to recognize and loc...
research
07/20/2020

MINI-Net: Multiple Instance Ranking Network for Video Highlight Detection

We address the weakly supervised video highlight detection problem for l...
research
09/07/2020

Sparse Network Inversion for Key Instance Detection in Multiple Instance Learning

Multiple Instance Learning (MIL) involves predicting a single label for ...
research
12/15/2018

Weakly supervised segment annotation via expectation kernel density estimation

Since the labelling for the positive images/videos is ambiguous in weakl...
research
04/08/2019

Weakly Supervised Person Re-identification: Cost-effective Learning with A New Benchmark

Person re-identification (ReID) benefits greatly from the accurate annot...
research
06/18/2019

A Weakly Supervised Learning Based Clustering Framework

A weakly supervised learning based clustering framework is proposed in t...
research
04/20/2022

Interventional Multi-Instance Learning with Deconfounded Instance-Level Prediction

When applying multi-instance learning (MIL) to make predictions for bags...

Please sign up or login with your details

Forgot password? Click here to reset