Temporal Action Segmentation with High-level Complex Activity Labels

08/15/2021
by   Guodong Ding, et al.
0

Over the past few years, the success in action recognition on short trimmed videos has led more investigations towards the temporal segmentation of actions in untrimmed long videos. Recently, supervised approaches have achieved excellent performance in segmenting complex human actions in untrimmed videos. However, besides action labels, such approaches also require the start and end points of each action, which is expensive and tedious to collect. In this paper, we aim to learn the action segments taking only the high-level activity labels as input. Under the setting where no action-level supervision is provided, Hungarian matching is often used to find the mapping between segments and ground truth actions to evaluate the model and report the performance. On the one hand, we show that with the high-level supervision, we are able to generalize the Hungarian matching settings from the current video and activity level to the global level. The extended global-level matching allows for the shared actions across activities. On the other hand, we propose a novel action discovery framework that automatically discovers constituent actions in videos with the activity classification task. Specifically, we define a finite number of prototypes to form a dual representation of a video sequence. These collectively learned prototypes are considered discovered actions. This classification setting endows our approach the capability of discovering potentially shared actions across multiple complex activities. Extensive experiments demonstrate that the discovered actions are helpful in performing temporal action segmentation and activity recognition.

READ FULL TEXT

page 1

page 5

page 8

page 9

research
12/02/2016

Unsupervised Human Action Detection by Action Matching

We propose a new task of unsupervised action detection by action matchin...
research
04/09/2019

Action Recognition from Single Timestamp Supervision in Untrimmed Videos

Recognising actions in videos relies on labelled supervision during trai...
research
07/21/2015

Every Moment Counts: Dense Detailed Labeling of Actions in Complex Videos

Every moment counts in action recognition. A comprehensive understanding...
research
03/26/2022

Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos

Action recognition models have shown a promising capability to classify ...
research
09/30/2017

Unsupervised Segmentation of Action Segments in Egocentric Videos using Gaze

Unsupervised segmentation of action segments in egocentric videos is a d...
research
03/11/2016

Watch-n-Patch: Unsupervised Learning of Actions and Relations

There is a large variation in the activities that humans perform in thei...
research
06/10/2022

ProActive: Self-Attentive Temporal Point Process Flows for Activity Sequences

Any human activity can be represented as a temporal sequence of actions ...

Please sign up or login with your details

Forgot password? Click here to reset