TAEC: Unsupervised Action Segmentation with Temporal-Aware Embedding and Clustering

by   Wei Lin, et al.
IG Farben Haus
Max Planck Society
TU Graz

Temporal action segmentation in untrimmed videos has gained increased attention recently. However, annotating action classes and frame-wise boundaries is extremely time consuming and cost intensive, especially on large-scale datasets. To address this issue, we propose an unsupervised approach for learning action classes from untrimmed video sequences. In particular, we propose a temporal embedding network that combines relative time prediction, feature reconstruction, and sequence-to-sequence learning, to preserve the spatial layout and sequential nature of the video features. A two-step clustering pipeline on these embedded feature representations then allows us to enforce temporal consistency within, as well as across videos. Based on the identified clusters, we decode the video into coherent temporal segments that correspond to semantically meaningful action classes. Our evaluation on three challenging datasets shows the impact of each component and, furthermore, demonstrates our state-of-the-art unsupervised action segmentation results.


page 6

page 14

page 15

page 16


Unsupervised learning of action classes with continuous temporal embedding

The task of temporally detecting and segmenting actions in untrimmed vid...

Action Shuffle Alternating Learning for Unsupervised Action Segmentation

This paper addresses unsupervised action segmentation. Prior work captur...

Joint Visual-Temporal Embedding for Unsupervised Learning of Actions in Untrimmed Sequences

Understanding the structure of complex activities in videos is one of th...

Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation

Action segmentation refers to inferring boundaries of semantically consi...

MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation

Temporally locating and classifying action segments in long untrimmed vi...

Unsupervised Shot Boundary Detection for Temporal Segmentation of Long Capsule Endoscopy Videos

Physicians use Capsule Endoscopy (CE) as a non-invasive and non-surgical...

Leveraging triplet loss for unsupervised action segmentation

In this paper, we propose a novel fully unsupervised framework that lear...

Please sign up or login with your details

Forgot password? Click here to reset