MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation

03/05/2019
by   Yazan Abu Farha, et al.
6

Temporally locating and classifying action segments in long untrimmed videos is of particular interest to many applications like surveillance and robotics. While traditional approaches follow a two-step pipeline, by generating frame-wise probabilities and then feeding them to high-level temporal models, recent approaches use temporal convolutions to directly classify the video frames. In this paper, we introduce a multi-stage architecture for the temporal action segmentation task. Each stage features a set of dilated temporal convolutions to generate an initial prediction that is refined by the next one. This architecture is trained using a combination of a classification loss and a proposed smoothing loss that penalizes over-segmentation errors. Extensive evaluation shows the effectiveness of the proposed model in capturing long-range dependencies and recognizing action segments. Our model achieves state-of-the-art results on three challenging datasets: 50Salads, Georgia Tech Egocentric Activities (GTEA), and the Breakfast dataset.

READ FULL TEXT

page 5

page 6

page 7

research
11/16/2016

Temporal Convolutional Networks for Action Segmentation and Detection

The ability to identify and temporally segment fine-grained human action...
research
08/16/2022

Temporal Action Localization with Multi-temporal Scales

Temporal action localization plays an important role in video analysis, ...
research
05/26/2022

Efficient U-Transformer with Boundary-Aware Loss for Action Segmentation

Action classification has made great progress, but segmenting and recogn...
research
08/29/2016

Temporal Convolutional Networks: A Unified Approach to Action Segmentation

The dominant paradigm for video-based action segmentation is composed of...
research
03/15/2018

Temporal Human Action Segmentation via Dynamic Clustering

We present an effective dynamic clustering algorithm for the task of tem...
research
05/22/2017

TricorNet: A Hybrid Temporal Convolutional and Recurrent Network for Video Action Segmentation

Action segmentation as a milestone towards building automatic systems to...
research
03/09/2023

TAEC: Unsupervised Action Segmentation with Temporal-Aware Embedding and Clustering

Temporal action segmentation in untrimmed videos has gained increased at...

Please sign up or login with your details

Forgot password? Click here to reset