ASFormer: Transformer for Action Segmentation

10/16/2021
by   Fangqiu Yi, et al.
0

Algorithms for the action segmentation task typically use temporal models to predict what action is occurring at each frame for a minute-long daily activity. Recent studies have shown the potential of Transformer in modeling the relations among elements in sequential data. However, there are several major concerns when directly applying the Transformer to the action segmentation task, such as the lack of inductive biases with small training sets, the deficit in processing long input sequence, and the limitation of the decoder architecture to utilize temporal relations among multiple action segments to refine the initial predictions. To address these concerns, we design an efficient Transformer-based model for action segmentation task, named ASFormer, with three distinctive characteristics: (i) We explicitly bring in the local connectivity inductive priors because of the high locality of features. It constrains the hypothesis space within a reliable scope, and is beneficial for the action segmentation task to learn a proper target function with small training sets. (ii) We apply a pre-defined hierarchical representation pattern that efficiently handles long input sequences. (iii) We carefully design the decoder to refine the initial predictions from the encoder. Extensive experiments on three public datasets demonstrate that effectiveness of our methods. Code is available at <https://github.com/ChinaYi/ASFormer>.

READ FULL TEXT

page 3

page 9

research
10/12/2021

StARformer: Transformer with State-Action-Reward Representations

Reinforcement Learning (RL) can be considered as a sequence modeling tas...
research
05/26/2022

Efficient U-Transformer with Boundary-Aware Loss for Action Segmentation

Action classification has made great progress, but segmenting and recogn...
research
08/28/2023

BIT: Bi-Level Temporal Modeling for Efficient Supervised Action Segmentation

We address the task of supervised action segmentation which aims to part...
research
02/03/2021

Relaxed Transformer Decoders for Direct Action Proposal Generation

Temporal action proposal generation is an important and challenging task...
research
05/19/2023

Enhancing Transformer Backbone for Egocentric Video Action Segmentation

Egocentric temporal action segmentation in videos is a crucial task in c...
research
04/10/2022

Panoptic-PartFormer: Learning a Unified Model for Panoptic Part Segmentation

Panoptic Part Segmentation (PPS) aims to unify panoptic segmentation and...
research
11/10/2018

Near Real-Time Data Labeling Using a Depth Sensor for EMG Based Prosthetic Arms

Recognizing sEMG (Surface Electromyography) signals belonging to a parti...

Please sign up or login with your details

Forgot password? Click here to reset