Action Segmentation Using 2D Skeleton Heatmaps

by   Syed Waleed Hyder, et al.

This paper presents a 2D skeleton-based action segmentation method with applications in fine-grained human activity recognition. In contrast with state-of-the-art methods which directly take sequences of 3D skeleton coordinates as inputs and apply Graph Convolutional Networks (GCNs) for spatiotemporal feature learning, our main idea is to use sequences of 2D skeleton heatmaps as inputs and employ Temporal Convolutional Networks (TCNs) to extract spatiotemporal features. Despite lacking 3D information, our approach yields comparable/superior performances and better robustness against missing keypoints than previous methods on action segmentation datasets. Moreover, we improve the performances further by using both 2D skeleton heatmaps and RGB videos as inputs. To our best knowledge, this is the first work to utilize 2D skeleton heatmap inputs and the first work to explore 2D skeleton+RGB fusion for action segmentation.


page 1

page 2

page 6


Learning by Aligning 2D Skeleton Sequences in Time

This paper presents a novel self-supervised temporal video alignment fra...

Predictively Encoded Graph Convolutional Network for Noise-Robust Skeleton-based Action Recognition

In skeleton-based action recognition, graph convolutional networks (GCNs...

Centrality Graph Convolutional Networks for Skeleton-based Action Recognition

The topological structure of skeleton data plays a significant role in h...

Dynamic Hypergraph Convolutional Networks for Skeleton-Based Action Recognition

Graph convolutional networks (GCNs) based methods have achieved advanced...

FenceNet: Fine-grained Footwork Recognition in Fencing

Current data analysis for the Canadian Olympic fencing team is primarily...

LAC: Latent Action Composition for Skeleton-based Action Segmentation

Skeleton-based action segmentation requires recognizing composable actio...

Space-Time Representation of People Based on 3D Skeletal Data: A Review

Spatiotemporal human representation based on 3D visual perception data i...

Please sign up or login with your details

Forgot password? Click here to reset