SiamParseNet: Joint Body Parsing and Label Propagation in Infant Movement Videos

by   Haomiao Ni, et al.

General movement assessment (GMA) of infant movement videos (IMVs) is an effective method for the early detection of cerebral palsy (CP) in infants. Automated body parsing is a crucial step towards computer-aided GMA, in which infant body parts are segmented and tracked over time for movement analysis. However, acquiring fully annotated data for video-based body parsing is particularly expensive due to the large number of frames in IMVs. In this paper, we propose a semi-supervised body parsing model, termed SiamParseNet (SPN), to jointly learn single frame body parsing and label propagation between frames in a semi-supervised fashion. The Siamese-structured SPN consists of a shared feature encoder, followed by two separate branches: one for intra-frame body parts segmentation, and one for inter-frame label propagation. The two branches are trained jointly, taking pairs of frames from the same videos as their input. An adaptive training process is proposed that alternates training modes between using input pairs of only labeled frames and using inputs of both labeled and unlabeled frames. During testing, we employ a multi-source inference mechanism, where the final result for a test frame is either obtained via the segmentation branch or via propagation from a nearby key frame. We conduct extensive experiments on a partially-labeled IMV dataset where SPN outperforms all prior arts, demonstrating the effectiveness of our proposed method.


page 2

page 8


Semi-supervised Body Parsing and Pose Estimation for Enhancing Infant General Movement Assessment

General movement assessment (GMA) of infant movement videos (IMVs) is an...

Adaptive Temporal Encoding Network for Video Instance-level Human Parsing

Beyond the existing single-person and multiple-person human parsing task...

Learning Motion Flows for Semi-supervised Instrument Segmentation from Robotic Surgical Video

Performing low hertz labeling for surgical videos at intervals can great...

Semantics through Time: Semi-supervised Segmentation of Aerial Videos with Iterative Label Propagation

Semantic segmentation is a crucial task for robot navigation and safety....

Warp-Refine Propagation: Semi-Supervised Auto-labeling via Cycle-consistency

Deep learning models for semantic segmentation rely on expensive, large-...

EventNet-ITA: Italian Frame Parsing for Events

This paper introduces EventNet-ITA, a large, multi-domain corpus annotat...

Dual Swap Disentangling

Learning interpretable disentangled representations is a crucial yet cha...

Please sign up or login with your details

Forgot password? Click here to reset