Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues

06/13/2019
by   Natalia Neverova, et al.
2

DensePose supersedes traditional landmark detectors by densely mapping image pixels to body surface coordinates. This power, however, comes at a greatly increased annotation time, as supervising the model requires to manually label hundreds of points per pose instance. In this work, we thus seek methods to significantly slim down the DensePose annotations, proposing more efficient data collection strategies. In particular, we demonstrate that if annotations are collected in video frames, their efficacy can be multiplied for free by using motion cues. To explore this idea, we introduce DensePose-Track, a dataset of videos where selected frames are annotated in the traditional DensePose manner. Then, building on geometric properties of the DensePose mapping, we use the video dynamic to propagate ground-truth annotations in time as well as to learn from Siamese equivariance constraints. Having performed exhaustive empirical evaluation of various data annotation and learning strategies, we demonstrate that doing so can deliver significantly improved pose estimation results over strong baselines. However, despite what is suggested by some recent works, we show that merely synthesizing motion patterns by applying geometric transformations to isolated frames is significantly less effective, and that motion cues help much more when they are extracted from videos.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 7

research
06/06/2019

Learning Temporal Pose Estimation from Sparsely-Labeled Videos

Modern approaches for multi-person pose estimation in video require larg...
research
12/08/2022

Pose Estimation for Human Wearing Loose-Fitting Clothes: Obtaining Ground Truth Posture Using HFR Camera and Blinking LEDs

Human pose estimation, particularly in athletes, can help improve their ...
research
10/24/2019

TexturePose: Supervising Human Mesh Estimation with Texture Consistency

This work addresses the problem of model-based human pose estimation. Re...
research
10/22/2019

Towards Automatic Annotation for Semantic Segmentation in Drone Videos

Semantic segmentation is a crucial task for robot navigation and safety....
research
03/24/2021

TagMe: GPS-Assisted Automatic Object Annotation in Videos

Training high-accuracy object detection models requires large and divers...
research
10/14/2022

Semi-supervised Body Parsing and Pose Estimation for Enhancing Infant General Movement Assessment

General movement assessment (GMA) of infant movement videos (IMVs) is an...
research
05/26/2023

Motion-Based Sign Language Video Summarization using Curvature and Torsion

An interesting problem in many video-based applications is the generatio...

Please sign up or login with your details

Forgot password? Click here to reset