Dynamic Kernel Distillation for Efficient Pose Estimation in Videos

08/24/2019
by   Xuecheng Nie, et al.
9

Existing video-based human pose estimation methods extensively apply large networks onto every frame in the video to localize body joints, which suffer high computational cost and hardly meet the low-latency requirement in realistic applications. To address this issue, we propose a novel Dynamic Kernel Distillation (DKD) model to facilitate small networks for estimating human poses in videos, thus significantly lifting the efficiency. In particular, DKD introduces a light-weight distillator to online distill pose kernels via leveraging temporal cues from the previous frame in a one-shot feed-forward manner. Then, DKD simplifies body joint localization into a matching procedure between the pose kernels and the current frame, which can be efficiently computed via simple convolution. In this way, DKD fast transfers pose knowledge from one frame to provide compact guidance for body joint localization in the following frame, which enables utilization of small networks in video-based pose estimation. To facilitate the training process, DKD exploits a temporally adversarial training strategy that introduces a temporal discriminator to help generate temporally coherent pose kernels and pose estimation results within a long range. Experiments on Penn Action and Sub-JHMDB benchmarks demonstrate outperforming efficiency of DKD, specifically, 10x flops reduction and 2x speedup over previous best model, and its state-of-the-art accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

page 9

research
07/25/2022

Live Stream Temporally Embedded 3D Human Body Pose and Shape Estimation

3D Human body pose and shape estimation within a temporal sequence can b...
research
03/31/2017

Thin-Slicing Network: A Deep Structured Model for Pose Estimation in Videos

Deep ConvNets have been shown to be effective for the task of human pose...
research
03/16/2022

DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation

This paper proposes a simple baseline framework for video-based 2D/3D hu...
research
12/04/2016

Online Localization and Prediction of Actions and Interactions

This paper proposes a person-centric and online approach to the challeng...
research
07/30/2020

Key Frame Proposal Network for Efficient Pose Estimation in Videos

Human pose estimation in video relies on local information by either est...
research
10/12/2022

Uplift and Upsample: Efficient 3D Human Pose Estimation with Uplifting Transformers

The state-of-the-art for monocular 3D human pose estimation in videos is...
research
05/21/2017

Generative Partition Networks for Multi-Person Pose Estimation

This paper proposes a new Generative Partition Network (GPN) to address ...

Please sign up or login with your details

Forgot password? Click here to reset