A new way of video compression via forward-referencing using deep learning

by   S. M. A. K. Rajin, et al.
Charles Sturt University
Federation University Australia

To exploit high temporal correlations in video frames of the same scene, the current frame is predicted from the already-encoded reference frames using block-based motion estimation and compensation techniques. While this approach can efficiently exploit the translation motion of the moving objects, it is susceptible to other types of affine motion and object occlusion/deocclusion. Recently, deep learning has been used to model the high-level structure of human pose in specific actions from short videos and then generate virtual frames in future time by predicting the pose using a generative adversarial network (GAN). Therefore, modelling the high-level structure of human pose is able to exploit semantic correlation by predicting human actions and determining its trajectory. Video surveillance applications will benefit as stored big surveillance data can be compressed by estimating human pose trajectories and generating future frames through semantic correlation. This paper explores a new way of video coding by modelling human pose from the already-encoded frames and using the generated frame at the current time as an additional forward-referencing frame. It is expected that the proposed approach can overcome the limitations of the traditional backward-referencing frames by predicting the blocks containing the moving objects with lower residuals. Experimental results show that the proposed approach can achieve on average up to 2.83 dB PSNR gain and 25.93% bitrate savings for high motion video sequences


page 2

page 4

page 5


Towards Accurate Human Pose Estimation in Videos of Crowded Scenes

Video-based human pose estimation in crowded scenes is a challenging pro...

Memorize, Then Recall: A Generative Framework for Low Bit-rate Surveillance Video Compression

Surveillance video applications grow dramatically in public safety and d...

Motion Projection Consistency Based 3D Human Pose Estimation with Virtual Bones from Monocular Videos

Real-time 3D human pose estimation is crucial for human-computer interac...

Pose Guided Human Video Generation

Due to the emergence of Generative Adversarial Networks, video synthesis...

Learning to Forecast and Refine Residual Motion for Image-to-Video Generation

We consider the problem of image-to-video translation, where an input im...

Deep Keyframe Detection in Human Action Videos

Detecting representative frames in videos based on human actions is quit...

From Kinematics To Dynamics: Estimating Center of Pressure and Base of Support from Video Frames of Human Motion

To gain an understanding of the relation between a given human pose imag...

Please sign up or login with your details

Forgot password? Click here to reset