Video Primal Sketch: A Unified Middle-Level Representation for Video

02/10/2015
by   Zhi Han, et al.
0

This paper presents a middle-level video representation named Video Primal Sketch (VPS), which integrates two regimes of models: i) sparse coding model using static or moving primitives to explicitly represent moving corners, lines, feature points, etc., ii) FRAME /MRF model reproducing feature statistics extracted from input video to implicitly represent textured motion, such as water and fire. The feature statistics include histograms of spatio-temporal filters and velocity distributions. This paper makes three contributions to the literature: i) Learning a dictionary of video primitives using parametric generative models; ii) Proposing the Spatio-Temporal FRAME (ST-FRAME) and Motion-Appearance FRAME (MA-FRAME) models for modeling and synthesizing textured motion; and iii) Developing a parsimonious hybrid model for generic video representation. Given an input video, VPS selects the proper models automatically for different motion patterns and is compatible with high-level action representations. In the experiments, we synthesize a number of textured motion; reconstruct real videos using the VPS; report a series of human perception experiments to verify the quality of reconstructed videos; demonstrate how the VPS changes over the scale transition in videos; and present the close connection between VPS and high-level action models.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 6

page 10

page 12

page 14

research
04/25/2019

DynamoNet: Dynamic Action and Motion Network

In this paper, we are interested in self-supervised learning the motion ...
research
04/07/2019

Self-supervised Spatio-temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics

We address the problem of video representation learning without human-an...
research
10/10/2021

Sketch Me A Video

Video creation has been an attractive yet challenging task for artists t...
research
05/06/2023

LEO: Generative Latent Image Animator for Human Video Synthesis

Spatio-temporal coherency is a major challenge in synthesizing high qual...
research
04/30/2022

RADNet: A Deep Neural Network Model for Robust Perception in Moving Autonomous Systems

Interactive autonomous applications require robustness of the perception...
research
07/29/2019

Seeing Things in Random-Dot Videos

The human visual system correctly groups features and interprets videos ...
research
04/22/2021

Hierarchical Motion Understanding via Motion Programs

Current approaches to video analysis of human motion focus on raw pixels...

Please sign up or login with your details

Forgot password? Click here to reset