Video Interpolation and Prediction with Unsupervised Landmarks

09/06/2019
by   Kevin J. Shih, et al.
1

Prediction and interpolation for long-range video data involves the complex task of modeling motion trajectories for each visible object, occlusions and dis-occlusions, as well as appearance changes due to viewpoint and lighting. Optical flow based techniques generalize but are suitable only for short temporal ranges. Many methods opt to project the video frames to a low dimensional latent space, achieving long-range predictions. However, these latent representations are often non-interpretable, and therefore difficult to manipulate. This work poses video prediction and interpolation as unsupervised latent structure inference followed by a temporal prediction in this latent space. The latent representations capture foreground semantics without explicit supervision such as keypoints or poses. Further, as each landmark can be mapped to a coordinate indicating where a semantic part is positioned, we can reliably interpolate within the coordinate domain to achieve predictable motion interpolation. Given an image decoder capable of mapping these landmarks back to the image domain, we are able to achieve high-quality long-range video interpolation and extrapolation by operating on the landmark representation space.

READ FULL TEXT

page 6

page 8

page 14

page 15

page 16

page 19

page 20

page 21

research
12/15/2020

FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation

A majority of approaches solve the problem of video frame interpolation ...
research
02/26/2023

Continuous Space-Time Video Super-Resolution Utilizing Long-Range Temporal Information

In this paper, we consider the task of space-time video super-resolution...
research
10/13/2021

Revisiting Latent-Space Interpolation via a Quantitative Evaluation Framework

Latent-space interpolation is commonly used to demonstrate the generaliz...
research
08/31/2023

STint: Self-supervised Temporal Interpolation for Geospatial Data

Supervised and unsupervised techniques have demonstrated the potential f...
research
06/08/2023

Tracking Everything Everywhere All at Once

We present a new test-time optimization method for estimating dense and ...
research
03/24/2018

VOS-GAN: Adversarial Learning of Visual-Temporal Dynamics for Unsupervised Dense Prediction in Videos

Recent GAN-based video generation approaches model videos as the combina...
research
10/21/2020

Semantics-Guided Representation Learning with Applications to Visual Synthesis

Learning interpretable and interpolatable latent representations has bee...

Please sign up or login with your details

Forgot password? Click here to reset