Deep motion estimation for parallel inter-frame prediction in video compression

12/11/2019
by   André Nortje, et al.
0

Standard video codecs rely on optical flow to guide inter-frame prediction: pixels from reference frames are moved via motion vectors to predict target video frames. We propose to learn binary motion codes that are encoded based on an input video sequence. These codes are not limited to 2D translations, but can capture complex motion (warping, rotation and occlusion). Our motion codes are learned as part of a single neural network which also learns to compress and decode them. This approach supports parallel video frame decoding instead of the sequential motion estimation and compensation of flow-based methods. We also introduce 3D dynamic bit assignment to adapt to object displacements caused by motion, yielding additional bit savings. By replacing the optical flow-based block-motion algorithms found in an existing video codec with our learned inter-frame prediction model, our approach outperforms the standard H.264 and H.265 video codecs across at low bitrates.

READ FULL TEXT

page 15

page 20

research
10/08/2018

Inter-BMV: Interpolation with Block Motion Vectors for Fast Semantic Segmentation on Video

Models optimized for accuracy on single images are often prohibitively s...
research
10/05/2021

Self-Supervised Learning of Perceptually Optimized Block Motion Estimates for Video Compression

Block based motion estimation is integral to inter prediction processes ...
research
07/17/2020

Can Learned Frame-Prediction Compete with Block-Motion Compensation for Video Coding?

Given recent advances in learned video prediction, we investigate whethe...
research
11/16/2018

Learned Video Compression

We present a new algorithm for video coding, learned end-to-end for the ...
research
07/11/2023

Offline and Online Optical Flow Enhancement for Deep Video Compression

Video compression relies heavily on exploiting the temporal redundancy b...
research
01/25/2022

Semantically Video Coding: Instill Static-Dynamic Clues into Structured Bitstream for AI Tasks

Traditional media coding schemes typically encode image/video into a sem...
research
06/15/2022

VCT: A Video Compression Transformer

We show how transformers can be used to vastly simplify neural video com...

Please sign up or login with your details

Forgot password? Click here to reset