End-to-end Neural Video Coding Using a Compound Spatiotemporal Representation

08/05/2021
by   Haojie Liu, et al.
0

Recent years have witnessed rapid advances in learnt video coding. Most algorithms have solely relied on the vector-based motion representation and resampling (e.g., optical flow based bilinear sampling) for exploiting the inter frame redundancy. In spite of the great success of adaptive kernel-based resampling (e.g., adaptive convolutions and deformable convolutions) in video prediction for uncompressed videos, integrating such approaches with rate-distortion optimization for inter frame coding has been less successful. Recognizing that each resampling solution offers unique advantages in regions with different motion and texture characteristics, we propose a hybrid motion compensation (HMC) method that adaptively combines the predictions generated by these two approaches. Specifically, we generate a compound spatiotemporal representation (CSTR) through a recurrent information aggregation (RIA) module using information from the current and multiple past frames. We further design a one-to-many decoder pipeline to generate multiple predictions from the CSTR, including vector-based resampling, adaptive kernel-based resampling, compensation mode selection maps and texture enhancements, and combines them adaptively to achieve more accurate inter prediction. Experiments show that our proposed inter coding system can provide better motion-compensated prediction and is more robust to occlusions and complex motions. Together with jointly trained intra coder and residual coder, the overall learnt hybrid coder yields the state-of-the-art coding efficiency in low-delay scenario, compared to the traditional H.264/AVC and H.265/HEVC, as well as recently published learning-based methods, in terms of both PSNR and MS-SSIM metrics.

READ FULL TEXT

page 1

page 2

page 3

page 5

page 7

page 8

page 10

research
07/09/2020

Neural Video Coding using Multiscale Motion Compensation and Spatiotemporal Context Model

Over the past two decades, traditional block-based video coding has made...
research
08/06/2020

Optical Flow and Mode Selection for Learning-based Video Coding

This paper introduces a new method for inter-frame coding based on two c...
research
02/07/2022

Motion-Plane-Adaptive Inter Prediction in 360-Degree Video Coding

Inter prediction is one of the key technologies enabling the high compre...
research
09/13/2020

Improving Deep Video Compression by Resolution-adaptive Flow Coding

In the learning based video compression approaches, it is an essential i...
research
12/26/2021

Learning Cross-Scale Prediction for Efficient Neural Video Compression

In this paper, we present the first neural video codec that can compete ...
research
02/08/2018

Texture Segmentation Based Video Compression Using Convolutional Neural Networks

There has been a growing interest in using different approaches to impro...
research
12/31/2018

Deep Frame Prediction for Video Coding

We propose a novel frame prediction method using a deep neural network (...

Please sign up or login with your details

Forgot password? Click here to reset