Learning for Video Compression with Recurrent Auto-Encoder and Recurrent Probability Model

by   Ren Yang, et al.

The past few years have witnessed increasing interests in applying deep learning to video compression. However, the existing approaches compress a video frame with only a few number of reference frames, which limits their ability to fully exploit the temporal correlation among video frames. To overcome this shortcoming, this paper proposes a Recurrent Learned Video Compression (RLVC) approach with the Recurrent Auto-Encoder (RAE) and Recurrent Probability Model (RPM). Specifically, the RAE employs recurrent cells in both the encoder and decoder. As such, the temporal information in a large range of frames can be used for generating latent representations and reconstructing compressed outputs. Furthermore, the proposed RPM network recurrently estimates the Probability Mass Function (PMF) of the latent representation, conditioned on the distribution of previous latent representations. Due to the correlation among consecutive frames, the conditional cross entropy can be lower than the independent cross entropy, thus reducing the bit-rate. The experiments show that our approach achieves the state-of-the-art learned video compression performance in terms of both PSNR and MS-SSIM. Moreover, our approach outperforms the default Low-Delay P (LDP) setting of x265 on PSNR, and also has better performance on MS-SSIM than the SSIM-tuned x265 and the slowest setting of x265.


page 1

page 9

page 12


Perceptual Learned Video Compression with Recurrent Conditional GAN

This paper proposes a Perceptual Learned Video Compression (PLVC) approa...

Learning for Video Compression with Hierarchical Quality and Recurrent Enhancement

The recent years have witnessed the great potential of deep learning for...

Temporal Context Mining for Learned Video Compression

We address end-to-end learned video compression with a special focus on ...

Conditional Entropy Coding for Efficient Video Compression

We propose a very simple and efficient video compression framework that ...

Spatiotemporal Entropy Model is All You Need for Learned Video Compression

The framework of dominant learned video compression methods is usually c...

Learned Video Compression via Joint Spatial-Temporal Correlation Exploration

Traditional video compression technologies have been developed over deca...

SummaryNet: A Multi-Stage Deep Learning Model for Automatic Video Summarisation

Video summarisation can be posed as the task of extracting important par...

Please sign up or login with your details

Forgot password? Click here to reset