An Emerging Coding Paradigm VCM: A Scalable Coding Approach Beyond Feature and Signal

01/09/2020
by   Sifeng Xia, et al.
14

In this paper, we study a new problem arising from the emerging MPEG standardization effort Video Coding for Machine (VCM), which aims to bridge the gap between visual feature compression and classical video coding. VCM is committed to address the requirement of compact signal representation for both machine and human vision in a more or less scalable way. To this end, we make endeavors in leveraging the strength of predictive and generative models to support advanced compression techniques for both machine and human vision tasks simultaneously, in which visual features serve as a bridge to connect signal-level and task-level compact representations in a scalable manner. Specifically, we employ a conditional deep generation network to reconstruct video frames with the guidance of learned motion pattern. By learning to extract sparse motion pattern via a predictive model, the network elegantly leverages the feature representation to generate the appearance of to-be-coded frames via a generative model, relying on the appearance of the coded key frames. Meanwhile, the sparse motion pattern is compact and highly effective for high-level vision tasks, e.g. action recognition. Experimental results demonstrate that our method yields much better reconstruction quality compared with the traditional video codecs (0.0063 gain in SSIM), as well as state-of-the-art action recognition performance over highly compressed videos (9.4 coding signal for both human and machine vision.

READ FULL TEXT

page 1

page 3

page 6

research
01/09/2020

Towards Coding for Human and Machine Vision: A Scalable Image Coding Approach

The past decades have witnessed the rapid development of image and video...
research
06/19/2023

LVVC: A Learned Versatile Video Coding Framework for Efficient Human-Machine Vision

Almost all digital videos are coded into compact representations before ...
research
02/02/2021

Human-Machine Collaborative Video Coding Through Cuboidal Partitioning

Video coding algorithms encode and decode an entire video frame while fe...
research
10/18/2021

Video Coding for Machine: Compact Visual Representation Compression for Intelligent Collaborative Analytics

Video Coding for Machines (VCM) is committed to bridging to an extent se...
research
09/11/2018

Temporal-Spatial Mapping for Action Recognition

Deep learning models have enjoyed great success for image related comput...
research
01/16/2013

Deep Predictive Coding Networks

The quality of data representation in deep learning methods is directly ...
research
09/23/2011

Latent Semantic Learning with Structured Sparse Representation for Human Action Recognition

This paper proposes a novel latent semantic learning method for extracti...

Please sign up or login with your details

Forgot password? Click here to reset