Cross Modal Compression: Towards Human-comprehensible Semantic Compression

09/06/2022
by   Jiguo Li, et al.
0

Traditional image/video compression aims to reduce the transmission/storage cost with signal fidelity as high as possible. However, with the increasing demand for machine analysis and semantic monitoring in recent years, semantic fidelity rather than signal fidelity is becoming another emerging concern in image/video compression. With the recent advances in cross modal translation and generation, in this paper, we propose the cross modal compression (CMC), a semantic compression framework for visual data, to transform the high redundant visual data (such as image, video, etc.) into a compact, human-comprehensible domain (such as text, sketch, semantic map, attributions, etc.), while preserving the semantic. Specifically, we first formulate the CMC problem as a rate-distortion optimization problem. Secondly, we investigate the relationship with the traditional image/video compression and the recent feature compression frameworks, showing the difference between our CMC and these prior frameworks. Then we propose a novel paradigm for CMC to demonstrate its effectiveness. The qualitative and quantitative results show that our proposed CMC can achieve encouraging reconstructed results with an ultrahigh compression ratio, showing better compression performance than the widely used JPEG baseline.

READ FULL TEXT

page 5

page 6

research
04/07/2019

Image and Video Compression with Neural Networks: A Review

In recent years, the image and video coding technologies have advanced b...
research
09/30/2019

Diachronic Cross-modal Embeddings

Understanding the semantic shifts of multimodal information is only poss...
research
04/30/2018

Cross-Modal Retrieval in the Cooking Context: Learning Semantic Text-Image Embeddings

Designing powerful tools that support cooking activities has rapidly gai...
research
01/23/2019

"Is this an example image?" -- Predicting the Relative Abstractness Level of Image and Text

Successful multimodal search and retrieval requires the automatic unders...
research
05/28/2018

Deep Generative Models for Distribution-Preserving Lossy Compression

We propose and study the problem of distribution-preserving lossy compre...
research
04/03/2023

Crossword: A Semantic Approach to Data Compression via Masking

The traditional methods for data compression are typically based on the ...
research
05/04/2023

Semantically Structured Image Compression via Irregular Group-Based Decoupling

Image compression techniques typically focus on compressing rectangular ...

Please sign up or login with your details

Forgot password? Click here to reset