Semantic Text Compression for Classification

09/19/2023
by   Emrecan Kutay, et al.
0

We study semantic compression for text where meanings contained in the text are conveyed to a source decoder, e.g., for classification. The main motivator to move to such an approach of recovering the meaning without requiring exact reconstruction is the potential resource savings, both in storage and in conveying the information to another node. Towards this end, we propose semantic quantization and compression approaches for text where we utilize sentence embeddings and the semantic distortion metric to preserve the meaning. Our results demonstrate that the proposed semantic approaches result in substantial (orders of magnitude) savings in the required number of bits for message representation at the expense of very modest accuracy loss compared to the semantic agnostic baseline. We compare the results of proposed approaches and observe that resource savings enabled by semantic quantization can be further amplified by semantic clustering. Importantly, we observe the generalizability of the proposed methodology which produces excellent results on many benchmark text classification datasets with a diverse array of contexts.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/12/2016

FastText.zip: Compressing text classification models

We consider the problem of producing compact architectures for text clas...
research
07/02/2021

Transformer-F: A Transformer network with effective methods for learning universal sentence representation

The Transformer model is widely used in natural language processing for ...
research
04/03/2023

Crossword: A Semantic Approach to Data Compression via Masking

The traditional methods for data compression are typically based on the ...
research
12/08/2021

Implicit Neural Representations for Image Compression

Recently Implicit Neural Representations (INRs) gained attention as a no...
research
02/26/2019

Semantic Hilbert Space for Text Representation Learning

Capturing the meaning of sentences has long been a challenging task. Cur...
research
01/24/2021

Image Compression with Encoder-Decoder Matched Semantic Segmentation

In recent years, layered image compression is demonstrated to be a promi...
research
12/20/2020

Learning to Localize Through Compressed Binary Maps

One of the main difficulties of scaling current localization systems to ...

Please sign up or login with your details

Forgot password? Click here to reset