IDEALEM: Statistical Similarity Based Data Reduction

11/16/2019
by   Dongeun Lee, et al.
0

Many applications such as scientific simulation, sensing, and power grid monitoring tend to generate massive amounts of data, which should be compressed first prior to storage and transmission. These data, mostly comprised of floating-point values, are known to be difficult to compress using lossless compression. A few compression methods based on lossy compression have been proposed to compress this seemingly incompressible data. Unfortunately, they are all designed to minimize the Euclidean distance between the original data and the decompressed data, which fundamentally limits compression performance. We recently proposed a new class of lossy compression based on statistical similarity, called IDEALEM, which was also provided as a software package. IDEALEM has demonstrated its performance by reducing data volume much more than state-of-the-art compression methods while preserving unique patterns of data. IDEALEM can operate in two different modes depending on the stationarity of input data. This paper presents compression performance analyses of these two modes, and investigates the difference between two transform techniques targeted for non-stationary data. This paper also discusses the data reconstruction quality of IDEALEM using spectral analysis and shows that important frequency components in application domain are well preserved. We expand the capability of IDEALEM by adding a new min/max check that facilitates preserving significant patterns lasting only for a brief duration which were previously hard to capture. This min/max check also accelerates the encoding process significantly. Experiments show IDEALEM preserves significant patterns in the original data with faster encoding time.

READ FULL TEXT
research
08/07/2023

Lossless preprocessing of floating point data to enhance compression

Data compression algorithms typically rely on identifying repeated seque...
research
06/28/2023

Erasing-based lossless compression method for streaming floating-point time series

There are a prohibitively large number of floating-point time series dat...
research
09/14/2014

Image compression overview

Compression plays a significant role in a data storage and a transmissio...
research
02/25/2022

Compressed Matrix Computations

Frugal computing is becoming an important topic for environmental reason...
research
11/01/2021

iFlow: Numerically Invertible Flows for Efficient Lossless Compression via a Uniform Coder

It was estimated that the world produced 59 ZB (5.9 × 10^13 GB) of data ...
research
11/19/2018

Heterogeneous Reliability Modes with Efficient State Compression for Out-of-Order Superscalar Processors

Reliability has emerged as a key topic of interest for researchers aroun...
research
11/19/2018

Architectural-Space Exploration of Heterogeneous Reliability and Checkpointing Modes for Out-of-Order Superscalar Processors

Reliability has emerged as a key topic of interest for researchers aroun...

Please sign up or login with your details

Forgot password? Click here to reset