Adaptive Encoding Strategies for Erasing-Based Lossless Floating-Point Compression

08/23/2023
by   Ruiyuan Li, et al.
0

Lossless floating-point time series compression is crucial for a wide range of critical scenarios. Nevertheless, it is a big challenge to compress time series losslessly due to the complex underlying layouts of floating-point values. The state-of-the-art erasing-based compression algorithm Elf demonstrates a rather impressive performance. We give an in-depth exploration of the encoding strategies of Elf, and find that there is still much room for improvement. In this paper, we propose Elf*, which employs a set of optimizations for leading zeros, center bits and sharing condition. Specifically, we develop a dynamic programming algorithm with a set of pruning strategies to compute the adaptive approximation rules efficiently. We theoretically prove that the adaptive approximation rules are globally optimal. We further extend Elf* to Streaming Elf*, i.e., SElf*, which achieves almost the same compression ratio as Elf*, while enjoying even higher efficiency in streaming scenarios. We compare Elf* and SElf* with 8 competitors using 22 datasets. The results demonstrate that SElf* achieves 9.2 ratio improvement over the best streaming competitor while maintaining similar efficiency, and that Elf* ranks among the most competitive batch compressors. All source codes are publicly released.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2023

Erasing-based lossless compression method for streaming floating-point time series

There are a prohibitively large number of floating-point time series dat...
research
11/05/2020

Datasets for Benchmarking Floating-Point Compressors

Compression of floating-point data, both lossy and lossless, is a topic ...
research
11/01/2019

LFZip: Lossy compression of multivariate floating-point time series data via improved prediction

Time series data compression is emerging as an important problem with th...
research
08/07/2023

Lossless preprocessing of floating point data to enhance compression

Data compression algorithms typically rely on identifying repeated seque...
research
03/08/2023

Change a Bit to save Bytes: Compression for Floating Point Time-Series Data

The number of IoT devices is expected to continue its dramatic growth in...
research
01/23/2023

Efficient Encoders for Streaming Sequence Tagging

A naive application of state-of-the-art bidirectional encoders for strea...
research
02/05/2022

DSSIM: a structural similarity index for floating-point data

Data visualization is a critical component in terms of interacting with ...

Please sign up or login with your details

Forgot password? Click here to reset