Log In Sign Up

Improving Prediction-Based Lossy Compression Dramatically Via Ratio-Quality Modeling

by   Sian Jin, et al.

Error-bounded lossy compression is one of the most effective techniques for scientific data reduction. However, the traditional trial-and-error approach used to configure lossy compressors for finding the optimal trade-off between reconstructed data quality and compression ratio is prohibitively expensive. To resolve this issue, we develop a general-purpose analytical ratio-quality model based on the prediction-based lossy compression framework, which can effectively foresee the reduced data quality and compression ratio, as well as the impact of the lossy compressed data on post-hoc analysis quality. Our analytical model significantly improves the prediction-based lossy compression in three use-cases: (1) optimization of predictor by selecting the best-fit predictor; (2) memory compression with a target ratio; and (3) in-situ compression optimization by fine-grained error-bound tuning of various data partitions. We evaluate our analytical model on 10 scientific datasets, demonstrating its high accuracy (93.47 computational cost (up to 18.7X lower than the trial-and-error approach) for estimating the compression ratio and the impact of lossy compression on post-hoc analysis quality. We also verified the high efficiency of our ratio-quality model using different applications across the three use-cases. In addition, the experiment demonstrates that our modeling based approach reduces the time to store the 3D Reverse Time Migration data by up to 3.4X over the traditional solution using 128 CPU cores from 8 compute nodes.


Adaptive Configuration of In Situ Lossy Compression for Cosmology Simulations via Fine-Grained Rate-Quality Modeling

Extreme-scale cosmological simulations have been widely used by today's ...

Exploring Autoencoder-Based Error-Bounded Compression for Scientific Data

Error-bounded lossy compression is becoming an indispensable technique f...

SZ3: A Modular Framework for Composing Prediction-Based Error-Bounded Lossy Compressors

Today's scientific simulations require a significant reduction of data v...

SCI: A spectrum concentrated implicit neural compression for biomedical data

Massive collection and explosive growth of the huge amount of medical da...

Understanding GPU-Based Lossy Compression for Extreme-Scale Cosmological Simulations

To help understand our universe better, researchers and scientists curre...

Optimizing Huffman Decoding for Error-Bounded Lossy Compression on GPUs

More and more HPC applications require fast and effective compression te...

PhishZip: A New Compression-based Algorithm for Detecting Phishing Websites

Phishing has grown significantly in the past few years and is predicted ...