SDC Resilient Error-bounded Lossy Compressor

10/07/2020
by   Sihuan Li, et al.
0

Lossy compression is one of the most important strategies to resolve the big science data issue, however, little work was done to make it resilient against silent data corruptions (SDC). In fact, SDC is becoming non-negligible because of exa-scale computing demand on complex scientific simulations with vast volume of data being produced or in some particular instruments/devices (such as interplanetary space probe) that need to transfer large amount of data in an error-prone environment. In this paper, we propose an SDC resilient error-bounded lossy compressor upon the SZ compression framework. Specifically, we adopt a new independent-block-wise model that decomposes the entire dataset into many independent sub-blocks to compress. Then, we design and implement a series of error detection/correction strategies based on SZ. We are the first to extend algorithm-based fault tolerance (ABFT) to lossy compression. Our proposed solution incurs negligible execution overhead without soft errors. It keeps the correctness of decompressed data still bounded within user's requirement with a very limited degradation of compression ratios upon soft errors.

READ FULL TEXT
research
05/25/2021

Exploring Autoencoder-Based Error-Bounded Compression for Scientific Data

Error-bounded lossy compression is becoming an indispensable technique f...
research
01/05/2023

TAC+: Drastically Optimizing Error-Bounded Lossy Compression for 3D AMR Simulations

Today's scientific simulations require a significant reduction of data v...
research
10/18/2021

Doubt and Redundancy Kill Soft Errors – Towards Detection and Correction of Silent Data Corruption in Task-based Numerical Software

Resilient algorithms in high-performance computing are subject to rigoro...
research
01/31/2022

SZx: an Ultra-fast Error-bounded Lossy Compressor for Scientific Datasets

Today's scientific high performance computing (HPC) applications or adva...
research
07/11/2023

Optimizing Scientific Data Transfer on Globus with Error-bounded Lossy Compression

The increasing volume and velocity of science data necessitate the frequ...
research
04/04/2019

Compact Error-Resilient Self-Assembly of Recursively Defined Patterns

A limitation to molecular implementations of tile-based self-assembly sy...
research
09/14/2017

Synthesizing Optimally Resilient Controllers

Recently, Dallal, Neider, and Tabuada studied a generalization of the cl...

Please sign up or login with your details

Forgot password? Click here to reset