CEAZ: Accelerating Parallel I/O via Hardware-Algorithm Co-Design of Efficient and Adaptive Lossy Compression

06/24/2021
by   Chengming Zhang, et al.
0

As supercomputers continue to grow to exascale, the amount of data that needs to be saved or transmitted is exploding. To this end, many previous works have studied using error-bounded lossy compressors to reduce the data size and improve the I/O performance. However, little work has been done for effectively offloading lossy compression onto FPGA-based SmartNICs to reduce the compression overhead. In this paper, we propose a hardware-algorithm co-design of efficient and adaptive lossy compressor for scientific data on FPGAs (called CEAZ) to accelerate parallel I/O. Our contribution is fourfold: (1) We propose an efficient Huffman coding approach that can adaptively update Huffman codewords online based on codewords generated offline (from a variety of representative scientific datasets). (2) We derive a theoretical analysis to support a precise control of compression ratio under an error-bounded compression mode, enabling accurate offline Huffman codewords generation. This also helps us create a fixed-ratio compression mode for consistent throughput. (3) We develop an efficient compression pipeline by adopting cuSZ's dual-quantization algorithm to our hardware use case. (4) We evaluate CEAZ on five real-world datasets with both a single FPGA board and 128 nodes from Bridges-2 supercomputer. Experiments show that CEAZ outperforms the second-best FPGA-based lossy compressor by 2X of throughput and 9.6X of compression ratio. It also improves MPI_File_write and MPI_Gather throughputs by up to 25.8X and 24.8X, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/27/2021

cuSZ(x): Optimizing Error-Bounded Lossy Compression for Scientific Data on GPUs

Error-bounded lossy compression is a critical technique for significantl...
research
07/19/2020

cuSZ: An Efficient GPU-Based Error-Bounded Lossy Compression Framework for Scientific Data

Error-bounded lossy compression is a state-of-the-art data reduction tec...
research
11/10/2017

In-Depth Exploration of Single-Snapshot Lossy Compression Techniques for N-Body Simulations

In situ lossy compression allowing user-controlled data loss can signifi...
research
11/18/2021

COMET: A Novel Memory-Efficient Deep Learning Training Framework by Using Error-Bounded Lossy Compression

Training wide and deep neural networks (DNNs) require large amounts of s...
research
05/03/2018

Polynomial data compression for large-scale physics experiments

The new generation research experiments will introduce huge data surge t...
research
06/23/2018

Optimizing Lossy Compression Rate-Distortion from Automatic Online Selection between SZ and ZFP

With ever-increasing volumes of scientific data produced by HPC applicat...
research
01/13/2021

ZipLine: In-Network Compression at Line Speed

Network appliances continue to offer novel opportunities to offload proc...

Please sign up or login with your details

Forgot password? Click here to reset