RawArray: A Simple, Fast, and Extensible Archival Format for Numeric Data

11/30/2021
by   David S. Smith, et al.
0

Raw data sizes are growing and proliferating in scientific research, driven by the success of data-hungry computational methods, such as machine learning. The preponderance of proprietary and shoehorned data formats make computations slower and make it harder to reproduce research and to port methods to new platforms. Here we present the RawArray format: a simple, fast, and extensible format for archival storage of multidimensional numeric arrays on disk. The RawArray file format is a simple concatenation of a header array and a data array. The header comprises seven or more 64-bit unsigned integers. The array data can be anything. Arbitrary user metadata can be appended to an RawArray file if desired, for example to store measurement details, color palettes, or geolocation data. We present benchmarks showing a factor of 2–3× speedup over HDF5 for a range of array sizes and a speedup of up to 20× in reading the common deep learning datasets MNIST and CIFAR10.

READ FULL TEXT
research
05/17/2016

The polymake XML file format

We describe an XML file format for storing data from computations in alg...
research
09/01/2023

A FAIR File Format for Mathematical Software

We describe a generic JSON based file format which is suitable for compu...
research
07/19/2022

A Comparison of HDF5, Zarr, and netCDF4 in Performing Common I/O Operations

Scientific data is often stored in files because of the simplicity they ...
research
09/24/2021

User-Defined Functions for HDF5

Scientific datasets are known for their challenging storage demands and ...
research
09/29/2020

Leader: Prefixing a Length for Faster Word Vector Serialization

Two competing file formats have become the de facto standards for distri...
research
10/25/2018

Waveform Signal Entropy and Compression Study of Whole-Building Energy Datasets

Electrical energy consumption has been an ongoing research area since th...

Please sign up or login with your details

Forgot password? Click here to reset