Exploring Object Stores for High-Energy Physics Data Storage

07/15/2021
by   Javier López-Gómez, et al.
0

Over the last two decades, ROOT TTree has been used for storing over one exabyte of High-Energy Physics (HEP) events. The TTree columnar on-disk layout has been proved to be ideal for analyses of HEP data that typically require access to many events, but only a subset of the information stored for each of them. Future colliders, and particularly HL-LHC, will bring an increase of at least one order of magnitude in the volume of generated data. Therefore, the use of modern storage hardware, such as low-latency high-bandwidth NVMe devices and distributed object stores, becomes more important. However, TTree was not designed to optimally exploit modern hardware and may become a bottleneck for data retrieval. The ROOT RNTuple I/O system aims at overcoming TTree's limitations and at providing improved efficiency for modern storage systems. In this paper, we extend RNTuple with a backend that uses Intel DAOS as the underlying storage, demonstrating that the RNTuple architecture can accommodate high-performance object stores. From the user perspective, data can be accessed with minimal changes to the code, that is by replacing a filesystem path by a DAOS URI. Our performance evaluation shows that the new backend can be used for realistic analyses, while outperforming the compatibility solution provided by the DAOS project.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/07/2022

RNTuple performance: Status and Outlook

Upcoming HEP experiments, e.g. at the HL-LHC, are expected to increase t...
research
03/17/2020

Evolution of the ROOT Tree I/O

The ROOT TTree data format encodes hundreds of petabytes of High Energy ...
research
07/24/2023

Prototyping a ROOT-based distributed analysis workflow for HL-LHC: the CMS use case

The challenges expected for the next era of the Large Hadron Collider (L...
research
12/05/2020

Optimal Caching for Low Latency in Distributed Coded Storage Systems

Erasure codes have been widely considered a promising solution to enhanc...
research
05/22/2017

Liquid Cloud Storage

A liquid system provides durable object storage based on spreading redun...
research
07/16/2019

Distributed data storage for modern astroparticle physics experiments

The German-Russian Astroparticle Data Life Cycle Initiative is an intern...
research
10/13/2022

PURIFI: A NEW COST EFFICIENT WAY TO CAPTURE CARBON

Global warming is a clear existential threat, and its impacts are very d...

Please sign up or login with your details

Forgot password? Click here to reset