A Serverless Engine for High Energy Physics Distributed Analysis

06/02/2022
by   Jacek Kuśnierz, et al.
0

The Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific collaborations interested in analysing such data very often require computing power beyond a single machine. This issue has been tackled traditionally by running analyses in distributed environments using stateful, managed batch computing systems. While this approach has been effective so far, current estimates for future computing needs of the field present large scaling challenges. Such a managed approach may not be the only viable way to tackle them and an interesting alternative could be provided by serverless architectures, to enable an even larger scaling potential. This work describes a novel approach to running real HEP scientific applications through a distributed serverless computing engine. The engine is built upon ROOT, a well-established HEP data analysis software, and distributes its computations to a large pool of concurrent executions on Amazon Web Services Lambda Serverless Platform. Thanks to the developed tool, physicists are able to access datasets stored at CERN (also those that are under restricted access policies) and process it on remote infrastructures outside of their typical environment. The analysis of the serverless functions is monitored at runtime to gather performance metrics, both for data- and computation-intensive workloads.

READ FULL TEXT
research
02/03/2022

Astronomical data organization, management and access in Scientific Data Lakes

The data volumes stored in telescope archives is constantly increasing d...
research
12/23/2013

Early Observations on Performance of Google Compute Engine for Scientific Computing

Although Cloud computing emerged for business applications in industry, ...
research
07/24/2023

Prototyping a ROOT-based distributed analysis workflow for HL-LHC: the CMS use case

The challenges expected for the next era of the Large Hadron Collider (L...
research
11/10/2015

BOAT: a cross-platform software for data analysis and numerical computing with arbitrary-precision

BOAT is a free cross-platform software for statistical data analysis and...
research
07/12/2018

SciTokens: Capability-Based Secure Access to Remote Scientific Data

The management of security credentials (e.g., passwords, secret keys) fo...
research
07/20/2020

GPU coprocessors as a service for deep learning inference in high energy physics

In the next decade, the demands for computing in large scientific experi...
research
01/16/2019

Fundamentals of effective cloud management for the new NASA Astrophysics Data System

The new NASA Astrophysics Data System (ADS) is designed with a serviceor...

Please sign up or login with your details

Forgot password? Click here to reset