An array-oriented Python interface for FastJet

02/08/2022
by   Aryan Roy, et al.
0

Analysis on HEP data is an iterative process in which the results of one step often inform the next. In an exploratory analysis, it is common to perform one computation on a collection of events, then view the results (often with histograms) to decide what to try next. Awkward Array is a Scikit-HEP Python package that enables data analysis with array-at-a-time operations to implement cuts as slices, combinatorics as composable functions, etc. However, most C++ HEP libraries, such as FastJet, have an imperative, one-particle-at-a-time interface, which would be inefficient in Python and goes against the grain of the array-at-a-time logic of scientific Python. Therefore, we developed fastjet, a pip-installable Python package that provides FastJet C++ binaries, the classic (particle-at-a-time) Python interface, and the new array-oriented interface for use with Awkward Array. The new interface streamlines interoperability with scientific Python software beyond HEP, such as machine learning. In one case, adopting this library along with other array-oriented tools accelerated HEP analysis code by a factor of 20. It was designed to be easily integrated with libraries in the Scikit-HEP ecosystem, including Uproot (file I/O), hist (histogramming), Vector (Lorentz vectors), and Coffea (high-level glue). We discuss the design of the fastjet Python library, integrating the classic interface with the array oriented interface and with the Vector library for Lorentz vector operations. The new interface was developed as open source.

READ FULL TEXT
research
11/04/2022

scikit-fda: A Python Package for Functional Data Analysis

The library scikit-fda is a Python package for Functional Data Analysis ...
research
06/18/2020

Array Programming with NumPy

Array programming provides a powerful, compact, expressive syntax for ac...
research
10/11/2017

Pyroomacoustics: A Python package for audio room simulations and array processing algorithms

We present pyroomacoustics, a software package aimed at the rapid develo...
research
08/03/2023

PyPartMC: A Pythonic interface to a particle-resolved, Monte Carlo aerosol simulation framework

PyPartMC is a Pythonic interface to PartMC, a stochastic, particle-resol...
research
05/14/2018

The EPFL Logic Synthesis Libraries

We present a collection of modular open source C++ libraries for the dev...
research
10/01/2019

pylustrator: Code generation for reproducible figures for publication

One major challenge in science is to make all results potentially reprod...
research
08/28/2020

Coffea – Columnar Object Framework For Effective Analysis

The coffea framework provides a new approach to High-Energy Physics anal...

Please sign up or login with your details

Forgot password? Click here to reset