MLPF: Efficient machine-learned particle-flow reconstruction using graph neural networks

In general-purpose particle detectors, the particle flow algorithm may be used to reconstruct a coherent particle-level view of the event by combining information from the calorimeters and the trackers, significantly improving the detector resolution for jets and the missing transverse momentum. In view of the planned high-luminosity upgrade of the CERN Large Hadron Collider, it is necessary to revisit existing reconstruction algorithms and ensure that both the physics and computational performance are sufficient in a high-pileup environment. Recent developments in machine learning may offer a prospect for efficient event reconstruction based on parametric models. We introduce MLPF, an end-to-end trainable machine-learned particle flow algorithm for reconstructing particle flow candidates based on parallelizable, computationally efficient, scalable graph neural networks and a multi-task objective. We report the physics and computational performance of the MLPF algorithm on on a synthetic dataset of ttbar events in HL-LHC running conditions, including the simulation of multiple interaction effects, and discuss potential next steps and considerations towards ML-based reconstruction in a general purpose particle detector.


page 3

page 4

page 7


Machine Learning for Particle Flow Reconstruction at CMS

We provide details on the implementation of a machine-learning based par...

Explaining machine-learned particle-flow reconstruction

The particle-flow (PF) algorithm is used in general-purpose particle det...

End-to-end multi-particle reconstruction in high occupancy imaging calorimeters with graph neural networks

We present an end-to-end reconstruction algorithm to build particle cand...

Scalable, End-to-End, Deep-Learning-Based Data Reconstruction Chain for Particle Imaging Detectors

Recent inroads in Computer Vision (CV) and Machine Learning (ML) have mo...

Learning representations of irregular particle-detector geometry with distance-weighted graph networks

We explore the use of graph networks to deal with irregular-geometry det...

Learning Tree Structures from Leaves For Particle Decay Reconstruction

In this work, we present a neural approach to reconstructing rooted tree...

Domain-informed neural networks for interaction localization within astroparticle experiments

This work proposes a domain-informed neural network architecture for exp...

Code Repositories

1 Introduction

Reconstruction algorithms at general-purpose high-energy particle detectors aim to provide a coherent, well-calibrated physics interpretation of the collision event. Variants of the particle-flow (PF) algorithm have been used at the PETRA Behrend and others (1982), ALEPH Buskulic and others (1995), CMS Sirunyan and others (2017) and ATLAS Aaboud and others (2017) experiments to reconstruct a particle-level interpretation of high-multiplicity hadronic collision events, given individual detector elements such as tracks and calorimeter clusters from a multi-layered, heterogeneous, irregular-geometry detector. The PF algorithm generally correlates tracks and calorimeter clusters from detector layers such as the electromagnetic calorimeter (ECAL), hadron calorimeter (HCAL) and others to reconstruct charged and neutral hadron candidates as well as photons, electrons, and muons with an optimized efficiency and resolution. Existing PF reconstruction implementations are optimized using simulation for each specific experiment because detailed detector characteristics and geometry must be considered for the best possible physics performance.

Recently, there has been significant interest in adapting the PF reconstruction approach for future high-luminosity experimental conditions at the CERN Large Hadron Collider (LHC), as well as for proposed future collider experiments like the Future Circular Collider (FCC). While reconstruction algorithms are often based on an imperative, rule-based approach, the use of supervised machine learning (ML) to define reconstruction parametrically based on data and simulation samples may improve the physics reach of the experiments while offering a modern computing solution that could scale better with the expected progress on ML-specific computing infrastructures, e.g., at high-performance computer centers. In addition to potentially improving the physics performance, one of the motivations for developing ML-based reconstruction is an improved computational performance over standard algorithms in a high-luminosity configuration, which ultimately would allow a more detailed reconstruction to be deployed given a fixed computing budget, as ML algorithms are well-suited to emerging highly parallel computing architectures.

ML-based reconstruction approaches have been proposed for various tasks, including PF Duarte and Vlimant (2020). The clustering of energy deposits in detectors with a realistic, irregular-geometry detector using graph neural networks has been first proposed in Ref. Qasim et al. (2019). The ML-based reconstruction of overlapping signals without a regular grid was further developed in Ref. Kieseler (2020), where an optimization scheme for reconstructing a variable number of particles based on a potential function using an object condensation approach was proposed. The clustering of energy deposits from particle decays with potential overlaps is an essential input to PF reconstruction. In Ref. Di Bello et al. (2020), various ML models including GNNs

and computer-vision models have been studied for reconstructing neutral hadrons from multi-layered granular calorimeter images and tracking information. In particle gun samples, the

ML-based approaches achieved a significant improvement in neutral hadron energy resolution over the default algorithm, an important step towards a fully parametric, simulation-driven reconstruction using ML.

In this paper, we build on the previous ML-based reconstruction approaches by extending the ML-based PFs algorithm to reconstruct particle candidates in events with a large number of simultaneous pileup (PU) collisions. In Section 2, we propose a benchmark dataset that has the main components for a particle-level reconstruction of charged and neutral hadrons with PU. In Section 3, we build on the existing ML-based reconstruction and propose a GNN-based machine-learned particle-flow (MLPF) algorithm where the runtime scales approximately linearly with the input size. Furthermore, in Section 4, we characterize the performance of the MLPF model on the benchmark dataset in terms of hadron reconstruction efficiency, fake rate and resolution, comparing it to the baseline PF reconstruction, while also demonstrating using synthetic data that MLPF reconstruction can be computationally efficient and scalable. Finally, in Section 5 we discuss some potential issues and next steps for ML-based PF reconstruction.

2 Physics simulation

We use pythiaSjöstrand et al. (2006, 2008) and delphesde Favereau et al. (2014) from the HepSim software repository Chekanov (2015) to generate a particle-level dataset of 50,000 top quark-antiquark () events produced in proton-proton collisions at 14, overlaid with minimum bias events corresponding to a PU of 200 on average. The dataset consists of detector hits as the input, generator particles as the ground truth and reconstructed particles from delphes for additional validation. The delphes model corresponds to a CMS-like detector with a multi-layered charged particle tracker, an electromagnetic and hadron calorimeter.

Although this simplified simulation does not include important physics effects such as pair production, Brehmsstrahlung, nuclear interactions, electromagnetic showering or a detailed detector simulation, it allows the study of overall per-particle reconstruction properties for charged and neutral hadrons in a high-PU environment. Different reconstruction approaches can be developed and compared on this simplified dataset, where the expected performance is straightforward to assess, including from the aspect of computational complexity.

The inputs to PF are charged particle tracks and calorimeter clusters. We use these high-level detector inputs (elements), rather than low-level tracker hits or unclustered calorimeter hits to closely follow how PF is implemented in existing reconstruction chains, where successive reconstruction steps are decoupled, such that each step can be optimized and characterized individually. In this toy dataset, tracks are characterized by transverse momentum (111

As common for collider physics, we use a Cartesian coordinate system with the

axis oriented along the beam axis, the axis on the horizontal plane, and the axis oriented upward. The and axes define the transverse plane, while the axis identifies the longitudinal direction. The azimuthal angle is computed with respect to the axis. The polar angle is used to compute the pseudorapidity . The transverse momentum () is the projection of the particle momentum on the (, ) plane. We fix units such that ., charge, and the pseudorapidity and azimuthal angle coordinates on the inner () and outer surfaces () of the tracker. The track and coordinates are additionally smeared with a 1% Gaussian resolution to model a finite tracker resolution. Calorimeter clusters are characterized by electromagnetic or hadron energy and coordinates. In this simulation, an event has detector inputs on average.

The targets for PF reconstruction are stable generator-level particles that are associated to at least one detector element, as particles that leave no detector hits are not reconstructable. Generator particles are characterized by a particle identification (PID) which may take one of the following categorical values: charged hadron, neutral hadron, photon, electron, or muon. In case multiple generator particles all deposit their energy completely to a single calorimeter cluster, we treat them as reconstructable only in aggregate. In this case, the generator particles are merged by adding the momenta and assigning it the PID of the highest-energy sub-particle. In addition, charged hadrons are indistinguishable outside the tracker acceptance from neutral hadrons, therefore we label generated charged hadrons with to neutral hadrons. We also set a lower energy threshold on reconstructable neutral hadrons to based on the delphes rule-based PF reconstruction, ignoring neutral hadrons that do not pass this threshold. A single event from the dataset is visualized in Fig. 1, demonstrating the input multiplicity and particle distribution in the event. We show the differential distributions of the generator-level particles in the simulated dataset on Fig. 2.

Figure 1: A simulated event from the MLPF dataset with 200 PU interactions. The input tracks are shown in gray, with the trajectory curvature being defined by the inner and outer coordinates. Electromagnetic (hadron) calorimeter clusters are shown in blue (orange), with the size corresponding to cluster energy for visualization purposes. We also show the locations of the generator particles (all types) with red cross markers. The radii and thus the -coordinates of the tracker, ECAL and HCAL surfaces are arbitrary for visualization purposes.
Figure 2: The (upper) and (lower) distributions of the generator particles in the simulated dataset, split by particle type.

We also store the PF candidates reconstructed by delphes for comparison purposes. The delphes rule-based PF algorithm is described in detail in Ref. de Favereau et al. (2014), identifying charged and neutral hadrons based on track and calorimeter cluster overlaps and energy subtraction. Photons, electrons and muons are identified by delphes based on the generator particle associated to the corresponding track or calorimeter cluster. Each event is now fully characterized by the set of generator particles

(target vectors), the set of detector inputs

(input vectors), with


For input tracks, only the type, , , , , , and features are filled. Similarly, for input clusters, only the type, , , and entries are filled. Unfilled features for both tracks and clusters are set to zero. In future iterations of MLPF, it may be beneficial to represent input elements of different type with separate data matrices to improve the computational efficiency of the model.

Functionally, the detector is modelled in simulation by a function that produces a set of detector signals from the generator-level inputs for an event. Reconstruction imperfectly approximates the inverse of that function . In the following section, we approximate the reconstruction as set-to-set translation and implement a baseline MLPF reconstruction using graph neural networks.

3 ML-based PF reconstruction

For a given set of detector inputs , we want to predict a set of particle candidates that closely approximates the target generator particle set . The target and predicted sets may have a different number of elements, depending on the quality of the prediction. For use in ML using gradient descent, this requires a computationally efficient set-to-set metric

to be used as the loss function.

We simplify the problem numerically by first zero-padding the target set

such that . This turns the problem of predicting a variable number of particles into a multi-classification prediction by adding an additional “no particle” to the classes already defined by the target PID and is based on Ref. Kieseler (2020). Since the target set now has a predefined size, we may compute the loss function which approximates reconstruction quality element-by-element:


where the target values and predictions

are decomposed such that the multi-classification is encapsulated in the scores and one-hot encoded classes

, while the momentum and charge regression values in . We use CLS to denote the multi-classification loss (e.g. categorical cross-entropy), while REG denotes the regression loss (e.g. mean-squared error) for the momentum components weighted appropriately by a coefficient . This per-particle loss function serves as a baseline optimization target for the ML training. Further physics improvements may be reached by extending the loss to take into account event-level quantities, either by using an energy flow distance as proposed in Ref. Komiske et al. (2019a, b); Romao et al. (2020), or using a generative adversarial network (GAN

) setup by optimizing the reconstruction network in tandem with a classifier that is trained to distinguish between the target and reconstructed events, given the detector inputs.

Furthermore, for PF reconstruction, the target generator particles are often geometrically and energetically close to well-identifiable detector inputs. In physics terms, a charged hadron is reconstructed based on a track, while a neutral hadron candidate can always be associated to at least one primary source cluster, with additional corrections taken from other nearby detector inputs. Therefore, we may choose to preprocess the inputs such that for a given arbitrary ordering of the detector inputs (sets of vectors are represented as matrices with some arbitrary ordering for ML training), the target set is arranged such that if a target particle can be associated to a single detector input, it is arranged to be in the same location in the sequence. This data preprocessing step speeds up model convergence, but does not introduce any additional assumptions to the model.

3.1 Graph neural network implementation

Figure 3: Functional overview of the end-to-end trainable MLPF setup with GNNs. The event is represented as a set of detector elements . The set is transformed into a graph by the graph building step, which is implemented here using an LSH approximation of kNN. The graph nodes are then encoded using a message passing step, implemented using graph convolutional nets. The encoded elements are decoded to the output feature vectors using pointwise feedforward networks.

Given the set of detector inputs for the event , we adopt a message passing approach for reconstructing the PF candidates . First, we need to construct a trainable graph adjacency matrix for the given set of input elements, represented with the graph building block in Fig. 3. The input set is heterogeneous, containing elements of different type (tracks, ECAL clusters, HCAL clusters) in different feature spaces. Therefore, defining a static neighborhood graph in the feature space in advance is not straightforward. A generic approach to learnable graph construction using kNN in an embedding space, known as GravNet, has been proposed in Ref. Qasim et al. (2019), where the authors demonstrated that a learnable, dynamically-generated graph structure significantly improves the physics performance of an ML-based reconstruction algorithm for calorimeter clustering.

However, naive kNN graph implementations have time complexity: for each set element out of , we must order the other elements by distance and pick the closest. For reconstruction, given equivalent physics performance, both computational efficiency (a low overall runtime) and scalability (subquadratic time and memory scaling with the input size) are desirable.

We build on the GravNet approach Qasim et al. (2019) by using an approximate kNN graph construction algorithm based on LSH to improve the time complexity of the graph building algorithm. The LSH approach has been recently proposed Kitaev et al. (2020) for approximating and thus speeding up ML models that take into account element-to-element relations using an optimizable matrix known as self-attention Vaswani et al. (2017). The method divides the input into bins using a hash function, such that nearby elements are likely to be assigned to the same bin. The bins contain only a small number of elements, such that constructing a kNN graph in the bin is fast.

In the kNN+LSH approach, the input elements are projected into a -dimensional embedding space by a trainable, elementwise feed-forward network . As in Ref. Kitaev et al. (2020), we now assign each element into one of bins indexed by integers using , where is a hash function that assigns nearby

to the same bin with a high probability. We define the hash function as

where denotes the concatenation of two vectors and and is a random projection matrix of size

drawn from the normal distribution at initialization.

We now build kNN graphs based on the embedded elements in each of the LSH bins, such that the full sparse graph adjacency in the inputs set

is defined by the sum of the subgraphs. The embedding function can be optimized with backpropagation and gradient descent using the values of the nonzero elements of

. Overall, this graph building approach has time complexity and does not require the allocation of an matrix at any point. The LSH step generates disjoint subgraphs in the full event graph. This is motivated by physics, as we expect subregions of the detector to be reconstructable approximately independently. The existing PF algorithm in the CMS detector employs a similar approach by producing disjoint PF blocks as an intermediate step of the algorithm Sirunyan and others (2017).

Having built the graph dynamically, we now use a variant of message passing Gilmer et al. (2017) to create hidden encoded states of the input elements taking into account the graph structure. As a first baseline, we use a variant of graph convolutional network (GCN) that combines local and global node-level information Kipf and Welling (2017); Wu et al. (2019); Xin et al. (2020). This choice is motivated by implementation and evaluation efficiency in establishing a baseline. This message passing step is represented in Fig. 3 by the GCN block. Finally, we decode the encoded nodes

to the target outputs with an elementwise feed-forward network that combines the hidden state with the original input element

using a skip connection.

We have a joint graph building, but separate graph convolution and decoding layers for the multi-classification and the momentum and charge regression subtasks. This allows each subtask to be retrained separately in addition to a combined end-to-end training should the need arise. The classification and regression losses are combined with constant empirical weights such that they have a approximately equal contribution to the full training loss. It may be beneficial to use specific multi-task training strategies such as gradient surgery Yu et al. (2020) to further improve the performance across all subtasks.

The multi-classification prediction outputs for each node are converted to particle probabilities with the softmax operation. We choose the PID with the highest probability for the reconstructed particle candidate, while ensuring that the probability meets a threshold that matches a fake rate working point defined by the baseline delphes PF reconstruction algorithm.

The predicted graph structure is an intermediate step in the model and is not used in the loss function explicitly—we only optimize the model with respect to reconstruction quality. However, using the graph structure in the loss function when a known ground truth is available may further improve the optimization process. In addition, access to the predicted graph structure may be helpful in evaluating the interpretability of the model.

The set of networks for graph building, message passing and decoding has been implemented with TensorFlow 2.3 and can be trained end-to-end using gradient descent. The inputs are zero-padded to elements, with the LSH bin size chosen to be such that the number of bins and the number of nearest neighbors . We use two hidden layers for each encoding and decoding net with 256 units each, with two successive graph convolutions between the encoding and decoding steps. Exponential linear activations (ELU) Clevert et al. (2016) were used between hidden layers, linear activations were used for the outputs. Overall, the model has approximately 1.5 million trainable weights and 25,000 constant weights for the random projections. For optimization, we used the Adam Kingma and Ba (2015) algorithm with

for 300 epochs, training over

events, with events used for testing. The events are processed in minibatches of five simultaneous events per graphics processing unit (GPU), we train for approximately 24 hours using five RTX 2070S GPUs using data parallelism. We report the results of the multi-task learning problem in the next section. The code and dataset to reproduce the training are made available on the Zenodo platform Pata et al. (2021a, b).

4 Results

In the model assessment, we focus on the charged and neutral hadron performance in the simulation events that were not used for training. In typical PF reconstruction, charged hadrons are reconstructed based on tracking information, while neutral hadrons are reconstructed from HCAL clusters not matched to tracks. In Fig. 4, we see that both the baseline rule-based PF in delphes and MLPF models generally predict the charged and neutral particle multiplicity with a high degree of correlation, suggesting that the multi-classification model is appropriate for reconstructing variable-multiplicity events. We note that the particle multiplicities from the MLPF model generally correlate better with the generator-level target than the rule-based PF.

Figure 4: True and predicted particle multiplicity for MLPF and delphes PF for charged hadrons (upper) and neutral hadrons (lower). Both models show a high degree of correlation () between the generated and predicted particle multiplicity, with the MLPF model reconstructing the charged and neutral particle multiplicitly with better resolution ().

In Fig. 5

, we compare the per-particle multi-classification confusion matrix for both reconstruction methods. We see overall a similar classification performance, with the neutral hadron identification efficiency being around

for both, while the MLPF algorithm has a slightly higher efficiency ( MLPF versus for the rule-based PF). Improved Monte Carlo generation, subsampling, or weighting may further improve reconstruction performance for particles or kinematic configurations that occur rarely in a physical simulation. In this set of results, we apply no weighting on the events or particles in the event.

Figure 5: Particle identification confusion matrices with gen-level particles as the ground truth, with the baseline delphes PF (upper) MLPF (lower). The rows have been normalized to unit probability, corresponding to normalizing the dataset according to the generated PID.

In Fig. 6, we see that the -dependent charged hadron efficiency (true positive rate) for the MLPF model is somewhat higher than for the rule-based PF baseline, while the fake rate (false positive rate) is equivalently zero, as the delphes simulation includes no fake tracks. From Fig. 7, we observe a similar result for the energy-dependent efficiency and fake rate of neutral hadrons. Both algorithms exhibit a turn-on at low energies and show a constant behaviour at high energies, with MLPF being comparable or slightly better than the rule-based PF baseline.

Figure 6: The efficiency of reconstructing charged hadron candidates as a function of the generator particle pseudorapidity . The MLPF model has uniformly higher efficiency. The fake rate is zero for both models, since the simulation does not contain fake tracks.
Figure 7: The efficiency (upper) and fake rate (lower) of reconstructing neutral hadron candidates as a function of the generator particle energy. The MLPF model shows comparable performance to the delphes PF benchmark, with a somewhat lower fake rate at a similar efficiency.

Furthermore, we see on Figs. 9 and 8 that the energy, energy () and angular resolution of the MLPF algorithm are generally comparable to the baseline for neutral (charged) hadrons.

Overall, these results demonstrate that formulating PF reconstruction as a multi-task ML problem of simultaneously identifying charged and neutral hadrons in a high-PU environment and predicting their momentum may offer comparable or improved physics performance over hand-written algorithms in the presence of sufficient simulation samples and careful optimization. The performance characteristics for the baseline and the proposed MLPF model are summarized in Table 1.

Figure 8: The and resolution of the delphes PF benchmark and the MLPF model for charged hadrons. The resolution is comparable for both algorithms, with the angular resolution being driven by the smearing of the track coordinates.

We also characterize the computational performance of the GNN-based MLPF algorithm. In Fig. 10, we see that the average inference time scales roughly linearly with the input size, which is necessary for scalable reconstruction at high PU. We also note that the GNN-based MLPF algorithm runs natively on a GPU, with the current runtime at around 50 ms/event on a consumer-grade GPU for a full 200 PU event. The algorithm may be relatively simple to port efficiently to any computing architecture that supports common ML frameworks like TensorFlow without significant investment. This includes GPUs and potentially even field-programmable gate arrays or ML-specific processors such as the GraphCore intelligence processing units Mohan et al. (2020) through specialized ML compilers Duarte et al. (2018); Iiyama and others (2021); Heintz et al. (2020). These coprocessing accelerators can be integrated into existing CPU-based experimental software frameworks as a scalable service that grows to meet the transient demand Duarte et al. (2019); Krupa et al. (2020); Rankin et al. (2020).

Figure 9: The energy and resolution of the delphes PF benchmark and the MLPF model for neutral hadrons. Both reconstruction algorithms show comparable performance.
Figure 10: Average runtime of the MLPF GNN model with a varying input event size (upper) and the inference time reduction with increasing batch size (lower). For a simulated event equivalent to 200 PU collisions, we see a runtime of around 50 ms, which scales approximately linearly with respect to the input event size. We see a weak dependence on batch size, with batching having a minor positive effect for low-pileup events. The runtime for each event size is averaged over 100 randomly generated events over three independent runs. The timing tests were done using an Nvidia RTX 2060S GPU and an Intel i7-10700@2.9GHz CPU. We assume a linear scaling between PU and the number of detector elements.
Charged hadrons Neutral hadrons
Metric Rule-based PF MLPF Rule-based PF MLPF
Efficiency 0.903 0.952 0.888 0.906
Fake rate 0 0 0.191 0.069
() resolution 0.211 0.137 0.351 0.324
resolution 0.245 0.250 0.05 0.059
resolution 0.009 0.004 0.032 0.013
Table 1: Particle reconstruction efficiency and fake rate, multiplicity , () and resolutions for charged (neutral) hadrons, comparing the rule-based PF baseline and the proposed MLPF method. Bolded values indicate better performance.

5 Conclusion and outlook

We have proposed an algorithm for MLPF reconstruction in a high-pileup environment for a general-purpose multilayered particle detector based on transforming input sets of detector elements to the output set of reconstructed particles. The MLPF implementation with GNNs is based on graph building with a LSH approximation for kNN, dubbed LSH+kNN, and message passing using graph convolutions. Based on a benchmark particle-level dataset generated using pythia 8 and delphes 3, the MLPF GNN reconstruction offers comparable physics performance for charged and neutral hadrons to the baseline rule-based PF algorithm in delphes, demonstrating that a purely parametric ML-based PF reconstruction can reach the physics performance of existing reconstruction algorithms, while allowing for greater portability across various computing architectures at a possibly reduced cost. The inference time empirically scales approximately linearly with the input size, which is useful for efficient evaluation in the high-luminosity phase of the LHC. In addition, the ML

-based reconstruction model may offer useful features for downstream physics analysis like per-particle probabilities for different reconstruction interpretations, uncertainty estimates, and optimizable particle-level reconstruction for rare processes including displaced signatures.

The MLPF model can be further improved with a more physics-motivated optimization criterion, i.e. a loss function that takes into account event-level, in addition to particle-level differences. While we have shown that a per-particle loss function already converges to an adequate physics performance overall, improved event-based losses such as the object condensation approach or energy flow may be useful. In addition, an event-based loss may be defined using an adversarial classifier that is trained to distinguish the target particles from the reconstructed particles.

Reconstruction algorithms need to adapt to changing experimental conditions—this may be addressed in MLPF by a periodic retraining on simulation that includes up-to-date running condition data such as the beam-spot location, dead channels, and latest calibrations. In a realistic MLPF training, care must be taken that the reconstruction qualities of rare particles and particles in the low-statistics tails of distributions are not adversely affected and that the reconstruction performance remains uniform. This may be addressed with detailed simulations and weighting schemes. In addition, for a reliable physics result, the interpretability of the reconstruction is essential. The reconstructed graph structure can provide information about causal relations between the input detector elements and the reconstructed particle candidates.

In order to develop a usable ML-based PF reconstruction algorithm, a realistic high-pileup simulated dataset that includes detailed interactions with the detector material needs to be used for the ML model optimization. To evaluate the reconstruction performance, efficiencies, fake rates, and resolutions for all particle types need to be studied in detail as a function of particle kinematics and detector conditions. Furthermore, high-level derived quantities such as pileup-dependent jet and missing transverse momentum resolutions must be assessed for a more complete characterization of the reconstruction performance. With ongoing work in ML-based track and calorimeter cluster reconstruction upstream of PF, and ML-based reconstruction of high-level objects including jets and jet classification probabilities downstream of PF, care must be taken that the various steps are optimized and interfaced coherently.

Finally, the MLPF algorithm is inherently parallelizable and can take advantage of hardware acceleration of GNNs via GPUs, FPGAs or emerging ML-specific processors. Current experimental software frameworks can easily integrate coprocessing accelerators as a scalable service. By harnessing heterogeneous computing and parallelizable, efficient ML, the burgeoning computing demand for event reconstruction tasks in the high-luminosity LHC era can be met while maintaining or even surpassing the current physics performance.

We thank our colleagues in the CMS Collaboration, including Josh Bendavid, Kenichi Hatakeyama, Lindsey Gray, and Jan Kieseler, for helpful feedback on this work. J. P. was supported by the Prime NSF Tier2 award 1624356 and the the U.S. Department of Energy (DOE), Office of Science, Office of High Energy Physics under Award No. DE-SC0011925 while at Caltech, and is currently supported by the Mobilitas Pluss Grant No. MOBTP187 of the Estonian Research Council. J. D. is supported by the DOE, Office of Science, Office of High Energy Physics Early Career Research program under Award No. DE-SC0021187 and by the DOE, Office of Advanced Scientific Computing Research under Award No. DE-SC0021396 (FAIR4HEP). M. P. is supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant Agreement No. 772369). J-R. V. is partially supported by the same ERC grant and by the DOE, Office of Science, Office of High Energy Physics under Award No. DE-SC0011925, DE-SC0019227, and DE-AC02-07CH11359. We are grateful to Caltech and the Kavli Foundation for their support of undergraduate student research in cross-cutting areas of machine learning and domain sciences. This work was mainly conducted at “iBanks,” the AI GPU cluster at Caltech, and on the NICPB GPU resources, supported by European Regional Development Fund through the CoE program grant TK133. We acknowledge Nvidia, SuperMicro and the Kavli Foundation for their support of iBanks. Part of this work was also performed using the Pacific Research Platform Nautilus HyperCluster supported by NSF awards CNS-1730158, ACI-1540112, ACI-1541349, OAC-1826967, the University of California Office of the President, and the University of California San Diego’s California Institute for Telecommunications and Information Technology/Qualcomm Institute. Thanks to CENIC for the 100 Gpbs networks.


  • M. Aaboud et al. (2017) Jet reconstruction and performance using particle flow with the ATLAS detector. Eur. Phys. J. C 77, pp. 466. External Links: 1703.10485, Document Cited by: §1.
  • H.J. Behrend et al. (1982) An analysis of the charged and neutral energy flow in hadronic annihilation at 34, and a determination of the QCD effective coupling constant. Phys. Lett. B 113, pp. 427. External Links: Document Cited by: §1.
  • D. Buskulic et al. (1995) Performance of the ALEPH detector at LEP. Nucl. Instrum. Meth. A 360, pp. 481. External Links: Document Cited by: §1.
  • S.V. Chekanov (2015) HepSim: a repository with predictions for high-energy physics experiments. Adv. High Energy Phys. 2015, pp. 136093. External Links: 1403.1886, Document Cited by: §2.
  • D. Clevert, T. Unterthiner, and S. Hochreiter (2016) Fast and accurate deep network learning by exponential linear units (ELUs). In 4th International Conference on Learning Representations, External Links: 1511.07289 Cited by: §3.1.
  • J. de Favereau, C. Delaere, P. Demin, A. Giammanco, V. Lemaître, A. Mertens, and M. Selvaggi (2014) 3, a modular framework for fast simulation of a generic collider experiment. JHEP 02, pp. 057. External Links: 1307.6346, Document Cited by: §2, §2.
  • F. A. Di Bello, S. Ganguly, E. Gross, M. Kado, M. Pitt, L. Santi, and J. Shlomi (2020) Towards a computer vision particle flow. External Links: 2003.08863 Cited by: §1.
  • J. Duarte, S. Han, P. Harris, S. Jindariani, E. Kreinar, B. Kreis, J. Ngadiuba, M. Pierini, R. Rivera, N. Tran, and Z. Wu (2018) Fast inference of deep neural networks in FPGAs for particle physics. JINST 13 (07), pp. P07027. External Links: Document, 1804.06913 Cited by: §4.
  • J. Duarte, P. Harris, S. Hauck, B. Holzman, S. Hsu, S. Jindariani, S. Khan, B. Kreis, B. Lee, M. Liu, V. Lončar, J. Ngadiuba, K. Pedro, B. Perez, M. Pierini, D. Rankin, N. Tran, M. Trahms, A. Tsaris, C. Versteeg, T. W. Way, D. Werran, and Z. Wu (2019) FPGA-accelerated machine learning inference as a service for particle physics computing. Comput. Softw. Big Sci. 3, pp. 13. External Links: 1904.08986, Document Cited by: §4.
  • J. Duarte and J. Vlimant (2020) Graph neural networks for particle tracking and reconstruction. In Artificial Intelligence for Particle Physics, Note: Submitted to Int. J. Mod. Phys. A External Links: 2012.01249 Cited by: §1.
  • J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl (2017) Neural message passing for quantum chemistry. In Proceedings of the 34th International Conference on Machine Learning, D. Precup and Y. W. Teh (Eds.), Vol. 70, pp. 1263. External Links: 1704.01212, Link Cited by: §3.1.
  • A. Heintz, V. Razavimaleki, J. Duarte, G. DeZoort, I. Ojalvo, S. Thais, M. Atkinson, M. Neubauer, L. Gray, S. Jindariani, N. Tran, P. Harris, D. Rankin, T. Aarrestad, V. Loncar, M. Pierini, S. Summers, J. Ngadiuba, M. Liu, E. Kreinar, and Z. Wu (2020) Accelerated charged particle tracking with graph neural networks on FPGAs. In 3rd Machine Learning and the Physical Sciences Workshop at the 34th Annual Conference on Neural Information Processing Systems, External Links: 2012.01563, Link Cited by: §4.
  • Y. Iiyama et al. (2021) Distance-weighted graph neural networks on FPGAs for real-time particle reconstruction in high energy physics. Front. Big Data 3, pp. 44. External Links: 2008.03601, Document, ISSN 2624-909X Cited by: §4.
  • J. Kieseler (2020) Object condensation: one-stage grid-free multi-object reconstruction in physics detectors, graph and image data. Eur. Phys. J. C 80, pp. 886. External Links: 2002.03605, Document Cited by: §1, §3.
  • D. P. Kingma and J. Ba (2015) Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations, Y. Bengio and Y. LeCun (Eds.), External Links: 1412.6980 Cited by: §3.1.
  • T. N. Kipf and M. Welling (2017) Semi-supervised classification with graph convolutional networks. In 5th International Conference on Learning Representations, External Links: 1609.02907, Link Cited by: §3.1.
  • N. Kitaev, Ł. Kaiser, and A. Levskaya (2020) Reformer: the efficient transformer. In 8th International Conference on Learning Representations, External Links: 2001.04451, Link Cited by: §3.1, §3.1.
  • P. T. Komiske, E. M. Metodiev, and J. Thaler (2019a) Energy flow networks: deep sets for particle jets. JHEP 01, pp. 121. External Links: 1810.05165, Document Cited by: §3.
  • P. T. Komiske, E. M. Metodiev, and J. Thaler (2019b) Metric space of collider events. Phys. Rev. Lett. 123, pp. 041801. External Links: 1902.02346, Document Cited by: §3.
  • J. Krupa, K. Lin, M. A. Flechas, J. Dinsmore, J. Duarte, P. Harris, S. Hauck, B. Holzman, S. Hsu, T. Klijnsma, M. Liu, K. Pedro, N. Suaysom, M. Trahms, and N. Tran (2020)

    GPU coprocessors as a service for deep learning inference in high energy physics

    Note: Submitted to Mach. Learn.: Sci. Technol. External Links: 2007.10359 Cited by: §4.
  • L. R. M. Mohan, A. Marshall, S. Maddrell-Mander, D. O’Hanlon, K. Petridis, J. Rademacker, V. Rege, and A. Titterton (2020) Studying the potential of Graphcore IPUs for applications in particle physics. External Links: 2008.09210 Cited by: §4.
  • J. Pata, J. M. Duarte, and A. Tepper (2021a) jpata/particleflow: MLPF paper software release. Zenodo. Note: External Links: Document, Link Cited by: §3.1.
  • J. Pata, J. M. Duarte, J. Vlimant, M. Pierini, and M. Spiropulu (2021b) Simulated particle-level dataset of with PU 200 using 8+3 for machine learned particle flow (MLPF). Zenodo. External Links: Document Cited by: §3.1.
  • S. R. Qasim, J. Kieseler, Y. Iiyama, and M. Pierini (2019) Learning representations of irregular particle-detector geometry with distance-weighted graph networks. Eur. Phys. J. C 79, pp. 608. External Links: 1902.07987, Document Cited by: §1, §3.1, §3.1.
  • D. S. Rankin, J. Krupa, P. Harris, M. A. Flechas, B. Holzman, T. Klijnsma, K. Pedro, N. Tran, S. Hauck, S. Hsu, M. Trahms, K. Lin, Y. Lou, T. Ho, J. Duarte, and M. Liu (2020) FPGAs-as-a-service toolkit (FaaST). In 2020 IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing (H2RC), External Links: 2010.08556, Document Cited by: §4.
  • M. C. Romao, N. F. Castro, J. G. Milhano, R. Pedro, and T. Vale (2020) Use of a generalized energy mover’s distance in the search for rare phenomena at colliders. External Links: 2004.09360 Cited by: §3.
  • A.M. Sirunyan et al. (2017) Particle-flow reconstruction and global event description with the CMS detector. JINST 12, pp. P10003. External Links: 1706.04965, Document Cited by: §1, §3.1.
  • T. Sjöstrand, S. Mrenna, and P. Z. Skands (2006) 6.4 physics and manual. JHEP 05, pp. 026. External Links: hep-ph/0603175, Document Cited by: §2.
  • T. Sjöstrand, S. Mrenna, and P. Z. Skands (2008) A brief introduction to 8.1. Comput. Phys. Commun. 178, pp. 852. External Links: 0710.3820, Document Cited by: §2.
  • A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin (2017) Attention is all you need. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30, pp. 5998. External Links: 1706.03762, Link Cited by: §3.1.
  • F. Wu, T. Zhang, A. H. de Souza Jr., C. Fifty, T. Yu, and K. Q. Weinberger (2019) Simplifying graph convolutional networks. In Proceedings of the 36th International Conference on Machine Learning, K. Chaudhuri and R. Salakhutdinov (Eds.), Vol. 97, pp. 6861. External Links: 1902.07153, Link Cited by: §3.1.
  • X. Xin, A. Karatzoglou, I. Arapakis, and J. M. Jose (2020) Graph highway networks. External Links: 2004.04635 Cited by: §3.1.
  • T. Yu, S. Kumar, A. Gupta, S. Levine, K. Hausman, and C. Finn (2020) Gradient surgery for multi-task learning. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. External Links: 2001.06782, Link Cited by: §3.1.