LHC experiment event data models are very complex and slow to read. The problem is that experiments do not care because input I/O time is minimal compared to the reconstruction process. Another critical factor is that experiments care about volume because they have lots of expensive disks.
For analysis case, the situation is different, since the data model is often more straightforward. The same case is about data volume used during the analysis phase, and there will be generated smaller a data volume and often used from SSD (NVMe). It causes minimal CPU costs and allows to iterate over events many times quickly.
ROOT IO is an incredibly flexible format. It can easily store the complex objects that correspond to the experiment’s data. In the same time, ROOT has high overheads for the serialization of simple objects.
2 Bulk IO
The typical mechanism for iterating through data in a TTree is a handwritten for-loop. ROOT uses a API shown in Listing 2 to read objects from a branch (TTree is a structure that contains one or multiple TBranches). This function runs in two steps. First, it searches the underlying storage medium for the basket where the event is located and then read the basket into a memory buffer. The TBasket is the data structure that represents the in-memory buffer. ROOT decompresses the buffer and put the uncompressed buffer in so-called “kernel” space. In the second step, once the basket appears in memory, GetEntry deserializes the requested event from the kernel-space buffer and copy it to user-space buffer.
Bulk IO interface is a set of APIs that are built in the existing ROOT IO framework. The user can choose between regular APIs and Bulk IO APIs. We implement Bulk IO in three common use cases: TBranch, TTreeReader and RDataFrame. We discuss about our interface design and integration in this section.
3.1 Bulk IO in TBranch
Listing 3.1 shows Bulk IO API in TBranch in which two input arguments need to be parsed into the function: entry and user_buf. the entry defines an event index number indicating which event the function is going to read. The user_buf parses an user-space TBuffer structure as a reference into the function. In the end of the function call, the user_buf should contain the whole basket of data that contains the inquiry event.
All tests are conducted on a desktop Intel i5 4-Core @ 3.2GHz. A TTree with 100 million float values is read with different APIs. We tested three different use cases: GetBulkEntries, TTreeReaderFast and RDataSource.
Figure 1 shows the time spent on iterating all events in the TTree with GetEntry and GetBulkEntries. Figure 2 shows the read time between TTreeReader and TTreeReaderFast. As shown in the figures, Bulk IO spends 10+ times less than GetEntry and TTreeReader. Bulk IO in both use cases spends similar time on reading events. TTreeReader interface spends more than 3 times reading events than GetEntry due to the overheads of TTreeReader itself (TTreeReader internally calls GetEntry).
Figure 3 shows the results of Bulk IO in RDataFrame. In the figure, the standard RDF shows the performance by using regular RDataFrame function calls. Bulk RDF and Bulk RDS show the result of Bulk APIs. The difference is that, Bulk RDS test detaches RDataSource from RDataFrame stack and run the test directly through RDS function calls. As shown in Figure 3, Bulk RDS outperforms standard RDF by more than 2 times. In addition, RDataFrame has extra overheads compared to RDataSource (RDataFrame internally relies RDataSource), therefore Bulk RDF runs slower than Bulk RDS, but still outperforms standard RDF.
This work was supported by the National Science Foundation under Grant ACI-1450323. This research was done using resources provided by the Holland Computing Center of the University of Nebraska.
-  Brun R and Rademakers F “ROOT - An object oriented data analysis framework”, Nucl. Instr. Meth. Phys. Res. 389 (1997) 81-86
-  Guiraud E, Naumann A and Piparo D “RDataFrame: functional chains for ROOT data analyses”, (2017) doi: 10.5281/zenodo.260230. url: https://doi.org/10.5281/zenodo.260230.
-  Bockelman B, Zhang z and Pivarski J “Optimizing ROOT IO For Analysis”, J. Phys.: Conf. Ser., 1085 (2018) 032012
-  Mckinney W “pandas: a Foundational Python Library for Data Analysis and Statistics”, PyHPC 2011 : Python for High Performance and Scientific Computing, (2011)