Realization of Random Forest for Real-Time Evaluation through Tree Framing

10/27/2020
by   Katharina Morik, et al.
0

The optimization of learning has always been of particular concern for big data analytics. However, the ongoing integration of machine learning models into everyday life also demand the evaluation to be extremely fast and in real-time. Moreover, in the Internet of Things, the computing facilities that run the learned model are restricted. Hence, the implementation of the model application must take the characteristics of the executing platform into account Although there exist some heuristics that optimize the code, principled approaches for fast execution of learned models are rare. In this paper, we introduce a method that optimizes the execution of Decision Trees (DT). Decision Trees form the basis of many ensemble methods, such as Random Forests (RF) or Extremely Randomized Trees (ET). For these methods to work best, trees should be as large as possible. This challenges the data and the instruction cache of modern CPUs and thus demand a more careful memory layout. Based on a probabilistic view of decision tree execution, we optimize the two most common implementation schemes of decision trees. We discuss the advantages and disadvantages of both implementations and present a theoretically well-founded memory layout which maximizes locality during execution in both cases. The method is applied to three computer architectures, namely ARM (RISC), PPC (Extended RISC) and Intel (CISC) and is automatically adopted to the specific architecture by a code generator. We perform over 1800 experiments on several real-world data sets and report an average speed-up of 2 to 4 across all three architectures by using the proposed memory layout. Moreover, we find that our implementation outperforms sklearn, which was used to train the models by a factor of 1500.

READ FULL TEXT
research
06/19/2018

Forest Packing: Fast, Parallel Decision Forests

Machine learning has an emerging critical role in high-performance compu...
research
05/15/2023

Fast Inference of Tree Ensembles on ARM Devices

With the ongoing integration of Machine Learning models into everyday li...
research
11/10/2020

PACSET (Packed Serialized Trees): Reducing Inference Latency for Tree Ensemble Deployment

We present methods to serialize and deserialize tree ensembles that opti...
research
10/27/2020

Decision Tree and Random Forest Implementations for Fast Filtering of Sensor Data

With increasing capabilities of energy efficient systems, computational ...
research
10/16/2020

Emergent and Unspecified Behaviors in Streaming Decision Trees

Hoeffding trees are the state-of-the-art methods in decision tree learni...
research
07/26/2022

Single MCMC Chain Parallelisation on Decision Trees

Decision trees are highly famous in machine learning and usually acquire...

Please sign up or login with your details

Forgot password? Click here to reset