GPU-acceleration for Large-scale Tree Boosting

06/26/2017
by   Huan Zhang, et al.
0

In this paper, we present a novel massively parallel algorithm for accelerating the decision tree building procedure on GPUs (Graphics Processing Units), which is a crucial step in Gradient Boosted Decision Tree (GBDT) and random forests training. Previous GPU based tree building algorithms are based on parallel multi-scan or radix sort to find the exact tree split, and thus suffer from scalability and performance issues. We show that using a histogram based algorithm to approximately find the best split is more efficient and scalable on GPU. By identifying the difference between classical GPU-based image histogram construction and the feature histogram construction in decision tree training, we develop a fast feature histogram building kernel on GPU with carefully designed computational and memory access sequence to reduce atomic update conflict and maximize GPU utilization. Our algorithm can be used as a drop-in replacement for histogram construction in popular tree boosting systems to improve their scalability. As an example, to train GBDT on epsilon dataset, our method using a main-stream GPU is 7-8 times faster than histogram based algorithm on CPU in LightGBM and 25 times faster than the exact-split finding algorithm in XGBoost on a dual-socket 28-core Xeon server, while achieving similar prediction accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/29/2018

XGBoost: Scalable GPU Accelerated Learning

We describe the multi-GPU gradient boosting algorithm implemented in the...
research
11/06/2017

Fast Integral Histogram Computations on GPU for Real-Time Video Analytics

In many Multimedia content analytics frameworks feature likelihood maps ...
research
11/20/2012

Tera-scale Astronomical Data Analysis and Visualization

We present a high-performance, graphics processing unit (GPU)-based fram...
research
05/18/2023

Unbiased Gradient Boosting Decision Tree with Unbiased Feature Importance

Gradient Boosting Decision Tree (GBDT) has achieved remarkable success i...
research
05/23/2018

GPU Accelerated Cascade Hashing Image Matching for Large Scale 3D Reconstruction

Image feature point matching is a key step in Structure from Motion(SFM)...
research
12/20/2018

Efficient logic architecture in training gradient boosting decision tree for high-performance and edge computing

This study proposes a logic architecture for the high-speed and power ef...
research
11/23/2022

SketchBoost: Fast Gradient Boosted Decision Tree for Multioutput Problems

Gradient Boosted Decision Tree (GBDT) is a widely-used machine learning ...

Please sign up or login with your details

Forgot password? Click here to reset