ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R

08/18/2015
by   Marvin N. Wright, et al.
0

We introduce the C++ application and R package ranger. The software is a fast implementation of random forests for high dimensional data. Ensembles of classification, regression and survival trees are supported. We describe the implementation, provide examples, validate the package with a reference implementation, and compare runtime and memory usage with other implementations. The new software proves to scale best with the number of features, samples, trees, and features tried for splitting. Finally, we show that ranger is the fastest and most memory efficient implementation of random forests to analyze data on the scale of a genome-wide association study.

READ FULL TEXT

page 7

page 8

page 11

page 15

research
06/04/2019

Fréchet random forests

Random forests are a statistical learning method widely used in many are...
research
12/08/2013

bartMachine: Machine Learning with Bayesian Additive Regression Trees

We present a new package in R implementing Bayesian additive regression ...
research
05/11/2016

Random forests for survival analysis using maximally selected rank statistics

The most popular approach for analyzing survival data is the Cox regress...
research
02/18/2018

Training Big Random Forests with Little Resources

Without access to large compute clusters, building random forests on lar...
research
06/19/2018

Forest Packing: Fast, Parallel Decision Forests

Machine learning has an emerging critical role in high-performance compu...
research
01/31/2021

Ordinal Trees and Random Forests: Score-Free Recursive Partitioning and Improved Ensembles

Existing ordinal trees and random forests typically use scores that are ...
research
06/16/2016

The Mondrian Kernel

We introduce the Mondrian kernel, a fast random feature approximation to...

Please sign up or login with your details

Forgot password? Click here to reset