Fréchet random forests

06/04/2019
by   Louis Capitaine, et al.
0

Random forests are a statistical learning method widely used in many areas of scientific research essentially for its ability to learn complex relationship between input and output variables and also its capacity to handle high-dimensional data. However, data are increasingly complex with repeated measures of omics, images leading to shapes, curves... Random forests method is not specifically tailored for them. In this paper, we introduce Fréchet trees and Fréchet random forests, which allow to manage data for which input and output variables take values in general metric spaces (which can be unordered). To this end, a new way of splitting the nodes of trees is introduced and the prediction procedures of trees and forests are generalized. Then, random forests out-of-bag error and variable importance score are naturally adapted. Finally, the method is studied in the special case of regression on curve shapes, both within a simulation study and a real dataset from an HIV vaccine trial.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/15/2007

Variable importance in binary regression trees and forests

We characterize and study variable importance (VIMP) and pairwise variab...
research
08/18/2015

ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R

We introduce the C++ application and R package ranger. The software is a...
research
11/03/2021

From global to local MDI variable importances for random forests and when they are Shapley values

Random forests have been widely used for their ability to provide so-cal...
research
10/27/2018

Dealing with Uncertain Inputs in Regression Trees

Tree-based ensemble methods, as Random Forests and Gradient Boosted Tree...
research
10/30/2017

Denoising random forests

This paper proposes a novel type of random forests called a denoising ra...
research
01/31/2019

Random forests for high-dimensional longitudinal data

Random forests is a state-of-the-art supervised machine learning method ...
research
11/26/2015

Random Forests for Big Data

Big Data is one of the major challenges of statistical science and has n...

Please sign up or login with your details

Forgot password? Click here to reset