An information criterion for automatic gradient tree boosting

An information theoretic approach to learning the complexity of classification and regression trees and the number of trees in gradient tree boosting is proposed. The optimism (test loss minus training loss) of the greedy leaf splitting procedure is shown to be the maximum of a Cox-Ingersoll-Ross process, from which a generalization-error based information criterion is formed. The proposed procedure allows fast local model selection without cross validation based hyper parameter tuning, and hence efficient and automatic comparison among the large number of models performed during each boosting iteration. Relative to xgboost, speedups on numerical experiments ranges from around 10 to about 1400, at similar predictive-power measured in terms of test-loss.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/28/2020

agtboost: Adaptive and Automatic Gradient Tree Boosting Computations

agtboost is an R package implementing fast gradient tree boosting comput...
research
10/18/2021

Gradient boosting with extreme-value theory for wildfire prediction

This paper details the approach of the team Kohrrelation in the 2021 Ext...
research
10/31/2017

TF Boosted Trees: A scalable TensorFlow based framework for gradient boosting

TF Boosted Trees (TFBT) is a new open-sourced frame-work for the distrib...
research
08/09/2018

Gradient and Newton Boosting for Classification and Regression

Boosting algorithms enjoy large popularity due to their high predictive ...
research
11/05/2019

A Comparative Analysis of XGBoost

XGBoost is a scalable ensemble technique based on gradient boosting that...
research
02/16/2021

Trees-Based Models for Correlated Data

This paper presents a new approach for trees-based regression, such as s...
research
04/26/2021

Infinitesimal gradient boosting

We define infinitesimal gradient boosting as a limit of the popular tree...

Please sign up or login with your details

Forgot password? Click here to reset