Instance-Based Uncertainty Estimation for Gradient-Boosted Regression Trees

05/23/2022
by   Jonathan Brophy, et al.
0

We propose Instance-Based Uncertainty estimation for Gradient-boosted regression trees (IBUG), a simple method for extending any GBRT point predictor to produce probabilistic predictions. IBUG computes a non-parametric distribution around a prediction using the k-nearest training instances, where distance is measured with a tree-ensemble kernel. The runtime of IBUG depends on the number of training examples at each leaf in the ensemble, and can be improved by sampling trees or training instances. Empirically, we find that IBUG achieves similar or better performance than the previous state-of-the-art across 22 benchmark regression datasets. We also find that IBUG can achieve improved probabilistic performance by using different base GBRT models, and can more flexibly model the posterior distribution of a prediction than competing methods. We also find that previous methods suffer from poor probabilistic calibration on some datasets, which can be mitigated using a scalar factor tuned on the validation data.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2021

Probabilistic Gradient Boosting Machines for Large-Scale Probabilistic Regression

Gradient Boosting Machines (GBM) are hugely popular for solving tabular ...
research
09/11/2020

TREX: Tree-Ensemble Representer-Point Explanations

How can we identify the training examples that contribute most to the pr...
research
06/08/2022

Bayesian additive regression trees for probabilistic programming

Bayesian additive regression trees (BART) is a non-parametric method to ...
research
11/16/2015

Binary Classifier Calibration using an Ensemble of Near Isotonic Regression Models

Learning accurate probabilistic models from data is crucial in many prac...
research
05/28/2019

Evaluating and Calibrating Uncertainty Prediction in Regression Tasks

Predicting not only the target but also an accurate measure of uncertain...
research
07/25/2021

Relational Boosted Regression Trees

Many tasks use data housed in relational databases to train boosted regr...

Please sign up or login with your details

Forgot password? Click here to reset