Influential Observations in Bayesian Regression Tree Models

03/26/2022
by   Matthew T. Pratola, et al.
0

BCART (Bayesian Classification and Regression Trees) and BART (Bayesian Additive Regression Trees) are popular Bayesian regression models widely applicable in modern regression problems. Their popularity is intimately tied to the ability to flexibly model complex responses depending on high-dimensional inputs while simultaneously being able to quantify uncertainties. This ability to quantify uncertainties is key, as it allows researchers to perform appropriate inferential analyses in settings that have generally been too difficult to handle using the Bayesian approach. However, surprisingly little work has been done to evaluate the sensitivity of these modern regression models to violations of modeling assumptions. In particular, we will consider influential observations, which one reasonably would imagine to be common – or at least a concern – in the big-data setting. In this paper, we consider both the problem of detecting influential observations and adjusting predictions to not be unduly affected by such potentially problematic data. We consider two detection diagnostics for Bayesian tree models, one an analogue of Cook's distance and the other taking the form of a divergence measure, and then propose an importance sampling algorithm to re-weight previously sampled posterior draws so as to remove the effects of influential data in a computationally efficient manner. Finally, our methods are demonstrated on real-world data where blind application of the models can lead to poor predictions and inference.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/18/2022

Visualizations for Bayesian Additive Regression Trees

Tree-based regression and classification has become a standard tool in m...
research
04/05/2022

GP-BART: a novel Bayesian additive regression trees approach using Gaussian processes

The Bayesian additive regression trees (BART) model is an ensemble metho...
research
06/29/2018

Fully Nonparametric Bayesian Additive Regression Trees

Bayesian Additive Regression Trees (BART) is fully Bayesian approach to ...
research
11/02/2022

Variational Hierarchical Mixtures for Learning Probabilistic Inverse Dynamics

Well-calibrated probabilistic regression models are a crucial learning c...
research
06/16/2016

The Effect of Heteroscedasticity on Regression Trees

Regression trees are becoming increasingly popular as omnibus predicting...
research
12/02/2020

Spatial Multivariate Trees for Big Data Bayesian Regression

High resolution geospatial data are challenging because standard geostat...
research
01/31/2023

On the Stability of General Bayesian Inference

We study the stability of posterior predictive inferences to the specifi...

Please sign up or login with your details

Forgot password? Click here to reset