A new BART prior for flexible modeling with categorical predictors

11/08/2022
by   Sameer K. Deshpande, et al.
0

Default implementations of Bayesian Additive Regression Trees (BART) represent categorical predictors using several binary indicators, one for each level of each categorical predictor. Regression trees built with these indicators partition the levels using a “remove one a time strategy.” Unfortunately, the vast majority of partitions of the levels cannot be built with this strategy, severely limiting BART's ability to “borrow strength” across groups of levels. We overcome this limitation with a new class of regression tree and a new decision rule prior that can assign multiple levels to both the left and right child of a decision node. Motivated by spatial applications with areal data, we introduce a further decision rule prior that partitions the areas into spatially contiguous regions by deleting edges from random spanning trees of a suitably defined network. We implemented our new regression tree priors in the flexBART package, which, compared to existing implementations, often yields improved out-of-sample predictive performance without much additional computational burden. We demonstrate the efficacy of flexBART using examples from baseball and the spatiotemporal modeling of crime.

READ FULL TEXT
research
06/12/2017

Random Forests, Decision Trees, and Categorical Predictors: The "Absent Levels" Problem

One of the advantages that decision trees have over many other models is...
research
09/30/2013

MPBART - Multinomial Probit Bayesian Additive Regression Trees

This article proposes Multinomial Probit Bayesian Additive Regression Tr...
research
04/15/2020

Exploiting Categorical Structure Using Tree-Based Methods

Standard methods of using categorical variables as predictors either end...
research
06/12/2020

Bayesian Additive Regression Trees with Model Trees

Bayesian Additive Regression Trees (BART) is a tree-based machine learni...
research
02/16/2017

Tree Ensembles with Rule Structured Horseshoe Regularization

We propose a new Bayesian model for flexible nonlinear regression and cl...
research
02/28/2007

Consumer Profile Identification and Allocation

We propose an easy-to-use methodology to allocate one of the groups whic...
research
08/15/2022

Flexible Bayesian Multiple Comparison Adjustment Using Dirichlet Process and Beta-Binomial Model Priors

Researchers frequently wish to assess the equality or inequality of grou...

Please sign up or login with your details

Forgot password? Click here to reset