Tree-Values: selective inference for regression trees

06/15/2021
by   Anna C. Neufeld, et al.
2

We consider conducting inference on the output of the Classification and Regression Tree (CART) [Breiman et al., 1984] algorithm. A naive approach to inference that does not account for the fact that the tree was estimated from the data will not achieve standard guarantees, such as Type 1 error rate control and nominal coverage. Thus, we propose a selective inference framework for conducting inference on a fitted CART tree. In a nutshell, we condition on the fact that the tree was estimated from the data. We propose a test for the difference in the mean response between a pair of terminal nodes that controls the selective Type 1 error rate, and a confidence interval for the mean response within a single terminal node that attains the nominal selective coverage. Efficient algorithms for computing the necessary conditioning sets are provided. We apply these methods in simulation and to a dataset involving the association between portion control interventions and caloric intake.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/29/2022

Selective inference for k-means clustering

We consider the problem of testing for a difference in means between clu...
research
06/23/2023

Valid inference after prediction

Recent work has focused on the very common practice of prediction-based ...
research
12/18/2022

Locally Simultaneous Inference

Selective inference is the problem of giving valid answers to statistica...
research
03/14/2021

Quantifying uncertainty in spikes estimated from calcium imaging data

In recent years, a number of methods have been proposed to estimate the ...
research
12/05/2020

Selective Inference for Hierarchical Clustering

Testing for a difference in means between two groups is fundamental to a...
research
02/13/2019

Selective Inference for Testing Trees and Edges in Phylogenetics

Selective inference is considered for testing trees and edges in phyloge...
research
10/12/2019

Spatio-Temporal Mixed Models to Predict Coverage Error Rates at Local Areas

Despite of the great efforts during the censuses, occurrence of some non...

Please sign up or login with your details

Forgot password? Click here to reset