DeepAI AI Chat
Log In Sign Up

An Efficient Post-Selection Inference on High-Order Interaction Models

by   S. Suzumura, et al.
Nagoya Institute of Technology
The University of Tokyo

Finding statistically significant high-order interaction features in predictive modeling is important but challenging task. The difficulty lies in the fact that, for a recent applications with high-dimensional covariates, the number of possible high-order interaction features would be extremely large. Identifying statistically significant features from such a huge pool of candidates would be highly challenging both in computational and statistical senses. To work with this problem, we consider a two stage algorithm where we first select a set of high-order interaction features by marginal screening, and then make statistical inferences on the regression model fitted only with the selected features. Such statistical inferences are called post-selection inference (PSI), and receiving an increasing attention in the literature. One of the seminal recent advancements in PSI literature is the works by Lee et al. where the authors presented an algorithmic framework for computing exact sampling distributions in PSI. A main challenge when applying their approach to our high-order interaction models is to cope with the fact that PSI in general depends not only on the selected features but also on the unselected features, making it hard to apply to our extremely high-dimensional high-order interaction models. The goal of this paper is to overcome this difficulty by introducing a novel efficient method for PSI. Our key idea is to exploit the underlying tree structure among high-order interaction features, and to develop a pruning method of the tree which enables us to quickly identify a group of unselected features that are guaranteed to have no influence on PSI. The experimental results indicate that the proposed method allows us to reliably identify statistically significant high-order interaction features with reasonable computational cost.


page 1

page 2

page 3

page 4


Safe Feature Pruning for Sparse High-Order Interaction Models

Taking into account high-order interactions among covariates is valuable...

Fast and More Powerful Selective Inference for Sparse High-order Interaction Model

Automated high-stake decision-making such as medical diagnosis requires ...

Selective Inference Approach for Statistically Sound Predictive Pattern Mining

Discovering statistically significant patterns from databases is an impo...

Learning High-Order Interactions via Targeted Pattern Search

Logistic Regression (LR) is a widely used statistical method in empirica...

Bayesian Classification and Regression with High Dimensional Features

This thesis responds to the challenges of using a large number, such as ...

Texture Modelling with Nested High-order Markov-Gibbs Random Fields

Currently, Markov-Gibbs random field (MGRF) image models which include h...

Dynamical noise can enhance high-order statistical structure in complex systems

Recent research has provided a wealth of evidence highlighting the pivot...