Detection of Cooperative Interactions in Logistic Regression Models
An important problem in the field of bioinformatics is to identify interactive effects among profiled variables for outcome prediction. In this paper, a logistic regression model with pairwise interactions among a set of binary covariates is considered. Modeling the structure of the interactions by a graph, our goal is to recover the interaction graph from independently identically distributed (i.i.d.) samples of the covariates and the outcome. When viewed as a feature selection problem, a simple quantity called influence is proposed as a measure of the marginal effects of the interaction terms on the outcome. For the case when the underlying interaction graph is known to be acyclic, it is shown that a simple algorithm that is based on a maximum-weight spanning tree with respect to the plug-in estimates of the influences not only has strong theoretical performance guarantees, but can also outperform generic feature selection algorithms for recovering the interaction graph from i.i.d. samples of the covariates and the outcome. Our results can also be extended to the model that includes both individual effects and pairwise interactions via the help of an auxiliary covariate.
READ FULL TEXT