Shapley value confidence intervals for variable selection in regression models

01/27/2020
by   Daniel Fryer, et al.
0

Multiple linear regression is a commonly used inferential and predictive process, whereby a single response variable is modeled via an affine combination of multiple explanatory covariates. The coefficient of determination is often used to measure the explanatory power of the chosen combination of covariates. A ranking of the explanatory contribution of each of the individual covariates is often sought in order to draw inference regarding the importance of each covariate with respect to the response phenomenon. A recent method for ascertaining such a ranking is via the game theoretic Shapley value decomposition of the coefficient of determination. Such a decomposition has the desirable efficiency, monotonicity, and equal treatment properties. Under an elliptical assumption, we obtain the asymptotic normality of the Shapley values. We then utilize this result in order to construct confidence intervals and hypothesis tests regarding such quantities. Monte Carlo studies regarding our results are provided. We found that our asymptotic confidence intervals are computationally superior to competing bootstrap methods and are able to improve upon the performance of such intervals. Analyses of housing and real estate data are used to demonstrate the applicability of our methodology.

READ FULL TEXT
research
11/27/2018

Higher-order approximate confidence intervals

We derive accurate confidence intervals based on higher-order approximat...
research
03/14/2019

On confidence intervals centered on bootstrap smoothed estimators

We assess the performance, in terms of coverage probability and expected...
research
07/14/2023

Sparsified Simultaneous Confidence Intervals for High-Dimensional Linear Models

Statistical inference of the high-dimensional regression coefficients is...
research
06/21/2023

Qini Curves for Multi-Armed Treatment Rules

Qini curves have emerged as an attractive and popular approach for evalu...
research
09/24/2019

Double-estimation-friendly inference for high-dimensional misspecified models

All models may be wrong—but that is not necessarily a problem for infere...
research
11/23/2022

Shapley Curves: A Smoothing Perspective

Originating from cooperative game theory, Shapley values have become one...
research
11/29/2017

Valid Inference Corrected for Outlier Removal

Ordinary least square (OLS) estimation of a linear regression model is w...

Please sign up or login with your details

Forgot password? Click here to reset