Ultra-marginal Feature Importance
Scientists frequently prioritize learning from data rather than training the best possible model; however, research in machine learning often prioritizes the latter. Marginal feature importance methods, such as marginal contribution feature importance (MCI), attempt to break this trend by providing a useful framework for quantifying the relationships in data in an interpretable fashion. In this work, we generalize the framework of MCI while aiming to improve performance and runtime by introducing ultra-marginal feature importance (UMFI). To do so, we prove that UMFI can be computed directly by applying preprocessing methods from the AI fairness literature to remove dependencies in the feature set. We show on real and simulated data that UMFI performs at least as well as MCI, with significantly better performance in the presence of correlated interactions and unrelated features, while substantially reducing the exponential runtime of MCI to super-linear.
READ FULL TEXT