SHAP@k:Efficient and Probably Approximately Correct (PAC) Identification of Top-k Features

07/10/2023
by   Sanjay Kariyappa, et al.
0

The SHAP framework provides a principled method to explain the predictions of a model by computing feature importance. Motivated by applications in finance, we introduce the Top-k Identification Problem (TkIP), where the objective is to identify the k features with the highest SHAP values. While any method to compute SHAP values with uncertainty estimates (such as KernelSHAP and SamplingSHAP) can be trivially adapted to solve TkIP, doing so is highly sample inefficient. The goal of our work is to improve the sample efficiency of existing methods in the context of solving TkIP. Our key insight is that TkIP can be framed as an Explore-m problem–a well-studied problem related to multi-armed bandits (MAB). This connection enables us to improve sample efficiency by leveraging two techniques from the MAB literature: (1) a better stopping-condition (to stop sampling) that identifies when PAC (Probably Approximately Correct) guarantees have been met and (2) a greedy sampling scheme that judiciously allocates samples between different features. By adopting these methods we develop KernelSHAP@k and SamplingSHAP@k to efficiently solve TkIP, offering an average improvement of 5× in sample-efficiency and runtime across most common credit related datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/24/2019

PAC Identification of Many Good Arms in Stochastic Multi-Armed Bandits

We consider the problem of identifying any k out of the best m arms in a...
research
10/18/2018

On Statistical Learning of Simplices: Unmixing Problem Revisited

Learning of high-dimensional simplices from uniformly-sampled observatio...
research
07/25/2018

Deep Contextual Multi-armed Bandits

Contextual multi-armed bandit problems arise frequently in important ind...
research
05/04/2021

Optimal Algorithms for Range Searching over Multi-Armed Bandits

This paper studies a multi-armed bandit (MAB) version of the range-searc...
research
03/10/2022

Data-driven Abstractions with Probabilistic Guarantees for Linear PETC Systems

We employ the scenario approach to compute probably approximately correc...
research
07/30/2020

A PAC algorithm in relative precision for bandit problem with costly sampling

This paper considers the problem of maximizing an expectation function o...
research
10/31/2014

Validation of Matching

We introduce a technique to compute probably approximately correct (PAC)...

Please sign up or login with your details

Forgot password? Click here to reset