DeepAI AI Chat
Log In Sign Up

CS-Shapley: Class-wise Shapley Values for Data Valuation in Classification

by   Stephanie Schoch, et al.

Data valuation, or the valuation of individual datum contributions, has seen growing interest in machine learning due to its demonstrable efficacy for tasks such as noisy label detection. In particular, due to the desirable axiomatic properties, several Shapley value approximation methods have been proposed. In these methods, the value function is typically defined as the predictive accuracy over the entire development set. However, this limits the ability to differentiate between training instances that are helpful or harmful to their own classes. Intuitively, instances that harm their own classes may be noisy or mislabeled and should receive a lower valuation than helpful instances. In this work, we propose CS-Shapley, a Shapley value with a new value function that discriminates between training instances' in-class and out-of-class contributions. Our theoretical analysis shows the proposed value function is (essentially) the unique function that satisfies two desirable properties for evaluating data values in classification. Further, our experiments on two benchmark evaluation tasks (data removal and noisy label detection) and four classifiers demonstrate the effectiveness of CS-Shapley over existing methods. Lastly, we evaluate the "transferability" of data values estimated from one classifier to others, and our results suggest Shapley-based data valuation is transferable for application across different models.


page 1

page 2

page 3

page 4


Threading the Needle of On and Off-Manifold Value Functions for Shapley Explanations

A popular explainable AI (XAI) approach to quantify feature importance o...

QPLEX: Duplex Dueling Multi-Agent Q-Learning

We explore value-based multi-agent reinforcement learning (MARL) in the ...

Avoiding Confusion between Predictors and Inhibitors in Value Function Approximation

In reinforcement learning, the goal is to seek rewards and avoid punishm...

Shapley Values with Uncertain Value Functions

We propose a novel definition of Shapley values with uncertain value fun...

Numerical approximation of the value of a stochastic differential game with asymmetric information

We consider a convexity constrained Hamilton-Jacobi-Bellman-type obstacl...

Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error

In this work, we study the use of the Bellman equation as a surrogate ob...

Learning Acceptance Regions for Many Classes with Anomaly Detection

Set-valued classification, a new classification paradigm that aims to id...