CS-Shapley: Class-wise Shapley Values for Data Valuation in Classification

11/13/2022
by   Stephanie Schoch, et al.
0

Data valuation, or the valuation of individual datum contributions, has seen growing interest in machine learning due to its demonstrable efficacy for tasks such as noisy label detection. In particular, due to the desirable axiomatic properties, several Shapley value approximation methods have been proposed. In these methods, the value function is typically defined as the predictive accuracy over the entire development set. However, this limits the ability to differentiate between training instances that are helpful or harmful to their own classes. Intuitively, instances that harm their own classes may be noisy or mislabeled and should receive a lower valuation than helpful instances. In this work, we propose CS-Shapley, a Shapley value with a new value function that discriminates between training instances' in-class and out-of-class contributions. Our theoretical analysis shows the proposed value function is (essentially) the unique function that satisfies two desirable properties for evaluating data values in classification. Further, our experiments on two benchmark evaluation tasks (data removal and noisy label detection) and four classifiers demonstrate the effectiveness of CS-Shapley over existing methods. Lastly, we evaluate the "transferability" of data values estimated from one classifier to others, and our results suggest Shapley-based data valuation is transferable for application across different models.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/24/2022

Threading the Needle of On and Off-Manifold Value Functions for Shapley Explanations

A popular explainable AI (XAI) approach to quantify feature importance o...
research
08/03/2020

QPLEX: Duplex Dueling Multi-Agent Q-Learning

We explore value-based multi-agent reinforcement learning (MARL) in the ...
research
12/19/2013

Avoiding Confusion between Predictors and Inhibitors in Value Function Approximation

In reinforcement learning, the goal is to seek rewards and avoid punishm...
research
01/19/2023

Shapley Values with Uncertain Value Functions

We propose a novel definition of Shapley values with uncertain value fun...
research
12/31/2019

Numerical approximation of the value of a stochastic differential game with asymmetric information

We consider a convexity constrained Hamilton-Jacobi-Bellman-type obstacl...
research
01/28/2022

Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error

In this work, we study the use of the Bellman equation as a surrogate ob...
research
09/20/2022

Learning Acceptance Regions for Many Classes with Anomaly Detection

Set-valued classification, a new classification paradigm that aims to id...

Please sign up or login with your details

Forgot password? Click here to reset