Towards Efficient Data Valuation Based on the Shapley Value

02/27/2019
by   Ruoxi Jia, et al.
17

"How much is my data worth?" is an increasingly common question posed by organizations and individuals alike. An answer to this question could allow, for instance, fairly distributing profits among multiple data contributors and determining prospective compensation when data breaches happen. In this paper, we study the problem of data valuation by utilizing the Shapley value, a popular notion of value which originated in coopoerative game theory. The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value. However, the Shapley value often requires exponential time to compute. To meet this challenge, we propose a repertoire of efficient algorithms for approximating the Shapley value. We also demonstrate the value of each training instance for various benchmark datasets.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/17/2019

An Empirical and Comparative Analysis of Data Valuation with Scalable Algorithms

This paper focuses on valuating training data for supervised learning ta...
research
02/01/2023

Approximating the Shapley Value without Marginal Contributions

The Shapley value is arguably the most popular approach for assigning a ...
research
09/14/2020

A Principled Approach to Data Valuation for Federated Learning

Federated learning (FL) is a popular technique to train machine learning...
research
11/12/2018

Streaming Hardness of Unique Games

We study the problem of approximating the value of a Unique Game instanc...
research
05/23/2022

Approximating CSPs with Outliers

Constraint satisfaction problems (CSPs) are ubiquitous in theoretical co...
research
10/17/2022

Private Data Valuation and Fair Payment in Data Marketplaces

Data valuation is an essential task in a data marketplace. It aims at fa...

Please sign up or login with your details

Forgot password? Click here to reset