Rewarding High-Quality Data via Influence Functions

08/30/2019
by   Adam Richardson, et al.
0

We consider a crowdsourcing data acquisition scenario, such as federated learning, where a Center collects data points from a set of rational Agents, with the aim of training a model. For linear regression models, we show how a payment structure can be designed to incentivize the agents to provide high-quality data as early as possible, based on a characterization of the influence that data points have on the loss function of the model. Our contributions can be summarized as follows: (a) we prove theoretically that this scheme ensures truthful data reporting as a game-theoretic equilibrium and further demonstrate its robustness against mixtures of truthful and heuristic data reports, (b) we design a procedure according to which the influence computation can be efficiently approximated and processed sequentially in batches over time, (c) we develop a theory that allows correcting the difference between the influence and the overall change in loss and (d) we evaluate our approach on real datasets, confirming our theoretical findings.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/31/2020

FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging

Influence functions approximate the 'influences' of training data-points...
research
10/20/2020

Model-specific Data Subsampling with Influence Functions

Model selection requires repeatedly evaluating models on a given dataset...
research
10/03/2022

Understanding Influence Functions and Datamodels via Harmonic Analysis

Influence functions estimate effect of individual data points on predict...
research
05/05/2020

Interpreting Deep Models through the Lens of Data

Identification of input data points relevant for the classifier (i.e. se...
research
06/22/2022

Federated Latent Class Regression for Hierarchical Data

Federated Learning (FL) allows a number of agents to participate in trai...
research
01/26/2021

Robustness of Iteratively Pre-Conditioned Gradient-Descent Method: The Case of Distributed Linear Regression Problem

This paper considers the problem of multi-agent distributed linear regre...
research
05/01/2023

Scalable Data Point Valuation in Decentralized Learning

Existing research on data valuation in federated and swarm learning focu...

Please sign up or login with your details

Forgot password? Click here to reset