Computing the Value of Data: Towards Applied Data Minimalism

07/29/2019
by   Michaela Regneri, et al.
0

We present an approach to compute the monetary value of individual data points, in context of an automated decision system. The proposed method enables us to explore and implement a paradigm of data minimalism for large-scale machine learning systems. Data minimalistic implementations enhance scalability, while maintaining or even optimizing a system's performance. Using two types of recommender systems, we first demonstrate how much data is ineffective in both settings. We then present a general account of computing data value via sensitivity analysis, and how, in theory, individual data points can be priced according to their informational contribution to automated decisions. We further exemplify this method to lab-scale recommender systems and outline further steps towards commercial data-minimalistic applications.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/18/2021

Beyond NDCG: behavioral testing of recommender systems with RecList

As with most Machine Learning systems, recommender systems are typically...
research
07/17/2018

Analyzing Hypersensitive AI: Instability in Corporate-Scale Machine Learning

Predictive geometric models deliver excellent results for many Machine L...
research
07/23/2023

Scalable solution to crossed random effects model with random slopes

The crossed random-effects model is widely used in applied statistics, f...
research
04/21/2017

Scatteract: Automated extraction of data from scatter plots

Charts are an excellent way to convey patterns and trends in data, but t...
research
08/13/2021

Incremental Learning for Personalized Recommender Systems

Ubiquitous personalized recommender systems are built to achieve two see...
research
09/10/2019

Distributed Equivalent Substitution Training for Large-Scale Recommender Systems

We present Distributed Equivalent Substitution (DES) training, a novel d...
research
01/29/2023

Neural Relation Graph for Identifying Problematic Data

Diagnosing and cleaning datasets are crucial for building robust machine...

Please sign up or login with your details

Forgot password? Click here to reset