Human Uncertainty and Ranking Error -- The Secret of Successful Evaluation in Predictive Data Mining

08/17/2017
by   Kevin Jasberg, et al.
0

One of the most crucial issues in data mining is to model human behaviour in order to provide personalisation, adaptation and recommendation. This usually involves implicit or explicit knowledge, either by observing user interactions, or by asking users directly. But these sources of information are always subject to the volatility of human decisions, making utilised data uncertain to a particular extent. In this contribution, we elaborate on the impact of this human uncertainty when it comes to comparative assessments of different data mining approaches. In particular, we reveal two problems: (1) biasing effects on various metrics of model-based prediction and (2) the propagation of uncertainty and its thus induced error probabilities for algorithm rankings. For this purpose, we introduce a probabilistic view and prove the existence of those problems mathematically, as well as provide possible solution strategies. We exemplify our theory mainly in the context of recommender systems along with the metric RMSE as a prominent example of precision quality measures.

READ FULL TEXT

page 1

page 2

page 6

page 7

page 8

page 9

research
09/24/2012

Mining Social Data to Extract Intellectual Knowledge

Social data mining is an interesting phe-nomenon which colligates differ...
research
09/19/2017

CASP-DM: Context Aware Standard Process for Data Mining

We propose an extension of the Cross Industry Standard Process for Data ...
research
12/18/2019

Replication in Data Grids: Metrics and Strategies

We focus in this report on two main axes. The first is dedicated to the ...
research
04/29/2015

Information-theoretic Interestingness Measures for Cross-Ontology Data Mining

Community annotation of biological entities with concepts from multiple ...
research
02/16/2018

Dealing with Uncertainties in User Feedback: Strategies Between Denying and Accepting

Latest research revealed a considerable lack of reliability within user ...
research
04/25/2019

Computational Approaches to Access Probabilistic Population Codes for Higher Cognition an Decision-Making

In recent years, research unveiled more and more evidence for the so-cal...

Please sign up or login with your details

Forgot password? Click here to reset