Data inaccuracy quantification and uncertainty propagation for bibliometric indicators

03/29/2023
by   Paul Donner, et al.
0

This study introduces an approach to estimate the uncertainty in bibliometric indicator values that is caused by data errors. This approach utilizes Bayesian regression models, estimated from empirical data samples, which are used to predict error-free data. Through direct Monte Carlo simulation – drawing predicted data from the estimated regression models a large number of times for the same input data – probability distributions for indicator values can be obtained, which provide the information on their uncertainty due to data errors. It is demonstrated how uncertainty in base quantities, such as the number of publications of a unit of certain document types and the number of citations of a publication, can be propagated along a measurement model into final indicator values. This method can be used to estimate the uncertainty of indicator values due to sources of errors with known error distributions. The approach is demonstrated with simple synthetic examples for instructive purposes and real bibliometric research evaluation data to show its possible application in practice.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/24/2018

A multilevel Monte Carlo method for high-dimensional uncertainty quantification of low-frequency electromagnetic devices

This work addresses uncertainty quantification of electromagnetic device...
research
12/06/2017

Normalization of zero-inflated data: An empirical analysis of a new indicator family and its use with altmetrics data

Recently, two new indicators (Equalized Mean-based Normalized Proportion...
research
08/29/2019

Vectorized Uncertainty Propagation and Input Probability Sensitivity Analysis

In this article we construct a theoretical and computational process for...
research
05/26/2020

Improving Regression Uncertainty Estimates with an Empirical Prior

While machine learning models capable of producing uncertainty estimates...
research
11/12/2017

Bayesian linear regression models with flexible error distributions

This work introduces a novel methodology based on finite mixtures of Stu...
research
01/10/2022

Efficient forecasting and uncertainty quantification for large scale account level Monte Carlo models of debt recovery

We consider the problem of forecasting debt recovery from large portfoli...
research
10/02/2018

A flexible sequential Monte Carlo algorithm for parametric constrained regression

An algorithm is proposed that enables the imposition of shape constraint...

Please sign up or login with your details

Forgot password? Click here to reset