Understanding and Reducing Crater Counting Errors in Citizen Science Data and the Need for Standardisation

09/06/2022
by   P. D. Tar, et al.
0

Citizen science has become a popular tool for preliminary data processing tasks, such as identifying and counting Lunar impact craters in modern high-resolution imagery. However, use of such data requires that citizen science products are understandable and reliable. Contamination and missing data can reduce the usefulness of datasets so it is important that such effects are quantified. This paper presents a method, based upon a newly developed quantitative pattern recognition system (Linear Poisson Models) for estimating levels of contamination within MoonZoo citizen science crater data. Evidence will show that it is possible to remove the effects of contamination, with reference to some agreed upon ground truth, resulting in estimated crater counts which are highly repeatable. However, it will also be shown that correcting for missing data is currently more difficult to achieve. The techniques are tested on MoonZoo citizen science crater annotations from the Apollo 17 site and also undergraduate and expert results from the same region.

READ FULL TEXT

page 6

page 8

page 22

research
10/28/2016

Missing Data Imputation for Supervised Learning

This paper compares methods for imputing missing categorical data for su...
research
03/08/2023

Estimation of Long-Range Dependent Models with Missing Data: to Input or not to Input?

Among the most important models for long-range dependent time series is ...
research
01/09/2017

Coupled Compound Poisson Factorization

We present a general framework, the coupled compound Poisson factorizati...
research
07/14/2020

Predicting feature imputability in the absence of ground truth

Data imputation is the most popular method of dealing with missing value...
research
07/01/2016

Missing Data Estimation in High-Dimensional Datasets: A Swarm Intelligence-Deep Neural Network Approach

In this paper, we examine the problem of missing data in high-dimensiona...
research
11/07/2022

Monte Carlo Techniques for Addressing Large Errors and Missing Data in Simulation-based Inference

Upcoming astronomical surveys will observe billions of galaxies across c...
research
04/03/2018

Two-stage approach for the inference of the source of high-dimension and complex chemical data in forensic science

While scholars advocate the use of a Bayes factor to quantify the weight...

Please sign up or login with your details

Forgot password? Click here to reset