Opening practice: supporting Reproducibility and Critical spatial data science

by   Chris Brunsdon, et al.

This paper reflects on a number of trends towards a more open and reproducible approach to geographic and spatial data science over recent years. In particular it considers trends towards Big Data, and the impacts this is having on spatial data analysis and modelling. It identifies a turn in academia towards coding as a core analytic tool, and away from proprietary software tools offering 'black boxes' where the internal workings of the analysis are not revealed. It is argued that this closed form software is problematic, and considers a number of ways in which issues identified in spatial data analysis (such as the MAUP) could be overlooked when working with closed tools, leading to problems of interpretation and possibly inappropriate actions and policies based on these. In addition, this paper and considers the role that reproducible and open spatial science may play in such an approach, taking into account the issues raised. It highlights the dangers of failing to account for the geographical properties of data, now that all data are spatial (they are collected somewhere), the problems of a desire for n=all observations in data science and it identifies the need for a critical approach. This is one in which openness, transparency, sharing and reproducibility provide a mantra for defensible and robust spatial data science.


Big Issues for Big Data: challenges for critical spatial data analytics

In this paper we consider some of the issues of working with big data an...

Starting with data: advancing spatial data science by building and sharing high-quality datasets

Spatial data science has emerged in recent years as an interdisciplinary...

Computational and informatics advances for reproducible data analysis in neuroimaging

The reproducibility of scientific research has become a point of critica...

Opinionated practices for teaching reproducibility: motivation, guided instruction and practice

In the data science courses at the University of British Columbia, we de...

The Right Tools for the Job: The Case for Spatial Science Tool-Building

This paper was presented as the 8th annual Transactions in GIS plenary a...

A Variability-Aware Design Approach to the Data Analysis Modeling Process

The massive amount of current data has led to many different forms of da...

Creating optimal conditions for reproducible data analysis in R with 'fertile'

The advancement of scientific knowledge increasingly depends on ensuring...