Elements and Principles of Data Analysis

03/18/2019
by   Stephanie C. Hicks, et al.
0

The data revolution has led to an increased interest in the practice of data analysis. As a result, there has been a proliferation of "data science" training programs. Because data science has been previously defined as an intersection of already-established fields or union of emerging technologies, the following problems arise: (1) There is little agreement about what is data science; (2) Data science becomes secondary to established fields in a university setting; and (3) It is difficult to have discussions on what it means to learn about data science, to teach data science courses and to be a data scientist. To address these problems, we propose to define the field from first principles based on the activities of people who analyze data with a language and taxonomy for describing a data analysis in a manner spanning disciplines. Here, we describe the elements and principles of data analysis. This leads to two insights: it suggests a formal mechanism to evaluate data analyses based on objective characteristics, and it provides a framework to teach students how to build data analyses. We argue that the elements and principles of data analysis lay the foundational framework for a more general theory of data science.

READ FULL TEXT

page 8

page 9

page 11

page 12

research
04/26/2019

Evaluating the Success of a Data Analysis

A fundamental problem in the practice and teaching of data science is ho...
research
03/09/2021

Design Principles for Data Analysis

The data science revolution has led to an increased interest in the prac...
research
02/09/2023

Rehabilitating Homeless: Dataset and Key Insights

This paper presents a large anonymized dataset of homelessness alongside...
research
03/27/2023

Philosophical Foundations of GeoAI: Exploring Sustainability, Diversity, and Bias in GeoAI and Spatial Data Science

This chapter presents some of the fundamental assumptions and principles...
research
07/17/2020

Principles for data analysis workflows

Traditional data science education often omits training on research work...
research
04/10/2022

Iceberg Sensemaking: A Process Model for Critical Data Analysis and Visualization

We offer a new model of the sensemaking process for data science and vis...
research
10/30/2019

Assessment of Multiple-Biomarker Classifiers: fundamental principles and a proposed strategy

The multiple-biomarker classifier problem and its assessment are reviewe...

Please sign up or login with your details

Forgot password? Click here to reset