Goals, Process, and Challenges of Exploratory Data Analysis: An Interview Study

11/01/2019
by   Kanit Wongsuphasawat, et al.
0

How do analysis goals and context affect exploratory data analysis (EDA)? To investigate this question, we conducted semi-structured interviews with 18 data analysts. We characterize common exploration goals: profiling (assessing data quality) and discovery (gaining new insights). Though the EDA literature primarily emphasizes discovery, we observe that discovery only reliably occurs in the context of open-ended analyses, whereas all participants engage in profiling across all of their analyses. We describe the process and challenges of EDA highlighted by our interviews. We find that analysts must perform repetitive tasks (e.g., examine numerous variables), yet they may have limited time or lack domain knowledge to explore data. Analysts also often have to consult other stakeholders and oscillate between exploration and other tasks, such as acquiring and wrangling additional data. Based on these observations, we identify design opportunities for exploratory analysis tools, such as augmenting exploration with automation and guidance.

READ FULL TEXT
research
03/27/2019

The Landscape of R Packages for Automated Exploratory Data Analysis

The increasing availability of large but noisy data sets with a large nu...
research
11/08/2017

Exploration in NetHack with Secret Discovery

Roguelike games generally feature exploration problems as a critical, ye...
research
03/12/2019

SmartEDA: An R Package for Automated Exploratory Data Analysis

This paper introduces SmartEDA, which is an R package for performing Exp...
research
10/26/2020

PoliWAM: An Exploration of a Large Scale Corpus of Political Discussions on WhatsApp Messenger

WhatsApp Messenger is one of the most popular channels for spreading inf...
research
11/16/2015

How much does your data exploration overfit? Controlling bias via information usage

Modern data is messy and high-dimensional, and it is often not clear a p...
research
07/29/2021

Interactive Region-of-Interest Discovery using Exploratory Feedback

In this paper, we propose a geospatial data management framework called ...
research
02/23/2022

Exploratory Methods for Relation Discovery in Archival Data

In this article we propose a holistic approach to discover relations in ...

Please sign up or login with your details

Forgot password? Click here to reset