DataPilot: Utilizing Quality and Usage Information for Subset Selection during Visual Data Preparation

03/02/2023
by   Arpit Narechania, et al.
0

Selecting relevant data subsets from large, unfamiliar datasets can be difficult. We address this challenge by modeling and visualizing two kinds of auxiliary information: (1) quality - the validity and appropriateness of data required to perform certain analytical tasks; and (2) usage - the historical utilization characteristics of data across multiple users. Through a design study with 14 data workers, we integrate this information into a visual data preparation and analysis tool, DataPilot. DataPilot presents visual cues about "the good, the bad, and the ugly" aspects of data and provides graphical user interface controls as interaction affordances, guiding users to perform subset selection. Through a study with 36 participants, we investigate how DataPilot helps users navigate a large, unfamiliar tabular dataset, prepare a relevant subset, and build a visualization dashboard. We find that users selected smaller, effective subsets with higher quality and usage, and with greater success and confidence.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/06/2021

Lumos: Increasing Awareness of Analytic Behavior during Visual Data Analysis

Visual data analysis tools provide people with the agency and flexibilit...
research
06/07/2018

Anchored in a Data Storm: How Anchoring Bias Can Affect User Strategy, Confidence, and Decisions in Visual Analytics

Cognitive biases have been shown to lead to faulty decision-making. Rece...
research
07/05/2021

An Analytical Survey on Recent Trends in High Dimensional Data Visualization

Data visualization is the process by which data of any size or dimension...
research
05/30/2023

Model averaging approaches to data subset selection

Model averaging is a useful and robust method for dealing with model unc...
research
07/29/2020

Selection-Bias-Corrected Visualization via Dynamic Reweighting

The collection and visual analysis of large-scale data from complex syst...
research
07/19/2021

Propagating Visual Designs to Numerous Plots and Dashboards

In the process of developing an infrastructure for providing visualizati...
research
04/24/2021

Exploring Multi-dimensional Data via Subset Embedding

Multi-dimensional data exploration is a classic research topic in visual...

Please sign up or login with your details

Forgot password? Click here to reset