AccuStripes: Adaptive Binning for the Visual Comparison of Univariate Data Distributions

07/19/2022
by   Anja Heim, et al.
0

Understanding and comparing distributions of data (e.g., regarding their modes, shapes, or outliers) is a common challenge in many scientific disciplines. Typically, this challenge is addressed using side-by-side comparisons of histograms or density plots. However, comparing multiple density plots is mentally demanding. Uniform histograms often represent distributions imprecisely since missing values, outliers, or modes are hidden by a grouping of equal size. In this paper, a novel type of overview visualization for the comparison of univariate data distributions is presented: AccuStripes (i.e., accumulated stripes) is a new visual metaphor encoding accumulations of data distributions according to adaptive binning using color coded stripes of irregular width. We provide detailed insights about challenges of binning. Specifically, we explore different adaptive binning concepts such as Bayesian Blocks binning and Jenks Natural Breaks binning for the computation of binning boundaries, in terms of their capabilities to represent the datasets as accurately as possible. In addition, we discuss issues arising with the representation of designs for the comparative visualization of distributions: To allow for a comparison of many distributions, their accumulated representations are plotted below each other in a stacked mode. Based on our findings, we propose three different layouts for comparative visualization of multiple distributions. The usefulness of AccuStripes is investigated using a statistical evaluation of the binning methods. Using a similarity metric from cluster analysis, it is shown, which binning method statistically yields the best grouping results. Through a user study we evaluate, which binning strategy visually represents the distribution in the most intuitive form and investigate, which layout allows the user the comparison of many distributions in the most effortless way.

READ FULL TEXT

page 1

page 4

page 5

page 9

research
08/21/2020

Visual Analysis of Large Multivariate Scattered Data using Clustering and Probabilistic Summaries

Rapidly growing data sizes of scientific simulations pose significant ch...
research
08/13/2021

Visual Arrangements of Bar Charts Influence Comparisons in Viewer Takeaways

Well-designed data visualizations can lead to more powerful and intuitiv...
research
08/15/2019

Analyzing the Fine Structure of Distributions

One aim of data mining is the identification of interesting structures i...
research
10/09/2019

Visual Multi-Metric Grouping of Eye-Tracking Data

We present an algorithmic and visual grouping of participants and eye-tr...
research
09/20/2023

Visualizing Comparisons of Bills of Materials

Data analysis often involves the comparison of complex objects. With the...
research
01/23/2020

Phoenixmap: An Abstract Approach to Visualize 2D Spatial Distributions

The multidimensional nature of spatial data poses a challenge for visual...
research
10/08/2017

Exploration of Heterogeneous Data Using Robust Similarity

Heterogeneous data pose serious challenges to data analysis tasks, inclu...

Please sign up or login with your details

Forgot password? Click here to reset