Log In Sign Up

Bivariate Separable-Dimension Glyphs can Improve Visual Analysis of Holistic Features

We introduce the cause of the inefficiency of bivariate glyphs by defining the corresponding error. To recommend efficient and perceptually accurate bivariate-glyph design, we present an empirical study of five bivariate glyphs based on three psychophysics principles: integral-separable dimensions, visual hierarchy, and pre-attentive pop out, to choose one integral pair (length_y-length_x), three separable pairs (length-color, length-texture, length_y-length_y), and one redundant pair (length_y-color/length_x). Twenty participants performed four tasks requiring: reading numerical values, estimating ratio, comparing two points, and looking for extreme values among a subset of points belonging to the same sub-group. The most surprising result was that length-texture was among the most effective methods, suggesting that local spatial frequency features can lead to global pattern detection that facilitate visual search in complex 3D structure. Our results also reveal the following: length-color bivariate glyphs led to the most accurate answers and the least task execution time, while length_y-length_x (integral) dimensions were among the worst and is not recommended; it achieved high performance only when pop-up color was added.


page 2

page 6

page 10

page 11


Picturing Bivariate Separable-Features for Univariate Vector Magnitudes in Large-Magnitude-Range Quantum Physics Data

We present study results from two experiments to empirically validate th...

What Do People See in a Twenty-Second Glimpse of Bivariate Vector Field Visualizations?

Little is known about how people learn from a brief glimpse of three-dim...

Shortest Paths in HSI Space for Color Texture Classification

Color texture representation is an important step in the task of texture...

Real-time Texture Error Detection

This paper advocates an improved solution for real-time error detection ...

Spatio-spectral networks for color-texture analysis

Texture is one of the most-studied visual attribute for image characteri...

1 Introduction

Bivariate glyph visualization is a common form of visual design in which a data set is depicted by two visual variables, often chosen from a set of perception-independent graphical dimensions of shape, color, texture, size, orientation, curvature, and so on [1]. While a multitude of glyph techniques and design guidelines have been developed and compared in two-dimensions (2D) [2] [3] [4], a dearth of three-dimensional (3D) glyph design principles exists. One reason is that 3D glyph design is exceptionally challenging because human judgments of metric 3D shapes and relationships contain large errors relative to the actual structure of the observed scene [5] [6]. Often only structural properties in 3D left invariant by affine mappings are reliably perceived, such as the lines/planes parallelism of lines/planes and relative distances in parallel directions. As a result, 3D glyphs must be designed with great care to convey relationships and patterns, as 2D principles often do not apply [7].

(a) Bivariate glyph
(b) Bivariate glyph
(c) Bivariate glyph
(d) Bivariate glyph
(e) Bivariate glyph
(f) Linear glyph
Fig. 1:

Large-magnitude-range vectors are encoded in digit-exponent bivariate of scientific notation (a)-(e) and linear glyphs (f). The

bivariate glyph design (e) were ten times more accurate than the linear glyphs (f) for quantitative discrimination tasks but was not efficient for comparing two vector magnitudes [8].

Imagine visual search in a 3D large-magnitude-range vector field, where the differences between the smallest vector magnitude and the largest magnitude reach . Three-dimensional bivariate glyph scene of (co-centric cylinders) carrying parallel line lengths of the exponent and digit of scientific notation (aka splitVectors) (Figure 1 (e)) achieved up to ten times better accuracy than a single direct depiction of linear magnitude mapping (Figure 1 (f)) for quantitative discrimination tasks requiring participants to read quantities at a single sampling site or visually compute ratios of two vector magnitudes [8]. However, this bivariate splitVectors glyph also increases task completion time for an apparently simple comparison task between two vector magnitudes in 3D. This last result goes well with Ware’s design recommendation [9]: comparison is a holistic recognition task, and since a single-size linear glyph is a holistic representation, we should always use direct linear encoding.

In this work, we challenge this consensus that holistic data should be represented holistically and argue that this bivariate splitVectors gives viewers a correspondence challenge that does not arise when linear encoding is used - the need to relate these two quantitative values of exponent and digit to their visual features would hamper its efficiency. To use the glyphs in Figure 1 (e), one must determine which one is exponent and which is digit at each sampling location. We suggest that in this case, if the correspondence errors account for the temporal costs with co-centric pairs, then techniques preventing this type of error can be as effective as a holistic direct depiction without time-consuming correspondence search.

To test this prediction and to improve the comparison task completion time without reducing accuracy, we took inspiration from three distinct visual search theories: integral and separable dimensions [10], feature binding and search [11] [12], and monotonicity [1]. Visual variables that are separable (i.e. manipulated and perceived independently), would initially considered problematic for encoding holistic data because of the known feature-binding challenges [13] involving in achieving integrated numerical readings by combining two visual features. Our method utilizes the fact that binding between separable variables is not always successful and a viewer can thus adopt a sequential task-driven viewing strategy based on visual hierarchy theory [12] to obtain gross regional distribution of larger exponents. After this, a lower-order visual comparison within the same exponent can be achieved; And no binding is needed as long as the correspondence between the two visual features can be easily understood. With these two steps, judging large or small or perceiving quantities accurately from separable variables may be no more time-consuming than single linear glyphs.

There is a compelling evidence that separable dimensions are not processed holistically but are broken down into their component parts and processed independently. Reducing correspondence error is influenced by the choices of separable dimensions. According to Treisman [11] and Wolfe [12], the initial preattentive phase is the major step towards improved comprehension, more important than the attentive phase. We select the “most recognizable” features as of size, color, and texture dimensions (Figure 2). Size and color are preattentive and permit visual selection at a glance at least in 2D. We purposefully select texture patterns by varying local spatial frequency, i.e., the amount of dark on white. The texture selection is inspired by the effectiveness of spatial frequency variation in “texture stitching” [14] for showing boundaries from continuous flow fields and the fact that spatial frequency attracts attention automatically [12]. Compared to the continuous random noise in Urness et al. [14], ours is for discrete quantities thus uses regular scale variations (Figure 2 (d)). Coupling with integral and separable dimensions, we anticipated that preattentive pop-up features in separable dimensions might reduce the correspondence errors compared to integral dimensions. Following this logic, we hypothesize that highly distinguishable separable dimension pairs would erase the costs associated with the correspondence errors.

We tested this hypothesis in an experiment with four tasks using four dimension pairs to compare against the (separable) in Zhao et al. [8]: (integral), (redundant and separable), (separable), and (separable). Since we predicted that separable dimensions with more preattentive features would reduce the task completion time, and might achieve more efficiency without losing accuracy than other bivariate glyphs.

Fig. 2: Quantitative bivariate glyph composition grammar. Here a magnitude of 440 is encoded using two values 4.4 (digit) and 2 (exponent) terms. (a)-(e) shows how these two values are drawn. In this bivariate representation, the exponents are integers and the digits are continuous bounded to [1, 10). The texts in blue are not part of the glyphs.

This work makes the following contributions:

  • Empirically validates that bivariate glyphs encoded by highly distinguishable separable dimensions would reduce correspondence errors and introduces similar temporal cost as single glyphs.

  • Be the first to explain that visual comparison can be optimized by optimizing viewing behavior: we explain the benefits of the global “gist” structure constraints of “spatial structural” pop out; that expands the widely accepted “feature” pop-out theory.

  • Offers a rank order of separable variables for 3D glyph design and shows that separable pairs and were among the most effective and efficient glyph encodings.

2 Quantitative Study Method

Here we describe a quantitative lab-based empirical study with participants with broad scientific backgrounds. Results include quantitative studies of the efficiency of these new glyphs and a surprising result: that achieved high efficiency and effectiveness for most tasks.

2.1 Theoretical Foundations in Perception and Vision Sciences

At least three perceptual and vision science theories have inspired our work: integral and separable dimensions [10, 15, 16], preattentive features [17, 18], and monotonicity [1].

Terminology. To avoid confusion, we adapt terms from Treisman and Gelade [13] in vision science to visualization. We use “visual dimension” to refer to the complete range of variation that is separately analyzed by some functionally independent perceptual subsystem, and “visual feature” to refer to a particular value on a dimension. Thus color, texture, and size are visual dimensions; gray-scale, spatial-frequency, and length are the features on those dimensions. Our “visual dimension” is thus most similar to Bertin’s “visual variables” [19] in the visualization domain. The term dimension is synergistic with the Euclidean geometry coordinate system for us in the long term to define these axes in visualization design space.

Integral and separable dimensions. Garner and Felfoldy’s seminal work on integral and separable dimensions [10] has inspired many visualization design guidelines. Ware [9] suggested a continuum from more integral to more separable pairs: (red-green) - (yellow-blue), - , color - shape/size/orientation, motion - shape/size/orientation, motion - color, and group position - color. His subsequent award-winning bivariate study [1] using hue-size, hue-luminance, and hue-texton (texture) supports the idea that more separable dimensions of hue-texton leads to higher accuracy. Our work differs from Ware’s texture selection in two aspects: whether or not the texture encodes spatial frequency and the dependencies between two variables to be represented. Our texture uses the amount of black and white to show local spatial frequency, in contrast to pure shape variation in textons. We anticipate that ours will be more effective because spatial frequency is an attentive feature [12]. Also, Ware’s work focuses on finding relationships between two independent data variables, and thus his tasks are analytical; in contrast, ours demands two dependent variables to form a bivariate encoding decomposed from a holistic data point for quantitative comparison tasks. No existing work has studied whether or not the separable features facilitate holistic comparisons and whether or not the comparison is scalable to large numbers of 3D vector magnitudes.

Treisman and Gelade’s feature-integration theory of attention [13] showed that the extent of difference between target and distractors for a given feature affects search time. This theory may explain why was time consuming: the similarity of the two lengths may make them interfere with each other in the comparison, thus introducing temporal cost. What we “see” depends on our goals and expectations. Wolfe et al. propose the theory of “guided search” [20, 21], a first attempt to incorporate users’ goals into viewing, suggesting that a flexible feature map is activated based on users’ goals. Wolfe et al. further suggest that color, texture, size, and spatial frequency are among the most effective features in attracting users’ attention.

Building on these research, our current study shows that viewers can be task-driven and adopt optimal viewing strategies to be more efficient. No existing visualization work to our knowledge has studied how viewers’ strategies in visual search influence bivariate visualization of two dependent variables. While Ware has recommended holistic representations for holistic attributes, our empirical study results suggest the opposite: that separable pairs can be as efficient as holistic representations.

Preattentive and Attentive Feature Ranking. Human visual processing can be faster when it is preattentive, i.e. perceived before it is given focused attention [11]. The idea of pop-out highlighting of an object is compelling because it captures the user’s attention against a background of other objects (e.g., for showing spatial highlights [22]). Visual dimensions such as size (length and width), orientation, and color (hue, saturation, lightness) can generate pop-out effects [11] [23]. Healy and Enns [24] in their comprehensive review further describes the fact that these visual dimensions are also not “popped-out” at the same speed. Hue has higher priority than shape and texture [25].

Visual features also can be responsible for different attention speeds, and color and size (length and spatial frequency) are among those that guide attention [12]. For visualizing quantitative data, Cleveland and McGill [16] and MacKinlay [15] leveraged the ranking of visual dimensions and suggested that position and size are quantitative and can be compared. Casner [26] expends MacKinlay’s APT by incorporating user tasks to guide visualization generation. Demiralp et al. [27] evaluated a crowdsourcing method to study subjective perceptual distances of 2D bivariate pairs of shape-color, shape-size, and size-color. When adopted in 3D glyph design, these studies further suggest that the most important data attributes should be displayed with the most salient visual features, to avoid situations where secondary data values mask the information the viewer wants to see.

Monotonicity. Quantitative data encoding must normally be monotonic, and various researchers have recommended a coloring sequence that increases monotonically in luminance [28]. In addition, the visual system mostly uses luminance variation to determine shape information [29]. There has been much debate about the proper design of a color sequence for displaying quantitative data, mostly in 2D [30] and 3D shape volume variations [31]. Our primary requirement is that users be able to read large or small exponents at a glance. We chose a sequence with monotonic luminance and mapped the higher-luminance values to the higher exponents. We claim not that this color sequence is optimal, only that it is a reasonable solution to the design problem [30].

2.2 Bivariate Glyphs

We chose five bivariate glyphs to examine the comparison task efficiency of separable-integral pairs in this study.

- (integral) (Figure 2(a)). Sizes (lengths) encode digit and exponent shown as the diagonal and height of the cylinder glyphs.

(redundant and separable) (Figure 2(b)). The glyph compared to adds a redundant color (luminance and hue variations) dimension to the exponent and the four sequential colors are chosen from colorbrewer [30].

(separable) (Figure 2(c)). This glyph maps four exponents to color. Pilot testing showed that correspondence errors in this case would be the lowest among these five glyph types.

(separable) (Figure 2(d)). Texture represents exponents. The percentage of black color (Bertin [19]) is used to represent the exponential terms 0 (0%), 1 (30%), 2 (60%) and 3 (90%), wrapped around the cylinders in five segments to make it visible from any viewpoint.

(splitVectors [8], separable) (Figure 2(e)). This glyph uses the splitVectors glyph [8] as the baseline and maps both digit and exponent to lengths. The glyphs are semi-transparent so the inner cylinder showing the digit terms are legible.

Feather-like fishbone legends are added at each location when the visual variable length is used. The tick-mark band is depicted as the subtle light-gray lines around each cylinder. Distances between neighboring lines show a unit length legible at certain distance (Figure 2, the third row).

2.3 Hypotheses

Given the analysis above and recommendations in the literature, we arrived at the following working hypotheses:

  • H1. (Overall). The glyph can lead to the most accurate answers.

    Several reasons lead to this conjecture. Color and length are separable dimensions. Colors can be detected quickly, so length and color are highly distinguishable; compared to the redundant , reduces density since the glyphs are generally smaller than those in .

  • H2. (Integral-separable, objective). Among the three separable dimensions, may lead to the greatest speed and accuracy and would be more effective than .

    The hypothesis could be supported because color and length are highly separable.

  • H3. (Integral-separable, subjective). Among the three separable dimensions, will lead to greater user confidence than the other separable dimensions, and .

  • H4. (Redundant encoding, objective). The redundant encoding will reduce time and improve accuracy compare to .

  • H5. (Redundant encoding, subjective). The redundant encoding will lead to the higher user confidence than .

2.4 Tasks

(a) Task type 1 (MAG): What is the magnitude of the vector at point A? (answer: 636.30)

(b) Task type 2 (RATIO): What is the ratio of the magnitude between the vectors at points A and B? (answer: 3.60)

(c) Task type 3 (COMP): Which magnitude is larger, point A or point B? (answer: A)

(d) Task type 4 (MAX): Which point has the maximum magnitude when exponent is X? (X: 0, answer: the point with magnitude 9.89)
Fig. 3: The four task types. The callouts show the task-relevant glyph design using one example encoding type.

Participants performed the following four tasks. They had unlimited time for the first three tasks and 30 seconds to answer each question for the last task.

Task 1 (MAG): magnitude reading (Figure 2(a)). What is the magnitude at point A? One vector is marked by a red triangle labeled “A”, and participants were asked to report the magnitude of that vector. This task requires precise numerical input.

Task 2 (RATIO): ratio estimation (Figure 2(b)). What is the ratio of magnitudes of points A and B? Two vectors were marked with two red triangles labeled “A” and “B”, and participants were asked to estimate the ratio of magnitudes of these two vectors. The ratio judgment is the most challenging quantitative task. Participants can either compare the glyph shapes or decipher each vector magnitude and compute the ratio mentally.

Task 3 (COMP): comparison (Figure 2(c)). Which magnitude is larger, point A or B? Two vectors are marked with red triangles and labeled “A” and “B”. Participants selected their answer by directly clicking the “A” or “B” answer buttons. This task is a simple comparison between two values and having a binary choice of large or small.

Task 4 (MAX): identifying the extreme value (Figure 2(d)). Which point has maximum magnitude when the exponent is X? X in the study was a number from 0 to 3. Participants needed first to locate points with exponent X and then select the largest one of that group. Compared to Task 3, this is a complex comparison task requiring participants to find the extreme among many vectors.

2.5 Empirical Study Design

2.5.1 Design and Order of Trials

We used a within-subject design with one independent variable of bivariate quantitative glyphs (five types) and compared their efficiency in four tasks. Dependent variables include relative error or accuracy and task completion time. We also collected participants’ confidence levels and preferences in a post-questionnaire.

Table I shows that participants are assigned into five blocks in a Latin-square order, and within one block the order of the five glyph types is the same. Participants perform four subtasks with randomly selected datasets for each encoding on each task type. Thus, each participant performed subtasks ( tasks datasets bivariate-glyphs).

Block Participant Bivariate-dimension
1 P1, P6, P11, P16 , , LC , LT, LCL
2 P2, P7, P12, P17 , LC, LCL, , LT
3 P3, P8, P13, P18 LC, LCL, LCT, ,
4 P4, P9, P14, P19 LT, , , LCL, LC
5 P5, P10, P15, P20 LCL, LT, , LC,
TABLE I: Experimental design: 20 participants are assigned to one of the five blocks and use all five bivariate glyphs. Here, : , : , : , : , and : .

2.5.2 Data Selection

We selected the data carefully to avoid introducing a confounding factor of dataset. We generated the data by randomly sampling some quantum physics simulation results and produced 1000 samples within 3D box size of . There are 445 to 455 sampling locations in each selected data region.

We selected the data satisfying the following conditions: (1) the answers must be at locations where some context information is available, i.e., not too close to the boundary of the testing data. (2) To avoid learning effect, no data sample was repeated to the same participant; (3) Since data must include a broad measurement, we selected the task-relevant data from each exponential term of 0 to 3 to have a balanced design for task types MAG, RATIO, and MAX.

For task 1 (MAG, What is the magnitude at point A?), point A was in the range of the center of the bounding box in each data sample. In addition, the experiment had four trials for each variable pair with one instance of the exponent values of 0, 1, 2 or 3 being used.

For task 2 (RATIO, What is the ratio of the magnitudes of points A and B?) points A and B are again randomly selected; the choice of exponents is the same as task 1 as well. Thus the ratios were always larger than 1.

For task 3 (COMP, Which magnitude is larger, point A or point B?), points are again must be in the range of the center of the bounding box. The magnitude of one point is around 0.2, and magnitude of the other point is around 0.5 where is the maximum magnitude in the data sample used for the corresponding trial.

For task 4 (MAX, Which point has maximum magnitude when the exponent is X?), we select samples in which the minimum magnitude has exponent 0 and the maximum has exponent 3.

2.5.3 Participants

We diversified the participant pool as much as possible, since all tasks can be carried out by those with some science background. Twenty participants (15 male and 5 female) of mean age 23.3 (standard deviation = 4.02) participated in the study, with ten in computer science, three in engineering, two in chemistry, one from physics, one in linguistics, one in business administration, one double-major in computer science and math, and one double-major in biology and psychology. The five females were placed in each of the five blocks (Table 

I). On average, participants spent about 40 minutes on the computer-based tasks.

2.5.4 Procedure, Interaction, and Environment

Participants were greeted and completed an Institutional Review Board (IRB) consent form. All participants had normal or corrected-to-normal vision and passed the Ishihara color-blindness test. They filled in the informed consent form (which described the procedure, risks and benefits of the study) and the demographic survey. We showed glyph examples and trained the participants with one trial for each of the five glyphs per task. They could ask questions during the training but were told they could not do so during the formal study. Participants practiced until they fully understood the glyphs and tasks.

Participants sat in front of a BenQ GTG XL 2720Z, gamma-corrected display with resolution 1920 1080. The distance between the participants and the display was about . The minimum visual angle of task-associated glyphs was in the default view where all data points were visible and filled the screen. Participants could zoom in and out and press “H” to go back to the default view. After the formal study, participants filled in a post-questionnaire asking how these glyphs supported their tasks and were interviewed for their comments.

3 Quantitative Results

This section describes results of the quantitative study by participants who are knowledgeable about engineering or scientific domains.

3.1 Overview

We collected 1600 data points (80 from each of the 20 participants), and there were 400 data points from each of the four tasks. All hypotheses but H2 are supported.

Our results clearly demonstrate the benefits of separable dimensions for comparison. The glyph was the most efficient approach and had the least error. For the comparison tasks (COMP and MAX) in this study, , and were most efficient for simple two-point comparison (Figure 3(c)) and were most accurate for group comparisons (Figure 3(d)). We also compared the results of , , and with the linear approach in Zhao et al. [8] and found that the separable dimensions achieved the same level of temporal accuracy as the direct linear glyph. A most surprising result was that was highly accurate and efficient, with performance similar to that of .

3.2 Analysis Approaches

Table II and Figure 4 show the and

values computed with SAS one-way repeated measures of variance for task completion time (

base to obtain a normal distribution), the Friedman test of accuracy, and repeated measures of logistic regression on confidence levels. Post-hoc analyses on

are adjusted by Bonferroni correction. All error bars represent confidence intervals.

We evaluated effect sizes using Cohen’s for and error, and Cramer’s V for accuracy to understand the practical significance [32]. We used Cohen’s benchmarks for “small” (), “medium” (), and “large” () effects.

We removed one trial from task MAG (because the answer was out of the data range) and fourteen trials from task MAX (nine because participants didn’t answer within the 30 seconds time frame and five more at one participant’s request due to erroneous input).

Task Variables Significance ES
MAG log(time) F =3.38 p=0.01 d=0.73
Error F, d=0.15
Conf. 6.85,
RATIO log(time) F 3.67 p 0.008 d=0.72
Error ,
Conf. 1.39, p 0.85
COMP log(time) F 8.74 p 0.0001 d=1.02
Accuracy 0.45, p 0.98 V=0.03
Conf. 10.81 p 0.03
MAX Error F=3.90 p=0.006 d=0.50
Conf. 40.72 p 0.0001
TABLE II: Summary Statistics by Tasks. The significant main effects and the high effect size is in bold and the medium effect size is in italic. Conf.: confidence; ES: effect size.
(a) Task 1 (MAG)
(b) Task 2 (RATIO)
(c) Task 3 (COMP)
(d) Task 4 (MAX)
Fig. 4: Task Completion Time () and Error or Accuracy by Tasks. The horizontal axis represents the while the vertical axis showing the accuracy or relative error. Same letters represent the same post-hoc analysis group. Colors label the glyph types. All error bars represent confidence interval.

3.3 Bivariate Glyph Types vs. Time and Relative Error or Accuracy

Completion Time. We observed a significant main effect of glyph type on task completion time for all three timed tasks MAG, RATIO, and COMP, and the effect sizes were large (Figure 4 and Table II). In the MAG tasks, was in a separate, most efficient group, followed by the and group (Figure 3(a)). In the RATIO tasks, , , and are the most efficient group (Figure 3(b)); in the COMP tasks, , , and are in the most efficient group (Figure 3(c)). In these three timed tasks, was always in the least efficient group.

In the COMP tasks, since there are the same number of sample data, we perform a one-way t-test in SAS’s Mixed procedure to study the effect of glyphs on task completion time (log) with the direct glyph in previous study 

[8]. Our post-hoc analysis showed that and were in the same group as the direct encoding solutions with the least temporal cost.

Relative Error or Accuracy. We adopted the error metric for quantitative data of Cleveland and McGill [16] for task types MAG, RATIO, and MAX. This metric calculates the absolute difference between the user’s and the true difference using the formula , where the base was appropriate for relative error judgments and prevented distortion of the results towards the lower end of the error scale, since some of the absolute errors were close to 0.

We did not observe differences in the effect of glyph types on relative errors for the first two quantitative tasks of MAG and RATIO. For the two comparison tasks, the glyph type was a significant main effect on accuracy for the COMP tasks and relative error for the MAX tasks. All glyph methods fell into the same group in the COMP tasks and , , and were in the same and most accurate group in the MAX tasks; this suggests that these three methods scale well to larger datasets, because the differences between MAX and COMP are the total number of values to be searched.

3.4 Subjective Confidence and Preference

Participants ranked their confidence levels after each trial during the computer-based study. Preferences were collected in the post-questionnaire. Both data were on a scale of 1 (least confident or preferred) to 7 (most confident or preferred). Significant effects of the glyph types on confidence were observed in the two comparison tasks of COMP and MAX, but not in quantitative tasks of MAG and RATIO. was the top preferred glyph for all tasks followed by and then . The two length-based regardless orthogonal or parallel were least preferred. The confidence levels followed a similar trend as the preferences (Figure 5(b)).

(a) Confidence
(b) Preference
Fig. 5: Subjective confidence and preference. Error bars show 95 confidence intervals. Letters (A, B, C) indicate the grouping.

4 Discussion

Fig. 6: Large-magnitude-range contours encoded with our bivariate glyphs: (a) represents the most effective encoding (); (b)-(f) show a sub-region of (a) for a visual comparison of these methods. We observed that the glyph can reveal scene spatial structures just as good as the glyphs.

This section discusses the design knowledge that we can gain from the experiment and factors that influence our design.

4.1 Hypothesis Testing

H1. (Overall). The glyphs would lead to the most accurate answers. [Supported]

We confirmed H1: this design option of led to the best performance for task completion time and accuracy in nearly all tasks (Table 4). It is not surprising that was most accurate. Originally, we though the effect would be influenced by the amount of information and occlusion. This turns out not to be the case, since with the most occlusions and without pop-out also performed well. This makes us to think that scene guidance besides feature guidance (see Section 4.2) effect might be among the factors to explain the cognitive benefits of quantitative encoding.

We might also compare the coloring effect with that of Healy, Booth, and Enns [23]. Their single-variate study showed that color was strongly influenced by the surroundings of the stimulus glyph, caused a significant interference effect when participants had to judge heights of glyphs or density patterns. We did not observe such effects here because the colors are discrete and can be easily distinguished.

H2. (Integral-separable, objective). The separable dimension may lead to greater speed and accuracy than the other separable dimensions, and . [Partially supported]

We only partially confirmed H2, in that the general order of these three separable visual variable pairs was largely confirmed. However, the efficiency and effectiveness of these glyphs are very much task dependent.

One of the most interesting results is that resulted in high accuracy and efficiency in nearly all tasks: functioned just as well as the with comparable subjective confidence levels. This result can be explained that the black/white texture scales on a regular grid may lead to global spatial frequency variation, which attracts attention [12], thus directly contribute to discrimination of the global and spatial pattern differences.

We tested this conjecture through our observations with some new quantum physics simulation datasets from our collaborators as show in Figure 6. We can easily discriminate the boundaries between the adjacent magnitude variations in the (Figure 6 (e)) and (Figure 6 (d)) glyphs and these two share a similar effect.

was not as bad as we originally thought for handling correspondence errors especially for the quantitative reading tasks of MAG and RATIO. belonged to the same efficient post-hoc group as and for the RATIO tasks and these three were also most efficient for MAG. The RATIO and MAG are the only two quantitative tasks. In contrast, the glyphs did elongate the time and increase errors.

As expected, dropped to the least efficient or most error-prone groups for both comparison tasks of COMP and MAX. This result replicated the former study results in Zhao et al. [8] by showing that harms comparison efficiency or effectiveness. We think that was an effective and efficient glyph for quantitative tasks because the same type used in the glyph perhaps reduced the cognitive load and also because scales of parallel lines are preserved in 3D.

It is worth noting that the only difference among these four tasks was that the first two (MAG and RATIO) involve visual discrimination (knowing precise values or how much larger) and COMP involved visual detection (larger or higher). For MAG and RATIO, a long time may have been spent on mentally synthesizing the numerical values. Our results further confirmed that visual discrimination and visual detection were fundamentally different comparison tasks as shown in Borgo et al. [33].

The relative errors or accuracy was task-dependent and perhaps depends on set-size. The lack of significant main effect on relative errors or accuracy happened in all tasks (MAG. RATIO, and COMP). Note that none of these three tasks required initial visual search, and target answers were labeled. Wolfe called this type of task-driven with known-target guided tasks [20]. was most accurate in all task types. We thought at first that error may be related to so-called proximity, i.e., the perceptual closeness of visual variable choices to the tasks. The coloring was perhaps more direct. However, since the participants read those quantities as they commented, we thought the reason for not observing difference could well be their similarities in mentally computing cost. When search-space set-size increases for the MAX tasks, the search becomes time-consuming and none of the length pairs ( and ) was effective.

H3. (Integral-separable, subjective). may lead to better user confidence than the other separable dimensions and . [Supported]

This hypothesis H3 is supported. The strong preference for shown in participants’ feedback clearly shows that the visual distinctness of those colors improved clarity. Seeing the focus information helped participants cognitively relate the power distribution to the color distribution, resulting in close data-mapping proximity.

It is also worth noting that the preference for may correlate more with whether or not the glyphs let participants separate the data into several subgroups and less with the integral-separable dimensions. Evidence for this idea is that participants liked all separable dimensions ( and ) and the mixed-variable pair (), but not the (separable) or . One may suggest, consistent with previous results [34] (with simpler data), that people preferred colorful scenes though the colors did not improve performance.

H4. (Redundant encoding, objective). The redundant encoding () may reduce time and improve accuracy more than . [Supported]

We also confirmed hypothesis H4. We were surprised by the large performance gain with the redundant encoding of mapping and to the exponents in splitVectors. With redundant encoding, the relative error was significantly reduced and task completion time was much shorter (significantly shorter for MAG and COMP tasks). While Ware [9] confirmed that redundancy encoding was for integrating with the encoded dimension, in our case, where color and size are separable, we suggest that the redundancy works because participants can use either size or color in different task conditions. When integral dimensions of is less accurate, adding more separable color can be a compensating mechanism to aid participants in their tasks to produce more visually separable glyphs.

Since we can also consider that is a redundant encoding with and did better than in some cases, adding a more separable dimension to the integral encoding may help improve task completion time and accuracy.

H5. (Redundant encoding, subject). The redundant encoding may lead to higher users confidence levels than without this redundancy (). [Supported]

We also confirmed hypothesis H5, that colored integral dimensions were highly preferred to integral dimensions alone. As we have explained, we think the reason is that adding color to the integral encoding improved the separability of the structures.

4.2 Feature Guidance vs. Scene Guidance

Taking into account all results, we think an important part of the answer to correspondence error is guidance of attention. Attention in most task-driven fashion is not deployed randomly to objects. It is guided to some objects/locations over others by two broad methods: feature guidance and scene guidance.

Features guidance refers to the guidance by the visual features and in the 3D scene, these features are limited to a relatively small subset of visual dimensions: color, size, texture, orientation, shape, blur or shininess and so on. These features have been broadly studied in 3D glyph design (see reviews by Healey and Enns [24] and Borgo et al. [2]). In this study for example, the MAX task of searching for the largest value in the power of 3 in Figure 6 will guide attention to either the orange color or the very dark texture or the fat cylinders or the longest outer-cylinder depending on the glyph types.

(a) glyph
(b) glyph
(c) glyph
(d) glyph
Fig. 7: Contours of a Simulation Data. Size from this viewpoint can guide visual grouping and size in 3D must take advantage of knowledge of the layout of the scene [35].

Working with quantum physicists, we have noticed that the structure and content of the scene strongly constrain the possible location of meaningful structures, guided by so-called “scene guidance” constraints [36]. Scientific data are not random and are typically structured. If we return to the MAX search task in Figure 6 again, we will note that the chunk of darker or lighter texture patterns and colors on these regular contour structures strongly influence our quick detection. This is a structural and physical constraint that can be utilized effectively by viewers. This observation coupled with the empirical study results may suggest an interesting hypothesis: adding scene structure guidance would speed up quantitative discrimination, improve the accuracy of comparison tasks, and reduce the perceived data complexity.

Another structural forming guidance is the size itself. Now to find large magnitude from Figure 7, our collaborator suggested that the cylinder-bases of the same size helped locate and group glyphs belonging to the same magnitude. This observation agrees with the most recent literature that guidance-by-size in 3D must take advantage of knowledge of the layout of the scene [35].

Though feature guidance can be pre-attentive and features are detected within a fraction of a second; scene guidance is probably just about as fast though the precise experiments have not been done. Scene ‘gist’ can be extracted from complex images after very brief exposures 

[36] [37]. This doesn’t mean that a viewer instantly know where the smallest magnitude is located for the MAX tasks. However, with a fraction of a second’s exposure, a viewer will know enough about the spatial layout of the scene to guide his or her attention towards the regions of interest vector groups.

A future direction and one approach to understanding the efficiency and the effectiveness of scene guidance is to conduct an eye tracking study to give viewers a flash of our spatial structures and then to allow the viewer to see the display only in a narrow range around the point of fixation and demonstrate that this brief preview guides attention and the eyes effectively.

4.3 Regularity Influences Texture Glyph Perception

The most intriguing result is perhaps that the spatial frequencies enabled by texture in achieved as good results as the pair. We also believe that the effective glyph was influenced by the regularity of glyph positions in our quantum physics datasets, that helped form scene guidance. The black-on-which texture can only guide attention when these glyphs are placed on a grid (e.g., Figure 1) or along contours (e.g,, Figure 6). Ware’s work [1] is heading in this direction and it is intriguing to note that the Ware’s texton pattern uses discretized shape (similar to the discrete coloring) that does not lead to spatial frequency variation. It would be an interesting direction to study how texture can help perceive spatial structures to influence feature or scene guidances.

4.4 Use Our Results in Visualization Tools

One limitation of this work is that we measured only a subset of tasks crucial to showing structures and omitted all tasks relevant to orientation. However, one may argue that the vectors naturally encode orientation. When orientation is considered, we could address the multiple-channel mappings in two ways. The first solution is to use the to encode the quantitative glyphs and color to encode the orientations if we cluster the vectors by orientations. The second solution is to treat magnitude and orientation as two data facets and use multiple views to display them separately, with one view showing magnitude and the other for orientation (using Munzner’s multiform design recommendations [38]).

5 Conclusion

This work shows that correspondence computation is necessary for retrieving information visually and that viewers’ strategies can play an important role. Our results showed that with the separable pairs fall into the same group as the linear ones. Our findings, in general, suggest that the distinguishable separable dimensions will perform better, as we hypothesized. Our empirical study results provide the following recommendations for designing 3D bivariate glyphs.

  • Highly separable pairs can be used for quantitative holistic data comparisons as long as these glyphs are structure forming. We recommend using and .

  • Texture-based glyphs () that introduces spatial-frequency variation is recommended.

  • Both integral and separable bivariate glyphs have similar accuracy when the tasks are guided (aka, target location is known). They only influence accuracy when the target is unknown and when the search space increases.

  • 3D glyph scene would shorten task completion time when the glyph scene support structural and feature guidances.

  • The redundant encoding () greatly improved on the performance of integral dimensions () by adding separable and preattentive color features.

Empirical study data and results can be found online at


The work is supported in part by NSF IIS-1302755, NSF CNS-1531491, and NIST-70NANB13H181. The user study was funded by NSF grants with the UMBC IRB approval number Y17JC37120. Non-User Study design work was supported by grant from NIST-70NANB13H181. The authors would like to thank Katrina Avery for her excellent editorial support and all participants for their time and contributions.

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Certain commercial products are identified in this paper in order to specify the experimental procedure adequately. Such identification is not intended to imply recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that the products identified are necessarily the best available for the purpose.

J. Chen is the corresponding author.


  • [1] C. Ware, “Quantitative texton sequences for legible bivariate maps,” IEEE Transactions on Visualization and Computer Graphics, vol. 15, no. 6, pp. 1523–1529, 2009.
  • [2] R. Borgo, J. Kehrer, D. H. Chung, E. Maguire, R. S. Laramee, H. Hauser, M. Ward, and M. Chen, “Glyph-based visualization: Foundations, design guidelines, techniques and applications,” Eurographics State of the Art Reports, pp. 39–63, 2013.
  • [3] J. Fuchs, P. Isenberg, A. Bezerianos, and D. Keim, “A systematic review of experimental studies on data glyphs,” IEEE Transactions on Visualization and Computer Graphics, vol. 23, no. 7, pp. 1863–1879, 2017.
  • [4] M. O. Ward, “Multivariate data glyphs: Principles and practice,” in

    Handbook of Data Visualization

    .    Springer Berlin Heidelberg, pp. 179–198, 2008.
  • [5] J. T. Todd, J. S. Tittle, and J. F. Norman, “Distortions of three-dimensional space in the perceptual analysis of motion and stereo,” Perception, vol. 24, no. 1, pp. 75–86, 1995.
  • [6] J. T. Todd and J. F. Norman, “The visual perception of 3-D shape from multiple cues: Are observers capable of perceiving metric structure?” Perception & Psychophysics, vol. 65, no. 1, pp. 31–47, 2003.
  • [7] T. Ropinski, S. Oeltze, and B. Preim, “Survey of glyph-based visualization techniques for spatial multivariate medical data,” Computers & Graphics, vol. 35, no. 2, pp. 392–401, 2011.
  • [8] H. Zhao, G. W. Bryant, W. Griffin, J. E. Terrill, and J. Chen, “Validation of SplitVectors encoding for quantitative visualization of large-magnitude-range vector fields,” IEEE Transactions on Visualization and Computer Graphics, vol. 23, no. 6, pp. 1691–1705, 2017.
  • [9] C. Ware, Information Visualization: Perception for Design.    Elsevier, 2012.
  • [10] W. R. Garner and G. L. Felfoldy, “Integrality of stimulus dimensions in various types of information processing,” Cognitive Psychology, vol. 1, no. 3, pp. 225–241, 1970.
  • [11] A. Treisman and S. Gormican, “Feature analysis in early vision: evidence from search asymmetries,” Psychological Review, vol. 95, no. 1, pp. 15–48, 1988.
  • [12] J. M. Wolfe and T. S. Horowitz, “What attributes guide the deployment of visual attention and how do they do it?” Nature Reviews Neuroscience, vol. 5, no. 6, pp. 1–7, 2004.
  • [13] A. M. Treisman and G. Gelade, “A feature-integration theory of attention,” Cognitive Psychology, vol. 12, no. 1, pp. 97–136, 1980.
  • [14] T. Urness, V. Interrante, I. Marusic, E. Longmire, and B. Ganapathisubramani, “Effectively visualizing multi-valued flow data using color and texture,” IEEE Visualization, pp. 115–121, 2003.
  • [15] J. Mackinlay, “Automating the design of graphical presentations of relational information,” ACM Transactions on Graphics, vol. 5, no. 2, pp. 110–141, 1986.
  • [16] W. S. Cleveland and R. McGill, “Graphical perception: Theory, experimentation, and application to the development of graphical methods,” Journal of the American Statistical Association, vol. 79, no. 387, pp. 531–554, 1984.
  • [17] C. G. Healey and J. T. Enns, “Large datasets at a glance: Combining textures and colors in scientific visualization,” IEEE Transactions on Visualization and Computer Graphics, vol. 5, no. 2, pp. 145–167, 1999.
  • [18] C. G. Healey, K. S. Booth, and J. T. Enns, “Visualizing real-time multivariate data using preattentive processing,” ACM Transactions on Modeling and Computer Simulation, vol. 5, no. 3, pp. 190–221, 1995.
  • [19] J. Bertin, Semiology of Graphics: Diagrams, Networks, Maps.    University of Wisconsin Press, 1967.
  • [20] J. M. Wolfe, “Guided search 4.0,” Integrated Models of Cognitive Systems, pp. 99–119, 2007.
  • [21] J. Wolfe, M. Cain, K. Ehinger, and T. Drew, “Guided search 5.0: Meeting the challenge of hybrid search and multiple-target foraging,” Journal of Vision, vol. 15, no. 12, pp. 1106–1106, 2015.
  • [22] H. Strobelt, D. Oelke, B. C. Kwon, T. Schreck, and H. Pfister, “Guidelines for effective usage of text highlighting techniques,” IEEE Transactions on Visualization and Computer Graphics, vol. 22, no. 1, pp. 489–498, 2016.
  • [23] C. G. Healey, K. S. Booth, and J. T. Enns, “High-speed visual estimation using preattentive processing,” ACM Transactions on Computer-Human Interaction, vol. 3, no. 2, pp. 107–135, 1996.
  • [24] C. Healey and J. Enns, “Attention and visual memory in visualization and computer graphics,” IEEE Transactions on Visualization and Computer Graphics, vol. 18, no. 7, pp. 1170–1188, 2012.
  • [25] T. C. Callaghan, “Interference and dominance in texture segregation: Hue, geometric form, and line orientation,” Perception, & Psychophysics, vol. 46, no. 4, pp. 299–311, 1989.
  • [26] S. M. Casner, “Task-analytic approach to the automated design of graphic presentations,” ACM Transactions on Graphics, vol. 10, no. 2, pp. 111–151, 1991.
  • [27] Ç. Demiralp, M. S. Bernstein, and J. Heer, “Learning perceptual kernels for visualization design,” IEEE Transactions on Visualization and Computer Graphics, vol. 20, no. 12, pp. 1933–1942, 2014.
  • [28] B. E. Rogowitz and A. D. Kalvin, “The ”Which Blair Project”: A quick visual method for evaluating perceptual color maps,” IEEE Visualization, pp. 183–191, 2001.
  • [29] J. P. O’Shea, M. Agrawala, and M. S. Banks, “The influence of shape cues on the perception of lighting direction,” Journal of Vision, vol. 10, no. 12, pp. 1–21, 2010.
  • [30] M. Harrower and C. A. Brewer, “ An online tool for selecting colour schemes for maps,” The Cartographic Journal, vol. 40, no. 1, pp. 27–37, 2003.
  • [31]

    C. Zhang, T. Schultz, K. Lawonn, E. Eisemann, and A. Vilanova, “Glyph-based comparative visualization for diffusion tensor fields,”

    IEEE Transactions on Visualization and Computer Graphics, vol. 22, no. 1, pp. 797–806, 2016.
  • [32] J. Cohen, Statistical power analysis for the behavioral sciences.    New York: Academic Press, 1988.
  • [33] R. Borgo, J. Dearden, and M. W. Jones, “Order of magnitude markers: An empirical study on large magnitude number detection,” IEEE Transactions on Visualization and Computer Graphics, vol. 20, no. 12, pp. 2261–2270, 2014.
  • [34] A. Forsberg, J. Chen, and D. H. Laidlaw, “Comparing 3D vector field visualization methods: A user study,” IEEE Transactions on Visualization and Computer Graphics, vol. 15, no. 6, pp. 1219–1226, 2009.
  • [35]

    M. P. Eckstein, K. Koehler, L. E. Welbourne, and E. Akbas, “Humans, but not deep neural networks, often miss giant targets in scenes,”

    Current Biology, vol. 27, 2017.
  • [36] I. Biederman, “On processing information from a glance at a scene,” ACM SIGGRAPH Workshop on User-oriented Design of Interactive Graphics Systems, 1977.
  • [37] A. Oliva, “Gist of the scene,” Neurobiology of attention, vol. 696, no. 64, pp. 251–258, 2005.
  • [38] T. Munzner, Visualization Analysis and Design.    A K Peters Visualization Series. CRC Press, 2014.