## 1 Statistical 3D Volume Rendering

We employ statistical volume rendering frameworks [SakhaeeEntezari2017, LiuLevineBremer2012, AthawaleJohnsonEntezari2021]

for 3D visualizations of the Red Sea eddy simulation ensemble. In a statistical volume rendering framework, per-voxel uncertainty is characterized using a probability distribution, which is estimated from the ensemble members. The probability distributions are then propagated through the direct volume rendering pipeline to derive likely (expected) visualizations for the ensemble.

We derive the expected visualizations for four noise models, namely, uniform [SakhaeeEntezari2017], Gaussian, Gaussian mixtures [LiuLevineBremer2012], and nonparametric models [AthawaleJohnsonEntezari2021], as shown in Fig. 1. The visualizations are derived for the velocity magnitude ensemble with members over the domain 40E-50E and 10N-20N for the time step . Fig. 1a visualizes a single ensemble member using the arrow glyphs color mapped by velocity magnitude. The high-velocity magnitude is generally observed near the vortex rim. The transfer function shown in Fig. 1c maps the regions with relatively high-, moderate-, and low-velocity magnitudes to red, blue, and yellow, respectively. The same transfer function is used for all statistical renderings in Fig. 1.

The expected visualizations derived using the uniform, Gaussian, Gaussian mixture, and nonparametric statistical models (Fig. 1 (d-j)) appear significantly different from the mean-field visualization (Fig. 1

b). The mean statistics exhibit high sensitivity to the outlier members, thus, they lack reliable reconstructions of the expected vortical features for the ensemble. In contrast, the distribution-based models display reconstructions with relatively high resilience to the outlier members and indicate the presence of eddies in the regions indicated by

, , and (see Fig. 1d). As can be inferred from the statistical renderings, the eddy denoted by can be observed across all distribution models, thus indicating a high likelihood of its presence/position. The eddy indicated by is clearly seen in the uniform, Gaussian, and Gaussian mixture (ordered) models, but not in the remaining noise models. The eddy denoted by exhibits a high level of uncertainty regarding its presence/position, as no noise model shows a clear vortical structure in the same region.Note that the statistical summarizations using the uniform and Gaussian noise models consume only twice the amount of memory needed for the mean-field statistical approach since they store mean and width/variance per voxel. We use four Gaussians for uncertainty modeling with Gaussian mixtures (see

[LiuLevineBremer2012]), which means, they consumetimes the amount of memory needed for the mean field (mean, variance, and weight per Gaussian). Quantile interpolation consumes memory proportional to the number of quantiles (see

[AthawaleJohnsonEntezari2021] for more details). The reduced data representation allows for statistical volume rendering at interactive frame rates. (Refer to the supplementary video for the data interaction demo.)Fig. 2

a visualizes a box-plot-like view for the velocity magnitude ensemble. Specifically, we derive the lower quartile (lower 25%), middle quartile (central 50%), and upper quartile (upper 25%) at each voxel of the dataset and visualize each quartile with the uniform statistics. The quartile view

[AthawaleJohnsonEntezari2021] gives us insight into variations in features across the three populations. Fig. 2b analyzes the effects of sample size on nonparametric statistical renderings. The dotted boxes in Fig. 2b illustrate the features with relatively high sensitivity to underlying data. Fig. 3 depicts how visualizations evolve for the time steps for the mean and parametric statistics.## 2 Statistical 2D Morse Complex Summary Maps

Morse and Morse-Smale complexes are topological descriptors that provide abstract representations of the gradient flow behavior of scalar fields [EdelsbrunnerHarerZomorodian2001]. We study the variability of Morse complexes for the Red Sea ensemble members using the probabilistic maps [AthawaleJohnsonWang2019] to extract the expected vortex structures as well as to gain insight into the positional variability of expected vortex structures. For our analysis, we use an ensemble of members, in which each member corresponds to a 2D slice perpendicular to the z-axis () for time step . We again analyze the eddies over the domain 40E-50E and 10N-20

N. Each ensemble member represents a velocity vector field, and Morse complexes are computed from the negation of velocity magnitudes of each ensemble member to focus on local minima of the vector fields. The probabilistic map computation comprises three steps: persistence simplification for each member, local maxima association across simplified members via labeling, and Morse complex visualization.

### Persistence simplification.

Persistent homology is a tool in topological data analysis for quantifying the significance of topological features. It is widely used for data de-noising through persistence simplification [EdelsbrunnerLetscherZomorodian2002]. We employ persistence simplification to obtain a common label set across all ensemble members, guided by persistence graphs and spaghetti plots in Fig. 4. In particular, at the selected simplification scale (dotted red line) in Fig. 4a, of () members agree on the number of maxima () after simplification.

We illustrate three ensemble members in Figs. 5a-c, respectively. For each ensemble member, its corresponding simplified Morse complex contains 2-cells that highlight vortical features of ocean eddies (white boxes). The mean field Morse complex in Fig. 5d, however, does not give any insight into the structural uncertainty, that is, the variabilities of these features across the ensemble. The spaghetti plots of the simplified Morse complexes in Fig. 4

b do not display the topological consistency of 1-cells, thereby indicating the high variability of simulations. For the simulations with high variability, we benefit from the k-means and Morse mapping labeling strategies for deriving associations among local maxima of ensemble members, as demonstrated below.

### Labeling.

In Fig. 6, we compare the three labeling strategies proposed in [AthawaleJohnsonWang2019]. As illustrated in Fig. 6d, the number of mandatory maxima [DavidJosephJulien2014] is small () since ensemble members have large variations. Simplifying each ensemble member to have maxima will miss most of the features of interest (Fig. 6e). The Morse mapping (Fig. 6a) and the k-means clustering (Fig. 6b-c) strategies, on the other hand, provide reasonable results. In the k-mean clustering, we set since we simplified each ensemble member to contain maxima based on the analysis of persistence graphs. The Morse mapping is more flexible than the k-means without requiring the same number of maxima across the ensemble.

### Probabilistic map.

We visualize the probabilistic map using color blending [AthawaleJohnsonWang2019] for both k-means clustering and Morse mapping labeling strategies. Both visualizations in Fig. 7 highlight the positional uncertainty of 2-cell boundaries invisible to the mean field of Fig. 5d. However, the expected 2-cell boundaries (black contours) using Morse mapping appear to be more spatially stable than those obtained via k-means clustering. The expected 2-cell boundaries extract the expected eddy positions for the ensemble dataset. Figs. 8a-c visualize our entropy-based exploration of the probabilistic map for lower entropy thresholds of , , and , respectively. Figs. 8d-f carve out regions in the domain, where the ensemble agrees in their gradient destinations for at least 80%, 70%, and 60% members, respectively. Thus, the shared features denoting the eddy structures across the ensemble are discoverable in Figs. 8d-f. In Fig. 9, the probabilistic map is again visualized for the lower entropy threshold of . The gradient flows originating at the query selections in Fig. 9 have the highest probability of terminating at the local maxima with green, yellow, gray, and pink labels, respectively.

## 3 Implementation

In the case of statistical volume visualizations, the renderings are performed on a machine with Nvidia GPU Quadro P6000, with 24 GB memory. We integrated the fragment shaders for our statistical frameworks into the Voreen volume rendering engine (http://voreen.uni-muenster.de) for DVR of ensemble data. In the case of statistical Morse complex summary maps, we extend the Python code for topological data analysis available at https://pypi.org/project/topopy/. We provide the demo of our techniques in action in a supplementary video.

## 4 Conclusion

We demonstrate the effectiveness of statistical visualization techniques for aggregate analysis of the Red Sea eddy simulation dataset. Specifically, we illustrate applications of statistical volume rendering [SakhaeeEntezari2017, LiuLevineBremer2012, AthawaleJohnsonEntezari2021] and statistical Morse complex summary maps [AthawaleJohnsonWang2019] to extract the likely (expected) eddy positions as well as their variability. The distribution-based data representation in the case of statistical volume rendering allows for the exploration of the large-scale Red Sea eddy simulation ensemble in 3D at interactive frame rates. Additionally, the distribution-based statistics show increased robustness to noise compared to the mean statistics and allow for an uncertainty integration with visualizations using both statistical rendering techniques.