1 Preliminaries
This section describes the context of reservoir simulation, introduces our formal setting and the metrics which are extended by our work.
1.1 Darcytype porous media simulation
There are multiple models for simulating flow in porous media. Though our viscous finger analysis framework is not limited to a specific simulation model, we introduce here Darcytype simulations, for which the physics is governed by quantities averaged over control volumes. We consider diphasic flow with oil and water.
Eq. 1 describes mass conservation, where is the oil and water phase; is the porosity of the medium; is the mass density of phase ; is the saturation of phase (it stands for the volume fraction of phase ); is the well source term (injection/production) of phase ; and is the velocity of phase . If denotes the total volume, then the mass of component is given by .
(1) 
Darcy’s law is an equation that describes fluid flow in porous media, determined experimentally by H. Darcy in 1856 for one phase [27], and which can be derived from the Stokes equations [104]. Its extension to multiphasic flow is given in Eq. 2, where is the velocity of phase ; is the absolute
permeability tensor of the porous medium;
is the viscosity of ; is the acceleration of gravity; is the pressure of phase ; is the relative permeability of phase . In our model, is a function of water saturation.(2) 
Furthermore, as shown in Eq. 3, oil saturation can be simply expressed in terms of water saturation, and water pressure can be expressed in terms of oil pressure, with being the capillary pressure, a function of water saturation.
(3) 
In this model, the unknowns are the saturations and pressures . The system formed by Eq. 1, 2, and 3 can then be solved numerically to yield the evolution of fluid in porous media under Darcy’s approximation. Moreover, models exist [56, 98, 11] for expressing as a function of , which can be obtained experimentally through centrifugal fan experiments. Relative permeabilities and , also functions of , are more elusive. Numerous models have been proposed in the literature in various contexts [23, 24, 18, 53, 12, 1, 37], and there is a number of methods for building them from interpretation of lab experiments [66, 60, 30, 74, 46]. Their correct definition, however, is key to a realistic description of flow in porous media, and can be quite difficult to obtain depending on the recovery mechanism, especially in processes involving severe viscous fingering patterns (in which case Darcy’s law can become approximate) or when dealing with an extra fluid phase, like an injected gas phase [4], notably because of the limited availability of experimental measurements. In the remainder of this work, relative permeabilities are considered as an input parameter of simulations.
Most of reservoir simulators are based on finite volumes discretizations of Eq. 1, 2, 3 on a gridded 2D or 3D model, in which independent variables are constant in each grid block. These quantities must be determined at each timestep by solving the sets of nonlinear conservation equations. The results shown in the experiments section were obtained in the 2D case with an inhouse research reservoir simulator [70, 49] using an IMPES scheme (IMplicit Pressure, Explicit Saturation) [17], which separately computes saturation with an explicit time approximation, and pressure with an implicit one. At every timestep, scalar data defined on control volumes is updated. As there are multiple variables, the simulator outputs multiple fields, like phase pressures and saturations. The pressure field is very diffusive, and in the diphasic case the saturation is constrained by Eq. 3. Thus, a good indicator of the simulation state is the scalar field of water saturation , which we will use as input data in the following.
1.2 Persistence diagrams
This subsection describes our formal setting. It contains definitions adapted from [100]. An introduction to TDA can be found in [32].
Input data: for each time step, the input saturation data is considered as a piecewise linear (PL) scalar field defined on a PL manifold with in our application. Scalar values are given at the vertices of
and linearly interpolated
elsewhere.Critical points: if is an isovalue, the sublevel set of , noted , is the preimage of the open interval under : . Symmetrically, the surlevel set is . These two objects serve as segmentation tools in multiple analysis tasks [9, 14, 8]. The points where the topology of differs from that of for are called the critical points of
. They can be classified according to their
index : 0 for minima, 1 for 1saddles, for saddles, for maxima.Persistence diagrams: the set of critical points of can be visually represented by a topological abstraction called the persistence diagram [33, 19] (Fig. 1). Specifically, the topological Elder Rule [32] states that critical points can be arranged in a set of pairs, such that each critical point appears in only one pair with and . Such a pairing indicates that a topological feature of (connected component, cycle, void, etc.) created at critical point dies at the critical point . For example, as the value increases, if two connected components of meet at a saddle of , the youngest of the two (the one with the highest minimal value, ) dies at the advantage of the oldest (the one with the lowest minimal value). Critical points and form a persistence pair.
A classical representation of the persistence diagram embeds each pair as a point in the 2D plane at coordinate . The height of the pair is called the persistence and denotes the lifespan of the topological feature created at and destroyed at . In 3D, the persistence of pairs linking critical points of index , and respectively denotes the lifespan of connected components, voids and noncollapsible cycles of . In the following, we will focus on persistence pairs (involving maxima).
The interest of this visual representation in practice is that it quickly hints at the distribution and relative importance of critical points. Small oscillations due to noise in the input data are typically represented by pairs with low persistence, in the vicinity of the diagonal. In contrast, the most prominent topological features are associated with large vertical bars (Fig. 1, b).
1.3 Metrics between Persistence diagrams
Metrics have been defined to evaluate the distance between scalar fields . The norm is a classical example.
In the context of TDA, multiple metrics [19, 15] have been introduced to compare two persistence diagrams and .
Critical point pairs in persistence diagrams can be associated with a pointwise distance, noted inspired by the norm. Given two persistence pairs and , can be defined as:
(4) 
The Wasserstein distance [61, 50], noted , between persistence diagrams and can then be defined as:
(5) 
where is the set of all possible assignments mapping each persistence pair to a persistence pair with identical critical indices or to its diagonal projection, noted – which corresponds to the removal of the corresponding feature from the assignment, with a cost . It is illustrated in Fig. 2. In practice, the Wasserstein distance is computed by solving a variant of the assignment problem [64, 63, 52, 92].
In the applications, this pointwise distance can be finetuned to better account for the layout of critical points in the geometrical domain , as done in applications such as feature tracking [92], resulting in the following lifted pointwise distance:
(6) 
with and in the 2D case, and where stands for the coordinates of the extremum of the persistence pair in the geometrical domain . Coordinates are taken at extrema rather than saddles or midpoints because extrema usually bear more meaning in the applications, but this can be adapted to the specificity of applicative cases.
In the above equation, the coefficients , , and need to be properly tuned for the target application. A geometrically lifted version of the Wasserstein distance, noted , can then be introduced as:
(7) 
2 Analysis framework
This section describes the problem of representing viscous fingers appearing in timevarying saturation fields, comparing them across simulations, and our approach for addressing this problem. In the following, we will note each time step of the reference groundtruth acquisition and each time step of a simulation run . Then, the goal of our framework is to efficiently compute relevant similarity measures, to rank simulation runs in order of increasing distance to the acquisition, so as to present to the experts the most plausible simulations for further inspection (Fig. Ranking Viscous Finger Simulations to an Acquired Ground Truth with Topologyaware Matchings).
Note that, in the rest of the paper, we will consider only one groundtruth acquisition data set, as the acquisition process (further detailed in Sec. 3.1) is long (several months) and involves expensive instrumentations (the process requires the acquisition machinery to operate in a high pressure environment to replicate the reservoir conditions). Because of the rarity of such acquired data, the number of available acquired time steps is in practice significantly lower than the number of simulated time steps . The simulator is thus set up to output additional time steps corresponding to a specific set of volumes of injected water, which were recorded for each time step . This physical criterion allows us to reliably match in time acquired and simulated time steps. The correspondence between simulation and acquisition timesteps is then given and reliable.
2.1 Feature representation
As discussed in the introduction, trying to reproduce the viscous fingering phenomenon with Darcytype simulation software is very challenging because the fingering geometry greatly varies when one modifies input parameters, even slightly. In particular, the input parameters considered here are the relative permeabilities . When comparing a simulation to an acquired groundtruth, this great geometrical variability challenges traditional image based distances, either pointwise based ( norm) or morphing based [26]. Moreover, the raw geometry of the viscous fingers can be insufficient in practice to identify all plausible simulations. Indeed, two geometrically different simulations can be deemed equally plausible by the experts if they share more abstract similarities, involving the number of fingers, their prominence and their progress in the porous medium. Thus, a proper feature representation, capable of abstracting these informations, is required to correctly represent the viscous fingering. Fig. 3 illustrates the extent to which the geometry of fingers may vary across simulations and how clearly distinct simulations can be judged as equally plausible by the experts.
The water saturation scalar field allows one to visually identify fingers, because they form a clear, sharp frontier with the background (as the geometric domain was initially filled with oil). The first step for identifying fingers then consists in extracting a sublevel set of water saturation, for an isovalue chosen properly, to extract the geometric domain where fingers are effectively present. In our use case, based on discussions with experts, we set in practice this isovalue parameter once for all to . Let be that subpart of . The same workflow can be applied on acquired Xray images.
To compare a simulation to an acquired groundtruth, a naive strategy consists in estimating overlaps between the sublevel sets of saturation of the simulation and the acquisition, for a given timestep, and use the area of such an overlap as a measure of likeliness. However, this purely geometric approach appears to be inadequate in practice due to the important variability in the number and shape of fingers, which then would not be accounted for (see Fig. 3).
A natural way of characterizing fingers while taking their shape into consideration is to provide with a descriptive scalar field, for instance a geodesic distance from the injection point. Here, the injection point is the left boundary of the domain, so the scalar field can simply be the geometrical coordinate. Local maxima of this new scalar field would then correspond to the tips of viscous fingers, and saddles to valleys between fingers. Since they correspond to finger tips, maxima of the geometrical coordinate provide a useful information to represent the progress of each finger in the porous medium. Moreover, in this setting, the persistence of the pair involving each maximum directly represents the length of the corresponding finger, which can be used as a reliable measure of importance given this application, to distinguish the main fingers from noise. The persistence diagram directly captures this information, in a robust and hierarchical setting. Fig. 4 illustrates the correspondence between fingers in the domain and pairs of critical points in persistence diagrams. In this context, persistence diagrams seem to be a promising feature representation for viscous fingers, since they efficiently describe their number, progress through the porous medium as well as their prominence.
2.2 Metrics between timevarying persistence diagrams
Considering that viscous fingers are captured by persistence diagrams, computing the similarity of a simulation with respect to a ground truth would require, in a first step, to compute distances between persistence diagrams. As outlined in the introduction, metrics have been introduced for this purpose, notably the (2)Wasserstein distance in the birthdeath space, noted (Eq. 5).
A drawback of is that it does not take into consideration geometrical information, other than coming from the birthdeath space. Fig. 5 illustrates this limitation. To avoid this problem, a lifted adaptation of , noted , including geometrical components can be considered (Eq. 7). It is subject to input parameters indicating the importance given to each geometrical component. Note that, the Earth mover’s distance [57], noted , which is an alternative of interest too for our application, is a special case of , for . It is similar to , but it only operates on the geometrical space instead of the birthdeath space. Thus, the lifted Wasserstein distance can be interpreted as a blend between the distances in the diagram birthdeath space and in the geometrical domain.
In practice, an important characteristic of a viscous fingering simulation run is the moment when the longest finger arrives at the right boundary, called
breakthrough time. Correctly predicting this event is essential because once it is reached, it means a preferential path has been formed, allowing water to easily flow through, impacting production. Thus, the position of local maxima (i.e. fingertips) is more important than the position of saddles (i.e. finger branchings).Then, given a timestep , to compare the persistence diagrams coming from a simulation and the acquisition , metrics should be more sensitive to the advancement of fingertips, then to the global extent of fingers, and lastly to their location in the domain. Thus, at this point, we propose to select the following metrics:

The Earth mover’s distance for local maxima:

The 2Wasserstein distance:

The 2Wasserstein distance, lifted to include geometrical information (the position of critical points): . As in this application, the advancement of fingertips is much more important than their vertical position in the domain, we only consider the coordinate of critical points. Thus, lifting coefficients (cf. Eq. 6) are ( being the extent of the geometrical domain), , and ( being the range of the scalar function). The values of these lifting parameters have been adjusted empirically based on discussions with experts.
Characterizing the evolution of fingers through time raises the necessity to integrate these metrics, as they are intended to evaluate the proximity between persistence diagrams for a single timestep . Thus, to measure the distance from a timevarying simulation to the timevarying acquired ground truth , we introduce timeintegrated versions, based on the norm, of the above metrics:
As suggested by the experts, the displacement speed of the saturation front is key to predicting breakthrough time. They suggested to match in priority simulations which display compatible fronts in terms of velocity during the experiment. Given fingers are captured by persistence diagrams, a possibility for appreciating their evolution with respect to that suggestion would be to compute the sequence of distances between diagrams in successive time steps. In other words, for each couple of consecutive time steps and , compute a distance between and (subsection 1.3), and integrate for all time steps. Here the chaotic behavior displayed by fingers when input simulation parameters change need not be taken into account: we are considering a unique simulation run, which has temporal coherence, therefore it is easier to choose a fitting metric. As shown in Fig. 6, a working solution is the 2Wasserstein distance, lifted to give more importance to the coordinate of maxima: , with lifting coefficients , , ( is the geometrical extent; is the scalar range). Because of the variability in the number of fingers, however, considering the difference of traveled distances alone could be problematic, for many little fingers going slow could compare close to few fast fingers. We then consider the mean traveled distance per finger. Thus, if (resp. ) denotes the number of fingers in the acquisition (resp. simulation) at timestep , we propose to evaluate the velocityoriented difference by:
Then, given an ensemble of timevarying viscous fingering simulations, each run can be compared to the reference acquired groundtruth and runs can be ranked in increasing order of distance to and presented to the experts for further visual inspection.
2.3 Insitu deployment
Doing feature extraction and comparisons can be problematic for very large numbers of simulations, in terms of data movement. Fortunately, computing the metrics we just presented does not require to have all timesteps available at once, and hence may be done in a progressive fashion.
We propose, within our framework, to implement the computation of metrics comparing the acquired reference to the simulation insitu, that is, without storing timesteps to the disk first. Precomputed persistence diagrams for the acquisition are first loaded in memory. Whenever the simulation attains a time for which there is a corresponding acquisition time step, the saturation scalar field is passed to our analysis pipeline, which applies a threshold, extracts the persistence diagram, and computes the per time step distance to the acquisition diagram (for instance ) . The distance can then be accumulated as the simulation unfolds. The insitu application of our pipeline is optional: timesteps can still be saved to the disk and the pipeline applied postmortem if desired.2.4 Visual interface
Each metric previously mentioned naturally produces a ranking of simulations, from the most to the less plausible ones. We propose a way to visually inspect those rankings with a lightweight HTML+Javascript application, as illustrated in Fig. 7. Note that we use the same interface for two tasks, to allow experts to (i) visually explore the rankings generated by our framework and to (ii) produce a groundtruth reference ranking (for the quantitative evaluation, Sec. 3.1). This visual interface offers linked views of the saturation scalar fields, to visually compare simulations runs, for a given time step which can be interactively selected. If needed (in particular to generate a reference ranking, cf. Sec. 3.1), the experts can interactively modify the suggested ranking by displacing a selected run up or down the ranking, either by unit or long jumps (typically skipping or positions, useful for the a priori reference ranking).
3 Case study
This section exposes our experimental setting, details a complete viscous fingering use case and summarizes the results of our approach in terms of performance and quality, compared to classical methods.
3.1 Experimental protocol
The behavior of a slab, initially filled with oil and water at connate water saturation, then subject to a water injection in reservoir conditions is captured through Xrays: Xray images are processed in order to be converted to maps of the fluid saturations within the slab. 2D simulations are then launched with varying input parameters in order to match the simulation results to the experimental measurements and to the fluid saturation maps derived from Xray images. The resemblance of fingers can be taken into account manually by experts, involving an interpretation of Xrays and an assessment of likeliness according to their expertise. A reference ranking of simulations is then produced by the experts with the help of our visual interface (Sec. 2.4), and is compared to the rankings generated by the metrics proposed in our framework (Sec. 2.2). The performance and quality of our approach are then evaluated.
Acquisition: The detailed experimental setup is that of [29, 34], further described in [90, 89, 88]. We consider slabs of Bentheimer sandstone (30902.45 cm), with porosity of approximately 23% and absolute permeability of 2.5 Darcy (when ). The slab is coated with two epoxy layers. Three grooves are cut into the first layer on the extreme faces, and connected to injection and production rails. It is mounted vertically in a 2D Xray scanning rig (Fig. 8).
Slabs first undergo cleaning and calibration processes. A tracer test validates the homogeneous behavior of the slab, then oil is injected to reach initial conditions. The system is then aged at 50°C and ambient pressure for a month, to get closer to field conditions. The water injection rate is kept constant at 3 cm^{3}/h, which corresponds to the velocity in fields far from wells. One of the fluids is doped with an Xray absorbing chemical for increasing the contrast. The scanner is equipped with an Xray source (40 to 60 kV at maximum 0.4 mA) and a camera capturing a slice of 0.511.5 cm. The camera moves in horizontal rows along the slab. A scan for a 3030 cm image takes 45 min, during which the fluid has moved by about 0.1 to 0.2 mm. The captured images, which are noisy and exhibit severe vertical and horizontal artifacts, are filtered [88], and manually segmented by an expert (Fig. 10) to differentiate fingers from the background, hence forming a reference finger geometry .
Simulations: The input parameters of the 2D simulations are relative permeabilities ( and ). In our model, they are a function of water saturation . We consider relative permeabilities in the form of simple Corey curves (Fig. 9, Eq. 8, [11]), subject to parameters (oil relative permeability endpoint), (residual oil saturation), and power law exponents and . Other quantities like (connate water saturation) and (water relative permeability endpoint) are determined by measurement.
(8) 
The parameters of these curves, , , and , were randomly sampled and selected using the algorithm by Wootton, Sergent, PhanTanLuu [79] to ensure a good initial covering of the space. The geometrical domain is discretized on a regular grid of 290890 blocks. 200 runs were launched on 400 simulation nodes (2 MPI ranks per run), then timesteps for which there was a corresponding Xray image were saved (8 available segmented images).
Expert ground truth: Images of the water saturation field were captured for each simulation at available Xray timesteps, for experts to manually form a reference ranking. During this process, experts would quickly discard runs deemed too far from the Xray image (because fingers are advancing too slow, too fast, or in a too diffusive fashion). Then, they would closely look at the shape and advancement of fingers when comparing two close runs. We use our lightweight web based visual interface (Sec. 2.4, Fig. 7), to alleviate this tedious process. Note that images from all simulations were necessary for the experts to form the reference ranking, so the corresponding simulated timesteps were saved to the disk. Once this reference is formed, later analyses can be done insitu.
3.2 Insitu performance
In this section, we evaluate the quantitative gains of using our approach insitu (Sec. 2.3), in terms of time and storage. Tab. 1 provides a CPU time comparison of our analysis pipeline based on the lifted 2Wasserstein metric (Sec. 2.2), for the two different strategies: (i) insitu, where the analysis is run on the fly during the simulation and without data storage and (ii) postmortem, where selected time steps are stored to disk to be analyzed after the simulation has finished. Persistence diagrams are computed using the algorithm by Gueunet et al. [42], and Wasserstein distances are computed using the exact approach by Soler et al. [92], both available in the Topology ToolKit (TTK, [100]). The insitu implementation is based on Catalyst [3], which is called by the simulation code at selected time steps to run a python script instantiating our analysis pipeline.
Step  CPU time (s)  Detailed CPU time (s)  

insitu  postmortem  
Simulation iteration  3.096  3.096  
Time step storage  0.076  
Catalyst analysis  1.111  0.063  Persistence diagram  
0.002  Distance  
0.001  Distance storage  
1.046  Catalyst overhead  
Data transfer  0.021  Lustre to workstation  
Data conversion  0.246  .unrst to .vtk  
Paraview analysis  2.189  0.084  Persistence diagram  
0.002  Distance  
2.085  Paraview overhead  
Analysis time  1.111  2.532  
Total processing  4.207  5.628 
The numbers are given for a single simulation timestep, therefore at the finest possible time resolution (about ten thousand timesteps are required to complete a run). Figures are averages on the timesteps of a typical run. Insitu computations are done on a supercomputer (among the of TOP500 Nov. 2018) with Xeon(R) E52680v3 processors, the postprocess is done on a local workstation with a Xeon(R) E52640v3 processor, so there is a difference in performance obtained for computing persistence diagrams.
Ideally, overheads due to different data layouts and conversions (lines “Catalyst overhead” and “Paraview overhead”, Tab. 1) in the simulator and VTK/ParaView would not enter into account (if the simulator were to directly output a VTK data array). We are left with two unneeded stages in the postmortem approach: time step writes and data transfer. Selecting timesteps from simulations, this amounts to s. of IO time versus approximately ms. necessary to write the doubles in the insitu case (representing the distance estimations), which is times faster.
In terms of data storage, the postmortem strategy requires to store and potentially transfer GiB ( MiB per timestep) of data, versus KiB for doubles (representing the distance estimations), which is times lighter.
Thus, overall, the insitu instantiation of our framework reduces data movement by orders of magnitude, while dividing by the time required to analyze a timestep (line “Analysis time”, Tab. 1).
3.3 Ranking quality
In this section, we evaluate quantitatively the relevance of the rankings obtained with each of the metrics discussed in Sec. 2.2, and compare them to rankings obtained with overlap methods, traditionally used for associating geometrical subdomains [78, 10, 9, 91, 82]. Let be the acquired finger geometry (Sec. 3.1) and be the sublevel set of the simulated water saturation at time . The overlap between and is the volume of divided by the volume of . From this we can define a distance:
Integrating the overlap between and for a single simulation and comparing it to the integrated overlap for the acquisition yields a velocityoriented version:
At this point, we need to compare different rankings to the reference ranking constituted by experts. Let and be two rankings of simulations. One of the most commonly used methods for computing a degree of similarity between and is Kendall’s [25, 39]: for all couples and :
(9) 
It corresponds to the number of pairs for which and in have the same ordering as and in minus the number of pairs for which the orderings in and are different. In other words, it is the difference between the number of concordant pairs and the number of discordant pairs. The closer this number is to in absolute value, the more compatible the rankings, being close to indicates that the two rankings are in reverse order.
Over the 200 examined runs, many were quickly discarded by experts during the manual ranking, because they were too far from the acquisition. Thus, as the order of the poorest runs is not important to the experts, we also compute the similarity with the reference ranking for the best (top 50 and top 25) identified runs according to each method. This enables us to focus the evaluation of the performance of our framework in separating plausible from nonplausible runs. Resulting Kendall coefficients are exposed in Tab. 2.
Method  

All  0.37  0.25  0.26  0.15  0.12  0.41 
Top 50  0.22  0.46  0.66  0.47  0.29  0.46 
Top 25  0.13  0.29  0.84  0.70  0.13  0.42 
Observing lines 2 and 3 in Tab. 2, we can first note that the overlap method (column “”) does not perform well. This behavior was expected because of the very chaotic geometry of fingers. The Wasserstein method, which is the traditional reference metric for comparing persistence diagrams, is shown in column “”. The Earth mover’s distance method (column “”) only takes the geometrical information of extrema into account, regardless of their persistence. It seems to perform better than , which is unexpected, because can wrongly associate smallscale details to largescale ones. The lifted Wasserstein method, which includes persistence information and favors a geometrical direction, is shown in column “”. As it achieves the best overall Kendall coefficients, it seems that manages to combine the advantages of both and , not just being a simple interpolation between the two. Lastly, metrics based on the distances traveled by fingers (columns and ) do not appear to be able to produce relevant rankings.
Method  
too diffuse  0  4  0  5  0  7 
too slow  0  0  0  0  17  0 
common  0  10  21  18  0  11 
appreciation  poor  good  best  poor  wrong  wrong 
3.4 Expert feedback
In this section we expose a qualitative appreciation, collected from experts, and a discussion of ranking results. We show in Tab. 3 the qualitative appreciation of rankings. The poor performance of velocitybased metrics ( and ) was unexpected. Looking at the rankings, we see that aberrant runs are considered close to the ground truth by these two metrics. There are three types of aberrant runs: too slow, too fast, and too diffusive (i.e. whose finger tips grow large and do not form a very sharp frontier with the background, as illustrated in Fig. 11). gives a good score to runs that are too slow, and , on the contrary, scores highly runs that are too diffusive.
The number of slow runs is counted in Tab. 3. The approach, based on overlaps in consecutive timesteps, does not discard them. The reason for this is that in simulations, the water saturation front is very smooth though in the acquisition fingers display a quite dendritic structure. Thus, the overlap between successive timesteps of smooth fingers going slow compares close to the overlap between successive timesteps of thin fingers going fast. The number of runs which exhibit a very diffusive behavior is also counted. These diffuse fingers seem to give trouble to the metric (and also to and ). This is because in the set of available simulations, among all which are diffuse some inevitably end up at the exact same advancement as the acquisition when the threshold stage (Sec. 2.1) is applied. Taking into account the number of fingers () or considering their branching events () is apparently insufficient to discard them. Note that the metric (and even ) were well appreciated by the experts because the top simulations in their rankings display fingers whose tips are quite close to the acquisition near breakthrough time, though for , there seems to be a higher distance variability. As for the basic overlap method , it fails to identify the real best simulations, though it does not incorrectly bring out aberrant runs either (be it too diffuse or too slow). Its ranking, though, feels random to the experts. Overall, the best performing metric seems to be , as confirmed quantitatively and qualitatively.
Taking a step back, the approach we proposed is appealing to experts because it allows them to include geometrical information into their parametric studies, in an autonomous and systematic way (instead of manually inspecting and checking runs). Using the based ranking results, we present in Fig. 12 all permeability curves and those yielding the best simulation runs. We see no clear pattern arising, either visually or numerically with respect to or Haussdorff [65] distances between curves. This confirms the wellknown difficulty of calibrating curves.
4 Conclusion
In this application paper, we presented a framework for enabling the automatic comparison and ranking of simulation runs to an acquired ground truth. We presented a set of metrics specifically adapted to this task in the case of viscous fingering in porous media. After evaluation, we identified the best fitting approach (the metric, which computes a geometrically tuned Wasserstein distance between simulation and acquisition persistence diagrams, on a pertimestep basis). This quantitative measurement method supplements the expert’s, and allows them to automatically form a subjective ranking close to one they would have manually produced. We demonstrated the possibility and showed the advantage of implementing the computation of this metric insitu, speeding up the analysis pipeline by a factor of 2.3 and reducing data movement by orders of magnitude. We proposed a lightweight web interface to explore automatically generated rankings and manually edit them. As with the best metric , there are still some diffuse runs in the ranked best fifty, we believe it could be further enhanced, for instance by considering the sharpness of the water saturation front, or by augmented the persistence diagram with the individual volume of fingers. Besides, though in our experimental setting, simulations took place in a 2D domain, nothing in our approach is restrictive to this case. In future work, our framework could be experimented with simulation models other than Darcy’s. Moreover, it could also be applied to 3D cases. Our overall approach would be usable asis in these scenarios, but its metaparameters (water saturation threshold and lifting coefficients) would likely need to be adjusted. Automatically optimizing the values of these metaparameters is also a promising direction for future work. Furthermore, the metrics introduced in this work have been mostly motivated empirically based on interactions with domain experts. In the future, we will consider a theoretical investigation of the stability of these metrics (based on stability results on the Bottleneck [19] and Wasserstein [20] metrics). Further, ways to capture finger merging events, for instance by considering richer topological structures such as Reeb Graphs, could also be considered, as this seems to happen in acquired images.
On another note, on the set of 200 simulations, we were not able to identify a regime of bestmatching input parameters. The combination of our metric with production and pressure data (at injectors/producers) in a followup study would be interesting in this regard. Trying to understand the influence of the space of input parameters, here curves, proves quite challenging. In our study, only four parameters (power law exponents and endpoints) were sampled, yielding a fourdimensional space, but the number of sampled parameter may be significantly higher. In particular, this study may be extended to permeability curves other than based on simple Corey power laws. We think it would be insightful to develop a visual interface for exploring such spaces of model parameters. We hope to see, in future studies, how accounting for the geometrical and topological quality of a modeled phenomenon can be used to infer or restrict model parameters.
Acknowledgements.
This work is partially supported by the Bpifrance grant “AVIDO” (Programme d’Investissements d’Avenir, reference P1120172661376/DOS0021427), by the French National Association for Research and Technology (ANRT), in the framework of the LIP6  Total SA CIFRE partnership reference 2016/0010 and by the European Commission grant H2020FETHPC “VESTEC” (ref. 800904). The authors would like to thank the anonymous reviewers for their thoughtful remarks and suggestions.References
 [1] F. O. Alpak, L. W. Lake, S. M. Embid, et al. Validation of a modified carmankozeny equation to model twophase relative permeabilities. In SPE Annual Technical Conference and Exhibition. Society of Petroleum Engineers, 1999.
 [2] U. Ayachit, A. Bauer, E. P. Duque, G. Eisenhauer, N. Ferrier, J. Gu, K. E. Jansen, B. Loring, Z. Lukić, S. Menon, et al. Performance analysis, design considerations, and applications of extremescale in situ infrastructures. In SuperComputing, 2016.
 [3] U. Ayachit, A. Bauer, B. Geveci, P. O’Leary, K. Moreland, N. Fabian, and J. Mauldin. Paraview catalyst: Enabling in situ data analysis and visualization. In Proc. of the First Workshop on In Situ Infrastructures for Enabling ExtremeScale Analysis and Vis., pp. 25–29. ACM, 2015.
 [4] L. Baker et al. Threephase relative permeability correlations. In SPE Enhanced Oil Recovery Symposium, 1988.
 [5] J. C. Bennett, H. Abbasi, P. T. Bremer, R. Grout, A. Gyulassy, T. Jin, S. Klasky, H. Kolla, M. Parashar, V. Pascucci, P. Pebay, D. Thompson, H. Yu, F. Zhang, and J. Chen. Combining insitu and intransit processing to enable extremescale scientific analysis. In SC, 2012.
 [6] H. Bhatia, A. G. Gyulassy, V. Lordi, J. E. Pask, V. Pascucci, and P.T. Bremer. Topoms: Comprehensive topological exploration for molecular and condensedmatter systems. J. of Comp. Chem., 2018.
 [7] S. Biasotti, D. Giorgio, M. Spagnuolo, and B. Falcidieno. Reeb graphs for shape analysis and applications. TCS, 2008.
 [8] A. Bock, H. Doraiswamy, A. Summers, and C. Silva. Topoangler: Interactive topologybased extraction of fishes. IEEE TVCG, 2018.
 [9] P. Bremer, G. Weber, J. Tierny, V. Pascucci, M. Day, and J. Bell. Interactive exploration and analysis of large scale simulations using topologybased data segmentation. IEEE TVCG, 2011.
 [10] P. T. Bremer, G. Weber, V. Pascucci, M. Day, and J. Bell. Analyzing and tracking burning structures in lean premixed hydrogen flames. IEEE TVCG, 2010.
 [11] R. H. Brooks and A. T. Corey. Hydraulic properties of porous media and their relation to drainage design. Transactions of the ASAE, 1964.
 [12] F. M. Carlson et al. Simulation of relative permeability hysteresis to the nonwetting phase. In SPE annual technical conference and exhibition. Society of Petroleum Engineers, 1981.
 [13] H. Carr, J. Snoeyink, and U. Axen. Computing contour trees in all dimensions. Computational Geometry, 24(2):75–94, 2003.
 [14] H. Carr, J. Snoeyink, and M. van de Panne. Simplifying flexible isosurfaces using local geometric measures. In IEEE VIS, 2004.
 [15] F. Chazal, D. CohenSteiner, M. Glisse, L. J. Guibas, and S. Oudot. Proximity of Persistence Modules and their Diagrams. In SoCG, 2009.
 [16] C.C. Chen and H.T. Chu. Similarity measurement between images. In 29th Annual International Computer Software and Applications Conference (COMPSAC’05), vol. 2, pp. 41–42. IEEE, 2005.
 [17] Z. Chen, G. Huan, and B. Li. An improved IMPES method for twophase flow in porous media. Transport in porous media, 2004.
 [18] G. L. Chierici et al. Novel relations for drainage and imbibition relative permeabilities. Soc. of Petroleum Engineers Journal, 1984.
 [19] D. CohenSteiner, H. Edelsbrunner, and J. Harer. Stability of persistence diagrams. In Symp. on Comp. Geom., pp. 263–271, 2005.
 [20] D. CohenSteiner, H. Edelsbrunner, J. Harer, and Y. Mileyko. Lipschitz functions have Lstable persistence. Foundations of Computational Mathematics, 2010.
 [21] D. CohenSteiner, H. Edelsbrunner, and D. Morozov. Vines and vineyards by updating persistence in linear time. In Symp. on Comp. Geom., 2006.
 [22] D. A. S. C. A. Committee. Synergistic challenges in dataintensive science and exascale computing. Technical report, DoE Advanced Scientific Computing Advisory Committee, Data Subcommittee, 2013.
 [23] A. T. Corey et al. The interrelation between gas and oil relative permeabilities. Producers monthly, 1954.
 [24] A. T. Corey, C. Rathjens, et al. Effect of stratification on relative permeability. Journal of Petroleum Technology, 8(12):69–71, 1956.
 [25] C. Croux and C. Dehon. Influence functions of the spearman and kendall correlation measures. Stat. Methods & Applications, 2010.
 [26] M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Proc. of NIPS, 2013.
 [27] H. Darcy. Les fontaines publiques de la ville de Dijon. 1856.
 [28] L. De Floriani, U. Fugacci, F. Iuricich, and P. Magillo. Morse complexes for shape segmentation and homological analysis: discrete models and algorithms. Comp. Grap. For., 2015.
 [29] R. de Loubens, G. Vaillant, M. Regaieg, J. Yang, A. Moncorgé, C. Fabbri, G. Darche, et al. Numerical modeling of unstable waterfloods and tertiary polymer floods into highly viscous oils. SPE Journal, 2018.
 [30] D. A. DiCarlo, S. Akshay, M. Blunt, et al. Threephase relative permeability of waterwet, oilwet, and mixedwet sandpacks. SPE Journal, 5(01):82–91, 2000.
 [31] H. Edelsbrunner and J. Harer. Persistent homology – a survey. American Mathematical Society, 2008.
 [32] H. Edelsbrunner and J. Harer. Computational Topology: An Introduction. American Mathematical Society, 2009.
 [33] H. Edelsbrunner, D. Letscher, and A. Zomorodian. Topological persistence and simplification. Disc. Compu. Geom., 2002.
 [34] C. Fabbri, R. De Loubens, A. Skauge, P. Ormehaug, B. Vik, M. Bourgeois, D. Morel, G. Hamon, et al. Comparison of historymatched water flood, tertiary polymer flood relative permeabilities and evidence of hysteresis during tertiary polymer flood in very viscous oils. In SPE Asia Pacific Enhanced Oil Recovery Conference, 2015.
 [35] G. Favelier, N. Faraj, B. Summa, and J. Tierny. Persistence atlas for critical point variability in ensembles. IEEE TVCG, 2019.
 [36] G. Favelier, C. Gueunet, and J. Tierny. Visualizing ensembles of viscous fingers. In IEEE Scientific Visualization Contest, 2016.
 [37] D. H. Fenwick, M. J. Blunt, et al. Network modeling of threephase flow in porous media. SPE Journal, 3(01):86–96, 1998.
 [38] C. H. Gao et al. Advances of polymer flood in heavy oil recovery. In SPE heavy oil conference and exhibition, 2011.

[39]
A. Göktas and Ö. Isçi.
A comparison of the most commonly used measures of association for doubly ordered square contingency tables via simulation.
Metodoloski zvezki, 8(1):17, 2011.  [40] D. Guenther, R. AlvarezBoto, J. ContrerasGarcia, J.P. Piquemal, and J. Tierny. Characterizing molecular interactions in chemical systems. IEEE TVCG, 20(12):2476–2485, 2014.
 [41] C. Gueunet, P. Fortin, J. Jomier, and J. Tierny. Contour forests: Fast multithreaded augmented contour trees. In IEEE LDAV, 2016.
 [42] C. Gueunet, P. Fortin, J. Jomier, and J. Tierny. Taskbased augmented merge trees with Fibonacci heaps. In IEEE LDAV, 2017.
 [43] A. Gyulassy, P. T. Bremer, B. Hamann, and V. Pascucci. A practical approach to morsesmale complex computation: Scalability and generality. IEEE TVCG, 14(6):1619–1626, 2008.
 [44] A. Gyulassy, D. Guenther, J. A. Levine, J. Tierny, and V. Pascucci. Conforming morsesmale complexes. IEEE TVCG, 2014.
 [45] A. Gyulassy, A. Knoll, K. Lau, B. Wang, P. Bremer, M. Papka, L. A. Curtiss, and V. Pascucci. Interstitial and interlayer ion diffusion geometry extraction in graphitic nanosphere battery materials. IEEE TVCG, 2015.
 [46] J. Hagoort et al. Oil recovery by gravity drainage. Society of Petroleum Engineers Journal, 20(03):139–150, 1980.
 [47] C. Heine, H. Leitte, M. Hlawitschka, F. Iuricich, L. De Floriani, G. Scheuermann, H. Hagen, and C. Garth. A survey of topologybased methods in visualization. Comp. Grap. For., 2016.
 [48] G. M. Homsy. Viscous fingering in porous media. Annual review of fluid mechanics, 19(1):271–311, 1987.
 [49] S. Jaure, A. Moncorge, R. de Loubens, et al. Reservoir simulation prototyping platform for high performance computing. In SPE Large Scale Computing and Big Data Challenges in Reservoir Simulation Conference and Exhibition. Society of Petroleum Engineers, 2014.
 [50] L. Kantorovich. On the translocation of masses. AS USSR, 1942.
 [51] J. Kasten, J. Reininghaus, I. Hotz, and H. Hege. Twodimensional timedependent vortex regions based on the acceleration magnitude. IEEE TVCG, 2011.
 [52] M. Kerber, D. Morozov, and A. Nigmetov. Geometry helps to compare persistence diagrams. J. Exp. Algorithmics, 2017.
 [53] J. Killough et al. Reservoir simulation with historydependent saturation functions. Society of Petroleum Engineers Journal, 1976.
 [54] A. Landge, V. Pascucci, A. Gyulassy, J. Bennett, H. Kolla, J. Chen, and T. Bremer. Insitu feature extraction of large scale combustion simulations using segmented merge trees. In SuperComputing, 2014.
 [55] H. Lavenant, S. Claici, E. Chien, and J. Solomon. Dynamical optimal transport on discrete surfaces. In SIGGRAPH Asia 2018 Technical Papers, p. 250. ACM, 2018.
 [56] M. Leverett et al. Capillary behavior in porous solids. Transactions of the AIME, 142(01):152–169, 1941.
 [57] E. Levina and P. Bickel. The earthmover’s distance is the mallows distance: some insights from statistics. In IEEE ICCV, 2001.
 [58] T. Luciani, A. Burks, C. Sugiyama, J. Komperda, and G. E. Marai. Detailsfirst, show context, overview last: Supporting exploration of viscous fingers in largescale ensemble simulations. IEEE TVCG, 25(1):1225–1235, 2019.
 [59] J. Lukasczyk, G. Aldrich, M. Steptoe, G. Favelier, C. Gueunet, J. Tierny, R. Maciejewski, B. Hamann, and H. Leitte. Viscous fingering: A topological visual analytic approach. In Applied Mechanics and Materials, vol. 869, pp. 9–19. Trans Tech Publ, 2017.
 [60] D. J. MacAllister, K. C. Miller, S. K. Graham, C.T. Yang, et al. Application of xray ct scanning to determine gas/water relative permeabilities. SPE formation evaluation, 8(03):184–188, 1993.
 [61] G. Monge. Mémoire sur la théorie des déblais et des remblais. Académie Royale des Sciences de Paris, 1781.
 [62] K. Moreland, R. Oldfield, P. Marion, S. Jourdain, N. Podhorszki, V. Vishwanath, N. Fabian, C. Docan, M. Parashar, M. Hereld, M. E. Papka, and S. Klasky. Examples of in transit visualization. In Proc. of the 2nd Int. Workshop on Petascale Data Analytics: Challenges and Opportunities, PDAC ’11, pp. 1–6. ACM, 2011.
 [63] D. Morozov. Dionysus. http://www.mrzv.org/software/dionysus, 2010. Accessed: 20160915.
 [64] J. Munkres. Algorithms for the assignment and transportation problems. J. of the Society for Industrial and Applied Mathematics, 1957.
 [65] J. Munkres. Topology. Pearson Education, 2014.
 [66] M. Oak, L. Baker, D. Thomas, et al. Threephase relative permeability of berea sandstone. Journal of Petroleum Technology, 1990.
 [67] P. O’Leary, J. Ahrens, S. Jourdain, S. Wittenburg, D. H. Rogers, and M. Petersen. Cinema imagebased in situ analysis and visualization of mpasocean simulations. Parallel Computing, 55:43–48, 2016.
 [68] V. Pascucci, G. Scorzelli, P. T. Bremer, and A. Mascarenhas. Robust online computation of Reeb graphs: simplicity and speed. ToG, 2007.
 [69] V. Pascucci, X. Tricoche, H. Hagen, and J. Tierny. Topological Data Analysis and Visualization: Theory, Algorithms and Applications. Springer, 2010.
 [70] L. Patacchini, R. De Loubens, A. Moncorge, A. Trouillaud, et al. Fourfluidphase, fully implicit simulation of surfactant flooding. SPE Reservoir Evaluation & Engineering, 17(02):271–285, 2014.
 [71] M. Rasquin, P. Marion, V. Vishwanath, B. Matthews, M. Hereld, K. Jansen, R. Loy, A. Bauer, M. Zhou, O. Sahni, J. Fu, N. Liu, C. Carothers, M. Shephard, M. Papka, K. Kumaran, and B. Geveci. Electronic poster: Covisualization of full data and in situ data extracts from unstructured grid cfd at 160k cores. In Proc. of the 2011 Companion on HPC Networking, Storage and Analysis Companion, SC ’11 Companion, pp. 103–104. ACM, 2011.
 [72] G. Reeb. Sur les points singuliers d’une forme de Pfaff complètement intégrable ou d’une fonction numérique. Acad. des Sci., 1946.
 [73] A. Riaz, G.Q. Tang, H. A. Tchelepi, and A. R. Kovscek. Forced imbibition in natural porous media: Comparison between experiments and continuum models. Physical Review E, 75(3):036305, 2007.
 [74] J. Richardson, J. Kerver, J. Hafford, J. Osoba, et al. Laboratory determination of relative permeability. J. of Petroleum Tech., 1952.
 [75] M. Rivi, L. Calori, G. Muscianisi, and V. Slavnic. Insitu visualization: Stateoftheart and some use cases. 2011.
 [76] V. Robins, P. Wood, and A. Sheppard. Theory and algorithms for constructing discrete morse complexes from grayscale digital images. IEEE Trans. on Pat. Ana. and Mach. Int., 2011.
 [77] P. G. Saffman and G. I. Taylor. The penetration of a fluid into a porous medium or heleshaw cell containing a more viscous liquid. Proceedings of the Royal Society of London. Series A. Mathematical and Physical Sciences, 245(1242):312–329, 1958.
 [78] H. Saikia and T. Weinkauf. Global feature tracking and similarity estimation in timedependent scalar fields. Comp.. Graph. For., 2017.
 [79] J. Santiago, M. ClaeysBruno, and M. Sergent. Construction of spacefilling designs using wsp algorithm for high dimensional spaces. Chemometrics and Intelligent Laboratory Systems, 113:26–31, 2012.
 [80] J. Sharma, S. B. Inwood, A. Kovscek, et al. Experiments and analysis of multiscale viscous fingering during forced imbibition. SPE Journal, 2012.
 [81] N. Shivashankar, P. Pranav, V. Natarajan, R. van de Weygaert, E. P. Bos, and S. Rieder. Felix: A topology based framework for visual exploration of cosmic filaments. IEEE TVCG, 2016.
 [82] D. Silver. Objectoriented visualization. IEEE Comput. Graph. Appl., 15(3):54–62, May 1995.
 [83] D. Silver and X. Wang. Volume tracking. In Visualization ’96. Proceedings., pp. 157–164, Oct 1996.
 [84] D. Silver and X. Wang. Tracking and visualizing turbulent 3d features. IEEE TVCG, 3(2):129–141, Apr 1997.
 [85] D. Silver and X. Wang. Tracking scalar features in unstructured data sets. In Visualization ’98. Proceedings, pp. 79–86, Oct 1998.
 [86] D. Silver and X. Wang. Visualizing evolving scalar phenomena. Future Generation Computer Systems, 15(1):99 – 108, 1999.
 [87] A. Skauge, T. Horgen, B. Noremark, and B. Vik. Experimental studies of unstable displacement in carbonate and sandstone material. In IOR 201116th European Symposium on Improved Oil Recovery, 2011.
 [88] A. Skauge, P. A. Ormehaug, T. Gurholt, B. Vik, I. Bondino, G. Hamon, et al. 2d visualisation of unstable waterflood and polymer flood for displacement of heavy oil. In Improved Oil Recovery, 2012.
 [89] A. Skauge, K. Sorbie, P. A. Ormehaug, and T. Skauge. Experimental and numerical modeling studies of viscous unstable displacement. In European Symposium on Improved Oil Recovery, 2009.
 [90] T. Skauge, B. F. Vik, P. A. Ormehaug, B. K. Jatten, V. Kippe, I. Skjevrak, D. C. Standnes, K. Uleberg, A. Skauge, et al. Polymer flood at adverse mobility ratio in 2d flow by xray visualization. In SPE EOR Conference at Oil and Gas West Asia, 2014.
 [91] B. S. Sohn and C. Bajaj. Timevarying contour topology. IEEE TVCG, 12(1):14–25, 2006.
 [92] M. Soler, M. Plainchault, B. Conche, and J. Tierny. Lifted wasserstein matcher for fast and robust topology tracking. In IEEE Symposium on Large Data Analysis and Visualization, 2018.
 [93] M. Soler, M. Plainchault, B. Conche, and J. Tierny. Topologically controlled lossy compression. In 2018 IEEE Pacific Visualization Symposium (PacificVis), pp. 46–55. IEEE, 2018.
 [94] J. Solomon, F. de Goes, G. Peyré, M. Cuturi, A. Butscher, A. Nguyen, T. Du, and L. J. Guibas. Convolutional wasserstein distances: efficient optimal transportation on geometric domains. ACM ToG, 2015.
 [95] J. Solomon, G. Peyré, V. G. Kim, and S. Sra. Entropic metric alignment for correspondence problems. ACM Trans. Graph., 2016.
 [96] S. W. Son, Z. Chen, W. Hendrix, A. Agrawal, W. k. Liao, and A. Choudhary. Data compression for the exascale computing era  survey. Supercomputing Frontiers and Innovations, 1(2), 2014.
 [97] T. Sousbie. The persistent cosmic web and its filamentary structure: Theory and implementations. Royal Astronomical Society, 2011.
 [98] J. Thomeer et al. Introduction of a pore geometrical factor defined by the capillary pressure curve. Journal of Petroleum Technology, 1960.
 [99] J. Tierny and H. Carr. Jacobi fiber surfaces for bivariate Reeb space computation. IEEE TVCG, 23(1):960–969, 2016.
 [100] J. Tierny, G. Favelier, J. A. Levine, C. Gueunet, and M. Michaux. The Topology ToolKit. IEEE TVCG, 24(1):832–842, 2017.
 [101] J. Tierny, A. Gyulassy, E. Simon, and V. Pascucci. Loop surgery for volumetric meshes: Reeb graphs reduced to contour trees. IEEE TVCG, 15(6):1177–1184, 2009.
 [102] M. Trojer, M. L. Szulczewski, and R. Juanes. Stabilizing fluidfluid displacements in porous media through wettability alteration. Physical Review Applied, 3(5):054008, 2015.
 [103] T. Tsuji, F. Jiang, and K. T. Christensen. Characterization of immiscible fluid displacement processes with various capillary numbers and viscosity ratios in 3d natural sandstone. Adv. in Water Resou., 2016.
 [104] S. Whitaker. Flow in porous media i: A theoretical derivation of darcy’s law. Transport in porous media, 1(1):3–25, 1986.
 [105] H. Yu, C. Wang, R. W. Grout, J. H. Chen, and K. L. Ma. In situ visualization for largescale combustion simulations. IEEE Computer Graphics and Applications, 30(3):45–57, May 2010.
Comments
There are no comments yet.