The capture and recovery of 3D motion in a transparent medium is a complex and challenging task, with important applications in different fields of science and technology: Observations of fluid motion and fluid-structure interaction form the basis of experimental fluid dynamics. Measuring and understanding flow and turbulence patterns is important for aero- and hydrodynamics in the automotive, aeronautic and ship-building industries, e.g. to design efficient shapes and to test the elasticity of components. Biological sciences are another application field, for instance behavioral studies about aquatic organisms that live in flowing water, or the analysis of the flight dynamics of birds and insects [1, 2].
A state-of-the-art technology to capture fluid motion in the laboratory is particle image velocimetry (PIV) [3, 4]. The underlying idea is to inject tracer particles into the fluid, which densely cover the illuminated volume of interest, and to record them with high-speed cameras from multiple viewpoints. Figure 1 shows the basic setup: 3D particle positions and a dense motion field are recovered from a set of input images from two consecutive time steps.
Due to computational limitations, early variants were restricted to 2D, by illuminating only a thin slice of the volume, thus neglecting interactions between different slices. More recent methods operate on the full 3D volume, but divide the problem in two independent steps. First, they recover the 3D particle distribution in the volume using all views at a single time step, by means of tomographic PIV (TomoPIV) , or sparse representations . Second, they employ dense correspondence estimation between consecutive particle volumes to obtain the 3D motion of the fluid. The latter step is often simply an exhaustive matching of large enough local 3D windows [8, 9, 10] or, more recently, a variational flow estimation , with suitable priors on the motion field to limit the size of the matching window. We argue that any such two-step approach is sub-optimal, for several reasons. At each time step, the same particles are reconstructed without looking at the images at the other time step, effectively halving the available data. E.g., ignoring that a particle strongly supported by the later frame “must have come from somewhere”. Similarly, valuable information is lost when discarding the input images after the first step. E.g., one cannot check whether a weak particle that unnaturally distorts the motion field “should be there at all”; particularly since physical limitations in practice restrict camera placement to narrow baselines, such that the initial reconstruction is ambiguous. Moreover, to achieve a resolution similar to the original images the volume must be discretized at high resolution, making the computation memory-hungry and expensive. Also, it is well documented that large 3D matching windows are needed for good performance, which undermines the benefits of high resolution.
In this work we propose a joint energy model for the reconstruction of the particles and the 3D motion field, so as to capture the inherent mutual dependencies. The model uses the full information – all available images from both time steps at full resolution – to solve the problem. We opt for a hybrid Lagrangian/Eulerian approach: particles are modeled individually, while the motion field is represented on a dense grid. Recovering explicit particle locations and intensities avoids the need for a costly 3D parameterization of voxel occupancy, as well as the use of a large matching window. Instead, it directly compares evidence for single particles in the images. In our experience, this yields significantly higher accuracy.
To represent the motion field, we opt for a trilinear finite element basis. Modeling the 3D motion densely allows us to incorporate physical priors that account for incompressibility and viscosity of the observed fluid . This can be done efficiently, at a much lower voxel resolution than would be required for particle reconstruction, due to the smoothness of the 3D motion field [10, 11].
We model our problem in a variational setting. To better resolve particle ambiguities, we add a prior to our energy that encourages sparsity. In order to overcome weak minima of the non-convex energy, we include a proposal generation step that detects putative particles in the residual images, and alternates with the energy minimization. For the optimization itself, we can rely on the very efficient inertial Proximal Alternating Linearized Minimization (iPALM) [12, 13]. It is guaranteed to converge to a stationary point of the energy and has a low memory footprint, so that we can reconstruct large volumes.
, which both address the problem sequentially with two independent steps, we achieve particle reconstructions with higher precision and recall, at all tested particle densities; and greatly improved motion estimates. The estimated fluid flow visually appears on par with state-of-the-art techniques like, which use tracking over multiple time steps and highly engineered post-processing [15, 16].
2 Related Work
. PIV outputs a dense velocity field (Eulerian view), while PTV computes a flow vector at each particle location (Lagrangian view). The first method to operate in 3D was 3D-PTV, where individual particles are detected in different views, triangulated and tracked over time. Yet, as the particle density increases the particles quickly start to overlap in the images, leading to ambiguities. Therefore, 3D-PTV is only recommended for densities ppp (particles per pixel). To handle higher densities,  introduced Tomo-PIV. They first employ a tomographic reconstruction (e.g. MART) 
per time step, to obtain a 3D voxel space of occupancy probabilities. Cross-correlation with large local 3D windows () [8, 9, 10] then yields the flow. Effectively, this amounts to matching particle constellations, assuming constant flow in large windows, which smoothes the output to low effective resolution. Recently, a new particle tracking method Shake-the-Box (StB) was introduced . It builds on the idea of iterative particle reconstruction (IPR) , where triangulation is performed iteratively, with a local position and intensity refinement after every triangulation step.
None of the above methods accounts for the physics of (incompressible) fluids during reconstruction. In StB 
, a post-process interpolates sparse tracks to a regular grid, at that step (but not during reconstruction), physical constraints can be included. combine TomoPIV with variational 3D flow estimation and account for physical constraints, with a regularizer derived from the stationary Stokes equations. However, their data term requires a huge, high-resolution intensity volume, and a local window of voxels for matching, which lowers spatial resolution, albeit less than earlier local matching.  proposed a similar approach for dye-injected two-media fluids. Their aim are visually pleasing, rather than physically correct results, computed for relatively small volumes ( voxels). We note that dye leads to data that is very different from tracer particles, e.g., it produces structures that can be matched more easily, but does not evenly cover the volume. Dalitz et al.  use compressive sensing to jointly recover the location and motion of a sparse, time-varying signal. Results are only shown for small grids (), and the physics of fluids is not considered. Barbu et al.  introduce a joint approach for 3D reconstruction and flow estimation, however, without considering physical properties of the problem. Their purely Eulerian, voxel-based setup limits the method to small volume sizes, i.e., the method is only evaluated on a volume of voxels and a rather low seeding density of 0.02 ppp. Recently, Xiong et al.  propose a joint formulation for their single-camera PIV setup. The volume is illuminated by rainbow-colored light planes that encode depth information. This permits the use of only a single camera with the drawback of lower depth accuracy and limited particle density. Voxel occupancy probabilities are recovered on a discrete 3D grid, similar to tomographic reconstruction. In order to handle the ill-conditioned problem from a single camera, constraints on particle sparsity and motion consistency (including physical constraints) are incorporated in the optimization. The method operates on a “thin” maximum volume of . The single-camera setup does not allow a direct comparison with standard 3D PIV/PTV, but can certainly not compete in terms of depth resolution. In contrast, by separating the representation of particles and motion field, our hybrid Lagrangian/Eulerian approach allows for sub-pixel accurate particle reconstruction and large fluid volumes.
Volumetric fluid flow is also related to variational scene-flow estimation, especially methods that parameterize the scene in 3D space [24, 25]. Like those, we search for the geometry and motion of a dynamic scene and exploit multiple views, yet our goal is a dense reconstruction in a given volume, rather than a pixel-wise motion field. Scene flow has undergone an evolution similar to the one for 3D fluid flow. Early methods started with a fixed, precomputed geometry estimate [26, 27], with a notable exception . Later work moved to a joint reconstruction of geometry and motion [24, 29, 25]. Likewise, [6, 11] precompute the 3D tracer particles  before estimating their motion. The present paper is, to our knowledge, the first in multi-camera PIV to jointly determine the explicit particle geometry and (physically-constrained) motion of the fluid. Hence, our model can be seen either as an extension of the flow algorithm  to simultaneously reconstruct the particles, or an extension of TomoPIV with sparse representations [30, 7], to simultaneously reconstruct the particles at both time steps, and the associated motion field.
Several scene flow methods [31, 32, 33] overcome the large state space by sampling geometry and motion proposals, and perform discrete optimization over those samples. In a similar spirit, we employ IPR to generate discrete particle proposals, but then combine them with a continuous, variational optimization scheme. We note that discrete labeling does not suit our task: The volumetric setting would require a large number of labels (3D vectors), and enormous amounts of memory. And it does not lend itself to sub-voxel accurate inference.
To set the scene, we restate the goal of our method: densely predict the 3D flow field in a fluid seeded with tracer particles, from multiple 2D views acquired at two adjacent time steps.
We aim to do this in a direct, integrated fashion. The joint particle reconstruction and motion estimation is formulated as a hybrid Lagrangian/Eulerian model, where we recover individual particles and keep track of their position and appearance, but reconstruct a continuous 3D motion field in the entire domain. A dense sampling of the motion field makes it technically and numerically easier to adhere to physical constraints like incompressibility. In contrast, modeling particles explicitly takes advantage of the low particle density in PIV data. Practical densities are around particles per pixel (ppp) in the images. Depending on the desired voxel resolution, this corresponds to lower volumetric density. Our complete pipeline is depicted in Fig. 3. It alternates between generating particle proposals (Sec. 3.2) based on the current residual images (starting from the raw input images), and energy minimization to update all particles and motion vectors (Sec. 3.3). The correspondent energy model is described in Sec. 3.1. In the process, particle locations and flow estimates are progressively refined and provide a better initialization for further particle proposals.
Particle triangulation is highly ambiguous, so the proposal generator will inevitably introduce many spurious “ghost” particles (Fig. 3). A sparsity term in the energy reduces the influence of low intensity particles that usually correspond to such ghosts, while true particles, given the preliminary flow estimate, receive additional support from the data of the second time step. In later iterations, already reconstructed particles vanish in the residual images. This allows for a refined localization of remaining particles, as particle overlaps are resolved.
Notation and Preliminaries. The scene is observed by calibrated cameras , recording the images at time . Parameterizing the scene with 3D entities obviates the need for image rectification. In our formulation we do not commit to a particular camera model and instead define a generic projection operator per camera. We note that fluid experiments typically need sophisticated models to deal with refraction (air-glass and glass-water), or an optical transfer function derived from volume self-calibration. Both are not the focus of this work and can be found in the respective literature, e.g. [34, 35].
The dependency on time is denoted via superscript , and omitted when possible. We denote the set of particles , composed of a set of intensities , and positions , where each is located in the rectangular domain . The 3D motion field at position , between times and , is . The set contains motion vectors located at a finite set of positions . If we let these locations coincide with the particle positions, we would arrive at a fully Lagrangian design, also referred to as smoothed particle hydrodynamics [36, 37]. In this work, we prefer a fixed set and represent the functional by trilinear interpolation, i.e. we opt for an Eulerian description of the motion. Our model is, thus, similar to the so-called particle in cell design . W.l.o.g. we assume , i.e., we set up a regular grid of vertices of size , which induce a voxel covering of size of the whole domain. Each grid vertex is associated with a trilinear basis function: . The elements now represent the coefficients of our motion field function that is given by:
3.1 Energy Model
With the definitions above, we can write the energy
with a data term , a smoothness term operating on the motion field, and a sparsity prior operating on the intensities of the particles.
Data Term. To compute the data term, the images of all cameras at both time steps are predicted from the particles’ positions and intensities, and the 3D motion field. penalizes deviations between predicted and observed images:
Following an additive (in terms of particles) image formation model, we integrate over the image plane of camera ; denotes the Iverson bracket. We model individual particles
as Gaussian blobs with variance. Particles do not exhibit strong shape or material variations. Their distance to the light source does influence the observed intensity. But since it changes smoothly and the cameras record with high frame-rate, assuming constant intensity is a valid approximation for our two-frame model.
In practice, the projection in (3) can be assumed to be almost orthographic, with little perspective distortion of the particles. The depth range of the volume is small compared to the distance from the camera. Hence, we assume that particles remain Gaussian after a projection into the image. In that regime, and omitting constant terms, the expression for a projected particle simplifies to
When computing the derivatives of (3) w.r.t. the set of parameters, we do not integrate the particle blobs over the whole image, but restrict the area of influence of (4) to a radius of , covering % of its total intensity.
Sparsity Term. The majority of the generated candidate particles do not correspond to true particles. To remove the influence of the huge set of low-intensity ghost particles one can exploit the expected sparsity of the solution, e.g. . In other words, we aim to reconstruct the observed scenes with few, bright particle, by introducing the following energy term:
Here, denotes the indicator function of the set . Popular sparsity-inducing norms are either the - or -norm (, respectively ). We have investigated both choices and prefer the stricter -norm for the final model. The -norm counts the number of non-zero intensities and rapidly discards particles that fall below a certain intensity threshold (modulated by in (2)). While the -norm only gradually reduces the intensities of weakly supported particles.
Smoothness Term. To define a suitable smoothness prior we follow  and employ a quadratic regularizer per component of the flow gradient, plus a term that enforces a divergence-free motion field:
It has been shown in  that (6) has a physical interpretation, in that the stationary Stokes equations emerge as the Euler-Lagrange equations of the energy (6), including an additional force field. Thus, (6) models the incompressibility of the fluid, while represents its viscosity.  also suggest a variant in which the hard divergence constraint is replaced with a soft penalty:
This version simplifies the numerical optimization, trading off speed for accuracy. For adequate (large) , the results are similar to the hard constraint in (6). Eq. (6) requires the computation of divergence and gradients of the 3D motion field. Following the definition (1) of the flow field, both entities are linear in the coefficients and constant per voxel . A valid discretization of the divergence operator can be achieved via the divergence theorem:
where we let denote the outward-pointing normal of voxel at position and the unit vector in direction . The final sum considers pairs of corner vertices of voxel , adjacent in direction . The definition of the per-voxel gradient follows from (1) in a similar manner.
3.2 Particle Initialization
To obtain an initial set of 3D particle locations we employ a direct detect-and-triangulate strategy like IPR  and iteratively triangulate putative particles, in alternation with the minimization of energy (2). Particle triangulation is extremely ambiguous and not decidable with local cues (Fig. 3). Instead, all plausible correspondences are instantiated. One can interpret the process as a proposal generator for the set of particles, which interacts with the sparsity constraint (5). This proposal generator creates new candidate particles where image evidence remains unexplained. The sparsity prior ensures that only “good” particles survive and contribute to the data costs, whereas those of low intensity that are inconsistent with our model become “zero-intensity” particles. Particles of low intensity are uncommon in reality. In each iteration the set of zero-intensity particles are actively discarded from
to reduce the workload. Note that this does not change the energy of the current solution. After the first particles and a coarse motion field have been reconstructed, a better initialization is available to spawn further particles, in locations suggested by the residual maps between predicted and observed images. Particles that contribute to the data are retained in the subsequent optimization and help to refine the motion field, etc. The procedure is inspired by the heuristic, yet highly efficient, iterative approach of. They also refine particle candidates triangulated from residual images. Other than theirs, our updated particle locations follow from a joint spatio-temporal objective, and thus also integrate information from the second time step. In more detail, each round of triangulation proceeds as follows: first, detect peaks in 2D image space for all cameras at time step . In the first iteration this is done in the raw inputs, then in the residual images . Peaks are found by non-maximum suppression with a kernel, followed by sub-pixel refinement of all peaks with intensity above a threshold . We treat one of the cameras, , as reference and compute the entry and exit points to for a ray passing through each peak. Reprojecting the entry and exit into other views yields epipolar line segments, along which we scan for (putatively) matching peaks (Fig. 3). Whenever we find peaks in all views that can be triangulated with a reprojection error below a tolerance , we generate a new candidate particle. Its initial intensity is set as a function of the intensity in the reference view and the number of candidates: if proposals are generated at a peak in the reference image, we set for each of them.
3.3 Energy Minimization
Our optimization is embedded in a two-fold coarse-to-fine scheme. On the one hand, we start with a larger value for , so as to increase the particles’ basins of attraction and improve convergence. During optimization, we progressively reduce until we reach , meaning that a particle blob covers approximately the same area as in the input images. On the other hand, we also start at a coarser grid and refine the grid resolution along with .
To minimize the non-convex and non-smooth energy (2) for a given , we employ PALM , in its inertial variant . Because our energy function is semi-algebraic [12, 39], it satisfies the Kurdyka-Lojasiewicz property , therefore the sequence generated by PALM globally converges to a critical point of the energy. The key idea of PALM is to split the variables into blocks, such that the problem is decomposed into one smooth function on the entire variable set, and a sum of non-smooth functions in which each block is treated separately. We start by arranging the locations and intensities of the particles into two separate vectors and . Similarly, we stack the coefficients of the trilinear basis . With these groups, we split the energy functional into a smooth part and two non-smooth functions, for the intensities and for the motion vectors :
For notation convenience, we define . The algorithm then alternates the steps of a proximal forward-backward scheme: take an explicit step w.r.t. one block of variables on the smooth part of the energy function, then take a backward (proximal) step on the non-smooth part w.r.t. the same variables. That is, we alternate steps of the form
with a suitable step size for each block of variables. Here and in the following, the placeholder variable can stand for , or , as required.
A key property is that, throughout the iterations, the partial gradient of function w.r.t. a variable block must be globally Lipschitz-continuous with some modulus at the current solution:
Note that Lipschitz continuity of the gradient of has to be verified only locally, at the current solution. This property allows for a back-tracking approach to determine the Lipschitz constant, e.g. . Algorithm 1 provides pseudo-code for our scheme to minimize the energy (9). To accelerate convergence we apply extrapolation (lines 4/8/12). These inertial steps, c.f. [13, 42], significantly reduce the number of iterations in the algorithm, while leaving the computational cost per step practically untouched. It is further convenient to not only reduce the step sizes (lines 7/11/15 in Alg. 1), but also to increase them, as long as (12) is fulfilled, to make the steps per iteration as large as possible.
One last thing needs to be explained, namely how we find the solution of the proximal steps on the intensities and flow vectors . The former can be solved point-wise, leading to the following 1D-problem:
which admits for a closed-form solution for both norms ():
The proximal step for the flow vector , , requires the projection of onto the space of divergence-free 3D vector fields. Given , the solution is independent of the step size , which we omit in the following. We construct the Lagrangian by introducing the multiplier , a scalar vector field whose physical meaning is the pressure in the fluid :
To prevent confusion, we introduce as matrix notation for the linear divergence operator in (8). The KKT conditions of the Lagrangian yield a linear equation system. Simplification with the Schur complement leads to a Poisson system, which we solve for the pressure to get the divergence-free solution:
Again interpreted physically, the divergence of the motion field is removed by subtracting the gradient of the resulting pressure field. For our problem of fluid flow estimation, it is not necessary to exactly solve the Poisson system in every iteration. Instead, we keep track of the pressure field during optimization, and warm-start the proximal step. In this way, a few (10-20) iterations of preconditioned conjugate gradient descent suffice to update .
If we replace the hard divergence constraint with the soft penalty from (7), we add to the smooth function in (9). Then only the proximal step on the intensities is needed in Alg. 1. We conclude by noting that accelerating the projection step is in itself an active research area in fluid simulation [43, 44, 45].
There is no other measurement technique that could deliver ground truth for fluid flow. We follow the standard practice and generate datasets for quantitative evaluation via direct numerical simulations (DNS) of turbulent flow, using the Johns Hopkins Turbulence Database (JHTDB) [46, 47]. This allows us to render realistic input images with varying particle densities and flow magnitudes, together with ground truth vectors on a regular grid. We evaluate how our approach performs with different smoothness terms, particle densities, initialization methods, particle sizes and temporal sampling rates. Additionally, we show results on “test case D” of the 4 International PIV Challenge . We quantitatively compare to the best performing method  and refer to the supplementary material for further comparisons as well as additional visualizations, also with the experimental setup of Fig. 1.
Simulated dataset. We follow the guidelines of the 4 International PIV Challenge  for the setup of our own dataset: Randomly sampled particles are rendered to four symmetric cameras of resolution pixels, with viewing angles of w.r.t. the -plane of the volume, respectively w.r.t. the -plane. If not otherwise specified, particles are rendered as Gaussian blobs with and varying intensity. We sample 12 datasets from 6 non-overlapping spatial locations and 2 temporal locations of the forced isotropic turbulence simulation of the JHTDB. Using the same discretization level as , 4 voxels for each DNS grid point, each dataset corresponds to a volume size of . For our flow fields with flow magnitudes up to voxels, we use pyramid levels with downsampling factor . At every level we alternate between triangulation of candidate particles and minimization of the energy function (at most iteration per level).
The effective resolution of the reconstructed flow field is determined by the particle density. At a standard density of ppp and a depth range of voxels, we get a density of particles per voxel. This suggests to estimate the flow on a coarser grid. We empirically found a particle density of per voxel to still deliver good results. Hence, we operate on a subsampled voxel grid of 10-times lower resolution per dimension in all our experiments, to achieve a notable speed-up and memory saving. The computed flow is then upsampled to the target resolution, with barely any loss of accuracy.
We always require a 2D intensity peak in all four cameras to instantiate a candidate particle. We start with a strict threshold of for the triangulation error, as suggested in , which is relaxed to in later iterations. The idea is to first recover particles with strong support, and gradually add more ambiguous ones, as the residual images become less cluttered. We set for our dataset. Since corresponds to the viscosity it should be adapted for other fluids. The sparsity weight is set dependent on the intensity threshold used for particle generation, which in turn is set to of the average particle intensity, as recommended in .
Regularization. Our framework allows us to plug in different smoothness terms. Following , we show results for hard () and soft divergence regularization (). Average endpoint error (AEE) and average angular error (AAE), as well as the average absolute divergence (AAD) are displayed in Tab. 1. Compared to our default regularizer , removing the divergence constraint (), increases the error by . With the soft constraint at high , the results are equal to those of . We also compare to the method  (original code, kindly provided by the authors). Our joint model improves the performance by over that recent baseline, on both error metrics. In Fig. 4 we visually compare our results (with hard divergence constraint) to those of . The figure shows the flow in -direction in one particular -slice of the volume. Our method recovers a lot finer details, and is clearly closer to the ground truth.
Particle Density & Initialization Method. There is a trade-off for choosing the seeding density: A higher density raises the observable spatial resolution, but at the same time makes matching more ambiguous. This causes false positives, commonly called “ghost particles”. Very high densities are challenging for all known reconstruction techniques. The additive image formation model of Eq. (3) also suggests an upper bound on the maximal allowed particle density. Tab. 2 reports results for varying particle densities. We measure recall (fraction of reconstructed ground truth particles) and precision (fraction of reconstructed particles that coincide with true particles to pixel).
To provide an upper bound, we initialize our method with ground truth particle locations at time step 0 and optimize only for the flow estimation. We also evaluate a sequential version of our method, in which we separate energy (2) into particle reconstruction and subsequent motion field estimation. In addition to our proposed IPR-like triangulation method, we initialize particles with a popular volumetric tomography method (MART) . MART creates a full, high-resolution voxel grid of intensities (with, hopefully, lower intensities for ghost particles and very low ones in empty space). To extract a set of sub-voxel accurate 3D particle locations we perform peak detection, similar to the 2D case for triangulation. Since MART always returns the same particle set we run it only once, but increase the number of iterations for the minimizer from to .
Starting from a perfect particle reconstruction (true particles) the flow estimate improves with increasing particle density. Remarkably, our proposed iterative triangulation approach achieves results comparable to the ground truth initialization, up to a very high particle density of ; which suggests that it is able to resolve all particle ambiguities. In contrast, MART and the sequential baseline struggle with increasing particle density, which supports our claim that joint energy minimization can better reconstruct the particles.
|ppp||IPR joint||IPR sequential||MART||true particles|
Particle Size. For the above experiments, we have rendered the particles into the images as Gaussian blobs with fixed , and the same is done when re-rendering for the data term, respectively, proposal generator. We now test the influence of particle size on the reconstruction, by varying . Tab. 4 shows results with hard divergence constraint and fixed particle density , for varying . For small enough particles, size does not matter, very large particles lead to more occlusions and degrade the flow. Furthermore, we verify the sensitivity of the method to unequal particle size. To that end, we draw an individual
for each particle from the normal distribution, while still using a fixed during inference. As expected, the mismatch between actual and rendered particles causes slightly larger errors.
Temporal Sampling. To quantify the stability of our method to different flow magnitudes we modify the time interval between the two discrete time steps and summarize the results in Tab. 4, together with the respective maximum flow magnitude . For lower frame rate (1.25x and 1.5x), and thus larger magnitudes, we set our pyramid downsampling factor to .
PIV Challenge. Unfortunately, no ground truth is provided for the data of the 4 PIV Challenge , such that we cannot run a quantitative evaluation on that dataset. However, Schanz et al.  kindly provided us results for their method, StB, for snapshot 10. StB was the best-performing method in the challenge with an endpoint error of 0.24 voxels (compared to errors 0.3 for all competitors). The average endpoint difference between our approach and StB is 0.14 voxels. In Fig. 5 both results appear to be visually comparable, yet, note that StB includes a tracking procedure that requires data of multiple time steps ( for the given particle density ). We show additional visualizations and a qualitative comparison with the ground truth in the supplementary material.
We have presented the first variational model that jointly solves sparse particle reconstruction and dense 3D fluid motion estimation in PIV data for the common multi-camera setup. The sparse particle representation allows us to utilize the high-resolution image data for matching, while keeping memory consumption low enough to process large volumes. Densely modeling the fluid motion in 3D enables the direct use of physically motivated regularizers, in our case viscosity and incompressibility. The proposed joint optimization captures the mutual dependencies of particle reconstruction and flow estimation. This yields results that are clearly superior to traditional, sequential methods [6, 11]; and, using only 2 frames, competes with the best available multi-frame methods, which require sequences of 15-30 timesteps.
In this document we provide further visualizations and more detailed descriptions of our evaluated datasets. In Section 0.A, we show results of our method on an experiment in water. Section 0.B gives qualitative results on the 4 PIV Challenge . In Section 0.C we provide details on our own simulated dataset.
Appendix 0.A Experimental Data
We test our approach on experimental data provided by LaVision (see  or Package 9 in http://fluid.irisa.fr/data-eng.htm). The experiment captures the wake flow behind a cylinder, which forms a so-called Karman vortex street (see Fig. 6 and 7). Data is provided in form of a tomographic reconstruction of the particle volume. In order to run our method, we backproject the particle volume to four arbitrary camera views (we take the same as for our simulated dataset) and use those backprojected images as input to our method. Since particle densities allow it and the provided reference flow is of low resolution, we downsample the input volume by a factor of 2 from to and render to 2D images of dimensions with particle size . Note, that since those camera locations do not necessarily coincide with the original camera locations, ghost particles in the reconstructed volume may lead to wrong particles in the backprojected image. However, as results in Fig. 8 indicate, our algorithm is able to recover a detailed flow field that correponds with the reference flow despite these deflections in the data. In addition to our own result, we show results provided by LaVision. The provided reference flow field was estimated with TomoPIV, using a final interrogation volume size of and an overlap of , i.e. one flow vector at every 12 voxel locations. It can be seen in Fig. 8 that our method recovers much finer details of the flow, due to the avoidance of large interrogation volumes. In order to cope with flow magnitues up to 15.5 voxel we chose 16 pyramid levels and a pyramid downsampling factor of . Note, that in the resulting flow field the cylinder is positioned to the right of the volume and the general flow direction is towards the left. Effects of the Karman vortex street can be primarly seen in the z component of the flow (periodically alternating flow directions with decreasing magnitude from right to left). Additional visualizations are available in the attached video.
Appendix 0.B PIV Challenge
Since no ground truth is provided for testcase D of the 4 PIV Challenge  and it is no longer possible to submit to the challenge, a quantitative evaluation of our approach on this dataset is infeasible. However, additional to the quantitative comparison with StB in the main paper, we show a qualitative comparison of our method with the ground truth and the best performing method StB  in Fig. 9. Additionally, we show a streamline visualization of our results in Fig. 10. Additional visualizations for our comparison with StB are available in the attached video.
Appendix 0.C Simulated Dataset
To quantitatively evaluate our method, we use the forced isotropic turbulence dataset of the Johns Hopkins Turbulence Database (JHTDB) [46, 47], which is generated using direct numerical simulations. We sample 12 non-overlapping datasets of size voxels, with a discretization level of 4 voxels per DNS grid point (as done also by ). Datasets are sampled at and at 6 different spatial locations. Standard temporal difference between two consecutive timesteps is . For each dataset we sample 480.000 seeding particles at random locations within the volume and ground truth flow vectors at each DNS grid location. In Fig. 11-14 we visualize the flow fields of the 12 resulting datasets. Quantitative evaluation on these datasets for various parameter settings, as well as for different particle densities and temporal sampling rates, are provided in the main paper. Additional visualization (incl. results of our method) for one of the datasets are shown in the attached video.
-  Michalec, F.G., Schmitt, F., Souissi, S., Holzner, M.: Characterization of intermittency in zooplankton behaviour in turbulence. The European Physical Journal E 38(10) (2015)
-  Wu, Z., Hristov, N., Hedrick, T., Kunz, T., Betke, M.: Tracking a large number of objects from multiple views. ICCV (2009)
-  Adrian, R., Westerweel, J.: Particle Image Velocimetry. Cambridge University Press (2011)
-  Raffel, M., Willert, C.E., Wereley, S., Kompenhans, J.: Particle image velocimetry: a practical guide. Springer (2013)
-  Michaelis, D., Poelma, C., Scarano, F., Westerweel, J., Wieneke, B.: A 3d time-resolved cylinder wake survey by tomographic piv. In: 12th Int’l Symp. on Flow Visualization. (2006)
-  Elsinga, G.E., Scarano, F., Wieneke, B., Oudheusden, B.W.: Tomographic particle image velocimetry. Experiments in Fluids 41(6) (2006)
-  Petra, S., Schröder, A., Schnörr, C.: 3D tomography from few projections in experimental fluid dynamics. In: Imaging Measurement Methods for Flow Analysis: Results of the DFG Priority Programme 1147. (2009)
-  Champagnat, F., Plyer, A., Le Besnerais, G., Leclaire, B., Davoust, S., Le Sant, Y.: Fast and accurate PIV computation using highly parallel iterative correlation maximization. Experiments in Fluids 50(4) (2011) 1169
-  Discetti, S., Astarita, T.: Fast 3D PIV with direct sparse cross-correlations. Experiments in Fluids 53(5) (2012)
-  Cheminet, A., Leclaire, B., Champagnat, F., Plyer, A., Yegavian, R., Le Besnerais, G.: Accuracy assessment of a lucas-kanade based correlation method for 3D PIV. In: Int’l Symp. Applications of Laser Techniques to Fluid Mechanics. (2014)
-  Lasinger, K., Vogel, C., Schindler, K.: Volumetric flow estimation for incompressible fluids using the stationary stokes equations. ICCV (2017)
-  Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized minimization for nonconvex and nonsmooth problems. Mathematical Programming 146(1) (2014) 459–494
-  Pock, T., Sabach, S.: Inertial proximal alternating linearized minimization (iPALM) for nonconvex and nonsmooth problems. SIAM J Imaging Sci 9(4) (2016) 1756–1787
-  Schanz, D., Gesemann, S., Schröder, A.: Shake-The-Box: Lagrangian particle tracking at high particle image densities. Experiments in Fluids 57(5) (2016)
-  Schneiders, J.F., Scarano, F.: Dense velocity reconstruction from tomographic PTV with material derivatives. Experiments in Fluids 57(9) (2016)
-  Gesemann, S., Huhn, F., Schanz, D., Schröder, A.: From noisy particle tracks to velocity, acceleration and pressure fields using B-splines and penalties. Int’l Symp. on Applications of Laser Techniques to Fluid Mechanics (2016)
-  Maas, H.G., Gruen, A., Papantoniou, D.: Particle tracking velocimetry in three-dimensional flows – part 1: Photogrammetric determination of particle coordinates. Experiments in Fluids 15(2) (1993)
-  Atkinson, C., Soria, J.: An efficient simultaneous reconstruction technique for tomographic particle image velocimetry. Experiments in Fluids 47(4) (2009)
-  Wieneke, B.: Iterative reconstruction of volumetric particle distribution. Measurement Science & Technology 24(2) (2013)
-  Gregson, J., Ihrke, I., Thuerey, N., Heidrich, W.: From capture to simulation: connecting forward and inverse problems in fluids. ACM ToG 33(4) (2014)
-  Dalitz, R., Petra, S., Schnörr, C.: Compressed motion sensing. SSVM (2017)
-  Barbu, I., Herzet, C., Mémin, E.: Joint Estimation of Volume and Velocity in TomoPIV. In: 10th Int’l Symp. on Particle Image Velocimetry - PIV13. (2013)
-  Xiong, J., Idoughi, R., Aguirre-Pablo, A.A., Aljedaani, A.B., Dun, X., Fu, Q., Thoroddsen, S.T., Heidrich, W.: Rainbow particle imaging velocimetry for dense 3d fluid velocity imaging. ACM Trans. Graph. 36(4) (July 2017) 36:1–36:14
-  Basha, T., Moses, Y., Kiryati, N.: Multi-view scene flow estimation: a view centered variational approach. CVPR (2010)
-  Vogel, C., Schindler, K., Roth, S.: 3D scene flow estimation with a rigid motion prior. ICCV (2011)
-  Wedel, A., Brox, T., Vaudrey, T., Rabe, C., Franke, U., Cremers, D.: Stereoscopic scene flow computation for 3d motion understanding. IJCV 95(1) (2011) 29–51
-  Rabe, C., Müller, T., Wedel, A., Franke, U.: Dense, robust, and accurate motion field estimation from stereo image sequences in real-time. ECCV (2010)
-  Huguet, F., Devernay, F.: A variational method for scene flow estimation from stereo sequences. ICCV (2007)
-  Valgaerts, L., Bruhn, A., Zimmer, H., Weickert, J., Stoll, C., Theobalt, C.: Joint estimation of motion, structure and geometry from stereo sequences. ECCV (2010)
-  Petra, S., Schröder, A., Wieneke, B., Schnörr, C.: On sparsity maximization in tomographic particle image reconstruction. DAGM (2008)
-  Vogel, C., Schindler, K., Roth, S.: Piecewise rigid scene flow. ICCV (2013)
-  Vogel, C., Schindler, K., Roth, S.: 3D scene flow estimation with a piecewise rigid scene model. IJCV 115(1) (2015) 1–28
-  Menze, M., Geiger, A.: Object scene flow for autonomous vehicles. CVPR (2015)
-  Wieneke, B.: Volume self-calibration for 3d particle image velocimetry. Experiments in fluids 45(4) (2008) 549–556
-  Schanz, D., Gesemann, S., Schröder, A., Wieneke, B., Novara, M.: Non-uniform optical transfer functions in particle imaging: calibration and application to tomographic reconstruction. Measurement Science & Technology 24(2) (2012)
-  Monaghan, J.J.: Smoothed particle hydrodynamics. Reports on Progress in Physics 68(8) (2005) 1703
-  Adams, B., Pauly, M., Keiser, R., Guibas, L.J.: Adaptively sampled particle fluids. ACM SIGGRAPH (2007)
-  Zhu, Y., Bridson, R.: Animating sand as a fluid. ACM ToG 24(3) (2005) 965–972
-  Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods. Mathematical Programming 137(1) (2013) 91–129
-  Bolte, J., Daniilidis, A., Lewis, A.: The Lojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems. SIAM J Optimiz 17(4) (2007) 1205–1223
-  Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods. Prentice-Hall (1989)
-  Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2(1) (2009) 183–202
-  Dodd, M.S., Ferrante, A.: A fast pressure-correction method for incompressible two-fluid flows. J Comput Phys 273 (2014) 416–434
-  Ladický, L., Jeong, S., Solenthaler, B., Pollefeys, M., Gross, M.: Data-driven fluid simulations using regression forests. ACM ToG 34(6) (2015)
-  Tompson, J., Schlachter, K., Sprechmann, P., Perlin, K.: Accelerating eulerian fluid simulation with convolutional networks. CoRR abs/1607.03597 (2016)
-  Li, Y., Perlman, E., Wan, M., Yang, Y., Meneveau, C., Burns, R., Chen, S., Szalay, A., Eyink, G.: A public turbulence database cluster and applications to study Lagrangian evolution of velocity increments in turbulence. Journal of Turbulence 9 (2008)
-  Perlman, E., Burns, R., Li, Y., Meneveau, C.: Data exploration of turbulence simulations using a database cluster. In: ACM/IEEE Conf. on Supercomputing. (2007)
-  Kähler, C.J., Astarita, T., Vlachos, P.P., Sakakibara, J., Hain, R., Discetti, S., La Foy, R., Cierpka, C.: Main results of the 4th International PIV Challenge. Experiments in Fluids 57(6) (2016)