Incremental 3D reconstruction from a sparse point cloud is gaining interest in the computer vision community as incremental Structure from Motion algorithms are consolidating. This is clearly true for those applications where a rough, but dense, surface represents a sufficient and effective representation of the scene, e.g, for traversability analysis in unmanned vehicle navigation. Furthermore, in real-time applications, the map of the environment needs to be updated online, and the surface has to be estimated incrementally.
Most of the existing algorithms [10, 13, 8, 9] bootstrap the reconstruction of a mesh surface from the 3D Delaunay triangulation of a sparse point cloud. Indeed, the 3D Delaunay triangulation is very powerful: the Delaunay property, i.e., no point of the triangulation is inside the sphere circumscribing any tetrahedron, avoids as much as possible the resulting tetrahedra to have a degenerate shape ; it is self-adaptive, i.e., the more the points are dense the more the tetrahedra are small; it is very fast to compute, and to update against point removal or addition; off-the-shelf libraries, such as CGAL , enable a very simple and efficient management of it.
As soon as a Delaunay triangulation is available, several approaches exist to extract a surface taking into account the visibility of each point. The simplest algorithm is the Space Carving : it initializes all the tetrahedra as matter, then it marks as free space the tetrahedra intersected by the camera-to-point viewing rays, i.e., the lines from the camera center to the observed 3D points in the triangulation. The boundary between free space and matter represents the final surface of the scene. Pan et al.  improve upon this simple procedure by proposing an online probabilistic Space Carving, but this is not an incremental approach: they start from scratch every time new points are added. Lovi et al.  present the first incremental Space Carving algorithm which runs real-time, but, as for the previous methods, the estimated surface is not guaranteed to be manifold
Several reasons lead to enforce the manifold property as explained in . Most Computer Graphics algorithms need the manifold property, for instance smoothing with Laplace-Beltrami operator , or the linear mesh parametrization . Moreover the manifold property enables surface evolution in mesh-based Multi-View Stereo, as in [17, 1].the manifold property enables a photometric refinement by surface evolution such as with the high accurate Multi-View Stereo mesh-based algorithm as in [17, 1]. With these approaches is hard to estimate the surface evolving flow in the presence of non manifold vertices: indeed they compute for each vertex the gradient minimizing the reprojection error, by summing-up the contribution of the incident facets; if the vertex is not manifold, this gradient does not converge. As a further proof of this,  needs to manually fix the surface estimated via s-t cut. As in , it is possible to fix the mesh as a post-processing step, but reconstructing directly a manifold as in the proposed paper, enables the design of a fully automatic pipeline which do not need human intervention.
In literature, the only algorithm reconstructing a manifold incrementally was proposed by Litvinov and Lhuiller [8, 9]. In their work, the authors bootstrap from the Space Carving procedure and, by taking into account the number of intersections of each tetrahedron with the viewing rays, they reconstruct a surface keeping the manifold property valid. The main limitation is that Litvinov and Lhuiller insert a point into the Delaunay triangulation only when its position is definitive, then they cannot move the point position anymore even in the case they could refine their estimate. The main reason of Litvinov and Lhuiller design choice has to be ascribed to the computational cost of updating the visibility information along the viewing rays incident to each moved point, and the computational cost of updating part of the Delaunay triangulation, which in turn induces a new manifold reconstruction iteration step.
Indeed, the very common approach to deal with a point moving in the triangulation, is to remove it and add it back in the new position  (Fig. 1). When we remove a point (the point A in Fig. 1(a)) and we want to keep the Delaunay property, we have to remove all the tetrahedra incident to that point (light red triangles in Fig. 1(b)); then, we add a new set of tetrahedra to triangulate the resulting hole (dark green triangles in 1(c)). When we add a new point into the triangulation (the point B in Fig. 1(d)), a set of tetrahedra would conflict with it, i.e., the Delaunay property is broken (light red triangles in Fig. 1(d)); so, we remove this set of tetrahedra again (red triangles in Fig. 1(e)) and we add a new connected set that re-triangulate the hole (dark green triangles in Fig. 1(f)). Whenever a set of tetrahedra is replaced, we have to transfer conveniently the information about the visibility (matter or free space) of the removed tetrahedra to the new one. In addition to this, we have to update the visibility of the tetrahedra crossed by a visibility ray from one camera to the moved point. For these reasons the update of the point position is computational demanding.
To complete the overview of the incremental reconstruction methods from sparse data, we mention here another very different approach was proposed by Hoppe et al.  who label the tetrahedra with a random field, and extract the surface via graph-cuts by minimizing a visibility-consistent energy function. This incremental algorithm is effective and handles the moving points, but the manifold property of the reconstructed surface is not yet guaranteed.
In this paper we propose, to the best of our knowledge, the first manifold 3D reconstruction algorithm from sparse data which deals with dynamic point changes. In particular, we show that in this setting the algorithm by Lovi et al.  provides a feasible solution, but it is very inefficient and we propose a novel efficient policy to handle the visibility update of Delaunay tetrahedra with moving points.
In Section 2 we summarize a slightly modified version of the approach of  we use to reconstruct a manifold surface. In Section 3 we describe Lovi’s approach  and our proposal to deal with moving points, together with a complexity analysis that explains why our approach is more efficient. In Section 4 we show the experimental results on the publicly available dataset KITTI , while, in Section 5 we point out some future works and in the conclusion of the paper.
2 Manifold Reconstruction
In this paper we reconstruct a manifold surface that represents the observed scene. A surface is manifold if and only if the neighborhood of each point is homeomorphic to a disk. In the discrete case, the points are the vertexes of a mesh, and the neighborhood is represented by the incident triangles (or polygons); a surface is manifold if each vertex is regular, i.e., if and only if the edges opposite to form a closed path without loops (see  for more details).
2.1 Incremental manifold extraction with tetrahedra weighting
In this section we briefly summarize our variation on the method originally proposed in  enhanced by a weighting scheme that avoids the creation of most visual artifact in the final mesh (more discussion about visual artifacts in ). In our Space Carving algorithm, a weight roughly represents how many rays intersect a tetrahedron, and in the following, a tetrahedron belongs to free space if its weight is higher than a threshold (in our case ).
2.1.1 Sparse point cloud
The input of our algorithm is a sparse 3D point cloud, estimated incrementally by assuming the camera poses to be known. For each keyframe, i.e., every frames, we extract Edge-point features, i.e. 2D points laying on the image edges ; these points represent measurements of 3D points. Frame-by-frame we track these features with the Kanade-Lucas-Tomasi tracker, and, at each keyframe, we estimate the new positions of the 3D points with the new measurements available from the tracking. New estimates are obtained by triangulating the 2D tracked points and by minimizing the reprojection error with a Gauss-Newton algorithm. Once a new estimate of a 3D point is available, we add it to the Delaunay triangulation, i.e., to the reconstruction; then we update its position according to the new measurements: this update induces the motion of the points inside the triangulation.
2.1.2 3D reconstruction
The reconstruction of the surface bootstraps from the manifold partitioning the 3D triangulation between the set of outside tetrahedra, i.e., the manifold subset of the free space (not all the free space tetrahedra will be part of the manifold), and the complementary set of inside tetrahedra, i.e. the remaining tetrahedra that represent the matter together with the free space tetrahedra which would invalidate the manifold property.
Let be the initial manifold. This initial manifold is obtained with the following steps. Point Insertion: add all the 3D points estimated up to time and build thir 3D Delaunay triangulation. Ray tracing and tetrahedra weighting: for each viewing ray, the algorithm traverses the triangulation and adds a weight to the intersected tetrahedra, a weight to the neighbors and a weight to the neighbors of their neighbors. Such weighting scheme acts as a smoother of the visibility and avoids the creation of visual artifacts; it is the main difference between our algorithm and the algorithms proposed in [8, 9]. Growing: initialize a queue starting from the tetrahedron with the higher weight. Then: (a) pick the tetrahedron with highest weight from and add it to only if the resulting surface between and remains manifold; (b) if inserted add the neighboring tetrahedra to the queue , otherwise discard it; continue iteratively until is empty.
Once the system is initialized, a new set of points is estimated at each frame, named keyframes, where and is the inverse of the keyframe rate. The insertion of a point causes the removal of the set of tetrahedra breaking the Delaunay property, and, the surface is not guaranteed to be manifold anymore. To avoid this, the authors in  define a list of tetrahedra and apply the Shrinking procedure, i.e., the inverse of Growing: they subtract iteratively from the tetrahedra keeping the manifoldness valid. After this process, it is likely that . Whenever the point is not added to the triangulation, i.e., is dropped. Once all points in have been added (or dropped), the growing process runs similarly to the initialization procedure, but the queue is initialized with the tetrahedra such that .
3 Reconstructing a manifold with moving points
As previously described, Litvinov and Lhuiller  algorithm adds points to the triangulation only when their 3D position is completely defined; by doing this, there are no changes in the Delaunay triangulation, induced by moving points. This results in a restriction if we would like to refine the estimation of the position of a point 3D position after its insertion.
Only Lovi et al.  presents an incremental Space Carving algorithm which deals with moving points, but their method does not enforce the manifold property. In this paper we verify the approach of Lovi et al.  to be very inefficient for manifold reconstruction, and we present a different approach to deal with moving points that leads to a significant faster computation.
3.1 The straightforward approach
The simplest way to deal with moving points while reconstructing a manifold surface, is to apply a straightforward modification to the so called Refinement Event Handler by Lovi et al. in . The Refinement Event Handler algorithm assumes that, for each tetrahedron in the Delaunay triangulation a list of the intersecting viewing rays is stored. In our voting schema an intersecting ray is each ray that increase the weight of the tetrahedron.
Let be a point that moves to position , the algorithm in  moves the point by removing point and adding as a new point, according to the classical approach of , then for each point they apply the following steps. Rays collection: collect in a set all the rays stored into the tetrahedra incident to , i.e., those affected by the removal (e.g., the light red triangle in Fig. 1(a)). Vertex removal: remove the vertex and its neighboring tetrahedra from the triangulation (Fig. 1(b)); then re-triangulate the hole left by the deleted tetrahedra (Fig. 1(c)). New point insertion: insert the new point into the triangulation and add to the set all the rays stored in the conflicting tetrahedra (Fig. 1(d-f)). Rays removal: for each tetrahedron of the entire triangulation remove the rays ending in . Ray casting: cast one ray for each ray in .
In our case, whenever the 3D estimate of a point moves, we apply the Refinement Event Handler, before point addition and region growing, if and only if the point is inside the shrinked volume (see Section 2), otherwise we do not move the point (this second case happens very rarely ).
The number of rays involved in space carving algorithms is where and represent respectively the number of frames and the number of points in the triangulation , and the number of tetrahedra in a 3D triangulation grows quadratically with the number of points (). In Table 1 we reported the complexities for each of the previous stage; since our implementation exploits the CGAL  3D triangulation data structure, the complexity of a single Ray casting, i.e., a cast of a single ray, is in the general case, but we bound the size of the viewing ray, to avoid to include too far uncertain 3D points estimates, so the final complexity becomes (see [19, p.94]).
From the analysis in the table is quite clear that this straightforward solution is not scalable, especially for the dependence between the number of rays and the number of processed frames.
|New points insertion|
|Backward ray casting||-||-|
3.1.2 Forgetting Heuristic
Lovi et al.  proposed a forgetting heuristic to limit the number of rays stored in each tetrahedron to a fixed number , thus making the complexity independent from the number of the processed frames. However, we show in Section 4 that, when the points are moving, the reconstruction is very inefficient even with this heuristic.
3.2 The efficient approach
Our contribution in this paper is an approach to deal with moving points, different from the straightforward variation of . Indeed in our proposal, we avoid storing the list of rays inside each tetrahedron, and we just store the weight associated with it. This allows the incremental reconstruction algorithm of Section 2.1, and, at the same time, we are able to bound the temporal complexity.
The main difficulty in the proposed approach is updating coherently the weights whenever a point moves, i.e., when the point is removed from the triangulation and added as a new point. As soon as the point is removed from the triangulation, we perform a backward ray casting with negative weights for each viewing camera such that the influence of the point is neglected. Then we remove the point, and we add a new vertex in the new position. Finally, we perform the ray casting from each viewing camera to the new point.
During both point removal and addition, we have to remove a set of connected tetrahedra from the triangulation and add a new one. Let be the set of removed off tetrahedra and the set of the new ones; their associated weights are respectively and . The weights are known, while are those to be computed for the new tetrahedra, without recasting the visibility rays related to those tetrahedra.
Different approaches are possible: Mean value: ; Weighted mean: let be the Euclidean distances between the centroids of the -th tetrahedron of and the -th of ; then ; Minimum distance: such that .
Among these, the third solution gives a non-smooth outcome and, even if this seems counter-intuitive, it results to be more suitable for our purposes. The main reason is that it preserves the discontinuity between matter and free space. For instance in Fig. 2(a) we depict a 2D triangulation where we want to add a new point position; in Fig. 2 (b), (c) and (d) we show the results of weights update after point addition with, respectively, Mean value, Weighted Mean value and Minimum distance approaches. It is clear that only Fig. 2(d) preserved the discontinuity, while in other cases becomes hard to distinguish between matter (lower weights) and free space (higher weights).
2D example of moving point addition in the new position after point removal (a brighter region corresponds to a higher weight, i.e., higher probability to be carved).
In case of very sparse data, the centroids of big tetrahedra, together with the associated visibility information, can be far from the newly added or moved points, and our update policy might lead to results far from the ideal solution, i.e., the straightforward approach discussed in Section 3.1. Our algorithm overcomes this issue thanks to the use of the (so called) Steiner points added to the triangulation before the actual reconstruction is performed; this idea was already introduced in . We add Steiner points to the Delaunay triangulation every 5m along each axis so that they cover all the space that can be represented. The use of Steiner points limits the creation of very big tetrahedra, the visibility information becomes always local, and the update policy avoids drifts. Indeed, experimental results show good accuracy on varied scenes, even when lack of textures induces very sparse data.
The complexity of the steps of our algorithm are reported in Table 1. The main difference with respect to the straightforward algorithm is the replacement of the Rays removal to the weight update and backward casting which are the key of the gaining in computational complexity. The proposed algorithm is thus , so, in principle, the dependence with still remains and results in a non scalable solution.
3.2.2 Window Heuristic
We are able to bound the complexity of our algorithm to thanks to the following heuristic: instead of backward casting all the rays connecting the moving point to all the viewing cameras, we consider only the most recent cameras. In this case the complexity of the ray casting becomes , where is the (constant) size of the window (in our case ), so the final complexity is .
4 Experimental validation
To evaluate our approach, we tested the system on four different sequences of the KITTI dataset  on a 4 Core i7-2630QM CPU at 2.2Ghz (6M Cache), with 6GB of DDR3 SDRAM. The video stream was captured by a Point Grey Flea 2, which records gray scale images at fps. The vehicle pose are estimated through a RTK-GPS and they are the initial input of our system together with the video stream.
Among all the sequences we choose the 0095 (268 frames) and 0104 (313 frames) since they depict two different urban scenarios: the former shows a narrow environment where the building façades are close to the camera, the latter captures a wide road. We also tested our approach on sequences 03 (801 frames) and 04 (271 frames) from the odometry dataset: these videos provide a varied landscape mixing natural (trees and bushes) and man-made (houses, cars) features.
To provide a quantitative evaluation we compared the reconstructed meshes with the very accurate point clouds measured by the Velodyne HDL-64E sensor in the KITTI dataset through the CloudCompare tool . This tool computes the reconstruction error as the average of the distances between each Velodyne point and the nearest triangle in the reconstructed mesh.
We evaluated the performance and the accuracy of the Lovi’s approach against our three different updating policy. As explained previously, no manifold incremental reconstruction approach deals with moving points, so a fair comparison results to be between the straightforward approach of Lovi, applied to manifold reconstruction (Section 3.1) and our updating policies. In Fig. 3 we show an example of the reconstruction results before and after the red points has been moved in the Delaunay triangulation (see also the video at http://youtu.be/_-q9sKjcOC0).
Fig. 4 shows the results of the comparison where we applied the window heuristic (Section 3.2.2) to all the algorithms. In the case of Lovi’s algorithm we applied the forgetting heuristic with and , where is the number of viewing rays stored for each tetrahedron. Fig. 4(a) shows that the accuracy of the proposed approach, i.e, moving point management through minimum distance weight updates, is comparable with respect to Lovi’s proposal outcomes, where the algorithm with stores more information, so it performs better. We compared our approach with respect to Lovi’s method instead of the other incremental reconstruction algorithm presented in ; the reasons are twofold. In  Litvinov and Lhuiller does not deal with moving points, which is the main point addressed in this paper. Moreover, Litvinov and Lhuiller point out in  that the ideal solution for a manifold reconstruction algorithm is represented by the manifold including as much as free space tetrahedra as possible. Since the solution provided by Lovi et al. coincides with the (non-manifold) mesh containing all the free space tetrahedra, a reconstruction accuracy similar to Lovi’s suggests that the reconstruction is near to the ideal solution. In some cases our algorithm reaches even better accuracy, this is due to the smoothing effect induced by our heuristic.
In Fig. 4(b) we report the time performance of the algorithms. Let and be the overall processing time with and without moving points, and be the number of the total points moves, e.g., if one point moves three times, . The overhead introduced in the whole reconstruction process for each move of each point has been computed as . The performance of the different update schema we presented in Section 3.2 is very similar since the steps involved are basically the same: for each update on the Delaunay data structure, we iterate over the old tetrahedra to collect the weights, then we iterate over the new tetrahedra to set the new weights. As expected by Section 3.2, our algorithm clearly outperforms Lovi’s approach. Our updating schema is very efficient for two reasons. First, we only need to update locally the visibility, while Lovi’s approach casts a ray for each visibility ray stored inside the tetrahedra. Second, when we remove a point (first step of moving point management), we perform a ray casting backward to update only the convenient tetrahedra, instead of iterating over the whole triangulation to remove the visibility rays involving the point moved as in .
|(a) Absolute errors (m).||(b) Per-point overhead (s).|
5 Conclusion and future work
In this paper we have shown that manifold reconstruction from sparse data with moving points is not a trivial task. To keep the Delaunay property valid when a point moves inside the Delaunay triangulation, we have to remove it and add a new point in the new position. This induces the removal of a set of tetrahedra, with the associated visibility information; then, we have to add a new set of tetrahedra with coherent visibility information; finally we have to update the visibility information in all the tetrahedra affected by the point move.
Existing solutions successfully applied for classic Space Carving, result to be inefficient and slow when applied in the manifold reconstruction setting. In this setting, we investigated different approaches to handle visibility information propagation, by updating the weight for each tetrahedron, which roughly represents the number of ray intersections, and we proposed an efficient algorithm to conveniently update it. We tested our system with the KITTI dataset and it clearly outperforms the existing approach of Lovi et al.  for incremental manifold reconstruction.
Future works would include a photometric refinement of the manifold extracted incrementally, and an evaluation of the manifold quality on-the-fly, relying on the uncertainty information carried by the estimation of 3D points. A natural extension could also deal with the reconstruction of non-rigid shapes whose 3D points are moving.
This work has been partially funded by the SINOPIAE project, from the Italian Ministry of University and Research and Regione Lombardia, and the MEP: Maps for Easy Paths project funded by Politecnico di Milano under the POLISOCIAL program.
-  Delaunoy, A., Prados, E., Piracés, P.G.I., Pons, J.P., Sturm, P.: Minimizing the multi-view stereo reprojection error for triangular surface meshes. In: BMVC 2008-British Machine Vision Conference. pp. 1–10. BMVA (2008)
-  Devillers, O., Teillaud, M.: Perturbations and vertex removal in a 3d delaunay triangulation. In: 14th ACM-Siam Symp. on Algorithms. pp. 313–319 (2003)
Geiger, A., Lenz, P., Urtasun, R.: Are we ready for autonomous driving? the kitti vision benchmark suite. In: Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. pp. 3354–3361. IEEE (2012)
-  Girardeau-Montaut, D.: Cloud compare, (last access mar, 22 2015), http://www.cloudcompare.org/
-  Hoppe, C., Klopschitz, M., Donoser, M., Bischof, H.: Incremental surface extraction from sparse structure-from-motion point clouds. Proc. BMVC (2013)
-  Kutulakos, K.N., Seitz, S.M.: A theory of shape by space carving. International Journal of Computer Vision 38(3), 199–218 (2000)
-  Lhuillier, M.: 2-manifold tests for 3d delaunay triangulation-based surface reconstruction. Journal of Mathematical Imaging and Vision pp. 1–8 (2014)
-  Litvinov, V., Lhuillier, M.: Incremental solid modeling from sparse and omnidirectional structure-from-motion data. In: BMVC (2013)
-  Litvinov, V., Lhuillier, M.: Incremental solid modeling from sparse structure-from-motion data with improved visual artifacts removal. In: International Conference on Pattern Recognition (ICPR) (2014)
-  Lovi, D.I., Birkbeck, N., Cobzas, D., Jagersand, M.: Incremental free-space carving for real-time 3d reconstruction. In: Fifth International Symposium on 3D Data Processing Visualization and Transmission(3DPVT) (2010)
-  Maur, P.: Delaunay triangulation in 3d. Tech. rep., Dep. of Computer Science and Engineering, University of West Bohemia, Pilsen, Czech Republic (2002)
-  Meyer, M., Desbrun, M., Schröder, P., Barr, A.H.: Discrete differential-geometry operators for triangulated 2-manifolds. In: Visualization and mathematics III, pp. 35–57. Springer (2003)
-  Pan, Q., Reitmayr, G., Drummond, T.: Proforma: Probabilistic feature-based on-line rapid model acquisition. In: BMVC. pp. 1–11 (2009)
-  Rhein, S., Lu, G., Sorensen, S., Mahoney, A.R., Eicken, H., Ray, G.C., Kambhamettu, C.: Iterative reconstruction of large scenes using heterogeneous feature tracking. In: Computer Vision and Pattern Recognition Workshops (CVPRW), 2013 IEEE Conference on. pp. 407–412. IEEE (2013)
-  Saboret, L., Alliez, P., Lévy, B.: Planar parameterization of triangulated surface meshes. In: CGAL User and Reference Manual. CGAL Editorial Board, 4.4 edn. (2000)
-  The CGAL Project: CGAL User and Reference Manual. CGAL Editorial Board, 4.5 edn. (2014)
-  Vu, H.H., Labatut, P., Pons, J.P., Keriven, R.: High accuracy and visibility-consistent dense multiview stereo. Pattern Analysis and Machine Intelligence, IEEE Transactions on 34(5), 889–901 (2012)
-  Wu, C.: Towards linear-time incremental structure from motion. In: 3D Vision-3DV 2013, 2013 International Conference on. pp. 127–134. IEEE (2013)
-  Yu, S.: Automatic 3d modeling of environments: a sparse approach from images taken by a catadioptric camera. Ph.D. thesis, Université Blaise Pascal-Clermont-Ferrand II (2013)