Fast Geometric Surface based Segmentation of Point Cloud from Lidar Data

05/06/2020 ∙ by Aritra Mukherjee, et al. ∙ Yahoo! Inc. 0

Mapping the environment has been an important task for robot navigation and Simultaneous Localization And Mapping (SLAM). LIDAR provides a fast and accurate 3D point cloud map of the environment which helps in map building. However, processing millions of points in the point cloud becomes a computationally expensive task. In this paper, a methodology is presented to generate the segmented surfaces in real time and these can be used in modeling the 3D objects. At first an algorithm is proposed for efficient map building from single shot data of spinning Lidar. It is based on fast meshing and sub-sampling. It exploits the physical design and the working principle of the spinning Lidar sensor. The generated mesh surfaces are then segmented by estimating the normal and considering their homogeneity. The segmented surfaces can be used as proposals for predicting geometrically accurate model of objects in the robots activity environment. The proposed methodology is compared with some popular point cloud segmentation methods to highlight the efficacy in terms of accuracy and speed.



There are no comments yet.


page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Mapping of environment in 3D is a sub-task for many robotic applications and is a primary part of Simultaneous Localization And Mapping (SLAM). For 3D structural data sensing, popular sensors are stereo vision, structured light, TOF (Time Of Flight) cameras and Lidar. Stereo vision can extract RGBD data but the accuracy is highly dependent on the presence of textural variance in the scene. Structured light and TOF cameras can extract depth information and RGBD data respectively in an accurate fashion. But these are mostly suitable for indoor uses and in a low range. Lidar is the primary choice for sensing accurate depth in outdoor scenarios over long range. Our focus in this work is on the unsupervised geometric surface segmentation in three dimensions based on Lidar data. We would like to emphasize that models built from such segmentation can be very useful for tasks like autonomous vehicle driving.

According to Nguyen et al. [13], the classic approaches in point cloud segmentation can be grouped into edge based methods [3], region based methods [17, 1, 9, 11], attributes based methods [6, 8, 4, 19], model based methods [16], and graph based methods [10, 2]. Vo et al. [17] proposed a new octree-based region growing method with refinement of segments and Bassier et al.. [1] improved it with Conditional Random Field. In [9, 11], variants of region growing methods with range image generated from 3D point cloud are reported. Ioannou et al. [8] used Difference of Normals (DoN) as a multiscale saliency feature used in a segmentation of unorganized point clouds. Himmelsbach et al. [7] treated the point clouds in cylindrical coordinates and used the distribution of the points for line fitting to the point cloud in segments. Ground surface was recognized by thresholding the slope of those line segments. In an attempt to recognize the ground surface, Moosmann et al. [12] built an undirected graph and characterized slope changes by mutual comparison of local plane normals. Zermas et al. [18] presented a fast instance level LIDAR Point cloud segmentation algorithm which consisting of deterministic iterative multiple plane fitting technique for the fast extraction of the ground points, followed by a point cloud clustering methodology named Scan Line Run (SLR). On the other hand, in supervised methods, PointNet [14] takes the sliding window approach to segment large clouds with the assumption that a small window can express the contextual information of the segment that it belongs to. Landrieu et al. [10]

introduced superpoint graphs, a novel point cloud representation with rich edge features encoding the contextual relationship between object parts in 3D point clouds. This is followed by deep learning on large-scale point clouds without major sacrifice in fine details.

In most of the works, neighbourhood of a point used to estimate normal is determined by tree based search. This search is time consuming and the resulting accuracy is also limited for sparse point cloud as provided by a Lidar in robotic application. This observation acts as the motivation for us to focus on developing a fast mesh generation procedure that will provide the near accurate normal in sparse point cloud. Moreover, processing Lidar data in real-time requires considerable computational resources limiting its deployability on small outdoor robots. Hence a fast method is also in demand.

2 Proposed Methodology

Figure 1: Block diagram of the entire system, the first stage can be merged with Lidar scanning by improvising the Lidar firmware

The overall process consists of three steps as shown in Figure 1. First, the point cloud is sensed by the Lidar, subsampled and the mesh is created simultaneously. Second, surface normal is calculated for node points using the mesh. Finally, surface segmentation is done by labelling the points on the basis of spatial continuity of normal with a smooth distribution.

Figure 2: (a) A schematic showing the formation of point cloud by Lidar and (b) the resultant point cloud
Figure 3: (a) A schematic showing the formation of mesh on subsampled (factor 2) cloud with the neighbour definition of a point and (b) the normal formation from the neighbours

The proposed methodology segments surface from point cloud obtained by spinning Lidars only. Spinning Lidars work on the principle of spinning a vertical array of divergent laser distance sensors and thus extracts point cloud in spherical coordinates. The point cloud consists of a set of coordinates where is the fixed vertical angle of a sensor from the plane perpendicular to the spinning axis, is the variable horizontal angle due to spinning of the array and is the distance measured by the laser sensor. This form of representation is exploited by our methodology to structure the data in an ordered form the only caveat being running it for a single spin. By varying the factor of sub-sampling of , the horizontal density of the point cloud can be varied. Figure 2 shows the operational procedure of a spinning Lidar and the resultant point cloud for an object with multiple surfaces. Please note that not every point during the sweep is considered for mesh construction as noisy points too close to each other horizontally produces erroneous normal. Sub-sampling is done to rectify this error by skipping points uniformly during the spin.

Mesh Construction: The significant novelty of this work is the fast mesh generation process that enables quick realization of subsequent steps. The mesh construction is a simultaneous process during the data acquisition stage. The connections are done during sampling in the following manner. Let a point be denoted as . Let the range of be which corresponds to vertical sensors in the array; and the range of be where is the number of times the sensor array is sampled uniformly during a single spin. The corresponding distance of the point is from the Lidar sensor. Let the topmost sensor in the array corresponds to angle and the count proceeds from top to bottom the vertically spinning array. Further, let first shot during the spin corresponds to . The mesh is constructed by joining neighbouring points in the following manner: is joined with , and for all points within range of and . The points from the last vertical sensor, i.e., corresponding to , are joined with the immediate horizontal neighbour. Thus, is joined with where varies from to . For points corresponding to , is joined with , and . The point is joined with to create the whole cylindrical connected mesh. For all pairs the joining is done if both the points have an that is within range of the Lidar. If all the neighbours of a point is present then six of them are connected by the meshing technique instead of all eight. This is done to ensure there are no overlapping surface triangles. Figure  3

(a) shows the connectivity a point which have six valid neighbours on the mesh. The mesh is stored in a map of vectors

where each point is mapped to the vector containing its existent neighbors in an ordered fashion. The computational complexity of the meshing stage in where is the number of sub-sampled points. As the meshing is performed on the fly with the sensor spinning the absolute time depends on the angular frequency and sub-sampling factor.

Normal Estimation: The structured mesh created in the previous step helps toward a fast computation of normal at a point. A point forms vectors with its neighbour. Pair of vectors are taken in an ordered fashion. Normal is estimated for that point by averaging the resultant vectors formed by cross multiplication of those pairs. The ordering is performed during the mesh construction stage only. For a point , vectors are formed with the existing neighbours as stored in in an anti-clockwise fashion. A normal can be estimated for if its corresponding has . From the neighbour vector of obtained from , let forms by joining with , by joining with , and so on until it forms by joining with . Then cross- multiplication of existing consecutive vectors is performed. In general, if every point exists, then , etc. are computed ending with to complete the circle. This arrangement is illustrated in Figure  3(b). The normal is estimated by averaging all the components of the resultant vectors individually. The normals are stored in the map

. Due to the inherent nature of the meshing technique, sometime points from disconnected objects get connected to the mesh. To mitigate this effect of a different surface contributing to the normal estimation, weighted average is used. The weight of a vector formed by cross multiplication of an ordered pair is inversely proportional to the sum of lengths of the vectors in the pair.

for each  do
       if  for in  then
             while  do
                   for each in corresponding to  do
                         if  for in  then
                               get for from ;
                               if , ,  then
                               end if
                         end if
                   end for
             end while
            search next in
       end if
end for
Result: L
Algorithm 1

Surface segmentation on normal distribution

Segmentation by Surface Homogeneity: Based on the normal at a point, as computed in the previous step, we now propagate the surface label. A label map is used for this purpose. This label map stores the label of each point by assigning a label . If for any point , its denotes the point is yet to be labelled. The criteria of assigning the label of to its neighbour depends on the absolute difference of their normal components. Three thresholds are empirically set depending on the type of environment. Segment labelling is propagated by a depth first search approach as described in algorithm  1. Two neighbouring points will have same label provided the absolute difference of corresponding components of their normals are within component-wise threshold. Computations of normals and mesh, as discussed earlier, generates the normal map and the mesh respectively. Subsequently algorithm  1 uses and to label the whole sub-sampled point cloud in an inductive fashion. Due to sub-sampling, all points in will not get a label. This issue is resolved by assigning the label of its nearest labelled point along the horizontal sweep. An optional post-processing may be arranged by eliminating segments with too few points.

3 Experiments Results and Analysis

The proposed methodology is implemented with C++ on a linux based system with DDR4 8GB RAM, 7th generation i5 Intel Processor. Experiments were performed on a synthetic dataset which simulates Lidar point clouds. The methodology is compared with the standard region growing algorithm used in point cloud library [15] and a region growing algorithm combined with merging for organized point cloud data [19] and it is observed that it excels in terms of both speed and accuracy.

Figure 4: (a) Synthetic scene with scanned point cloud overlayed (b) Point cloud with distance color coded from blue(least) to red(highest) (c) Mesh and normals with sub-sampling factor of 5 (d) Point cloud segment surface ground truth (e) A detailed look at the mesh and normals (f) Segmented point cloud by proposed methodology

We have used the software “Blensor” [5] to simulate the output of the Velodyne 32E Lidar in a designed environment. Blensor can roughly model the scenery with the desired sensor parameters. BlenSor also provides an exact measurement ground truth annotated over the model rather than the point cloud. We have included primitive 3D model structures such as cylinder, sphere, cube, cone and combined objects placed in different physically possible orientations in the scene to simulate the real environment. Different percentage of occlusion and crowding levels is included in the model environment to test out the property of scene complexity independence, of the proposed solution. We have a total of 32 different environments with increasing order of different types of objects, occlusion, complexity of geometry and pose, and crowding levels. We have used Gaussian noise as our noise model for Lidar with zero mean and variance of . With Velodyne 32E Lidar the total number of sensors in the spinning array is and a resolution of degree is set for horizontal sweep resulting in sensor firing in one spin. Thus for our dataset the range of and are and . The output at different stages of our methodology is shown in Figure 4

Time (in ms)
Time (in ms)
Time (in ms)
5 54.33 63 41
10 35.06 48 28
15 25.53 32 20
Growing [15]
275.13 1507 134
with Merging [19]
368.46 1691 129
Table 1: Comparison of execution times (all units in milliseconds) of different competing methods.

Performance of Proposed Methodology: The input scene and point cloud along with the output at different stages are given in Figure 4. It should be mentioned that the different colors of the mesh are due to separation of ground plane on the basis of normal and is rendered for better visualization only. Performance is evaluated using the precision-recall and f1 score metric. As the segments may lack semantic labels, edge based comparisons were performed with overlapping of dilated edge points with ground-truths. We vary the sampling interval from to in steps of , as a sampling interval of corresponds to degree of sweep of the Lidar. The thresholds for checking normal homogeneity are kept at . These values are determined empirically for scenes with standard objects on a flat surface. Our methodology is compared with  [15] and  [19]. The tuning parameters of the methods are kept at default settings. Table 1 shows the execution times for different methods and clearly reveals that the proposed method is much faster. In Table 2, we present the accuracy values for the competing approaches. This table indicates that our solution is much more accurate. Overall, the results demonstrate that we are successful in providing a fast yet accurate solution to this complex problem.

F1 score
5 0.7406 0.7616 0.7224
10 0.7147 0.7330 0.6998
15 0.6910 0.7190 0.6673
Growing [15]
0.3509 0.4752 0.2804
with Merging [19]
0.3614 0.3849 0.3430
Table 2: Comparison of accuracy of different competing methods

4 Conclusion

In this work we have presented an unsupervised surface segmentation algorithm which is fast, accurate and robust to noise, occlusion and different orientations of the surface with respect to the Lidar. This work serves as the first step for mapping environments with geometric primitive modelling in SLAM applications for unmanned ground vehicles. In future, supervised classifier can be utilized for segment formation on data collected by Lidar on a real environment. Thereafter, the surface segments will enable the model generation of 3D objects.


  • [1] M. Bassier, M. Bonduel, B. Van Genechten, and M. Vergauwen (2017) Segmentation of large unstructured point clouds using octree-based region growing and conditional random fields. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 42 (2W8), pp. 25–30. Cited by: §1.
  • [2] Y. Ben-Shabat, T. Avraham, M. Lindenbaum, and A. Fischer (2018) Graph based over-segmentation methods for 3d point clouds. Computer Vision and Image Understanding 174, pp. 12–23. Cited by: §1.
  • [3] B. Bhanu, S. Lee, C. Ho, and T. Henderson (1986) Range data processing: representation of surfaces by edges. In

    Proceedings of the eighth international conference on pattern recognition

    pp. 236–238. Cited by: §1.
  • [4] C. Feng, Y. Taguchi, and V. R. Kamat (2014)

    Fast plane extraction in organized point clouds using agglomerative hierarchical clustering

    In 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 6218–6225. Cited by: §1.
  • [5] M. Gschwandtner, R. Kwitt, A. Uhl, and W. Pree (2011) BlenSor: blender sensor simulation toolbox. In International Symposium on Visual Computing, pp. 199–208. Cited by: §3.
  • [6] T. Hackel, J. D. Wegner, and K. Schindler (2016) Fast semantic segmentation of 3d point clouds with strongly varying density. ISPRS annals of the photogrammetry, remote sensing and spatial information sciences 3 (3), pp. 177–184. Cited by: §1.
  • [7] M. Himmelsbach, F. V. Hundelshausen, and H. Wuensche (2010) Fast segmentation of 3d point clouds for ground vehicles. In 2010 IEEE Intelligent Vehicles Symposium, pp. 560–565. Cited by: §1.
  • [8] Y. Ioannou, B. Taati, R. Harrap, and M. Greenspan (2012) Difference of normals as a multi-scale operator in unorganized point clouds. In 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pp. 501–508. Cited by: §1.
  • [9] X. Y. Jiang, U. Meier, and H. Bunke (1996) Fast range image segmentation using high-level segmentation primitives. In Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV’96, pp. 83–88. Cited by: §1.
  • [10] L. Landrieu and M. Simonovsky (2018) Large-scale point cloud semantic segmentation with superpoint graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4558–4567. Cited by: §1.
  • [11] M. Li and D. Yin (2017) A fast segmentation method of sparse point clouds. In 2017 29th Chinese Control And Decision Conference (CCDC), pp. 3561–3565. Cited by: §1.
  • [12] F. Moosmann, O. Pink, and C. Stiller (2009) Segmentation of 3d lidar data in non-flat urban environments using a local convexity criterion. In 2009 IEEE Intelligent Vehicles Symposium, pp. 215–220. Cited by: §1.
  • [13] A. Nguyen and B. Le (2013) 3D point cloud segmentation: a survey. In 2013 6th IEEE conference on robotics, automation and mechatronics (RAM), pp. 225–230. Cited by: §1.
  • [14] C. R. Qi, H. Su, K. Mo, and L. J. Guibas (2017) Pointnet: deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 652–660. Cited by: §1.
  • [15] R. B. Rusu and S. Cousins (2011) 3d is here: point cloud library (pcl). In 2011 IEEE international conference on robotics and automation, pp. 1–4. Cited by: Table 1, Table 2, §3, §3.
  • [16] F. Tarsha-Kurdi, T. Landes, and P. Grussenmeyer (2007) Hough-transform and extended ransac algorithms for automatic detection of 3d building roof planes from lidar data. In ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, Vol. 36, pp. 407–412. Cited by: §1.
  • [17] A. Vo, L. Truong-Hong, D. F. Laefer, and M. Bertolotto (2015) Octree-based region growing for point cloud segmentation. ISPRS Journal of Photogrammetry and Remote Sensing 104, pp. 88–100. Cited by: §1.
  • [18] D. Zermas, I. Izzat, and N. Papanikolopoulos (2017) Fast segmentation of 3d point clouds: a paradigm on lidar data for autonomous vehicle applications. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 5067–5073. Cited by: §1.
  • [19] Q. Zhan, Y. Liang, and Y. Xiao (2009) Color-based segmentation of point clouds. Laser scanning 38 (3), pp. 155–161. Cited by: §1, Table 1, Table 2, §3, §3.