1 Introduction
The availability of the lowcost 3D sensors, e.g. Microsoft Kinect, Time of Flight, has gained increasingly interest in using three dimensional point clouds Rusu20113D Han2017A for solving many tasks, such as 3D object recognition, classification, robot localization and navigation, the fundamental applications in 3D computer vision and robotics, since 3D point clouds has the capability of providing greatly important information cues for analyzing objects and environments.
The considerably crucial steps involved in 3D applications is 3D descriptors or 3D features extraction which have a significant effect on the overall performance of descriptive results. A powerful and discriminative descriptor should be able to capture the geometric structure and invariant to translation, scaling and rotation at the same time. How to extract a meaningful 3D descriptor from 3D point clouds, especially in environment with occlusion and clutter, therefore is still a challenging research area and worth being extensively investigated.
The existing 3D descriptors in literature can be distinctly divided into three main categories: globalbased descriptors, localbased descriptors and hybridbased descriptors. The former generally estimate a single descriptor vector encoding the whole input 3D object. The success of global descriptors relies on the observation of the entire geometry of point clouds of objects, which turns out to be a little more difficult. While the local descriptors construct features resorting to the geometrical information of the local neighborhood of each keypoint obtained from point cloud using relevant keypointextraction algorithms. And the local descriptors are robust to occlusion and clutter
Guo2013Rotational which the global descriptors are not. However, the local descriptors are sensitive to the changes in the neighborhoods around keypoints Hadji2014Local . The hybridbased descriptors is the sort of descriptors fusing the essential theorem of local and global descriptors or incorporating both kinds of descriptors together to make the most of the advantages of local and global features.Although there exists a few review papers Aldoma2012Tutorial Alexandre20123D Salti2012On Mateo2014A Guo2016A on 3D descriptors and its related fields published. However, these papers covered only a rather limited number of descriptors to be evaluated, particularly, most only focus on local descriptors for mesh or depth image while others concentrate on performance of a small range of global descriptors for particular applications (e.g. Urban Object Recognition). Overall, there is no survey papers specifically emphasizing the comprehensive review, analysis and evaluation of the 3D point cloud descriptors.
The main contributions of this paper includes:i) To the best of our knowledge, this is the first review paper specially concentrating on 3D point cloud descriptors, including both localbased and globalbased descriptors. ii) we give readers a comprehensively insightful introduction of the stateoftheart descriptors from early work to the recent research (including 31 localbased descriptors, 14 globalbased descriptors and 5 hybridbased descriptors). iii) The main traits of these descriptors are summarized in table form offering an intuitive understanding. iv) This paper conducts an experiment on performance evaluation of several extensively used descriptors.
The organization of the reminder of the paper is as follows: a comprehensive description of the stateoftheart 3D point cloud descriptors is presented in section 2. Section 3 shows the detailed results and insightful analysis of the experimental comparison we carry out to overall evaluate several descriptors, followed by conclusion in Section 4.
2 3D point cloud descriptors
3D point cloud descriptors has recently been extensively addressed in many fields. And numerous works processing on this research area have been made in recent decades. The main approaches extracting features from 3D point cloud can be categorized into the following groups: local descriptor, global descriptor and Hybrid methods. We list the approaches relating to each category chronologically by year of publication.
2.1 Local Descriptors
3D Local descriptors have been developed to encode the local geometric information of feature point (e.g. surface normals and curvatures) which is directly associated with the quality and resolution of point cloud models. Generally, these descriptors can be used in applications like registration, object recognition and categorization.
2.1.1 Spin Image
Johnson et al. Johnson1998Surface introduced a local shape based descriptors on 3D point clouds called spin images. The feature point is represented using its own coordinate p and surface normal n. Then the spin image attributes of each neighboring point q of the feature point is defined as a pair of distances (,), where and (shown in Figure 1). And finally, the spin image is generated by accumulating the neighbors of feature points in discrete 2D bins. This descriptor is robust to occlusion and clutter 765655 , but the presence of highlevel noise will lead to degradation of performance.
Matei et al. Matei2006 incorporated the spin image descriptor into their work to handle the recognition of 3D point clouds of vehicles which is a challenging problem.
Similarly, Shan et al. Shan2006 also took the spin image as descriptive descriptors to propose Shapeme Histogram projection(SHP) approach, together with a Bayesian framework to complete partial object recognition through projection of the descriptor of the query model onto the subspace of the model database.
Golovinskiy et al. Golovinskiy2009 applied the shapebased spin image descriptor in conjunction with contextual features to form a discriminative descriptor to identity the urban objects.
2.1.2 3D Shape Context
Frome et al.Frome2004 directly extended the 2D shape context descriptor to 3D point cloud, generating 3D shape contexts(3DSC). A spherical support region, centered on a given feature point p, is first determined with its north pole orienting as the surface normal n. Within the support, a set of bins (Figure 2) is formed by equally dividing the azimuth and elevation and logarithmically spacing radial dimension. Then, the final 3DSC descriptor is computed as the weighted sum of the number of points falling into bins. Actually, this descriptor captures the local shape of point cloud at p using the distribution of points in a spherical support. However, it requires computation of multiple descriptors for each feature point because of no definition of Reference Frame at feature point.
2.1.3 Eigenvalues based Descriptors
Vandapel et al. Vandapel2004
devised a eigenvalues based descriptors to extracted saliency features. The eigenvalues are attained by decomposing the local covariance matrix defined in a local support region around feature points and decreasingly ordered
. Three saliencies, named pointness, curveness and surfaceness respectively, are constructed by means of linear combination of these eigenvalues.(1) 
In literature Anand2013Contextually ,Kahler2013Efficient ,Zelener2015Classification , the above three features also are integrated into the corresponding feature representations to perform object detection in indoor scenes, 3D scene labeling and vehicles detection, respectively.
2.1.4 Distribution Histogram
Distribution Histogram (DH) by Anguelov et al. Anguelov2005 is computed based on the principal plane around each point. First, PCA is run on points in a defined cube to obtain a plane that spanned by the first two principal components. Then, the cube is divided into several bins,oriented the plane. Finally, the descriptor is formed by summing up the number of points falling into each subcubes. Experiment shows that this feature, incorporated with Markov random fields model, is applicable to objects in both outdoor and indoor environments
2.1.5 Histogram of Normal Orientation
Triebel et al. Triebel2006 evaluated a new feature descriptor representing a local distribution of the cosine of the angles between surface normal on p
and the normals of its neighbors. And the local distribution on these angles is defined as a local histogram. In general, regions with a strong curvature result in a uniformly distributed histogram, while flat areas lead to a peaked histogram
Behley2012 .2.1.6 Intrinsic Shape Signatures
The Intrinsic Shape Signatures is devised as following procedures. In the first step, for a feature point p, a local reference frame (e,e,ee) is computed, where e, e and e is the eigen vectors obtained using the eigen analysis of p’s spherical support. In the second step, the spherical angular space () constructed using octahedron is partitioned into several bins. And the ISS descriptor is a 3D histogram created by counting weighted sum of points in each bin. It is stated that the ISS descriptor is stable, repeatable, informative and discriminative.
Ge et al. Ge2016Non used multilevel ISS approach to extract feature descriptor as an important step of their registration framework.
2.1.7 ThrIFT
Motivated by the success of SIFT and SURF, ThrIFT has been proposed by Flint et al. Flint2007Thrift taking orientation information into account. For each keypoint p and point q from its neighbouring support, two windows W and W are computed before estimating their leastsquares plane’s normals n and n (shown in Figure 4). The output descriptor for p is generated by binning the angle between n and n into a histogram.
(2) 
Although the ThiIFT descriptor can yield promising results, it is sensitive to noise.
2.1.8 Point Feature Histogram
Point Feature Histogram, PFH for simplicity proposed by Rusu et al. Rusu2008 Rusu2008Aligning , uses the relationships between point pairs in the support region (as shown in Figure 4(a)) and estimated surface normals to represent the geometric properties. For every pair of points p and p in the neighborhood of p, where one point is chosen as p and the other as p, first, a Darboux frame is constructed at p as (as shown in Figure 4(b)):
(3) 
Then, using the frame defined above, three angular features, i.e. , , , expressing the difference between normals n and n, and the distance d are computed for each point pair in the support region.
(4) 
And the final PFH representation is created by binning these four features into a histogram with div bins, where div is the number of subdivisions along each features’ value range.
2.1.9 Fast Point Feature Histogram
As for PFH described above, its computational complexity for a given point cloud with n points is O(nk), where k is the number of neighbors of a query point, which makes PFH inappropriate to be used in the realtime application. Therefore, in order to reduce computational complexity, Rusu et al.Rusu2009Fast Rusu2009Detecting introduced Fast Point Feature Histogram (FPFH) to simplify the PFH descriptor. The FPFH consists of two steps. The first step is the construction of the Simplified Point Feature Histogram (SPFH). Three angular features , , between each query point and its neighbors (In the case of PFH, these angles are computed for all the point pairs in the support region) are computed using the same way as PFH and are binned into three separate histograms. Then these histograms are concatenated to generate the SPFH. In the second step, the final FPFH of a query point p is calculated as the sum of the SPFH of p and the weighted sum of the SPFH of each point p in its kneighbors.
(5) 
Where w donates the distance between the query point and a neighbor in the support region (as shown in Figure 5). The FPFH descriptor can greatly reduce the computational complexity to O(nk).
Huang et al. Huang2013
combined the FPFH descriptor and SVM (Support Vector Machine) learning algorithms for detecting objects in Scene Point Cloud, which achieving effective results for cluttered industrial scenes.
2.1.10 Radiusbased Surface Descriptor
Radiusbased Surface Descriptor (RSD) Marton2010General depicts the geometric property of point by estimating the radial relation with its neighbouring points. The radius is modeled as relation between distance of two points and the angle between normals on these two points as follows.
(6) 
By solving this equation, the maximum radius and minimum radius are obtained to constructed the final descriptor for each point presented as d = [r,r]. The advantage of this method is simple and descriptive.
2.1.11 Normal Aligned Radial Feature
Given a point cloud, Normal aligned radial feature descriptor Radu2010NARF is estimated as following steps. A normal aligned range value patch around the feature point is computed by constructing a local coordinate system. Then a star shaped pattern is projected into this patch to form the final descriptor. And the rotationally invariant version of NARF descriptor is achieved by shifting this descriptor according to a unique orientation extracted from the original NARF descriptor. This approach obtain a better results on feature matching.
2.1.12 Signature of Histogram of Orientation
Signature of Histogram of Orientation(SHOT) descriptor, presented by Tombari et al. Tombari2010Unique , can be considered as a combination of Signatures and Histograms. First, a repeatable local Reference Frame (LRF) with disambiguation and uniqueness is computed for the feature point based on disambiguated Eigenvalue Decomposition(EVD) of the covariance matrix of points within the support region. Then, an isotropic spherical grid is used to define the signature structure that patitions the neighborhood along the radial, azimuth and elevation axes (as shown in Figure 6). For each grid sector, point counts are accumulated into bins based on the angle between normal at each neighboring point within the corresponding grid sector and normal at feature point to obtain a local histogram. And the final SHOT descriptor is formed by juxtaposing all these local histograms.
In order to improve the accuracy of feature matching, Tombari et al. Tombari2011A incorporated texture information (CIELab color space) to extend the SHOT to form the color version, i.e. SHOTCOLOR or CSHOT. Experimental results show that it presents a good balance between recognition accuracy and time complicity.
Gomes et al. Beserra2013Efficient introduced SHOT descriptor on foveated point clouds to their objection recognition system to reduce the processing time. And experiments show attractive performance.
2.1.13 Unique Shape Context
Unique Shape Context Tombari2010Uniqueshape can be considered as an improvement of 3D Shape descriptor by adding a unique, unambiguous local reference frame with the purpose of avoiding computation of multiple features at each keypoint. Given a query point p and its spherical support region with radius R, a weighted covariance matrix is defined as
(7) 
Where . Three unit vector of LRF is computed from the Eigen Vector Decomposition of M
. The eigenvectors corresponding to the maximum and minimum eigenvalues are reoriented in order to match the majority of the vectors they depicted, while the sign of the third eigenvector is determined by cross product. The detailed definition of this LRF is presented in
Tombari2010Uniqueorignal . Once the LRF is builded, the construction of USC descriptor follows the approach analogous to that used in 3DSC. From the view point of the memory cost and efficiency, the USC upgrades 3DSC notably.2.1.14 Depth Kernel descriptor
Motivated by kernel descriptors, Bo et al. Bo2011Depth extended this idea to 3D point clouds to derive five local kernel descriptors that describe size, shape and edges,respectively. Experimental results show that these features can complement each other and the formed method turns out to significantly improve the accuracy of object recognition.
2.1.15 Spectral Histogram
Similar to the SHOT descriptor, Behley et al. Behley2012 proposed the Spectral Histogram (SH)descriptor. They first decomposed the covariance matrix defined in the support radius to compute eigenvalues , , , where . The corresponding normalized eigenvalues are evaluated by=/. Then three signature values ,  and  are calculated respectively. Finally, they subdivided these three values into different sector (Figure 7) and accumulated the number of points falling into every sector to form the descriptor. This method is a best choice for the classification in urban environment.
2.1.16 Covariance Based Descriptors
Inspired by the core idea of spin images, Fehr et al. Fehr2012Compact newly exploited covariance based descriptors for 3D point clouds due to the compactness and flexibility of containing multiple features for representational power. These features they decided to capture the geometric relation included ,,,, and n, which are shown in the following figure.The advantages of this method are low computational and storage requirements, no model parameters to tune and scalability relevant to various features.
To further increase performance, Fehr et al.Fehr2014RGB took the additional r,g,b color channel values from ’colored’ point clouds into consideration to form a straightforward extension of the original covariance based descriptors. And this approach actually yields promising results.
Beksi et al. 7139443 computed covariance descriptors that also encapsulated both shape features and visual features on the entire point cloud. However, the difference from Fehr2014RGB
is that they incorporated the principal curvatures and Gaussian curvature into shape vector while adding gradient and depth into visual vector. Then ,together with dictionary learning, a point cloud classification framework has been constructed to classify the object.
2.1.17 Surface Entropy
Fiolka et al. Fiolka2014Distinctive Fiolka2012SURE designed a novel local shapetexture descriptor from the perspective of shape and color information (if it is available). Regarding shape descriptor, a local reference frame (u,v,w)is defined using two surfels (p,n) and (q,n) at keypoint p and its neighbor q. After that, surfel pair relations containing four features , , and are estimated.
(8) 
The shape descriptor is constructed by building histograms using discretion of the surfel pair relations into several bins each. If color information at points is available, two more histograms are builded based on the hue and saturation in HSL color space. The final descriptor is viewpose invariant
2.1.18 3D Selfsimilarity Descriptor
Huang et al. Huang2012 developed a specifically designed descriptor by extending the concept of selfsimilarity to 3D point clouds. This 3D selfsimilarity descriptor contains two major steps. In the first step, three similarities , named normal similarity, curvature similarity and photometric similarity, between two points x and y are approximated as follows.
(9) 
(10) 
(11) 
Then, these similarities are combined together to define the united similarity:
(12) 
By comparing the feature point’s united similarity to that of neighbors within its local spherical support, the selfsimilarity surface is directly constructed. The second step is to build a local reference at the feature point to guarantee the rotation invariance and quantize the correlation space into cells with average similarity value of points falling in corresponding cells to form the descriptor (shown in Figure 9). This descriptor can efficiently characterize the distinctive geometric signatures in point clouds.
2.1.19 Geometric and Photometric Local Feature
The Geometric and Photometric Local Feature (GPLF) makes full use of geometric properties and photometric characteristics denoted as =(). Given a point p and its normal n, the knearest neighbors p of p is estimated. With these points, several geometric parameters are derived as follows.
(13) 
The two color features and are calculated in HSV color space according the equations below.
(14) 
And the last two angular features and are defined as
(15) 
(16) 
where . The final GPLF is a histogram created on these four features with the size of 128.
2.1.20 Mcov
Cirujeda et al. Cirujeda2014MCOV fused visual and 3D shapes information to develop the new MCOV covariance descriptor. Given a feature point p and its radial neighborhood N
, a feature selection function is defined as follows.
(17) 
where is represented as a vector (R,G,B,,,) (as shown in Figure 10). The first three elements corresponds to the R,G,B values at p in RGB space capturing the texture information. And the last three components are p,(pp), p,(pp) and n,n respectively. Then, a covariance descriptor at p is calculated by
(18) 
Where, is the mean of the . Results on testing point clouds demonstrate that the MCOV providing a compact and depictive representation boost the discriminative performance dramatically.
2.1.21 Histogram of Oriented Principal Components
To handle viewpoint variations and effect of noise, Rahmani et al. Rahmani2014HOPC proposed a 3D point cloud descriptor, named Histogram of Oriented Principal Components(HOPC). First, PCA is performed on the support centered key point to yield the eigenvectors and corresponding eigenvalues. Then, they projected each eigenvector into m directions derived from a regular msided polyhedron and scaled it by the corresponding eigenvalue. Finally, the projected eigenvectors in decreasing order of eigenvalues is concatenated to form the HOPC descriptor.
2.1.22 Height Gradient Histogram
Height Gradient Histogram (HGIH), developed by Zhao et al.Zhao2014Height , takes full advantage of the height dimension data which is firstly extracted from 3D point cloud as f(p)=p (x corresponds to height) for each point p=(p,p,p). After that, the linear gradient reconstruction method is used to compute the height gradient f(p) of point p based on its neighbors. Secondly, the spherical support of each point p is divided into K subregions and the gradient orientation of points with one subregion is encoded to form a histogram. Finally, the HIGH feature descriptor is constructed by concatenating the histograms of all subregions. Experimental results show that the HIGH descriptor can give promising performance. However, one major limitation is that its lack of capability of describing small objects well.
2.1.23 Equivalent Circumference Surface Angle Descriptor
The Equivalent Circumference Surface Angle Descriptor (ECSAD)J2015Geometric is developed for addressing the problem of geometric edges detections in 3D point clouds. Consider a feature point p, its local support region is divided into several cells along the radial and azimuth axes only. For each cell, an angle is computed by averaging all the angles between normal n at p and pp, where p
is the point falling into this cell. In order to handle the empty bins, an interpolation strategy is used to estimate values for these bins. The final dimension of ECSAD is 30.
2.1.24 Binary SHOT
To reduce computation time and memory footprint, Prakhya et al. Prakhya2015B first employed binary 3D feature descriptor, named Binary SHOT (BSHOT) which is formed through replacing each value of SHOT with either 0 or 1. This construction procedure is performed by encoding every sequential four values taken from a SHOT descriptor in turn into corresponding binary values according to the five possibilities defined by authors. The proposed method requires significantly less memory and obviously faster than SHOT.
2.1.25 Rotation, illumination, Scale Invariant Appearance and Shape Feature
The rotation, illumination, scale invariant appearance and shape feature (RISAS), presented by Li et al. Li2016RISAS , is a feature vector including three statistical histograms, namely spatial distribution, intensity information and geometrical information. As for spatial distribution, a spherical support is divide into n sectors and information (position and depth value) in each sector is encoded to form spatial histogram. Regarding intensity information, relative intensity instead of absolute intensity is grouped into n bins to construct the intensity histogram. Finally, with respect to geometric information, the quantity = n,n between normal n at feature point p and normal n at its neighbor qis computed and these values are then binned to form the geometric histogram. Experimental results show the effectiveness of proposed descriptor in point cloud alignment.
2.1.26 Colored Histograms of Spatial Concentric SurfletPairs
Similar to PFH and FPFH descriptors, the Colored Histogram of Spatial Concentric SurfletPairs (CoSPAIR) descriptor in Logoglu2016CoSPAIR is also on the basis of surfletpair relations Wahl2003Surflet . Considering the query point p and its neighbor q within its spherical support, a fixed reference frame and three angle between them are estimated using the same strategies as PFH. Then, the neighbouring space is partitioned into several spherical shell (called level) along the radius. For each level, three distinct histograms are generated by accumulating points in it along the three angular features. The original SPAIR is the concatenation of the histograms in all levels. Finally, CIELab color space are binned into histograms for each channel at each level, together with the SPAIR descriptor to form the final CoSPAIR descriptor. It is reported that this is a simple and faster method.
2.1.27 Signature of Geometric Centroids
Tang et al. Tang2016Signature generated a new Signature of Geometric Centroids (SGC) descriptor by first constructing a unique LRF centered at feature point p based on PCA. Then its local spherical support S aligns with the LRF and a cubical volume encompassing S is defined with edges length of 2R following by partitioning this volume evenly into KKK voxels. Finally, the descriptor is constructed by concatenating the number of points within each voxel and the position of centroid for calculated for these points. The dimension of SGC descriptor is 2KKK.
2.1.28 Local Feature Statistics Histograms
Yang et al.Yang2016A exploited the statistical properties of three local shape geometry: local depth, point density and angles between normals. Given a keypoint p from point cloud and its spherical neighborhood N, all neighbors p are projected on a tangent plane estimated along the normal n at p to form new points p. The local depth is then computed as d=r n(pp). The angular feature is denoted as =arccos(n,n). Regarding point density, the ratio of point falling into each annulus generated by equally dividing the circle on the plane is determined via horizontal projection distance .
(19) 
The final descriptor is constructed using the concatenation of the histograms builded on these three features and it has low dimension, low computational complexity and is robust to various nuisances.
2.1.29 3D Histogram of Point Distribution
The 3D Histogram of Point Distribution descriptor Prakhya20173DHoPD are formed using the following steps. First, a local reference frame is for each keypoint and the local surface is aligned with the LRF to achieve rotational invariance. Next, along x axis, the histograms are created by partitioning the range between the smallest and largest x coordinate value of the points in the surface into D bins and accumulating the points falling in each bin. Repeating the same processing along y and z axis, the 3DHoPD is generated by concatenating these histograms together. The advantage of this method is that the computation is greatly fast.
2.2 Global Descriptors
Global descriptors encode the geometric information of the whole 3D point cloud, which requires relatively less computation time as well as memory footprint. In general, the global descriptors are increasingly used for the purpose of 3D object recognition, geometric categorization and shape retrieval.
2.2.1 Point Pair Feature
Similar to surfletpair feature, given two points p and p and their normals n and n, the Point Pair Feature (PPF) Drost2010Model is defined as , where the range of angle is [0;]. Feature with the same discrete version are aggregated together. And the global descriptor is formed by mapping the sampled PPF space to the model.
2.2.2 Global RSD
Global RSD Marton2011Combined can be regarded as a global version of local RSD descriptor. After voxelizing the input point cloud, the smallest and largest radius are estimated in each voxel which is labeled using one of the five surface primitives (e.g. planar, sphere) by checking the radii. Once all voxles have bee labeled, a global RSD descriptor can be devised by analyzing the relations between all labels.
2.2.3 Viewpoint Feature Histogram
The Viewpoint Feature Histogram (VFH) Muja2011REIN Rusu2014Fast
extends the idea and properties of FPFH by including additional viewpoint variance. It is a global descriptor composed of a viewpoint direction component and a surface shape component. The viewpoint component is a histogram of angle
between central viewpoint direction and each point’s surface normal. As for shape component, three angle , and computed similarly as PFPF are binned into three distinct subhistograms, respectively, each with 45 bins. The VFH turns out to have high recognition performance and a computational complexity of O(n). However, the main drawback of VFH is its sensitivity to noise and occlusions.Ali et al. Ali2014Contextual integrated the VFH descriptor into their proposed system to perform scene labeling. Chan et al. Chan2014A extracted VFH feature from point cloud of human to calculate the humanpose estimation.
2.2.4 Clustered Viewpoint Feature Histogram
The Clustered Viewpoint Feature Histogram descriptor Aldoma2012CAD can be considered as an extension of VFH, which takes into account the advantage from stable object regions obtained by applying a region growing algorithm after removing the points with high curvature. Given a region s from the regions set S, a Darboux frame (u,v w) similar to FPH is constructed using the centroid p and its corresponding normal n of s. Then the angular information () as in VFH are binned into four histograms followed by computation of Shape Distribution Component (SDC) of CVFH. . The final CVFH descriptor is the concatenation of histograms created from () with the size of 308. Although the CVFH can produce promising results, lacking of notion of an aligned Euclidean space causes the feature to miss a proper spatial description Aldoma2012OUR .
2.2.5 Oriented, Unique and Repeatable Clustered Viewpoint Feature Histogram
The Oriented, Unique and Repeatable Clustered Viewpoint Feature Histogram (OURCVFH) has the same surface shape components and viewpoint component as CVFH, but the dimension of viewpoint component is reduced to 64 bins. On the other hand, eight histograms of distances between points of surface S and centroid are built to replace the shape distribution component in CVFH based on reference frames that is yielded by employing the SemiGlobal Unique Reference Frame (SGURF) method. The size of the final OURCVFH is 303.
2.2.6 Global Structure Histograms
The Global Structure Histograms (GSH) Madry2012Improving
descriptor encodes the global and structural properties in the 3D point cloud in threestage pipeline. Consider a point cloud model, first, a local descriptor is estimated for each point and then all points based on their descriptors are labeled one approximated surface class by using kmeans algorithm followed by the computation of BagofWords model. In the second step, the relationship between different class is determined along the surface form by triangulation. The final step is the construction GSH descriptor representing the object as the distribution of distance along the surface in form of histogram. This descriptor not only maintains the low variations, but also reflects the global expressiveness power.
2.2.7 Shape Distribution on Voxel Surfaces
Based on the concept of shape function, Shape Distribution on Voxel Surfaces (SDVS)Wohlkinger2012Shape is proposed by first creating a voxel grid approximating the real surface of the point cloud, and with its help, the distance between two points randomly sampled from point cloud are binned into three histograms according to the line’s location (inside the 3D model, outside or mixture of both). Although the method is simple, it cannot address the confusion shape well.
2.2.8 Ensemble of Shape Functions
Aiming at realtime application, Wohlkinger et al. Wohlkinger2012Ensemble introduced a global descriptor from partial point cloud, dubbed Ensemble of Shape Functions which is a concatenation of ten 64sized histograms of shape distributions including three angle histograms, three area histograms, three distance histograms and one distance ratio histogram. The first nine histograms are created by respectively classifying the A3 (angle formed by randomly sampled three points), D3 (area created by three points) and D2 shape function (line between pointpair sampled from point cloud) into ON, OFF and MIXED cases according to the mechanism mentioned in Ip2002Using , while the distance ration histogram is builded on the lines from D2 shape function. The final ESF descriptor has proven to be efficient and expressive.
2.2.9 Global Fourier Histogram
The Global Fourier Histogram (GFH) descriptor Chen2014Performance is generated for an oriented point which is chosen as the original point 0. The normal at 0 together with global zaxis (z=[0,0,1]) are utilised to construct the global reference frame. whereafter, a 3D cylindrical support region centered at 0
is divided into several bins by equally spacing elevation, azimuth and radial dimensions. The GFH descriptor is a 3D histogram formed by summing up the points in each bin. To further boost the robustness, 1D Fast Fourier Transform is applied to analyze the 3D histogram along azimuth dimension. This descriptor compensate the drawback of Spin Image method.
2.2.10 Positionrelated Shape,Object Height along Length, Reflective Intensity Histogram
In order to improve vehicle detection rate remarkably, three novel features in Cheng2014Robust , namely positionrelated shape, object height along length and reflective intensity histogram, are extracted together to distinguish robustly vehicles and other objects. The positionrelated shape takes both shape features (including width to length ratio and width to height ratio) and position information (e.g. distance to object,angle of view and orientation) into account to address the variance of orientation and angle of view. And in the next stage, the bounding box around the vehicles are divided into several blocks along the length and the average height in each block is added in the feature vector to improve discrimination further. Finally, a reflective intensity histogram with 25 bins is computed using characteristic intensity distribution of vehicles. The experimental results show that the final feature is contributed greatly to the classification performance.
2.2.11 Global Orthographic Object Descriptor
The Global Orthographic Object Descriptor (GOOD) Kasaei2016GOOD is constructed by first defining a unique and repeatable Local Reference Frame based on the Principle Component Analysis. Then, the point cloud model is orthographically projected onto three plans, called XoZ, XoY and YoZ. As for XoZ, this plane is divided into several bins and a distribution matrices are computed by accumulating points falling into each bin. And similar processing is preformed for XoY and YoZ. Afterward, two statistical features, i.e. entropy and variance, are estimated for each distribution vector transformed from the matrix above. Concatenating these vectors together forms a single vector for the entire object. Experimental results report that the GOOD is scale and post invariant and give a tradeoff between expressiveness and computational cost.
2.2.12 Globally Aligned Spatial Distribution
Globally Aligned Spatial Distribution (GASD), a novel global descriptor proposed by Lima et al. Lima2016 , mainly consists of two steps. A reference frame estimated for the entire point cloud model representing an object is constructed based on PCA, where x and z axis are the eigenvectors v,v corresponding to the minimal and maximal eigenvalues, respectively, of covariance matrix C that computed from P and the centroid of point cloud, y axis is v=vv. Then, the whole point cloud is aligned with this reference frame. Next, a axisaligned bounding cube of point cloud centered at is partitioned into mmm cells . The global descriptor is achieved by concatenating the histograms formed by summing up the number of points falling into each grid. To get a higher discriminative power, color information based on HSV space is incorporated into the descriptor in a similar way as computing shape descriptor to form the final descriptor. However, this method may not work well with objects having similar shape and color distribution.
2.2.13 Scale Invariant Point Feature
Scale Invariant point Feature (SIPF)(Lin2017S ) computes q=argmin pq between feature point p and border point q as the reference direction. Then, the angle of a local cylindrical coordinates with q is partitioned into several cells. The final SIPF descriptor is constructed by concatenating all the normalized cell features D=exp(d/(1d)), where d is the minimum distance between p and points in ith cell.
2.3 Hybrid methods
2.3.1 BottomUp and TopDown Descriptor
Alexander et al. Alexander2008Object combined bottomup and topdown descriptors to operate on 3D point clouds. In the BottomUp stage, spin images are computed for point clouds followed by TopDown stage, in which global descriptorExtended Gaussian Images(EGIs)are used to capture larger scale structure information to further depict the models. Experimental results demonstrate this approach provides balance between efficiency and accuracy.
2.3.2 Local and Global Point Feature Histogram
For realtime application, a novel feature descriptor resorting to local and global properties of point clouds is described by Himmelsbach et al. Himmelsbach2009Real in detail. First, four object level features including maximum object intensity, object intensity mean and variance, and volume are estimated as global part. Then, as for description of local point properties, three histograms are built for three features each, i.g. scatterness, linearness and surfaceness from Lalonde et al. Lalonde2006Natural followed by adoption of Anguelove et al.’s Anguelov2005 feature extraction approach which produces five more histograms. The finally designed descriptor is proved well suited for object detection in largesized 3D point cloud scene. Lehtomäki et al.Lehtom2016Object integrated the LGPFH descriptor above into their workflow to recognize objects (such as tree, car, pedestrian, lamp and so on) in road and street environment.
2.3.3 LocaltoGlobal Signature Descriptor
To overcome the drawbacks of both local and global descriptors, Hadji et al. Hadji2014Local proposed LocaltoGlobal descriptor (LGS). First, they classified the points based on the minimum radius of RSD Marton2010General to capture the whole geometric property of object. Next, both local and global support regions aggregated by the same class are adopted to describe feature points (local description). Finally, the LGS is created in a signature form. Since this descriptor make use of strengths of both local and global methods, so it is highly discriminative.
2.3.4 PointBased Descriptor
Wang et al. Wang2015A extracted pointbased descriptor from point clouds as the foundation of their multiscale and hierarchical classification framework. The first part of feature (denoted as F) is obtained using eigenvalues (,,, in decreasing order) computed for covariance matric of feature point.
(20) 
Spin image is adopted as the second part of their wanted feature, represented as F. The final descriptor is the combination of F and F which is an 18 dimensional vector.
2.3.5 Fpfh+vfh
In order to make most of the advantages of localbased and globalbased descriptor, Alhamzi et al. Alhamzi20153D adopted the combination of wellknown local descriptor FPFH and global descriptor VFH as features representing objects together to implement their 3D object recognition system.
Table 1 summarizes the characteristics of localbased, globalbased and hybirdbased descriptors extraction approaches. These approaches are arranged in chronological order.
No. 
name  Reference  Category  Application  Performance 

1  SI  Johnson et al. 1998 Johnson1998Surface  Local  3D object recognition  Be robust to occlusion and clutter 
2  3DSC  Frome et al. 2004 Frome2004  Local  3D object recognition  Outperform the SI 
3  Eigenvalue Based  Vandapel et al. 2004 Vandapel2004  Local  3D terrain classification  Be suitable for 3D data segmentation for terrain classification in vegetated environment 
4  DH  Anguelov et al. 2005 Anguelov2005  Local  3D segmentation  Achieve the best generalization performance 
5  SHP  Shan et al. 2006 Shan2006  Local  3D object recognition  Handle problem of partial object recognition well 
6  HNO  Triebel et al. 2006 Triebel2006  Local  3D object classification and segmentation  Yield more robust classifications 
7  Thrift  Flint et al. 2007 Flint2007Thrift  Local  3D object recognition  Be robust to missing data and viewchange. 
8  BUTD  Alexander et al. 2008 Alexander2008Object  Hybrid  3D object detection  Be applicable to very large datasets and requires limited training efforts 
9  PFH  Rusu et al. 2008 Rusu2008Aligning  Local  3D registration  Be invariant to position, orientation and point cloud density. 
10  LGPFH  Himmelsbach et al. 2009 Himmelsbach2009Real  Hybrid  3D object classification  Achieve realtime performance. 
11  FPFH  Rusu et al. 2009 Rusu2009Fast  Local  3D registration  Reduce time consuming of PFH. 
12  GFPFH  Rusu et al. 2009 Rusu2009Detecting  Global  3D object detection and segmentation  Achieve a high accuracy in terms of matching and classification. 
13  ISS  Zhong et al. 2009 Zhong2010Intrinsic  Local  3D object recognition  Be discriminative, descriptive and robust to sensor noise, obscuration and scene clutter. 
14  PPF  Drost et al. 2010 Drost2010Model  Global  3D object recognition  Achieve high performance in the presence of noise, clutter and partial occlusions. 
15  RSD  Marton et al. 2010 Marton2010General  Local  3D object reconstruction  Be a fast feature estimation method. 
16  NARF  Steder et al. 2010 Radu2010NARF  Local  3D object recognition  Be invariant to rotation and outperform SI. 
17  SHOT  Tombari et al. 2010 Tombari2010Unique  Local  3D object recognition and reconstruction  Outperform SI. 
18  USC  Tombari et al. 2010 Tombari2010Uniqueshape  Local  3D object recognition  Improve the accuracy and decrease memory cost of 3DSC. 
19  Kernel  Bo et al. 2011 Bo2011Depth  Local  3D object recognition  output the SI significantly 
20  GRSD  Marton et al. 2011 Marton2011Combined  Global  3D object detection, classification and reconstruction  Reduce the complexity to be linear in the number of points. 
21  VFH  Muja et al. 2011 Muja2011REIN  Global  3D object recognition  Outperform SI and be fast and robust to large surface noise. 
22  CSHOT  Tombari et al. 2011 Tombari2011A  Local  3D object recognition  Improve the accuracy of SHOT. 
23  CVFH  Aldoma et al. 2012 Aldoma2012CAD  Global  3D object recognition  Outperform SI. 
24  OURCVFH  Aldoma et al. 2012 Aldoma2012OUR  Global  3D object recognition  Outperform CVFH and SHOT 
25  SH  Behley et al. 2012 Behley2012  Local  Laser data classification  Outperform SI. 
26  COV  Fehr et al. 2012 Fehr2012Compact  Local  3D object recognition  Be compact and low dimensional and outperform SI. 
27  SURE  Fiolka et al. 2012 Fiolka2012SURE  Local  3D matching  Outperform NARF in terms of matching score and runtime. 
28  3DSSIM  Huang et al. 2012 Huang2012  Local  3D matching and shape retrieve  Be invariant to scale and orientation change. 
29  GPLF  Hwang et al. 2012 Hwang2012Robust  Local  3D object recognition  Be stable and outperform FPFH. 
30  GSH  Madry et al. 2012 Madry2012Improving  Global  3D object classification  Outperform VFH. 
31  SDVS  Wohlkinger et al. 2012 Wohlkinger2012Shape  Global  3D object classification  Be fast and robust. 
32  ESF  Wohlkinger et al. 2012 Wohlkinger2012Ensemble  Global  3D object classification  Ourperform SDVS,VFH, CVFH and GSHOT. 
33  GFH  Chen et al. 2014 Chen2014Performance  Global  3D urban object recognition  Outperform global SI and LGPFH. 
34  PRS, OHL and RIH  Cheng et al. 2014 Cheng2014Robust  Global  3D vehicle detection  Improve the vehicles detection performance grately. 
35  MCOV  Cirujeda et al. 2014 Cirujeda2014MCOV  Local  3D matching  Outperform CSHOT and SI. 
36  LGS  Hadji et al. 2014 Hadji2014Local  Hybrid  3D object recognition  Outperform SI, SHOT and FPFH. 
37  HOPC  Rahmani et al. 2014 Rahmani2014HOPC  Local  3D action recognition  Be robust to noise and viewpoint variations and suitable for Action recognition in 3D point cloud 
38  HIGH  Zhao et al. 2014 Zhao2014Height  Local  3D scene labeling  Outperform 3DSC, SHOT and FPFH in terms of 3D scene labeling. 
39  FPFH + VFH  Alhamzi et al. 2015Alhamzi20153D  Hybrid  3D object recognition  Outperform ESF, VFH and CVFH. 
40  ECSAD  Jørgensen et al. 2015 J2015Geometric  Local  3D object classification and recognition  Be fast and produce highly reliable edge detections. 
41  BSHOT  Prakhya et al. 2015 Prakhya2015B  Local  3D matching  Outperform SHOT, RSD and FPFH and Be fast. 
42  Pointbased  Wang et al. 2015 Wang2015A  Hybrid  3D terrestrial object classification  Suit for classification of objects from terrestrial laser scanning. 
43  GOOD  Kasaei et al. 2016 Kasaei2016GOOD  Global  3D object recognition  Outperform VFH, ESF,GRSD and GFPFH. Be robust, descriptive and efficient and suited for realtime application. 
44  RISAS  Li et al. 2016 Li2016RISAS  Local  3D matching  Outperform CSHOT and be suitable for point cloud alignment. 
45  GASD  Lima et al. 2016 Lima2016  Global  3D object recognition  Outperform ESF, VFH and CVFH. 
46  CoSPAIR  Logoglu et al. 2016 Logoglu2016CoSPAIR  Local  3D object recognition  Outperform PFH, PFHRGB, FPFH, SHOT and CSHOT. 
47  SGC  Tang et al. 2016 Tang2016Signature  Local  3D matching  Outperform SI, 3DSC and SHOT. 
48  LFSH  Yang et al. 2016 Yang2016A  Local  3D registration  Outperform SI, PFH, FPFH and SHOT in terms of descriptiveness and time efficiency. 
49  SIPF  Lin et al. 2017 Lin2017S  Global  3D object detection  Outperform SI, PFH, FPFH and SHOT. 
50  3DHoPD  Prakhya et al. 2017 Prakhya20173DHoPD  Local  3D object detection  Require dramatically lowcomputational time and outperform USC, FPFH SHOT and 3DSC. 

3 Experimental Results and Discussion
Here, to comprehensively investigate the performance of 3D point cloud descriptors, we proceed to extensively conduct several experiments to make a meaningful evaluation and comparison among the selected stateoftheart descriptors in terms of descriptiveness and efficiency.
3.1 Datasets
The publicly available dataset we choice to evaluate the local and global descriptors is named Washington RGBD Objects Dataset Lai2011A . This dataset contains 207,621 3D point clouds (in PCD format) of view of 300 objects which are classified into 51 categories.
3.2 Selected Methods
To carry out experiments for performance evaluation, we choose 13 different descriptors (8 localbased and 5 Globalbased) which has a relatively high citation rate, stateoftheart performance and are commonly used as comparable algorithms. In section 2, we have already given a summary presentation of these descriptors including SI, 3DSC, PFH, PFHColor, FPFH, SHOT, SHOTColor, USC and GFPFH, VFH, CVFH, OURCVFH, ESF. All descriptors were implemented using C++, while all experiments are conducted on a PC with Intel(R) Core(TM) i74790 CPU @ 3.60GHz and 16GB memory.
3.3 Implementation details
The diagram of recognition pipeline based on local descriptors we used in our designed experiments is illustrated in Figure 11
. including train and test phases. Considering learning strategy, we adopt the classic and famous supervised learning algorithms, named Support Vector Machine (SVM), to process the point cloud dataset. In the case of globalbased object recognition, we remove the 3D keypoint extraction from pipeline.
3.4 Recognition Accuracy
Object Recognition Accuracy can be used to reflect the descriptiveness of these descriptors. And the basic pipeline in object recognition is to match local/global descriptors between two models. Since there exists several descriptors based on normals, we therefore resort to Principle Component Analysis to estimate the surface normals. Regarding local descriptors, to make a fair comparison, the same keypoint extraction algorithm (SIFT in our experiments) are adopted to select feature points. Furthermore, another important parameter, i.e. the support region size, determining the neighbors around feature point that is utilized to compute local descriptor, is arranged the same value (5cm in our experiments). As for global approaches, we will use these methods to directly extract descriptor from the entire objects’ point clouds without any preprocessing.
Table 2 presents the dimensionality and recognition accuracy of the chose local and global descriptors. From the results, we can draw the following observations. First, in terms of local descriptors, SHOTColor and PFHColor works surprisingly well and become the bestperforming methods on this dataset, which outperform the original SHOT and PFH respectively. It can be concluded that fusing color information well contribute to improve the recognition accuracy. Second, the second best performance descriptors are PFH, followed by FPFH and SHOT (producing a similar result). Third, from global descriptors point of view, ESF achieve the highest accuracy compared to other global methods, followed by VFH. OURCVFH demonstrates the moderate results which is much better than CVFH and GFPFH.
Descriptors  Type  Size  Recognition Accuracy 

SI  Local  153  36.7 
3DSC  Local  1980  21.7 
PFH  Local  125  53.3 
PFHColor  Local  250  55.0 
FPFH  Local  33  51.7 
SHOT  Local  352  51.7 
SHOTColor  Local  1344  58.3 
USC  Local  1960  38.3 
GFPFH  Global  16  43.3 
VFH  Global  308  76.7 
CVFH  Global  308  53.3 
OURCVFH  Global  308  60.0 
ESF  Global  640  78.3 
3.5 Efficiency
Computational efficiency of feature extraction is another crucial performance measurement for evaluating the 3D point cloud descriptors. As for localbased methods, since the feature construction highly relies on the local surface information, the number of points within the support region therefore directly affects the running time for descriptor generation, we compute the average time on several point cloud models (extracting 100 descriptors from each model) with respect to the size of the support for each localbased approach. In the case of globalbased strategies, since the number of points in objects’ point cloud model have directly effect on the computational efficiency, we calculate the average time costs on different models for the formation of global descriptor for each entire model.
Several experiments were carried out to perform evaluation regarding computational efficiency of the selected local and global descriptors. Figure 12 presents the running time of local descriptors with increasing the radius by Buch2016Local . In can be clearly seen that first, SI descriptor is the most efficient descriptor. In contrast, PFHColor and PFH become the most computationally expensive descriptors as the radius increases, although they are faster than 3DSC and USC when the raidus is 0.05. Second, FPFH, SHOT and SHOTColor descriptors achieving almost the same performance in terms of running time are slightly slower compared to SI, but they outperform the other descriptors significantly. Third, it is also relatively timeconsuming to build 3DSC ans USC descriptors which outperform PFHColor and PFH remarkably when the radius is greater than 0.07.
Table2 gives the average computation time required to extract a global descriptor over selected point cloud models using different global methods. It is worth pointing out that VFH achieves the best computational performance, which is nearly one or two order of magnitude faster compared to CVFH and OURCVFH or GFPFH.
Overall, SI and VFH perform best among localbased descriptors and globalbased descriptors respectively. They specifically be suitable for realtime application, while PFHColor and GFPFH take extremely high computational cost to generate the local and global descriptor separately.
Globalbased descriptors  Computation time(ms) 

GFPFH  110,157 
VFH  111 
CVFH  5,169 
ORUCVFH  5,209 
ESF  329 
3.6 Discussion
From the experimental results with respect to recognition accuracy and computational efficiency described above, we summarize some points as following.
First, the SHOTColor descriptor produces a relatively satisfactory accuracy at the cost of higher computational time requirement. In contrast, the SI method can be considered as a better choice for timecrucial systems which donot emphasize on descriptiveness performance.
Second, globalbased descriptors ESF, VFH and localbased descriptor SHOTColor provide a good balance between accuracy and running efficiency. These three descriptors are suitable for realtime object recognition application to make full of their advantages (expressiveness and highefficiency).
4 Conclusion
This paper specifically concentrates on the research activity concerning the field of 3D feature descriptors. We summary the main characteristics of the existing 3D point cloud descriptors categorised into three types: localbased, globalbased and hybridbased descriptor, and make a comprehensive evaluation and comparison among several selected descriptors based on their popularity and stateoftheart performance. Although the rapid development of 3D point cloud descriptors has already gained promising results in many applications, it is believed that how to design a powerful descriptor further is still a challenging research area due to the complex environment, the presence of occlusion and clutter and other nuisances.
References
References
 (1) R. B. Rusu, S. Cousins, 3d is here: Point cloud library (pcl), in: IEEE International Conference on Robotics and Automation, 2011, pp. 1–4.
 (2) X. F. Han, J. S. Jin, M. J. Wang, W. Jiang, L. Gao, L. Xiao, A review of algorithms for filtering the 3d point cloud, Signal Processing Image Communication 57 (2017) 103–112.
 (3) Y. Guo, F. Sohel, M. Bennamoun, M. Lu, J. Wan, Rotational projection statistics for 3d local surface description and object recognition, International Journal of Computer Vision 105 (1) (2013) 63–86.
 (4) I. Hadji, G. N. Desouza, Localtoglobal signature descriptor for 3d object recognition, in: Asian Conference on Computer Vision, 2014, pp. 570–584.
 (5) A. Aldoma, Z. C. Marton, F. Tombari, W. Wohlkinger, C. Potthast, B. Zeisl, R. B. Rusu, S. Gedikli, M. Vincze, Tutorial: Point cloud library: Threedimensional object recognition and 6 dof pose estimation, Robotics and Automation Magazine IEEE 19 (3) (2012) 80–91.
 (6) L. A. Alexandre, 3d descriptors for object and category recognition: a comparative evaluation, 2012.
 (7) T. F. Salti S, Petrelli A, On the affinity between 3d detectors and descriptors, in: 3DIMPVT, 2012, pp. 424–431.
 (8) C. M. Mateo, P. Gil, F. Torres, A performance evaluation of surface normalsbased descriptors for recognition of objects using cadmodels, in: International Conference on Informatics in Control, Automation and Robotics, 2014, pp. 428 – 435.
 (9) Y. Guo, M. Bennamoun, F. Sohel, M. Lu, J. Wan, N. M. Kwok, A comprehensive performance evaluation of 3d local feature descriptors, International Journal of Computer Vision 116 (1) (2016) 66–89.
 (10) A. E. Johnson, M. Hebert, Surface matching for object recognition in complex 3d scenes, Image and Vision Computing 16 (910) (1998) 635–651.
 (11) A. E. Johnson, M. Hebert, Using spin images for efficient object recognition in cluttered 3d scenes, IEEE Transactions on Pattern Analysis and Machine Intelligence 21 (5) (1999) 433–449. doi:10.1109/34.765655.
 (12) B. Matei, Y. Shan, H. S. Sawhney, Y. Tan, R. Kumar, D. Huber, M. Hebert, Rapid object indexing using locality sensitive hashing and joint 3dsignature space estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence 28 (7) (2006) 1111–1126.
 (13) Y. Shan, H. S. Sawhney, B. Matei, R. Kumar, Shapeme histogram projection and matching for partial object recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence 28 (4) (2006) 568–577. doi:10.1109/TPAMI.2006.83.
 (14) A. Golovinskiy, V. G. Kim, T. Funkhouser, Shapebased recognition of 3d point clouds in urban environments, in: 2009 IEEE 12th International Conference on Computer Vision, 2009, pp. 2154–2161.
 (15) A. Frome, D. Huber, R. Kolluri, T. Bülow, J. Malik, Recognizing Objects in Range Data Using Regional Point Descriptors, Springer Berlin Heidelberg, Berlin, Heidelberg, 2004, pp. 224–237.
 (16) N. Vandapel, D. F. Huber, A. Kapuria, M. Hebert, Natural terrain classification using 3d ladar data, in: Robotics and Automation, 2004. Proceedings. ICRA ’04. 2004 IEEE International Conference on, Vol. 5, 2004, pp. 5117–5122. doi:10.1109/ROBOT.2004.1302529.
 (17) A. Anand, H. S. Koppula, T. Joachims, A. Saxena, Contextually guided semantic labeling and search for threedimensional point clouds, International Journal of Robotics Research 32 (1) (2013) 19–34.
 (18) O. Kahler, I. Reid, Efficient 3d scene labeling using fields of trees, in: IEEE International Conference on Computer Vision, 2013, pp. 3064–3071.
 (19) A. Zelener, P. Mordohai, I. Stamos, Classification of vehicle parts in unstructured 3d point clouds, in: International Conference on 3d Vision, 2015, pp. 147–154.

(20)
D. Anguelov, B. Taskarf, V. Chatalbashev, D. Koller, D. Gupta, G. Heitz, A. Ng, Discriminative learning of markov random fields for segmentation of 3d scan data, in: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2, 2005, pp. 169–176 vol. 2.
doi:10.1109/CVPR.2005.133.  (21) R. Triebel, K. Kersting, W. Burgard, Robust 3d scan point classification using associative markov networks, in: Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006., 2006, pp. 2603–2608. doi:10.1109/ROBOT.2006.1642094.
 (22) J. Behley, V. Steinhage, A. B. Cremers, Performance of histogram descriptors for the classification of 3d laser range data in urban environments, in: 2012 IEEE International Conference on Robotics and Automation, 2012, pp. 4391–4398. doi:10.1109/ICRA.2012.6225003.
 (23) X. Ge, Nonrigid registration of 3d point clouds under isometric deformation, Isprs Journal of Photogrammetry and Remote Sensing 121 (2016) 192–202.
 (24) A. Flint, A. Dick, A. V. D. Hengel, Thrift: Local 3d structure recognition, in: Digital Image Computing Techniques and Applications, Biennial Conference of the Australian Pattern Recognition Society on, 2007, pp. 182–188.
 (25) R. B. Rusu, Z. C. Marton, N. Blodow, M. Beetz, I. A. Systems, T. U. München, Persistent point feature histograms for 3d point clouds, in: In Proceedings of the 10th International Conference on Intelligent Autonomous Systems (IAS10, 2008.
 (26) R. B. Rusu, N. Blodow, Z. C. Marton, M. Beetz, Aligning point cloud views using persistent feature histograms, in: Ieee/rsj International Conference on Intelligent Robots and Systems, 2008, pp. 3384–3391.
 (27) R. B. Rusu, N. Blodow, M. Beetz, Fast point feature histograms (fpfh) for 3d registration, in: IEEE International Conference on Robotics and Automation, 2009, pp. 1848–1853.
 (28) R. B. Rusu, A. Holzbach, M. Beetz, G. Bradski, Detecting and segmenting objects for mobile manipulation, in: IEEE International Conference on Computer Vision Workshops, 2009, pp. 47–54.
 (29) J. Huang, S. You, Detecting objects in scene point cloud: A combinational approach, in: 2013 International Conference on 3D Vision  3DV 2013, 2013, pp. 175–182. doi:10.1109/3DV.2013.31.
 (30) Z. C. Marton, D. Pangercic, N. Blodow, J. Kleinehellefort, M. Beetz, General 3d modelling of novel objects from a single view, in: Ieee/rsj International Conference on Intelligent Robots and Systems, 2010, pp. 3700–3705.
 (31) B. S. Radu, B. Rusu, K. Konolige, W. Burgard, Narf: 3d range image features for object recognition.
 (32) F. Tombari, S. Salti, L. D. Stefano, Unique signatures of histograms for local surface description, in: European Conference on Computer Vision Conference on Computer Vision, 2010, pp. 356–369.
 (33) F. Tombari, S. Salti, L. D. Stefano, A combined textureshape descriptor for enhanced 3d feature matching 263 (4) (2011) 809–812.
 (34) R. Beserra Gomes, B. Marques, Ferreira Da Silva, L. K. D. M. Rocha, R. V. Aroca, Gon, L. M. G. Alves, Efficient 3d object recognition using foveated point clouds, Computers and Graphics 37 (5) (2013) 496–508.
 (35) F. Tombari, S. Salti, L. D. Stefano, Unique shape context for 3d data description, in: Proceedings of the ACM workshop on 3D object retrieval, 2010, pp. 57–62.
 (36) F. Tombari, S. Salti, L. D. Stefano, Unique signatures of histograms for local surface description., Lecture Notes in Computer Science 6313 (2010) 356–369.
 (37) L. Bo, X. Ren, D. Fox, Depth kernel descriptors for object recognition 32 (14) (2011) 821–826.
 (38) D. Fehr, A. Cherian, R. Sivalingam, S. Nickolay, Compact covariance descriptors in 3d point clouds for object recognition, in: IEEE International Conference on Robotics and Automation, 2012, pp. 1793–1798.
 (39) D. Fehr, W. J. Beksi, D. Zermas, N. Papanikolopoulos, Rgbd object classification using covariance descriptors, in: IEEE International Conference on Robotics and Automation, 2014, pp. 5467–5472.
 (40) W. J. Beksi, N. Papanikolopoulos, Object classification using dictionary learning and rgbd covariance descriptors, in: 2015 IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 1880–1885. doi:10.1109/ICRA.2015.7139443.
 (41) T. Fiolka, J. Stuckler, D. A. Klein, D. Schulz, S. Behnke, Distinctive 3d surface entropy features for place recognition, in: European Conference on Mobile Robots, 2014, pp. 204–209.
 (42) T. Fiolka, J. Stückler, D. A. Klein, D. Schulz, S. Behnke, Sure: Surface entropy for distinctive 3d features, in: International Conference on Spatial Cognition, 2012, pp. 74–93.
 (43) J. Huang, S. You, Point cloud matching based on 3d selfsimilarity, in: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012, pp. 41–48. doi:10.1109/CVPRW.2012.6238913.
 (44) P. Cirujeda, X. Mateo, Y. Dicente, X. Binefa, Mcov: A covariance descriptor for fusion of texture and shape features in 3d point clouds, in: International Conference on 3d Vision, 2014, pp. 551–558.
 (45) H. Rahmani, A. Mahmood, Q. H. Du, A. Mian, Hopc: Histogram of oriented principal components of 3d pointclouds for action recognition, in: European Conference on Computer Vision, 2014, pp. 742–757.
 (46) G. Zhao, J. Yuan, K. Dang, Height gradient histogram (high) for 3d scene labeling, in: International Conference on 3d Vision, 2014, pp. 569–576.
 (47) T. B. Jørgensen, A. G. Buch, D. Kraft, Geometric edge description and classification in point cloud data with application to 3d object recognition.
 (48) S. M. Prakhya, B. Liu, W. Lin, Bshot: A binary feature descriptor for fast and efficient keypoint matching on 3d point clouds, in: Ieee/rsj International Conference on Intelligent Robots and Systems, 2015, pp. 1929–1934.
 (49) X. Li, K. Wu, Y. Liu, R. Ranasinghe, G. Dissanayake, R. Xiong, Risas: A novel rotation, illumination, scale invariant appearance and shape feature.
 (50) K. B. Logoglu, S. Kalkan, A. Temizel, Cospair: Colored histograms of spatial concentric surfletpairs for 3d object recognition, Robotics and Autonomous Systems 75 (part B) (2016) 558–570.
 (51) E. Wahl, U. Hillenbrand, G. Hirzinger, Surfletpairrelation histograms: A statistical 3dshape representation for rapid classification, in: International Conference on 3D Digital Imaging and Modeling, 2003. 3dim 2003. Proceedings, 2003, pp. 474–481.
 (52) K. Tang, P. Song, X. Chen, Signature of geometric centroids for 3d local shape description and partial shape matching, in: Asian Conference on Computer Vision, 2016, pp. 311–326.
 (53) J. Yang, Z. Cao, Q. Zhang, A fast and robust local descriptor for 3d point cloud registration, Information Sciences s 346–347 (2016) 163–179.
 (54) S. M. Prakhya, J. Lin, V. Chandrasekhar, W. Lin, B. Liu, 3dhopd: A fast lowdimensional 3d descriptor, IEEE Robotics and Automation Letters 2 (3) (2017) 1472–1479.
 (55) B. Drost, M. Ulrich, N. Navab, S. Ilic, Model globally, match locally: Efficient and robust 3d object recognition, in: Computer Vision and Pattern Recognition, 2010, pp. 998–1005.
 (56) Z. C. Marton, D. Pangercic, N. Blodow, M. Beetz, Combined 2d–3d categorization and classification for multimodal perception systems, International Journal of Robotics Research 30 (11) (2011) 1378–1402.
 (57) M. Muja, R. B. Rusu, G. Bradski, D. G. Lowe, Rein  a fast, robust, scalable recognition infrastructure (2011) 2939–2946.
 (58) R. B. Rusu, G. Bradski, R. Thibaux, J. Hsu, Fast 3d recognition and pose using the viewpoint feature histogram, in: Ieee/rsj International Conference on Intelligent Robots and Systems, 2014, pp. 2155–2162.
 (59) H. Ali, F. Shafait, E. Giannakidou, A. Vakali, N. Figueroa, T. Varvadoukas, N. Mavridis, Contextual object category recognition for rgbd scene labeling, Robotics and Autonomous Systems 62 (2) (2014) 241–256.
 (60) K. C. Chan, C. K. Koh, C. S. G. Lee, A 3dpointcloud system for humanpose estimation, IEEE Transactions on Systems Man and Cybernetics Systems 44 (11) (2014) 1486–1497.
 (61) A. Aldoma, M. Vincze, N. Blodow, D. Gossow, S. Gedikli, R. B. Rusu, G. Bradski, Cadmodel recognition and 6dof pose estimation using 3d cues, in: IEEE International Conference on Computer Vision Workshops, 2012, pp. 585–592.
 (62) A. Aldoma, F. Tombari, R. B. Rusu, M. Vincze, Ourcvfh – oriented, unique and repeatable clustered viewpoint feature histogram for object recognition and 6dof pose estimation, in: Joint Dagm, 2012, pp. 113–122.
 (63) M. Madry, C. H. Ek, R. Detry, K. Hang, Improving generalization for 3d object categorization with global structure histograms, in: Ieee/rsj International Conference on Intelligent Robots and Systems, 2012, pp. 1379–1386.
 (64) W. Wohlkinger, M. Vincze, Shape distributions on voxel surfaces for 3d object classification from depth images, in: IEEE International Conference on Signal and Image Processing Applications, 2012, pp. 115–120.
 (65) W. Wohlkinger, M. Vincze, Ensemble of shape functions for 3d object classification, in: IEEE International Conference on Robotics and Biomimetics, 2012, pp. 2987–2992.
 (66) C. Y. Ip, D. Lapadat, L. Sieger, W. C. Regli, Using shape distributions to compare solid models, in: ACM Symposium on Solid Modeling and Applications, 2002, pp. 273–280.
 (67) T. Chen, B. Dai, D. Liu, J. Song, Performance of global descriptors for velodynebased urban object recognition, IEEE, 2014.
 (68) J. Cheng, Z. Xiang, T. Cao, J. Liu, Robust vehicle detection using 3d lidar under complex urban environment, in: IEEE International Conference on Robotics and Automation, 2014, pp. 691–696.
 (69) S. H. Kasaei, A. M. Tomé, L. S. Lopes, M. Oliveira, Good: A global orthographic object descriptor for 3d object recognition and manipulation ☆, Pattern Recognition Letters 83 (2016) 312–320.
 (70) J. P. S. d. M. Lima, V. Teichrieb, An efficient global point cloud descriptor for object recognition and pose estimation, in: 2016 29th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), 2016, pp. 56–63. doi:10.1109/SIBGRAPI.2016.017.
 (71) B. Lin, F. Wang, F. Zhao, Y. Sun, Scale invariant point feature (sipf) for 3d point clouds and 3d multiscale object detection, Neural Computing and Applicationsdoi:10.1007/s0052101729641.
 (72) I. V. Alexander Patterson, P. Mordohai, K. Daniilidis, Object Detection from LargeScale 3D Datasets Using BottomUp and TopDown Descriptors, Springer Berlin Heidelberg, 2008.
 (73) M. Himmelsbach, T. Luettel, H. J. Wuensche, Realtime object classification in 3d point clouds using point feature histograms, in: Ieee/rsj International Conference on Intelligent Robots and Systems, 2009, pp. 994–1000.
 (74) J. F. Lalonde, N. Vandapel, D. F. Huber, M. Hebert, Natural terrain classification using three‐dimensional ladar data for ground robot mobility, Journal of Field Robotics 23 (10) (2006) 839–861.
 (75) M. Lehtomäki, A. Jaakkola, J. Hyyppä, J. Lampinen, H. Kaartinen, A. Kukko, E. Puttonen, H. Hyyppä, Object classification and recognition from mobile laser scanning point clouds in a road environment, IEEE Transactions on Geoscience and Remote Sensing 54 (2) (2016) 1226–1239.
 (76) Z. Wang, L. Zhang, T. Fang, P. T. Mathiopoulos, X. Tong, H. Qu, Z. Xiao, F. Li, D. Chen, A multiscale and hierarchical feature extraction method for terrestrial laser scanning point cloud classification, IEEE Transactions on Geoscience and Remote Sensing 53 (5) (2015) 2409–2425.
 (77) K. Alhamzi, M. Elmogy, S. Barakat, 3d object recognition based on local and global features using point cloud library 7 (2015) 43–54.
 (78) Y. Zhong, Intrinsic shape signatures: A shape descriptor for 3d object recognition, in: IEEE International Conference on Computer Vision Workshops, 2010, pp. 689–696.
 (79) H. Hwang, S. Hyung, S. Yoon, K. Roh, Robust descriptors for 3d point clouds using geometric and photometric local feature, in: Ieee/rsj International Conference on Intelligent Robots and Systems, 2012, pp. 4027–4033.
 (80) K. Lai, L. Bo, X. Ren, D. Fox, A largescale hierarchical multiview rgbd object dataset, in: IEEE International Conference on Robotics and Automation, 2011, pp. 1817–1824.
 (81) A. G. Buch, H. G. Petersen, N. Krüger, Local shape feature fusion for improved matching, pose estimation and 3d object recognition, Springerplus 5 (1) (2016) 1–33.