Log In Sign Up

A comprehensive review of 3D point cloud descriptors

The introduction of inexpensive 3D data acquisition devices has promisingly facilitated the wide availability and popularity of 3D point cloud, which attracts more attention on the effective extraction of novel 3D point cloud descriptors for accurate and efficient of 3D computer vision tasks. However, how to de- velop discriminative and robust feature descriptors from various point clouds remains a challenging task. This paper comprehensively investigates the exist- ing approaches for extracting 3D point cloud descriptors which are categorized into three major classes: local-based descriptor, global-based descriptor and hybrid-based descriptor. Furthermore, experiments are carried out to present a thorough evaluation of performance of several state-of-the-art 3D point cloud descriptors used widely in practice in terms of descriptiveness, robustness and efficiency.


page 1

page 2

page 3

page 4


EgoNN: Egocentric Neural Network for Point Cloud Based 6DoF Relocalization at the City Scale

The paper presents a deep neural network-based method for global and loc...

Learning a Task-specific Descriptor for Robust Matching of 3D Point Clouds

Existing learning-based point feature descriptors are usually task-agnos...

Performance Evaluation of 3D Keypoint Detectors and Descriptors on Coloured Point Clouds in Subsea Environments

The recent development of high-precision subsea optical scanners allows ...

POEM: 1-bit Point-wise Operations based on Expectation-Maximization for Efficient Point Cloud Processing

Real-time point cloud processing is fundamental for lots of computer vis...

Robotic Ironing with 3D Perception and Force/Torque Feedback in Household Environments

As robotic systems become more popular in household environments, the co...

Topologically Persistent Features-based Object Recognition in Cluttered Indoor Environments

Recognition of occluded objects in unseen indoor environments is a chall...

1 Introduction

The availability of the low-cost 3D sensors, e.g. Microsoft Kinect, Time of Flight, has gained increasingly interest in using three dimensional point clouds Rusu20113D Han2017A for solving many tasks, such as 3D object recognition, classification, robot localization and navigation, the fundamental applications in 3D computer vision and robotics, since 3D point clouds has the capability of providing greatly important information cues for analyzing objects and environments.

The considerably crucial steps involved in 3D applications is 3D descriptors or 3D features extraction which have a significant effect on the overall performance of descriptive results. A powerful and discriminative descriptor should be able to capture the geometric structure and invariant to translation, scaling and rotation at the same time. How to extract a meaningful 3D descriptor from 3D point clouds, especially in environment with occlusion and clutter, therefore is still a challenging research area and worth being extensively investigated.

The existing 3D descriptors in literature can be distinctly divided into three main categories: global-based descriptors, local-based descriptors and hybrid-based descriptors. The former generally estimate a single descriptor vector encoding the whole input 3D object. The success of global descriptors relies on the observation of the entire geometry of point clouds of objects, which turns out to be a little more difficult. While the local descriptors construct features resorting to the geometrical information of the local neighborhood of each keypoint obtained from point cloud using relevant keypoint-extraction algorithms. And the local descriptors are robust to occlusion and clutter

Guo2013Rotational which the global descriptors are not. However, the local descriptors are sensitive to the changes in the neighborhoods around keypoints Hadji2014Local . The hybrid-based descriptors is the sort of descriptors fusing the essential theorem of local and global descriptors or incorporating both kinds of descriptors together to make the most of the advantages of local and global features.

Although there exists a few review papers Aldoma2012Tutorial Alexandre20123D Salti2012On Mateo2014A Guo2016A on 3D descriptors and its related fields published. However, these papers covered only a rather limited number of descriptors to be evaluated, particularly, most only focus on local descriptors for mesh or depth image while others concentrate on performance of a small range of global descriptors for particular applications (e.g. Urban Object Recognition). Overall, there is no survey papers specifically emphasizing the comprehensive review, analysis and evaluation of the 3D point cloud descriptors.

The main contributions of this paper includes:i) To the best of our knowledge, this is the first review paper specially concentrating on 3D point cloud descriptors, including both local-based and global-based descriptors. ii) we give readers a comprehensively insightful introduction of the state-of-the-art descriptors from early work to the recent research (including 31 local-based descriptors, 14 global-based descriptors and 5 hybrid-based descriptors). iii) The main traits of these descriptors are summarized in table form offering an intuitive understanding. iv) This paper conducts an experiment on performance evaluation of several extensively used descriptors.

The organization of the reminder of the paper is as follows: a comprehensive description of the state-of-the-art 3D point cloud descriptors is presented in section 2. Section 3 shows the detailed results and insightful analysis of the experimental comparison we carry out to overall evaluate several descriptors, followed by conclusion in Section 4.

2 3D point cloud descriptors

3D point cloud descriptors has recently been extensively addressed in many fields. And numerous works processing on this research area have been made in recent decades. The main approaches extracting features from 3D point cloud can be categorized into the following groups: local descriptor, global descriptor and Hybrid methods. We list the approaches relating to each category chronologically by year of publication.

2.1 Local Descriptors

3D Local descriptors have been developed to encode the local geometric information of feature point (e.g. surface normals and curvatures) which is directly associated with the quality and resolution of point cloud models. Generally, these descriptors can be used in applications like registration, object recognition and categorization.

2.1.1 Spin Image

Johnson et al. Johnson1998Surface introduced a local shape based descriptors on 3D point clouds called spin images. The feature point is represented using its own coordinate p and surface normal n. Then the spin image attributes of each neighboring point q of the feature point is defined as a pair of distances (,), where and (shown in Figure 1). And finally, the spin image is generated by accumulating the neighbors of feature points in discrete 2D bins. This descriptor is robust to occlusion and clutter 765655 , but the presence of high-level noise will lead to degradation of performance.

Figure 1: Spin Image.

Matei et al. Matei2006 incorporated the spin image descriptor into their work to handle the recognition of 3D point clouds of vehicles which is a challenging problem.

Similarly, Shan et al. Shan2006 also took the spin image as descriptive descriptors to propose Shapeme Histogram projection(SHP) approach, together with a Bayesian framework to complete partial object recognition through projection of the descriptor of the query model onto the subspace of the model database.

Golovinskiy et al. Golovinskiy2009 applied the shape-based spin image descriptor in conjunction with contextual features to form a discriminative descriptor to identity the urban objects.

2.1.2 3D Shape Context

Frome et al.Frome2004 directly extended the 2D shape context descriptor to 3D point cloud, generating 3D shape contexts(3DSC). A spherical support region, centered on a given feature point p, is first determined with its north pole orienting as the surface normal n. Within the support, a set of bins (Figure 2) is formed by equally dividing the azimuth and elevation and logarithmically spacing radial dimension. Then, the final 3DSC descriptor is computed as the weighted sum of the number of points falling into bins. Actually, this descriptor captures the local shape of point cloud at p using the distribution of points in a spherical support. However, it requires computation of multiple descriptors for each feature point because of no definition of Reference Frame at feature point.

Figure 2: Bins of 3D shape context Frome2004 .

2.1.3 Eigenvalues based Descriptors

Vandapel et al. Vandapel2004

devised a eigenvalues based descriptors to extracted saliency features. The eigenvalues are attained by decomposing the local covariance matrix defined in a local support region around feature points and decreasingly ordered

. Three saliencies, named point-ness, curve-ness and surface-ness respectively, are constructed by means of linear combination of these eigenvalues.


In literature Anand2013Contextually ,Kahler2013Efficient ,Zelener2015Classification , the above three features also are integrated into the corresponding feature representations to perform object detection in indoor scenes, 3D scene labeling and vehicles detection, respectively.

2.1.4 Distribution Histogram

Distribution Histogram (DH) by Anguelov et al. Anguelov2005 is computed based on the principal plane around each point. First, PCA is run on points in a defined cube to obtain a plane that spanned by the first two principal components. Then, the cube is divided into several bins,oriented the plane. Finally, the descriptor is formed by summing up the number of points falling into each sub-cubes. Experiment shows that this feature, incorporated with Markov random fields model, is applicable to objects in both outdoor and indoor environments

2.1.5 Histogram of Normal Orientation

Triebel et al. Triebel2006 evaluated a new feature descriptor representing a local distribution of the cosine of the angles between surface normal on p

and the normals of its neighbors. And the local distribution on these angles is defined as a local histogram. In general, regions with a strong curvature result in a uniformly distributed histogram, while flat areas lead to a peaked histogram

Behley2012 .

2.1.6 Intrinsic Shape Signatures

The Intrinsic Shape Signatures is devised as following procedures. In the first step, for a feature point p, a local reference frame (e,e,ee) is computed, where e, e and e is the eigen vectors obtained using the eigen analysis of p’s spherical support. In the second step, the spherical angular space () constructed using octahedron is partitioned into several bins. And the ISS descriptor is a 3D histogram created by counting weighted sum of points in each bin. It is stated that the ISS descriptor is stable, repeatable, informative and discriminative.

Ge et al. Ge2016Non used multilevel ISS approach to extract feature descriptor as an important step of their registration framework.

2.1.7 ThrIFT

Motivated by the success of SIFT and SURF, ThrIFT has been proposed by Flint et al. Flint2007Thrift taking orientation information into account. For each keypoint p and point q from its neighbouring support, two windows W and W are computed before estimating their least-squares plane’s normals n and n (shown in Figure 4). The output descriptor for p is generated by binning the angle between n and n into a histogram.


Although the ThiIFT descriptor can yield promising results, it is sensitive to noise.

Figure 3: Example of two plane for two windows and corresponding normals Flint2007Thrift .

2.1.8 Point Feature Histogram

Point Feature Histogram, PFH for simplicity proposed by Rusu et al. Rusu2008 Rusu2008Aligning , uses the relationships between point pairs in the support region (as shown in Figure 4(a)) and estimated surface normals to represent the geometric properties. For every pair of points p and p in the neighborhood of p, where one point is chosen as p and the other as p, first, a Darboux frame is constructed at p as (as shown in Figure 4(b)):


Then, using the frame defined above, three angular features, i.e. , , , expressing the difference between normals n and n, and the distance d are computed for each point pair in the support region.

Figure 4: Point Feature Histogram. (a) Support region for a query point p;( b)Darboux frame Rusu2008 .

And the final PFH representation is created by binning these four features into a histogram with div bins, where div is the number of subdivisions along each features’ value range.

2.1.9 Fast Point Feature Histogram

As for PFH described above, its computational complexity for a given point cloud with n points is O(nk), where k is the number of neighbors of a query point, which makes PFH inappropriate to be used in the real-time application. Therefore, in order to reduce computational complexity, Rusu et al.Rusu2009Fast Rusu2009Detecting introduced Fast Point Feature Histogram (FPFH) to simplify the PFH descriptor. The FPFH consists of two steps. The first step is the construction of the Simplified Point Feature Histogram (SPFH). Three angular features , , between each query point and its neighbors (In the case of PFH, these angles are computed for all the point pairs in the support region) are computed using the same way as PFH and are binned into three separate histograms. Then these histograms are concatenated to generate the SPFH. In the second step, the final FPFH of a query point p is calculated as the sum of the SPFH of p and the weighted sum of the SPFH of each point p in its k-neighbors.


Where w donates the distance between the query point and a neighbor in the support region (as shown in Figure 5). The FPFH descriptor can greatly reduce the computational complexity to O(nk).

Figure 5: Fast Point Feature Histogram.Support region for a query point p Rusu2009Fast .

Huang et al. Huang2013

combined the FPFH descriptor and SVM (Support Vector Machine) learning algorithms for detecting objects in Scene Point Cloud, which achieving effective results for cluttered industrial scenes.

2.1.10 Radius-based Surface Descriptor

Radius-based Surface Descriptor (RSD) Marton2010General depicts the geometric property of point by estimating the radial relation with its neighbouring points. The radius is modeled as relation between distance of two points and the angle between normals on these two points as follows.


By solving this equation, the maximum radius and minimum radius are obtained to constructed the final descriptor for each point presented as d = [r,r]. The advantage of this method is simple and descriptive.

2.1.11 Normal Aligned Radial Feature

Given a point cloud, Normal aligned radial feature descriptor Radu2010NARF is estimated as following steps. A normal aligned range value patch around the feature point is computed by constructing a local coordinate system. Then a star shaped pattern is projected into this patch to form the final descriptor. And the rotationally invariant version of NARF descriptor is achieved by shifting this descriptor according to a unique orientation extracted from the original NARF descriptor. This approach obtain a better results on feature matching.

2.1.12 Signature of Histogram of Orientation

Signature of Histogram of Orientation(SHOT) descriptor, presented by Tombari et al. Tombari2010Unique , can be considered as a combination of Signatures and Histograms. First, a repeatable local Reference Frame (LRF) with disambiguation and uniqueness is computed for the feature point based on disambiguated Eigenvalue Decomposition(EVD) of the covariance matrix of points within the support region. Then, an isotropic spherical grid is used to define the signature structure that patitions the neighborhood along the radial, azimuth and elevation axes (as shown in Figure 6). For each grid sector, point counts are accumulated into bins based on the angle between normal at each neighboring point within the corresponding grid sector and normal at feature point to obtain a local histogram. And the final SHOT descriptor is formed by juxtaposing all these local histograms.

Figure 6: Signature structure for SHOT Tombari2010Unique .

In order to improve the accuracy of feature matching, Tombari et al. Tombari2011A incorporated texture information (CIELab color space) to extend the SHOT to form the color version, i.e. SHOTCOLOR or CSHOT. Experimental results show that it presents a good balance between recognition accuracy and time complicity.

Gomes et al. Beserra2013Efficient introduced SHOT descriptor on foveated point clouds to their objection recognition system to reduce the processing time. And experiments show attractive performance.

2.1.13 Unique Shape Context

Unique Shape Context Tombari2010Uniqueshape can be considered as an improvement of 3D Shape descriptor by adding a unique, unambiguous local reference frame with the purpose of avoiding computation of multiple features at each keypoint. Given a query point p and its spherical support region with radius R, a weighted covariance matrix is defined as


Where . Three unit vector of LRF is computed from the Eigen Vector Decomposition of M

. The eigenvectors corresponding to the maximum and minimum eigenvalues are re-oriented in order to match the majority of the vectors they depicted, while the sign of the third eigenvector is determined by cross product. The detailed definition of this LRF is presented in

Tombari2010Uniqueorignal . Once the LRF is builded, the construction of USC descriptor follows the approach analogous to that used in 3DSC. From the view point of the memory cost and efficiency, the USC upgrades 3DSC notably.

2.1.14 Depth Kernel descriptor

Motivated by kernel descriptors, Bo et al. Bo2011Depth extended this idea to 3D point clouds to derive five local kernel descriptors that describe size, shape and edges,respectively. Experimental results show that these features can complement each other and the formed method turns out to significantly improve the accuracy of object recognition.

2.1.15 Spectral Histogram

Similar to the SHOT descriptor, Behley et al. Behley2012 proposed the Spectral Histogram (SH)descriptor. They first decomposed the covariance matrix defined in the support radius to compute eigenvalues , , , where . The corresponding normalized eigenvalues are evaluated by=/. Then three signature values , - and - are calculated respectively. Finally, they subdivided these three values into different sector (Figure 7) and accumulated the number of points falling into every sector to form the descriptor. This method is a best choice for the classification in urban environment.

Figure 7: the Spectral Histogram Behley2012 .

2.1.16 Covariance Based Descriptors

Inspired by the core idea of spin images, Fehr et al. Fehr2012Compact newly exploited covariance based descriptors for 3D point clouds due to the compactness and flexibility of containing multiple features for representational power. These features they decided to capture the geometric relation included ,,,, and n, which are shown in the following figure.The advantages of this method are low computational and storage requirements, no model parameters to tune and scalability relevant to various features.

Figure 8: The different features are used in covariance based descriptor Fehr2012Compact .

To further increase performance, Fehr et al.Fehr2014RGB took the additional r,g,b color channel values from ’colored’ point clouds into consideration to form a straightforward extension of the original covariance based descriptors. And this approach actually yields promising results.

Beksi et al. 7139443 computed covariance descriptors that also encapsulated both shape features and visual features on the entire point cloud. However, the difference from Fehr2014RGB

is that they incorporated the principal curvatures and Gaussian curvature into shape vector while adding gradient and depth into visual vector. Then ,together with dictionary learning, a point cloud classification framework has been constructed to classify the object.

2.1.17 Surface Entropy

Fiolka et al. Fiolka2014Distinctive Fiolka2012SURE designed a novel local shape-texture descriptor from the perspective of shape and color information (if it is available). Regarding shape descriptor, a local reference frame (u,v,w)is defined using two surfels (p,n) and (q,n) at keypoint p and its neighbor q. After that, surfel pair relations containing four features , , and are estimated.


The shape descriptor is constructed by building histograms using discretion of the surfel pair relations into several bins each. If color information at points is available, two more histograms are builded based on the hue and saturation in HSL color space. The final descriptor is view-pose invariant

2.1.18 3D Self-similarity Descriptor

Huang et al. Huang2012 developed a specifically designed descriptor by extending the concept of self-similarity to 3D point clouds. This 3D self-similarity descriptor contains two major steps. In the first step, three similarities , named normal similarity, curvature similarity and photometric similarity, between two points x and y are approximated as follows.


Then, these similarities are combined together to define the united similarity:


By comparing the feature point’s united similarity to that of neighbors within its local spherical support, the self-similarity surface is directly constructed. The second step is to build a local reference at the feature point to guarantee the rotation invariance and quantize the correlation space into cells with average similarity value of points falling in corresponding cells to form the descriptor (shown in Figure 9). This descriptor can efficiently characterize the distinctive geometric signatures in point clouds.

Figure 9: Local reference and quantization Huang2012 .

2.1.19 Geometric and Photometric Local Feature

The Geometric and Photometric Local Feature (GPLF) makes full use of geometric properties and photometric characteristics denoted as =(). Given a point p and its normal n, the k-nearest neighbors p of p is estimated. With these points, several geometric parameters are derived as follows.


The two color features and are calculated in HSV color space according the equations below.


And the last two angular features and are defined as


where . The final GPLF is a histogram created on these four features with the size of 128.

2.1.20 Mcov

Cirujeda et al. Cirujeda2014MCOV fused visual and 3D shapes information to develop the new MCOV covariance descriptor. Given a feature point p and its radial neighborhood N

, a feature selection function is defined as follows.


where is represented as a vector (R,G,B,,,) (as shown in Figure 10). The first three elements corresponds to the R,G,B values at p in RGB space capturing the texture information. And the last three components are p,(p-p), p,(p-p) and n,n respectively. Then, a covariance descriptor at p is calculated by


Where, is the mean of the . Results on testing point clouds demonstrate that the MCOV providing a compact and depictive representation boost the discriminative performance dramatically.

Figure 10: The angular measures encoded in MCOV.

2.1.21 Histogram of Oriented Principal Components

To handle viewpoint variations and effect of noise, Rahmani et al. Rahmani2014HOPC proposed a 3D point cloud descriptor, named Histogram of Oriented Principal Components(HOPC). First, PCA is performed on the support centered key point to yield the eigenvectors and corresponding eigenvalues. Then, they projected each eigenvector into m directions derived from a regular m-sided polyhedron and scaled it by the corresponding eigenvalue. Finally, the projected eigenvectors in decreasing order of eigenvalues is concatenated to form the HOPC descriptor.

2.1.22 Height Gradient Histogram

Height Gradient Histogram (HGIH), developed by Zhao et al.Zhao2014Height , takes full advantage of the height dimension data which is firstly extracted from 3D point cloud as f(p)=p (x corresponds to height) for each point p=(p,p,p). After that, the linear gradient reconstruction method is used to compute the height gradient f(p) of point p based on its neighbors. Secondly, the spherical support of each point p is divided into K sub-regions and the gradient orientation of points with one sub-region is encoded to form a histogram. Finally, the HIGH feature descriptor is constructed by concatenating the histograms of all sub-regions. Experimental results show that the HIGH descriptor can give promising performance. However, one major limitation is that its lack of capability of describing small objects well.

2.1.23 Equivalent Circumference Surface Angle Descriptor

The Equivalent Circumference Surface Angle Descriptor (ECSAD)J2015Geometric is developed for addressing the problem of geometric edges detections in 3D point clouds. Consider a feature point p, its local support region is divided into several cells along the radial and azimuth axes only. For each cell, an angle is computed by averaging all the angles between normal n at p and p-p, where p

is the point falling into this cell. In order to handle the empty bins, an interpolation strategy is used to estimate values for these bins. The final dimension of ECSAD is 30.

2.1.24 Binary SHOT

To reduce computation time and memory footprint, Prakhya et al. Prakhya2015B first employed binary 3D feature descriptor, named Binary SHOT (B-SHOT) which is formed through replacing each value of SHOT with either 0 or 1. This construction procedure is performed by encoding every sequential four values taken from a SHOT descriptor in turn into corresponding binary values according to the five possibilities defined by authors. The proposed method requires significantly less memory and obviously faster than SHOT.

2.1.25 Rotation, illumination, Scale Invariant Appearance and Shape Feature

The rotation, illumination, scale invariant appearance and shape feature (RISAS), presented by Li et al. Li2016RISAS , is a feature vector including three statistical histograms, namely spatial distribution, intensity information and geometrical information. As for spatial distribution, a spherical support is divide into n sectors and information (position and depth value) in each sector is encoded to form spatial histogram. Regarding intensity information, relative intensity instead of absolute intensity is grouped into n bins to construct the intensity histogram. Finally, with respect to geometric information, the quantity = n,n between normal n at feature point p and normal n at its neighbor qis computed and these values are then binned to form the geometric histogram. Experimental results show the effectiveness of proposed descriptor in point cloud alignment.

2.1.26 Colored Histograms of Spatial Concentric Surflet-Pairs

Similar to PFH and FPFH descriptors, the Colored Histogram of Spatial Concentric Surflet-Pairs (CoSPAIR) descriptor in Logoglu2016CoSPAIR is also on the basis of surflet-pair relations Wahl2003Surflet . Considering the query point p and its neighbor q within its spherical support, a fixed reference frame and three angle between them are estimated using the same strategies as PFH. Then, the neighbouring space is partitioned into several spherical shell (called level) along the radius. For each level, three distinct histograms are generated by accumulating points in it along the three angular features. The original SPAIR is the concatenation of the histograms in all levels. Finally, CIELab color space are binned into histograms for each channel at each level, together with the SPAIR descriptor to form the final CoSPAIR descriptor. It is reported that this is a simple and faster method.

2.1.27 Signature of Geometric Centroids

Tang et al. Tang2016Signature generated a new Signature of Geometric Centroids (SGC) descriptor by first constructing a unique LRF centered at feature point p based on PCA. Then its local spherical support S aligns with the LRF and a cubical volume encompassing S is defined with edges length of 2R following by partitioning this volume evenly into KKK voxels. Finally, the descriptor is constructed by concatenating the number of points within each voxel and the position of centroid for calculated for these points. The dimension of SGC descriptor is 2KKK.

2.1.28 Local Feature Statistics Histograms

Yang et al.Yang2016A exploited the statistical properties of three local shape geometry: local depth, point density and angles between normals. Given a keypoint p from point cloud and its spherical neighborhood N, all neighbors p are projected on a tangent plane estimated along the normal n at p to form new points p. The local depth is then computed as d=r- n(p-p). The angular feature is denoted as =arccos(n,n). Regarding point density, the ratio of point falling into each annulus generated by equally dividing the circle on the plane is determined via horizontal projection distance .


The final descriptor is constructed using the concatenation of the histograms builded on these three features and it has low dimension, low computational complexity and is robust to various nuisances.

2.1.29 3D Histogram of Point Distribution

The 3D Histogram of Point Distribution descriptor Prakhya20173DHoPD are formed using the following steps. First, a local reference frame is for each keypoint and the local surface is aligned with the LRF to achieve rotational invariance. Next, along x axis, the histograms are created by partitioning the range between the smallest and largest x coordinate value of the points in the surface into D bins and accumulating the points falling in each bin. Repeating the same processing along y and z axis, the 3DHoPD is generated by concatenating these histograms together. The advantage of this method is that the computation is greatly fast.

2.2 Global Descriptors

Global descriptors encode the geometric information of the whole 3D point cloud, which requires relatively less computation time as well as memory footprint. In general, the global descriptors are increasingly used for the purpose of 3D object recognition, geometric categorization and shape retrieval.

2.2.1 Point Pair Feature

Similar to surflet-pair feature, given two points p and p and their normals n and n, the Point Pair Feature (PPF) Drost2010Model is defined as , where the range of angle is [0;]. Feature with the same discrete version are aggregated together. And the global descriptor is formed by mapping the sampled PPF space to the model.

2.2.2 Global RSD

Global RSD Marton2011Combined can be regarded as a global version of local RSD descriptor. After voxelizing the input point cloud, the smallest and largest radius are estimated in each voxel which is labeled using one of the five surface primitives (e.g. planar, sphere) by checking the radii. Once all voxles have bee labeled, a global RSD descriptor can be devised by analyzing the relations between all labels.

2.2.3 Viewpoint Feature Histogram

The Viewpoint Feature Histogram (VFH) Muja2011REIN Rusu2014Fast

extends the idea and properties of FPFH by including additional viewpoint variance. It is a global descriptor composed of a viewpoint direction component and a surface shape component. The viewpoint component is a histogram of angle

between central viewpoint direction and each point’s surface normal. As for shape component, three angle , and computed similarly as PFPF are binned into three distinct sub-histograms, respectively, each with 45 bins. The VFH turns out to have high recognition performance and a computational complexity of O(n). However, the main drawback of VFH is its sensitivity to noise and occlusions.

Ali et al. Ali2014Contextual integrated the VFH descriptor into their proposed system to perform scene labeling. Chan et al. Chan2014A extracted VFH feature from point cloud of human to calculate the human-pose estimation.

2.2.4 Clustered Viewpoint Feature Histogram

The Clustered Viewpoint Feature Histogram descriptor Aldoma2012CAD can be considered as an extension of VFH, which takes into account the advantage from stable object regions obtained by applying a region growing algorithm after removing the points with high curvature. Given a region s from the regions set S, a Darboux frame (u,v w) similar to FPH is constructed using the centroid p and its corresponding normal n of s. Then the angular information () as in VFH are binned into four histograms followed by computation of Shape Distribution Component (SDC) of CVFH. . The final CVFH descriptor is the concatenation of histograms created from () with the size of 308. Although the CVFH can produce promising results, lacking of notion of an aligned Euclidean space causes the feature to miss a proper spatial description Aldoma2012OUR .

2.2.5 Oriented, Unique and Repeatable Clustered Viewpoint Feature Histogram

The Oriented, Unique and Repeatable Clustered Viewpoint Feature Histogram (OUR-CVFH) has the same surface shape components and viewpoint component as CVFH, but the dimension of viewpoint component is reduced to 64 bins. On the other hand, eight histograms of distances between points of surface S and centroid are built to replace the shape distribution component in CVFH based on reference frames that is yielded by employing the Semi-Global Unique Reference Frame (SGURF) method. The size of the final OUR-CVFH is 303.

2.2.6 Global Structure Histograms

The Global Structure Histograms (GSH) Madry2012Improving

descriptor encodes the global and structural properties in the 3D point cloud in three-stage pipeline. Consider a point cloud model, first, a local descriptor is estimated for each point and then all points based on their descriptors are labeled one approximated surface class by using k-means algorithm followed by the computation of Bag-of-Words model. In the second step, the relationship between different class is determined along the surface form by triangulation. The final step is the construction GSH descriptor representing the object as the distribution of distance along the surface in form of histogram. This descriptor not only maintains the low variations, but also reflects the global expressiveness power.

2.2.7 Shape Distribution on Voxel Surfaces

Based on the concept of shape function, Shape Distribution on Voxel Surfaces (SDVS)Wohlkinger2012Shape is proposed by first creating a voxel grid approximating the real surface of the point cloud, and with its help, the distance between two points randomly sampled from point cloud are binned into three histograms according to the line’s location (inside the 3D model, outside or mixture of both). Although the method is simple, it cannot address the confusion shape well.

2.2.8 Ensemble of Shape Functions

Aiming at real-time application, Wohlkinger et al. Wohlkinger2012Ensemble introduced a global descriptor from partial point cloud, dubbed Ensemble of Shape Functions which is a concatenation of ten 64-sized histograms of shape distributions including three angle histograms, three area histograms, three distance histograms and one distance ratio histogram. The first nine histograms are created by respectively classifying the A3 (angle formed by randomly sampled three points), D3 (area created by three points) and D2 shape function (line between point-pair sampled from point cloud) into ON, OFF and MIXED cases according to the mechanism mentioned in Ip2002Using , while the distance ration histogram is builded on the lines from D2 shape function. The final ESF descriptor has proven to be efficient and expressive.

2.2.9 Global Fourier Histogram

The Global Fourier Histogram (GFH) descriptor Chen2014Performance is generated for an oriented point which is chosen as the original point 0. The normal at 0 together with global z-axis (z=[0,0,1]) are utilised to construct the global reference frame. whereafter, a 3D cylindrical support region centered at 0

is divided into several bins by equally spacing elevation, azimuth and radial dimensions. The GFH descriptor is a 3D histogram formed by summing up the points in each bin. To further boost the robustness, 1D Fast Fourier Transform is applied to analyze the 3D histogram along azimuth dimension. This descriptor compensate the drawback of Spin Image method.

2.2.10 Position-related Shape,Object Height along Length, Reflective Intensity Histogram

In order to improve vehicle detection rate remarkably, three novel features in Cheng2014Robust , namely position-related shape, object height along length and reflective intensity histogram, are extracted together to distinguish robustly vehicles and other objects. The position-related shape takes both shape features (including width to length ratio and width to height ratio) and position information (e.g. distance to object,angle of view and orientation) into account to address the variance of orientation and angle of view. And in the next stage, the bounding box around the vehicles are divided into several blocks along the length and the average height in each block is added in the feature vector to improve discrimination further. Finally, a reflective intensity histogram with 25 bins is computed using characteristic intensity distribution of vehicles. The experimental results show that the final feature is contributed greatly to the classification performance.

2.2.11 Global Orthographic Object Descriptor

The Global Orthographic Object Descriptor (GOOD) Kasaei2016GOOD is constructed by first defining a unique and repeatable Local Reference Frame based on the Principle Component Analysis. Then, the point cloud model is orthographically projected onto three plans, called XoZ, XoY and YoZ. As for XoZ, this plane is divided into several bins and a distribution matrices are computed by accumulating points falling into each bin. And similar processing is preformed for XoY and YoZ. Afterward, two statistical features, i.e. entropy and variance, are estimated for each distribution vector transformed from the matrix above. Concatenating these vectors together forms a single vector for the entire object. Experimental results report that the GOOD is scale and post invariant and give a trade-off between expressiveness and computational cost.

2.2.12 Globally Aligned Spatial Distribution

Globally Aligned Spatial Distribution (GASD), a novel global descriptor proposed by Lima et al. Lima2016 , mainly consists of two steps. A reference frame estimated for the entire point cloud model representing an object is constructed based on PCA, where x and z axis are the eigenvectors v,v corresponding to the minimal and maximal eigenvalues, respectively, of covariance matrix C that computed from P and the centroid of point cloud, y axis is v=vv. Then, the whole point cloud is aligned with this reference frame. Next, a axis-aligned bounding cube of point cloud centered at is partitioned into mmm cells . The global descriptor is achieved by concatenating the histograms formed by summing up the number of points falling into each grid. To get a higher discriminative power, color information based on HSV space is incorporated into the descriptor in a similar way as computing shape descriptor to form the final descriptor. However, this method may not work well with objects having similar shape and color distribution.

2.2.13 Scale Invariant Point Feature

Scale Invariant point Feature (SIPF)(Lin2017S ) computes q=argmin p-q between feature point p and border point q as the reference direction. Then, the angle of a local cylindrical coordinates with q is partitioned into several cells. The final SIPF descriptor is constructed by concatenating all the normalized cell features D=exp(d/(1-d)), where d is the minimum distance between p and points in ith cell.

2.3 Hybrid methods

2.3.1 Bottom-Up and Top-Down Descriptor

Alexander et al. Alexander2008Object combined bottom-up and top-down descriptors to operate on 3D point clouds. In the Bottom-Up stage, spin images are computed for point clouds followed by Top-Down stage, in which global descriptor-Extended Gaussian Images(EGIs)-are used to capture larger scale structure information to further depict the models. Experimental results demonstrate this approach provides balance between efficiency and accuracy.

2.3.2 Local and Global Point Feature Histogram

For real-time application, a novel feature descriptor resorting to local and global properties of point clouds is described by Himmelsbach et al. Himmelsbach2009Real in detail. First, four object level features including maximum object intensity, object intensity mean and variance, and volume are estimated as global part. Then, as for description of local point properties, three histograms are built for three features each, i.g. scatter-ness, linear-ness and surface-ness from Lalonde et al. Lalonde2006Natural followed by adoption of Anguelove et al.’s Anguelov2005 feature extraction approach which produces five more histograms. The finally designed descriptor is proved well suited for object detection in large-sized 3D point cloud scene. Lehtomäki et al.Lehtom2016Object integrated the LGPFH descriptor above into their workflow to recognize objects (such as tree, car, pedestrian, lamp and so on) in road and street environment.

2.3.3 Local-to-Global Signature Descriptor

To overcome the drawbacks of both local and global descriptors, Hadji et al. Hadji2014Local proposed Local-to-Global descriptor (LGS). First, they classified the points based on the minimum radius of RSD Marton2010General to capture the whole geometric property of object. Next, both local and global support regions aggregated by the same class are adopted to describe feature points (local description). Finally, the LGS is created in a signature form. Since this descriptor make use of strengths of both local and global methods, so it is highly discriminative.

2.3.4 Point-Based Descriptor

Wang et al. Wang2015A extracted point-based descriptor from point clouds as the foundation of their multiscale and hierarchical classification framework. The first part of feature (denoted as F) is obtained using eigenvalues (,,, in decreasing order) computed for covariance matric of feature point.


Spin image is adopted as the second part of their wanted feature, represented as F. The final descriptor is the combination of F and F which is an 18 dimensional vector.

2.3.5 Fpfh+vfh

In order to make most of the advantages of local-based and global-based descriptor, Alhamzi et al. Alhamzi20153D adopted the combination of well-known local descriptor FPFH and global descriptor VFH as features representing objects together to implement their 3D object recognition system.

Table 1 summarizes the characteristics of local-based, global-based and hybird-based descriptors extraction approaches. These approaches are arranged in chronological order.

name Reference Category Application Performance
1 SI Johnson et al. 1998 Johnson1998Surface Local 3D object recognition Be robust to occlusion and clutter
2 3DSC Frome et al. 2004 Frome2004 Local 3D object recognition Outperform the SI
3 Eigenvalue Based Vandapel et al. 2004 Vandapel2004 Local 3D terrain classification Be suitable for 3-D data segmentation for terrain classification in vegetated environment
4 DH Anguelov et al. 2005 Anguelov2005 Local 3D segmentation Achieve the best generalization performance
5 SHP Shan et al. 2006 Shan2006 Local 3D object recognition Handle problem of partial object recognition well
6 HNO Triebel et al. 2006 Triebel2006 Local 3D object classification and segmentation Yield more robust classifications
7 Thrift Flint et al. 2007 Flint2007Thrift Local 3D object recognition Be robust to missing data and view-change.
8 BUTD Alexander et al. 2008 Alexander2008Object Hybrid 3D object detection Be applicable to very large datasets and requires limited training efforts
9 PFH Rusu et al. 2008 Rusu2008Aligning Local 3D registration Be invariant to position, orientation and point cloud density.
10 LGPFH Himmelsbach et al. 2009 Himmelsbach2009Real Hybrid 3D object classification Achieve real-time performance.
11 FPFH Rusu et al. 2009 Rusu2009Fast Local 3D registration Reduce time consuming of PFH.
12 GFPFH Rusu et al. 2009 Rusu2009Detecting Global 3D object detection and segmentation Achieve a high accuracy in terms of matching and classification.
13 ISS Zhong et al. 2009 Zhong2010Intrinsic Local 3D object recognition Be discriminative, descriptive and robust to sensor noise, obscuration and scene clutter.
14 PPF Drost et al. 2010 Drost2010Model Global 3D object recognition Achieve high performance in the presence of noise, clutter and partial occlusions.
15 RSD Marton et al. 2010 Marton2010General Local 3D object reconstruction Be a fast feature estimation method.
16 NARF Steder et al. 2010 Radu2010NARF Local 3D object recognition Be invariant to rotation and outperform SI.
17 SHOT Tombari et al. 2010 Tombari2010Unique Local 3D object recognition and reconstruction Outperform SI.
18 USC Tombari et al. 2010 Tombari2010Uniqueshape Local 3D object recognition Improve the accuracy and decrease memory cost of 3DSC.
19 Kernel Bo et al. 2011 Bo2011Depth Local 3D object recognition output the SI significantly
20 GRSD Marton et al. 2011 Marton2011Combined Global 3D object detection, classification and reconstruction Reduce the complexity to be linear in the number of points.
21 VFH Muja et al. 2011 Muja2011REIN Global 3D object recognition Outperform SI and be fast and robust to large surface noise.
22 CSHOT Tombari et al. 2011 Tombari2011A Local 3D object recognition Improve the accuracy of SHOT.
23 CVFH Aldoma et al. 2012 Aldoma2012CAD Global 3D object recognition Outperform SI.
24 OUR-CVFH Aldoma et al. 2012 Aldoma2012OUR Global 3D object recognition Outperform CVFH and SHOT
25 SH Behley et al. 2012 Behley2012 Local Laser data classification Outperform SI.
26 COV Fehr et al. 2012 Fehr2012Compact Local 3D object recognition Be compact and low dimensional and outperform SI.
27 SURE Fiolka et al. 2012 Fiolka2012SURE Local 3D matching Outperform NARF in terms of matching score and run-time.
28 3DSSIM Huang et al. 2012 Huang2012 Local 3D matching and shape retrieve Be invariant to scale and orientation change.
29 GPLF Hwang et al. 2012 Hwang2012Robust Local 3D object recognition Be stable and outperform FPFH.
30 GSH Madry et al. 2012 Madry2012Improving Global 3D object classification Outperform VFH.
31 SDVS Wohlkinger et al. 2012 Wohlkinger2012Shape Global 3D object classification Be fast and robust.
32 ESF Wohlkinger et al. 2012 Wohlkinger2012Ensemble Global 3D object classification Ourperform SDVS,VFH, CVFH and GSHOT.
33 GFH Chen et al. 2014 Chen2014Performance Global 3D urban object recognition Outperform global SI and LGPFH.
34 PRS, OHL and RIH Cheng et al. 2014 Cheng2014Robust Global 3D vehicle detection Improve the vehicles detection performance grately.
35 MCOV Cirujeda et al. 2014 Cirujeda2014MCOV Local 3D matching Outperform CSHOT and SI.
36 LGS Hadji et al. 2014 Hadji2014Local Hybrid 3D object recognition Outperform SI, SHOT and FPFH.
37 HOPC Rahmani et al. 2014 Rahmani2014HOPC Local 3D action recognition Be robust to noise and viewpoint variations and suitable for Action recognition in 3D point cloud
38 HIGH Zhao et al. 2014 Zhao2014Height Local 3D scene labeling Outperform 3DSC, SHOT and FPFH in terms of 3D scene labeling.
39 FPFH + VFH Alhamzi et al. 2015Alhamzi20153D Hybrid 3D object recognition Outperform ESF, VFH and CVFH.
40 ECSAD Jørgensen et al. 2015 J2015Geometric Local 3D object classification and recognition Be fast and produce highly reliable edge detections.
41 B-SHOT Prakhya et al. 2015 Prakhya2015B Local 3D matching Outperform SHOT, RSD and FPFH and Be fast.
42 Point-based Wang et al. 2015 Wang2015A Hybrid 3D terrestrial object classification Suit for classification of objects from terrestrial laser scanning.
43 GOOD Kasaei et al. 2016 Kasaei2016GOOD Global 3D object recognition Outperform VFH, ESF,GRSD and GFPFH. Be robust, descriptive and efficient and suited for real-time application.
44 RISAS Li et al. 2016 Li2016RISAS Local 3D matching Outperform CSHOT and be suitable for point cloud alignment.
45 GASD Lima et al. 2016 Lima2016 Global 3D object recognition Outperform ESF, VFH and CVFH.
46 CoSPAIR Logoglu et al. 2016 Logoglu2016CoSPAIR Local 3D object recognition Outperform PFH, PFHRGB, FPFH, SHOT and CSHOT.
47 SGC Tang et al. 2016 Tang2016Signature Local 3D matching Outperform SI, 3DSC and SHOT.
48 LFSH Yang et al. 2016 Yang2016A Local 3D registration Outperform SI, PFH, FPFH and SHOT in terms of descriptiveness and time efficiency.
49 SIPF Lin et al. 2017 Lin2017S Global 3D object detection Outperform SI, PFH, FPFH and SHOT.
50 3DHoPD Prakhya et al. 2017 Prakhya20173DHoPD Local 3D object detection Require dramatically low-computational time and outperform USC, FPFH SHOT and 3DSC.

Table 1: Methods for 3D Point Cloud Filtering

3 Experimental Results and Discussion

Here, to comprehensively investigate the performance of 3D point cloud descriptors, we proceed to extensively conduct several experiments to make a meaningful evaluation and comparison among the selected state-of-the-art descriptors in terms of descriptiveness and efficiency.

3.1 Datasets

The publicly available dataset we choice to evaluate the local and global descriptors is named Washington RGB-D Objects Dataset Lai2011A . This dataset contains 207,621 3D point clouds (in PCD format) of view of 300 objects which are classified into 51 categories.

3.2 Selected Methods

To carry out experiments for performance evaluation, we choose 13 different descriptors (8 local-based and 5 Global-based) which has a relatively high citation rate, state-of-the-art performance and are commonly used as comparable algorithms. In section 2, we have already given a summary presentation of these descriptors including SI, 3DSC, PFH, PFHColor, FPFH, SHOT, SHOTColor, USC and GFPFH, VFH, CVFH, OUR-CVFH, ESF. All descriptors were implemented using C++, while all experiments are conducted on a PC with Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz and 16GB memory.

3.3 Implementation details

The diagram of recognition pipeline based on local descriptors we used in our designed experiments is illustrated in Figure 11

. including train and test phases. Considering learning strategy, we adopt the classic and famous supervised learning algorithms, named Support Vector Machine (SVM), to process the point cloud dataset. In the case of global-based object recognition, we remove the 3D keypoint extraction from pipeline.

Figure 11: Diagram of point cloud object recognition we adopted in our paper.

3.4 Recognition Accuracy

Object Recognition Accuracy can be used to reflect the descriptiveness of these descriptors. And the basic pipeline in object recognition is to match local/global descriptors between two models. Since there exists several descriptors based on normals, we therefore resort to Principle Component Analysis to estimate the surface normals. Regarding local descriptors, to make a fair comparison, the same keypoint extraction algorithm (SIFT in our experiments) are adopted to select feature points. Furthermore, another important parameter, i.e. the support region size, determining the neighbors around feature point that is utilized to compute local descriptor, is arranged the same value (5cm in our experiments). As for global approaches, we will use these methods to directly extract descriptor from the entire objects’ point clouds without any preprocessing.

Table 2 presents the dimensionality and recognition accuracy of the chose local and global descriptors. From the results, we can draw the following observations. First, in terms of local descriptors, SHOTColor and PFHColor works surprisingly well and become the best-performing methods on this dataset, which outperform the original SHOT and PFH respectively. It can be concluded that fusing color information well contribute to improve the recognition accuracy. Second, the second best performance descriptors are PFH, followed by FPFH and SHOT (producing a similar result). Third, from global descriptors point of view, ESF achieve the highest accuracy compared to other global methods, followed by VFH. OUR-CVFH demonstrates the moderate results which is much better than CVFH and GFPFH.

Descriptors Type Size Recognition Accuracy
SI Local 153 36.7
3DSC Local 1980 21.7
PFH Local 125 53.3
PFHColor Local 250 55.0
FPFH Local 33 51.7
SHOT Local 352 51.7
SHOTColor Local 1344 58.3
USC Local 1960 38.3
GFPFH Global 16 43.3
VFH Global 308 76.7
CVFH Global 308 53.3
OUR-CVFH Global 308 60.0
ESF Global 640 78.3
Table 2: The dimensionality and recognition accuracy of these selected descriptors

3.5 Efficiency

Computational efficiency of feature extraction is another crucial performance measurement for evaluating the 3D point cloud descriptors. As for local-based methods, since the feature construction highly relies on the local surface information, the number of points within the support region therefore directly affects the running time for descriptor generation, we compute the average time on several point cloud models (extracting 100 descriptors from each model) with respect to the size of the support for each local-based approach. In the case of global-based strategies, since the number of points in objects’ point cloud model have directly effect on the computational efficiency, we calculate the average time costs on different models for the formation of global descriptor for each entire model.

Several experiments were carried out to perform evaluation regarding computational efficiency of the selected local and global descriptors. Figure 12 presents the running time of local descriptors with increasing the radius by Buch2016Local . In can be clearly seen that first, SI descriptor is the most efficient descriptor. In contrast, PFHColor and PFH become the most computationally expensive descriptors as the radius increases, although they are faster than 3DSC and USC when the raidus is 0.05. Second, FPFH, SHOT and SHOTColor descriptors achieving almost the same performance in terms of running time are slightly slower compared to SI, but they outperform the other descriptors significantly. Third, it is also relatively time-consuming to build 3DSC ans USC descriptors which outperform PFHColor and PFH remarkably when the radius is greater than 0.07.

Table2 gives the average computation time required to extract a global descriptor over selected point cloud models using different global methods. It is worth pointing out that VFH achieves the best computational performance, which is nearly one or two order of magnitude faster compared to CVFH and OUR-CVFH or GFPFH.

Overall, SI and VFH perform best among local-based descriptors and global-based descriptors respectively. They specifically be suitable for real-time application, while PFHColor and GFPFH take extremely high computational cost to generate the local and global descriptor separately.

Figure 12: Computation time required to generate 100 descriptor using each selected local descriptors in the support region with varying raidus.
Global-based descriptors Computation time(ms)
GFPFH 110,157
VFH 111
CVFH 5,169
ORU-CVFH 5,209
ESF 329
Table 3: Average running time of selected global descriptors on randomly chosen objects’ point cloud models (in ms)

3.6 Discussion

From the experimental results with respect to recognition accuracy and computational efficiency described above, we summarize some points as following.

First, the SHOTColor descriptor produces a relatively satisfactory accuracy at the cost of higher computational time requirement. In contrast, the SI method can be considered as a better choice for time-crucial systems which donot emphasize on descriptiveness performance.

Second, global-based descriptors ESF, VFH and local-based descriptor SHOTColor provide a good balance between accuracy and running efficiency. These three descriptors are suitable for real-time object recognition application to make full of their advantages (expressiveness and high-efficiency).

4 Conclusion

This paper specifically concentrates on the research activity concerning the field of 3D feature descriptors. We summary the main characteristics of the existing 3D point cloud descriptors categorised into three types: local-based, global-based and hybrid-based descriptor, and make a comprehensive evaluation and comparison among several selected descriptors based on their popularity and state-of-the-art performance. Although the rapid development of 3D point cloud descriptors has already gained promising results in many applications, it is believed that how to design a powerful descriptor further is still a challenging research area due to the complex environment, the presence of occlusion and clutter and other nuisances.



  • (1) R. B. Rusu, S. Cousins, 3d is here: Point cloud library (pcl), in: IEEE International Conference on Robotics and Automation, 2011, pp. 1–4.
  • (2) X. F. Han, J. S. Jin, M. J. Wang, W. Jiang, L. Gao, L. Xiao, A review of algorithms for filtering the 3d point cloud, Signal Processing Image Communication 57 (2017) 103–112.
  • (3) Y. Guo, F. Sohel, M. Bennamoun, M. Lu, J. Wan, Rotational projection statistics for 3d local surface description and object recognition, International Journal of Computer Vision 105 (1) (2013) 63–86.
  • (4) I. Hadji, G. N. Desouza, Local-to-global signature descriptor for 3d object recognition, in: Asian Conference on Computer Vision, 2014, pp. 570–584.
  • (5) A. Aldoma, Z. C. Marton, F. Tombari, W. Wohlkinger, C. Potthast, B. Zeisl, R. B. Rusu, S. Gedikli, M. Vincze, Tutorial: Point cloud library: Three-dimensional object recognition and 6 dof pose estimation, Robotics and Automation Magazine IEEE 19 (3) (2012) 80–91.
  • (6) L. A. Alexandre, 3d descriptors for object and category recognition: a comparative evaluation, 2012.
  • (7) T. F. Salti S, Petrelli A, On the affinity between 3d detectors and descriptors, in: 3DIMPVT, 2012, pp. 424–431.
  • (8) C. M. Mateo, P. Gil, F. Torres, A performance evaluation of surface normals-based descriptors for recognition of objects using cad-models, in: International Conference on Informatics in Control, Automation and Robotics, 2014, pp. 428 – 435.
  • (9) Y. Guo, M. Bennamoun, F. Sohel, M. Lu, J. Wan, N. M. Kwok, A comprehensive performance evaluation of 3d local feature descriptors, International Journal of Computer Vision 116 (1) (2016) 66–89.
  • (10) A. E. Johnson, M. Hebert, Surface matching for object recognition in complex 3-d scenes, Image and Vision Computing 16 (9-10) (1998) 635–651.
  • (11) A. E. Johnson, M. Hebert, Using spin images for efficient object recognition in cluttered 3d scenes, IEEE Transactions on Pattern Analysis and Machine Intelligence 21 (5) (1999) 433–449. doi:10.1109/34.765655.
  • (12) B. Matei, Y. Shan, H. S. Sawhney, Y. Tan, R. Kumar, D. Huber, M. Hebert, Rapid object indexing using locality sensitive hashing and joint 3d-signature space estimation, IEEE Transactions on Pattern Analysis and Machine Intelligence 28 (7) (2006) 1111–1126.
  • (13) Y. Shan, H. S. Sawhney, B. Matei, R. Kumar, Shapeme histogram projection and matching for partial object recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence 28 (4) (2006) 568–577. doi:10.1109/TPAMI.2006.83.
  • (14) A. Golovinskiy, V. G. Kim, T. Funkhouser, Shape-based recognition of 3d point clouds in urban environments, in: 2009 IEEE 12th International Conference on Computer Vision, 2009, pp. 2154–2161.
  • (15) A. Frome, D. Huber, R. Kolluri, T. Bülow, J. Malik, Recognizing Objects in Range Data Using Regional Point Descriptors, Springer Berlin Heidelberg, Berlin, Heidelberg, 2004, pp. 224–237.
  • (16) N. Vandapel, D. F. Huber, A. Kapuria, M. Hebert, Natural terrain classification using 3-d ladar data, in: Robotics and Automation, 2004. Proceedings. ICRA ’04. 2004 IEEE International Conference on, Vol. 5, 2004, pp. 5117–5122. doi:10.1109/ROBOT.2004.1302529.
  • (17) A. Anand, H. S. Koppula, T. Joachims, A. Saxena, Contextually guided semantic labeling and search for three-dimensional point clouds, International Journal of Robotics Research 32 (1) (2013) 19–34.
  • (18) O. Kahler, I. Reid, Efficient 3d scene labeling using fields of trees, in: IEEE International Conference on Computer Vision, 2013, pp. 3064–3071.
  • (19) A. Zelener, P. Mordohai, I. Stamos, Classification of vehicle parts in unstructured 3d point clouds, in: International Conference on 3d Vision, 2015, pp. 147–154.
  • (20)

    D. Anguelov, B. Taskarf, V. Chatalbashev, D. Koller, D. Gupta, G. Heitz, A. Ng, Discriminative learning of markov random fields for segmentation of 3d scan data, in: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2, 2005, pp. 169–176 vol. 2.

  • (21) R. Triebel, K. Kersting, W. Burgard, Robust 3d scan point classification using associative markov networks, in: Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006., 2006, pp. 2603–2608. doi:10.1109/ROBOT.2006.1642094.
  • (22) J. Behley, V. Steinhage, A. B. Cremers, Performance of histogram descriptors for the classification of 3d laser range data in urban environments, in: 2012 IEEE International Conference on Robotics and Automation, 2012, pp. 4391–4398. doi:10.1109/ICRA.2012.6225003.
  • (23) X. Ge, Non-rigid registration of 3d point clouds under isometric deformation, Isprs Journal of Photogrammetry and Remote Sensing 121 (2016) 192–202.
  • (24) A. Flint, A. Dick, A. V. D. Hengel, Thrift: Local 3d structure recognition, in: Digital Image Computing Techniques and Applications, Biennial Conference of the Australian Pattern Recognition Society on, 2007, pp. 182–188.
  • (25) R. B. Rusu, Z. C. Marton, N. Blodow, M. Beetz, I. A. Systems, T. U. München, Persistent point feature histograms for 3d point clouds, in: In Proceedings of the 10th International Conference on Intelligent Autonomous Systems (IAS-10, 2008.
  • (26) R. B. Rusu, N. Blodow, Z. C. Marton, M. Beetz, Aligning point cloud views using persistent feature histograms, in: Ieee/rsj International Conference on Intelligent Robots and Systems, 2008, pp. 3384–3391.
  • (27) R. B. Rusu, N. Blodow, M. Beetz, Fast point feature histograms (fpfh) for 3d registration, in: IEEE International Conference on Robotics and Automation, 2009, pp. 1848–1853.
  • (28) R. B. Rusu, A. Holzbach, M. Beetz, G. Bradski, Detecting and segmenting objects for mobile manipulation, in: IEEE International Conference on Computer Vision Workshops, 2009, pp. 47–54.
  • (29) J. Huang, S. You, Detecting objects in scene point cloud: A combinational approach, in: 2013 International Conference on 3D Vision - 3DV 2013, 2013, pp. 175–182. doi:10.1109/3DV.2013.31.
  • (30) Z. C. Marton, D. Pangercic, N. Blodow, J. Kleinehellefort, M. Beetz, General 3d modelling of novel objects from a single view, in: Ieee/rsj International Conference on Intelligent Robots and Systems, 2010, pp. 3700–3705.
  • (31) B. S. Radu, B. Rusu, K. Konolige, W. Burgard, Narf: 3d range image features for object recognition.
  • (32) F. Tombari, S. Salti, L. D. Stefano, Unique signatures of histograms for local surface description, in: European Conference on Computer Vision Conference on Computer Vision, 2010, pp. 356–369.
  • (33) F. Tombari, S. Salti, L. D. Stefano, A combined texture-shape descriptor for enhanced 3d feature matching 263 (4) (2011) 809–812.
  • (34) R. Beserra Gomes, B. Marques, Ferreira Da Silva, L. K. D. M. Rocha, R. V. Aroca, Gon, L. M. G. Alves, Efficient 3d object recognition using foveated point clouds, Computers and Graphics 37 (5) (2013) 496–508.
  • (35) F. Tombari, S. Salti, L. D. Stefano, Unique shape context for 3d data description, in: Proceedings of the ACM workshop on 3D object retrieval, 2010, pp. 57–62.
  • (36) F. Tombari, S. Salti, L. D. Stefano, Unique signatures of histograms for local surface description., Lecture Notes in Computer Science 6313 (2010) 356–369.
  • (37) L. Bo, X. Ren, D. Fox, Depth kernel descriptors for object recognition 32 (14) (2011) 821–826.
  • (38) D. Fehr, A. Cherian, R. Sivalingam, S. Nickolay, Compact covariance descriptors in 3d point clouds for object recognition, in: IEEE International Conference on Robotics and Automation, 2012, pp. 1793–1798.
  • (39) D. Fehr, W. J. Beksi, D. Zermas, N. Papanikolopoulos, Rgb-d object classification using covariance descriptors, in: IEEE International Conference on Robotics and Automation, 2014, pp. 5467–5472.
  • (40) W. J. Beksi, N. Papanikolopoulos, Object classification using dictionary learning and rgb-d covariance descriptors, in: 2015 IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 1880–1885. doi:10.1109/ICRA.2015.7139443.
  • (41) T. Fiolka, J. Stuckler, D. A. Klein, D. Schulz, S. Behnke, Distinctive 3d surface entropy features for place recognition, in: European Conference on Mobile Robots, 2014, pp. 204–209.
  • (42) T. Fiolka, J. Stückler, D. A. Klein, D. Schulz, S. Behnke, Sure: Surface entropy for distinctive 3d features, in: International Conference on Spatial Cognition, 2012, pp. 74–93.
  • (43) J. Huang, S. You, Point cloud matching based on 3d self-similarity, in: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012, pp. 41–48. doi:10.1109/CVPRW.2012.6238913.
  • (44) P. Cirujeda, X. Mateo, Y. Dicente, X. Binefa, Mcov: A covariance descriptor for fusion of texture and shape features in 3d point clouds, in: International Conference on 3d Vision, 2014, pp. 551–558.
  • (45) H. Rahmani, A. Mahmood, Q. H. Du, A. Mian, Hopc: Histogram of oriented principal components of 3d pointclouds for action recognition, in: European Conference on Computer Vision, 2014, pp. 742–757.
  • (46) G. Zhao, J. Yuan, K. Dang, Height gradient histogram (high) for 3d scene labeling, in: International Conference on 3d Vision, 2014, pp. 569–576.
  • (47) T. B. Jørgensen, A. G. Buch, D. Kraft, Geometric edge description and classification in point cloud data with application to 3d object recognition.
  • (48) S. M. Prakhya, B. Liu, W. Lin, B-shot: A binary feature descriptor for fast and efficient keypoint matching on 3d point clouds, in: Ieee/rsj International Conference on Intelligent Robots and Systems, 2015, pp. 1929–1934.
  • (49) X. Li, K. Wu, Y. Liu, R. Ranasinghe, G. Dissanayake, R. Xiong, Risas: A novel rotation, illumination, scale invariant appearance and shape feature.
  • (50) K. B. Logoglu, S. Kalkan, A. Temizel, Cospair: Colored histograms of spatial concentric surflet-pairs for 3d object recognition, Robotics and Autonomous Systems 75 (part B) (2016) 558–570.
  • (51) E. Wahl, U. Hillenbrand, G. Hirzinger, Surflet-pair-relation histograms: A statistical 3d-shape representation for rapid classification, in: International Conference on 3-D Digital Imaging and Modeling, 2003. 3dim 2003. Proceedings, 2003, pp. 474–481.
  • (52) K. Tang, P. Song, X. Chen, Signature of geometric centroids for 3d local shape description and partial shape matching, in: Asian Conference on Computer Vision, 2016, pp. 311–326.
  • (53) J. Yang, Z. Cao, Q. Zhang, A fast and robust local descriptor for 3d point cloud registration, Information Sciences s 346–347 (2016) 163–179.
  • (54) S. M. Prakhya, J. Lin, V. Chandrasekhar, W. Lin, B. Liu, 3dhopd: A fast low-dimensional 3-d descriptor, IEEE Robotics and Automation Letters 2 (3) (2017) 1472–1479.
  • (55) B. Drost, M. Ulrich, N. Navab, S. Ilic, Model globally, match locally: Efficient and robust 3d object recognition, in: Computer Vision and Pattern Recognition, 2010, pp. 998–1005.
  • (56) Z. C. Marton, D. Pangercic, N. Blodow, M. Beetz, Combined 2d–3d categorization and classification for multimodal perception systems, International Journal of Robotics Research 30 (11) (2011) 1378–1402.
  • (57) M. Muja, R. B. Rusu, G. Bradski, D. G. Lowe, Rein - a fast, robust, scalable recognition infrastructure (2011) 2939–2946.
  • (58) R. B. Rusu, G. Bradski, R. Thibaux, J. Hsu, Fast 3d recognition and pose using the viewpoint feature histogram, in: Ieee/rsj International Conference on Intelligent Robots and Systems, 2014, pp. 2155–2162.
  • (59) H. Ali, F. Shafait, E. Giannakidou, A. Vakali, N. Figueroa, T. Varvadoukas, N. Mavridis, Contextual object category recognition for rgb-d scene labeling, Robotics and Autonomous Systems 62 (2) (2014) 241–256.
  • (60) K. C. Chan, C. K. Koh, C. S. G. Lee, A 3-d-point-cloud system for human-pose estimation, IEEE Transactions on Systems Man and Cybernetics Systems 44 (11) (2014) 1486–1497.
  • (61) A. Aldoma, M. Vincze, N. Blodow, D. Gossow, S. Gedikli, R. B. Rusu, G. Bradski, Cad-model recognition and 6dof pose estimation using 3d cues, in: IEEE International Conference on Computer Vision Workshops, 2012, pp. 585–592.
  • (62) A. Aldoma, F. Tombari, R. B. Rusu, M. Vincze, Our-cvfh – oriented, unique and repeatable clustered viewpoint feature histogram for object recognition and 6dof pose estimation, in: Joint Dagm, 2012, pp. 113–122.
  • (63) M. Madry, C. H. Ek, R. Detry, K. Hang, Improving generalization for 3d object categorization with global structure histograms, in: Ieee/rsj International Conference on Intelligent Robots and Systems, 2012, pp. 1379–1386.
  • (64) W. Wohlkinger, M. Vincze, Shape distributions on voxel surfaces for 3d object classification from depth images, in: IEEE International Conference on Signal and Image Processing Applications, 2012, pp. 115–120.
  • (65) W. Wohlkinger, M. Vincze, Ensemble of shape functions for 3d object classification, in: IEEE International Conference on Robotics and Biomimetics, 2012, pp. 2987–2992.
  • (66) C. Y. Ip, D. Lapadat, L. Sieger, W. C. Regli, Using shape distributions to compare solid models, in: ACM Symposium on Solid Modeling and Applications, 2002, pp. 273–280.
  • (67) T. Chen, B. Dai, D. Liu, J. Song, Performance of global descriptors for velodyne-based urban object recognition, IEEE, 2014.
  • (68) J. Cheng, Z. Xiang, T. Cao, J. Liu, Robust vehicle detection using 3d lidar under complex urban environment, in: IEEE International Conference on Robotics and Automation, 2014, pp. 691–696.
  • (69) S. H. Kasaei, A. M. Tomé, L. S. Lopes, M. Oliveira, Good: A global orthographic object descriptor for 3d object recognition and manipulation ☆, Pattern Recognition Letters 83 (2016) 312–320.
  • (70) J. P. S. d. M. Lima, V. Teichrieb, An efficient global point cloud descriptor for object recognition and pose estimation, in: 2016 29th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), 2016, pp. 56–63. doi:10.1109/SIBGRAPI.2016.017.
  • (71) B. Lin, F. Wang, F. Zhao, Y. Sun, Scale invariant point feature (sipf) for 3d point clouds and 3d multi-scale object detection, Neural Computing and Applicationsdoi:10.1007/s00521-017-2964-1.
  • (72) I. V. Alexander Patterson, P. Mordohai, K. Daniilidis, Object Detection from Large-Scale 3D Datasets Using Bottom-Up and Top-Down Descriptors, Springer Berlin Heidelberg, 2008.
  • (73) M. Himmelsbach, T. Luettel, H. J. Wuensche, Real-time object classification in 3d point clouds using point feature histograms, in: Ieee/rsj International Conference on Intelligent Robots and Systems, 2009, pp. 994–1000.
  • (74) J. F. Lalonde, N. Vandapel, D. F. Huber, M. Hebert, Natural terrain classification using three‐dimensional ladar data for ground robot mobility, Journal of Field Robotics 23 (10) (2006) 839–861.
  • (75) M. Lehtomäki, A. Jaakkola, J. Hyyppä, J. Lampinen, H. Kaartinen, A. Kukko, E. Puttonen, H. Hyyppä, Object classification and recognition from mobile laser scanning point clouds in a road environment, IEEE Transactions on Geoscience and Remote Sensing 54 (2) (2016) 1226–1239.
  • (76) Z. Wang, L. Zhang, T. Fang, P. T. Mathiopoulos, X. Tong, H. Qu, Z. Xiao, F. Li, D. Chen, A multiscale and hierarchical feature extraction method for terrestrial laser scanning point cloud classification, IEEE Transactions on Geoscience and Remote Sensing 53 (5) (2015) 2409–2425.
  • (77) K. Alhamzi, M. Elmogy, S. Barakat, 3d object recognition based on local and global features using point cloud library 7 (2015) 43–54.
  • (78) Y. Zhong, Intrinsic shape signatures: A shape descriptor for 3d object recognition, in: IEEE International Conference on Computer Vision Workshops, 2010, pp. 689–696.
  • (79) H. Hwang, S. Hyung, S. Yoon, K. Roh, Robust descriptors for 3d point clouds using geometric and photometric local feature, in: Ieee/rsj International Conference on Intelligent Robots and Systems, 2012, pp. 4027–4033.
  • (80) K. Lai, L. Bo, X. Ren, D. Fox, A large-scale hierarchical multi-view rgb-d object dataset, in: IEEE International Conference on Robotics and Automation, 2011, pp. 1817–1824.
  • (81) A. G. Buch, H. G. Petersen, N. Krüger, Local shape feature fusion for improved matching, pose estimation and 3d object recognition, Springerplus 5 (1) (2016) 1–33.