Widespread use of modern sensors in engineering and industry in recent times has resulted not only in bigger datasets but also in more complex datasets. Statistical Process Control (SPC) is an area that has witnessed increased sophistication in metrology accompanied by increased complexity in the resulting data sets related to production or manufacturing processes (Colosimo, 2018). We consider the case non-contact sensors collect point cloud and mesh data formed by thousands of points which actually lie in what is a 2-dimensional manifold, the actual surface of the object or part produced. Due to manufacturing and measurement errors, the observed surface will deviate from the target or nominal surface, given by the design of the part, usually available in some Computer Aided Design (CAD) file. In this paper, we present fundamentally new differential-geometric methods for performing SPC of a manufacturing system producing discrete parts from their associated scanned datasets. The method can be applied to mesh (triangulation), point cloud, and even voxel (tetrahedralization) datasets, and is based on techniques some of which have been used in recent years in Computer Graphics, Computer Vision, and Machine Learning.
Traditional treatment of 2 and 3-dimensional point cloud datasets in Statistics pertains to the field of Statistical Shape Analysis (SSA, Kendall (1984); Dryden & Mardia (2016); for applications in manufacturing see del Castillo & Colosimo (2011)). In SSA, the -point cloud data are represented by a configuration matrix with or 3. The Shape of an object is defined as the geometrical information in that remains after discounting the effect of similarity transformations usually excepting reflections (translations, rotations and dilations). To make inferences on the shape of objects, the Generalized Procrustes Algorithm, GPA, is first applied. The GPA registers or superimposes all the objects by finding scaling factors , rotation matrices (the special orthogonal group, which exclude reflections and have determinant one) and
dimensional translation vectors, such that they minimize the sum of squared full procrustes distances between all pairs of configuration matrices ():
where is a vector of ones. Constraints must be added to avoid the trivial solution where all parameters are zero (Dryden & Mardia, 2016). Other shape analysis methods based on euclidean distances between the points (Lele, 1993) require large distance matrices and will not be discussed here.
The problem of registering two three dimensional objects with a large but different number of (non-corresponding) points has been known for a long time in the computer vision literature, where the Iterated Closest Point (ICP) algorithm (Besl & McKay (1992); Zhang (1992)) is a standard. Consider the configurations of two distinct unlabeled objects and (with ), not necessarily having the same pose and assume . Let with if is matched with point , and zero otherwise. The objects may be located and oriented differently in space, and hence the problem is not only to find the correspondences but also a rigid body transformation that registers the two objects such that we
subject to , and or 1, where is the cost of matching point to point
. This is a hard non-linear discrete optimization problem. Existing heuristics differ by choosing different cost functions. In the ICP method, , the euclidean distance between in and the closest point in .
Commercial Computer Aided Design (CAD) and inspection software use variants of the ICP method to align the cloud points of each scanned object and that of the CAD file, in order to determine regions in the manufactured part that differ from nominal. To do this, a CAD model, usually in the form of NURBS curves, is sampled to form a mesh or triangulation, and then the alignment of the CAD and scanned part triangulations can be performed. Figure 1 shows an instance of a metal part CAD model and two color-coded comparisons between the CAD model and the 3D-printed part. This is actually voxel data, not mesh or point cloud data on the surface of the object, but the registration problem is essentially identical.
A first approach to Statistical Process Control based on surface (and in general, point cloud and voxel) data can be based on monitoring the deviations from nominal shown in Figure 1 after applying the ICP method, similarly to a “DNOM” (deviation from nominal) control chart (see e.g., Farnum (1994)). The deviations from nominal are vectors, and either their norm or their individual components could be used for SPC. The optimal value of the ICP statistic (2), between CAD file and each scanned part could be monitored on-line, for instance, to provide a “generic” SPC mechanism against a wide variety of unanticipated out of control states in the geometry of a part, in a similar sense to what Box and Ramírez thought of univariate Shewhart charts (Box & Ramírez, 1992), with other SPC or diagnostic mechanisms added to detect more specific, or more localized defects on the part. As far as we know, this simple strategy has not been proposed in the literature, so we consider it and contrast it with the new intrinsic differential-geometrical methods that conform our main contribution. We also discuss other registration-based SPC methods that instead of ICP use spectral distances in section 7.
Our approach brings SPC of discrete-part manufacturing closer to the Computer Graphics field. The differential-geometrical approaches we propose are based on techniques popular in computer graphics to characterize 3D objects, and these methods have also been used in machine learning of general manifolds of much higher dimension than the 2-manifolds (surfaces) we focus in this paper. In computer graphics and computer vision applications, however, the problem to solve is the identification of large differences between the shapes of objects for classification purposes or to match a query object, tasks that a (patient) human can in principle perform. Here, in contrast, we focus in detecting considerable smaller differences (sometimes not perceptible to the eye) between a sequence of objects which the manufacturer is trying to make as identical to each other as possible, hence differences will tend to be rather small. A further difficulty in SPC is that, in addition to detecting changes in the mean shape of an object, a noise increment should be considered a process signature change to be detected, rather than a signal that must be filtered and ignored as in Computer Vision, where shape identification methods that are robust with respect to noise are typically sought.
In this paper, we follow the traditional SPC paradigm. The main goal is detection of significant part to part differences with respect to historical variation because they may indicate a manufacturing process problem, and the aim is to detect “assignable causes of variation” as soon as possible, while avoiding false alarms using a statistical monitoring scheme. The paper is organized as follows. Section 2 first reviews some preliminary mathematical concepts in order to present the new differential-geometric SPC methods, including the main operator we will use, the Laplace-Beltrami (LB) operator, and its spectrum. The spectrum of the LB operator needs to be estimated from data, and section 3 discusses methods to do so. The spectral methods we utilize are also popular in the analysis of social networks, and as it will be shown there is a close connection between the combinatorial or graph Laplacian and the discretized Laplace-Beltrami operator we utilize, since both are based on a graph. Section 4 presents the main practical results, where a specific distribution-free multivariate chart is used to monitor a process with respect to an in-control reference data set (i.e. “phase II” in SPC) using the spectrum of the estimated discrete LB operator of a sequence of scanned parts. We show how the spectral methods have greater sensitivity to detect general out of control conditions in a discrete-part manufacturing process than using the deviations from nominal ICP method sketched above. As additional post SPC alarm diagnostics, in section 5 we investigate the use of ICP to localize the occurrence of defects on a part. To make the presentation of our SPC proposal complete, section 6 discusses a method based on the spectrum of the LB operator for the first phase (“phase I”) when monitoring a process in the absence of prior in-control data. Section 7 contains a discussion of alternative spectral distances popular in Computer Graphics that have received recent attention in the Statistics literature and their potential use in SPC. The paper concludes with conclusions and some directions for further research.
For completeness, we first define the Differential Geometric concepts we use in the sequence; for most of them see, e.g. Kreyszig (1991) or O’Neill (2006). In Differential Geometry, one begins with properties affecting the vicinity of a point on a surface and deduces properties governing the overall structure of the object under consideration.
We consider either surface or voxel data, which constitute datasets sampled from instances of a -dimensional manifold ( or 3, respectively) embedded in 3-dimensional Euclidean space. Informally, a manifold is a -dimensional space that resembles (Euclidean space) on small domains around each of its points (we will assume
is compact and connected). The manifold hypothesis, useful if true in machine learning and in engineering data, indicates that high (-) dimensional data frequently lie on or near a lower, often curved dimensional manifold, where . In smooth manifolds with an inner product, the so-called Riemannian manifolds, we can measure distances, areas, volumes, etc. If the manifold hypothesis holds, we say the intrinsic dimension of the data is , and that the manifold is embedded in a dimensional ambient space. For manifold data, any -dimensional point can be completely described by defining local (intrinsic) coordinates or “parameters” . For instance, consider a parametric curve in , such as the helicoidal curve in Figure 2. This is an instance of a 1-manifold. In this paper, we will mostly consider data sampled from surfaces or 2-manifolds of manufactured parts, although all our methods are extendable to the case of 3-manifolds, i.e., voxel-data.
Definition 1. Any property of a manifold that can be computed without recourse to the ambient space coordinates, and instead is computed using only the intrinsic or local manifold coordinates is said to be an intrinsic geometrical property, or simply, an intrinsic property, of .
Definition 2. Any geometrical property of an object that remains constant after application of a given transformation is said to be invariant with respect to that transformation.
Intrinsic geometrical properties of a manifold in Euclidean space are invariant with respect to rigid transformations (rotations and translations, but not dilations), but the opposite is not true. For instance, Euclidean distances between points in a configuration matrix are invariant to rigid transformations but they are evidently not intrinsic. An instance of an intrinsic property is the geodesic distance between two points located on a manifold , which is therefore also invariant with respect to rigid transformations. In section 7, we discuss intrinsic distances other than the geodesic.
A well-known result in Differential Geometry indicates that intrinsic geometrical properties of a manifold depend only on the so-called first fundamental form of . For the 2-manifolds (surfaces) we mostly deal with in this paper, this quadratic form is defined as follows. Consider a parametric surface , (here which defines , a Riemannian 2-manifold. Define the surface differential vectors at as:
(see Figure 3.)
Now define a parametric curve on , such that
and use the chain rule:
Finally, take the inner product (borrowed from ):
where: . We then have the following.
Definition 3. The quadratic form is called the first fundamental form of the parametric surface describing . It provides a means to measure arc lengths, areas and angles on . It defines a new inner product for vectors on tangent planes and therefore, a metric
on the surface, with associated matrix (tensor):
In this sense, the ambient space induces a metric, the Riemannian metric, on the manifold . Since , the length of a curve segment on is:
With the Riemannian metric , we can also compute differential operators acting on a function defined on , which are very useful for our purposes, i.e., for estimation, and therefore, statistical monitoring, of intrinsic geometrical properties of an object.
Definition 4. The gradient of a function on points in the direction of steepest ascent:
and creates a vector field in .
Definition 5. The divergence of a vector field in is given by:
and measures the “quantity” of the outward flux of from the infinitesimal neighborhood around each point . This is a scalar-valued function that creates a scalar field.
Definition 6. The Laplace operator of a twice differentiable function is minus the divergence of its gradient field:
and measures the difference between and the average in a neighborhood around . Given the second derivatives, it is a measure of curvature, and can be alternatively understood as minus the trace of the Hessian of .
Finally, we define the main differential-geometric operator we will use in the sequence, the Laplace-Beltrami operator, widely used in Computer Graphics and Machine Learning (Belkin, 2003; Kimmel, 2004; Levy, 2006; Patané, 2014; Reuter et al. , 2006).
Definition 7. For a function , the Laplace-Beltrami (LB) operator (sometimes called the “second differential parameter of Beltrami”) is defined as:
where is the divergence taken on . For a point defined by a parametric surface the following relation holds:
where is the normal at the point on (see Figure 3) and is the mean curvature of at , which equals the average of the maximum and minimum curvatures at . This relation provides a geometric interpretation of the action of the LB operator, which can be visualized as creating a vector field of normals on such that the “height” of the normal is twice the mean curvature of at that point.
The LB operator extends the definition of the Laplacian to functions defined on manifolds, and is an intrinsic measure of local curvature at a point. Its intrinsic nature can be seen from defining a local coordinate system (or parametrization) on the manifold (), with for surfaces). Then, the LB operator applied to a function is defined as:
where are the elements of and the minus sign is to be consistent with our definition. Appendix A illustrates how to compute the LB operator of a function defined on a -dimensional manifold using (3). The LB operator on is therefore a function of elements in the metric tensor only, and thus it is intrinsic and invariant with respect to rigid transformations. As it will be discussed below, the spectrum of the LB operator contains considerable geometric information about the manifold, and is widely used for this reason in both machine learning and computer graphics/computer vision.
The LB operator emerges from the spatial part of the solution to the heat and wave partial differential equations, see Table1. If has boundaries, the solution to the heat and wave equations requires additional conditions such as the so-called Dirichlet boundary condition for all (so the boundary acts as an absolute refrigerator for all in the case of the heat equation).
|HEAT EQUATION||WAVE EQUATION|
Divide both sides by
The Laplacian eigenvalue problem is
sometimes called the Helmholtz partial differential equation, with an infinite number of eigenfunctionsand eigenvalue pairs. The collection of eigenvalues is called the spectrum of the LB operator. Defining the inner product on the compact and connected manifold as for (space of square integrable functions on ), it can be seen that the LB operator is self-adjoint, i.e., and positive semi-definite (). Furthermore, the eigenvalues are such that and the eigenfunctions form an orthonormal basis in , see Chavel (1984). In the particular case is a circle (1-dimensional manifold) the corresponding basis functions consist of the usual Fourier harmonics and .
In case , the eigenfunctions satisfying the Helmholtz equation can be interpreted as the modes of vibration of a 2D membrane with resonant or natural frequencies . The question of whether one can determine the shape of the drum holding the membrane on its boundary from its spectrum entertained mathematical physicists for many years until it was shown by Gordon et al. (1992) that there are pairs of 2D drums that have the same spectrum, yet their shapes are different, see Figure 4. In general, for , this result implies that the geometric information contained in the spectrum is not enough to uniquely identify the shape of the surface . However, these pairs of figures are always concave polygons and correspond segment by segment, so they are rare.
The spectrum of the LB operator is always discrete, non-negative, and contains considerable geometrical and topological information about a manifold that can be used for shape identification. For instance, for a surface:
(thus the area of can be inferred from the asymptotic slope of the spectrum; note is proportional to the index ). Another result shown in a classic paper by Kac (1966) is
Also, the spectrum contains topological information about . For instance, one result showing dependency of the spectrum on topological information is that for a surface without boundary (Yang & S.T., 1980),
where is the genus of the surface (number of holes).
The spectrum changes in a continuous form with continuous changes of the shape of the manifold. Furthermore, scaling a surface by a factor of changes the eigenvalues by (see Figure 5). It is possible to extract useful information from the lower part of the spectrum. According to Reuter et al. (2009)
, the LB operator spectrum of a manifold has more discrimination power than simpler measures like surface area. These authors provided examples of shapes with the same surface area but different spectrum. Topological data analysis tools can be applied to the eigenvectors in order to extract topological information such as the number of holes. We leave this for further research (see conclusions section).
3 The spectrum of the Laplace-Beltrami operator to characterize the geometry of a 3D object
The true spectrum of very few manifolds is known. One instance where it is known is the case of a unit sphere (Figure 5). Repeated eigenvalues are the result of perfect symmetries in the geometry of an object and are therefore rare in practice. To characterize the geometry of general 3D objects, such as discrete parts from a manufactured process, we need to use a discrete approximation of the LB operator based on a sample of points, possibly with adjacency information, forming a mesh.
In machine learning, LB operator-based methods have proved useful in both unsupervised and semi-supervised learning methods (e.g., seeBelkin & Niyogi (2008)
). For unsupervised learning, the spectrum of the LB operator is used. For semi-supervised learning, a functiondefined on is observed at certain points–a point cloud dataset. The main goal is to fit this function, and an approximation to the LB operator is needed based on such dataset. Computer graphics and computer vision authors have also used the LB operator for shape classification (e.g., see Kimmel (2004); Levy (2006)).
3.1 Approximating the LB operator and its spectrum
For SPC of surface (2-manifold) data, we can make use of the geometric information encoded in the spectrum of the LB operator of a surface, which is invariant and intrinsic. The LB operator is linear, taking functions into functions. Applied to a differentiable function evaluated on a surface point , returns a scalar. In practice, we have no expression for the surface , we only have a (large) sample of points from it. We can discretize the manifold based on the data, and an approximate, or discrete LB operator is obtained in the form of a matrix acting on vectors, returning a vector. One then works with the eigenvalues of the matrix which approximate the true spectrum of the manifold. There are different ways to discretize the manifold where the data lies. In computer graphics and machine learning, it is common to build a mesh or graph to represent the manifold, with the vertices representing the data points and the edges representing proximity relations that define the geometry of the manifold. A typical case in computer graphics are triangulations, sometimes generated automatically by non-contact sensors. If the data is in the form of a point cloud, a graph is also constructed from the nearest neighbors to each point. Finite-element methods (FEM) approximations of the LB operator for mesh data (Reuter et al. , 2006) and for voxel data (Reuter et al. , 2007) have also been developed, which increases the practical use of this differential geometry tool.
Motivation for some of the most popular LB operator approximations used for analyzing the shape of an object comes from the theory of heat diffusion and wave propagation in Physics (see Table 1). For instance, if the sensor data available has the form of a triangulation , the Mesh Laplacian approximation (Belkin et al. , 2008) is given by:
where denotes the set of all triangles in the mesh, denotes the area of triangle , and denotes the set of vertices in triangle . The discrete LB operator contains the quantity
called the heat kernel, which can be thought to be the amount of heat that is transferred from to in time if there is a unit heat source at when (here both and are assumed to be in euclidean space; to obtain the exact heat kernel over a manifold one needs the eigendecomposition of the LB operator, see (12) below). The parameter , which represents time in the heat equation (Appendix B) can be used to obtain the approximation at different scales, with larger values implying heat has flowed for longer times and the approximate Laplacian is considering larger areas of interest around a given point.
The mesh Laplacian has a simple expression that connects it to the underlying graph Laplacian, as the next remark shows.
Claim 1. For a differentiable function defined on , the discretized approximation of its LB operator (6) results in an Laplacian matrix acting on vectors which can be written as:
with for and and diagonal matrix with entries .
For a proof, see Appendix B. The estimated spectrum of the LB operator is then given by the eigenvalues of matrix . Li et al. (2015) noted how the mesh Laplacian approximation (6) is a global approximation, since computing it at a given point requires the integral over the whole surface . Instead, these authors proposed a modification of the mesh Laplacian that uses geodesic distances between points and (as opposed to euclidean distances) and that considers in the last sum only points within a certain radius of each point , resulting in the alternative discrete Laplacian approximation:
where denotes the area of the one ring neighborhood of point . By writing the equation above in vector form, it is easy to obtain the Laplacian matrix , with for and and diagonal matrix with entries . We call this LB approximation the Localized Mesh Laplacian. It has the merit of resulting in sparser matrices, reducing storage and computational requirements, and was used in all the examples and figures shown below.
Remark 1. Relation to the combinatorial graph Laplacian The discrete Mesh Laplacian in equation (6) and the Localized Mesh Laplacian (8), have the same form as the combinatorial Graph Laplacian (Chung & Lu, 2006), , where is the adjacency matrix of the mesh and is a diagonal matrix with entries equal to the degree of each vertex. The graph Laplacian is an operator applied to a function defined on the vertices of the graph. It corresponds to the discrete Laplacian if the edge weights are , so the discrete LB operator (6) can be thought of as a weighted version of the combinatorial Laplacian where the edge weights are given by the integrated heat kernel (with either euclidean or geodesic distances over the neighborhood of a point, depending on the approximation used) over one third of the area of the one-ring neighborhood of either of its two end points. The graph Laplacian arises when modeling a diffusion process on a network or mesh. Let be the amount of some substance at vertex , and be the rate at which the substance flows from vertex to vertex , where is a constant. Then
where is the element of . It is easy to show (Newman, 2010) that
where is a vector with all ’s, from which the diffusion equation on a network results: , compare to the heat equation in Table 1.
Remark 2. Convergence. Both Belkin et al. (2008, 2009) and Li et al. (2015) show how as the triangulation gets finer, their Laplacians (6 and 8) converge pointwise to the continuous LB operator defined on a smooth manifold. Dey et al. (2010) further proved the convergence and stability of the spectra of the mesh Laplacian in Belkin et al. (2008).
Remark 3. General form of discrete Laplacian approximations. Patané (2017) pointed out that many of the discretized LB operators proposed in the literature can be represented in a unified way as the multiplication of two matrices
where and are symmetric, positive semi-definite ( is positive definite) matrices and called the mass matrix and the stiffness matrix, respectively. This means is symmetrizable (Liu et al. , 2012) and guaranteed to have real eigenvalues. Defining the inner product we have that , so is self-adjoint. We note that although a symmetrizable LB approximation has therefore the desirable property of having real spectrum, not all other desirable properties of their continuous LB operator counterparts can be achieved with a discrete approximation, so there is a “no free lunch” situation, explaining why there are so many discrete approximations proposed to the LB operator, see Wardetzky et al. (2007). For instance, the eigenfunctions of the continuous LB operator form an orthogonal basis in , however, the eigenvectors of the discrete LB operator approximations do not form an orthogonal basis if using the euclidean inner product, a result of the underlying meshes used for their computation not being uniform on (Rustamov, 2007). The non-uniform mesh in turns result in a non-symmetric discrete Laplacian. The eigenvectors of a discrete approximation do form an orthogonal basis but with respect to the -inner product , i.e., for . For this reason, Rustamov (2007) indicates that only when the mesh is uniform one can expect a discrete Laplacian to be “faithful” to the continuous LB operator. This indicates that mesh pre-processing techniques may be valuable, but we leave this topic for further research.
Proof. From Claim 1, both the Mesh Laplacian and the Localized Mesh Laplacian can be written as:
where and is diagonal, for and . The only differences between these two discrete Laplacians are the definitions of area and distance , which are summarized in the following table:
|Mesh Laplacian||Localized Mesh Laplacian|
|Euclidean distance||Geodesic distance|
Where recall is the area of the one ring neighborhood of point . Now define a diagonal matrix where , which is obviously positive definite. Pre-multiplying a matrix by is the same as multiplying the th row of the matrix times the th diagonal element in , , so matrix has elements:
It is easy to see that holds with either of the two area definitions above, and indices and are interchangeable in both types of distances. Therefore, is symmetric for both types of discrete Laplacians.
In addition, is a diagonal matrix and automatically symmetric. So is symmetric, and is the multiplication of two symmetric, positive semi-definite ( is positive definite) matrices. Therefore, both discrete Laplacians, (6) and (8) are symmetrizable and thus always have real eigenvalues.
We have been unable to show how the discrete Laplacian for unstructured cloud points due to Belkin et al. (2009) can be written as in (9), so it is not possible to assure if such Laplacian, which we therefore do not use herein, always has a real spectrum. In what follows, we use the Localized Mesh Laplacian due to its sparseness advantage over (6).
The spectrum of the combinatorial graph Laplacian has some properties with counterparts in the approximated spectrum of as the latter is also based on a mesh or network: first, , and the algebraic multiplicity of this eigenvalue gives the number of connected components in the graph. The first eigenvector is constant, and the second eigenvalue is greater than zero only if the graph is connected (and only if is not repeated). The signs in the second eigenvector, called Fiedler’s vector, can be used to cluster the vertices in two sets, a notion related to the nodal sets of the eigenvector of the LB operator (see Chung & Lu (2006) and also section 7 below).
To demonstrate numerically how the discrete Laplacian approximates the spectrum of a sphere, Figure 5 shows the first 10 eigenvalues of the Localized Mesh Laplacian for a mesh with 3000 points on the unit sphere. Both the true spectrum and the approximation are invariant with respect to rigid transformations (the localized mesh Laplacian (8) was used throughout in this paper).
The spectra of widely different objects are considerably different, as Figure 6 indicates. The quality of the mesh approximation gets better as the mesh size increases, as it can be seen in Figure 7, illustrating numerically the convergence of the estimated LB operator. If the noise is high, the spectrum cannot be estimated well (Figure 8). For moderate levels of noise (which includes both manufacturing errors and measurement errors), we demonstrate in section 4 how the spectrum of the LB operator can still be used for process monitoring purposes.
3.2 Computation and stability of the discrete LB operator spectrum
The computational cost of obtaining the eigenvalues of is an to operation depending on the sparseness of the matrix. In practice, only the lower part of the spectrum is needed. The Arnoldi algorithm, used in Matlab’s function eigs to find the first eigenvalue-eigenvector pairs of a sparse symmetric matrix, has a typical complexity of . Although not all discrete LB approximations result in a symmetric matrix, many of them are symmetrizable as discussed in Remark 3. It can easily be proved that is an eigenvalue-eigenvector pair of (9) if and only if is an eigenvalue-eigenvector pair of matrix , which is symmetric. Hence, we can gain the extra speed of the Arnoldi algorithm by finding the eigendecomposition of and get the eigenvectors of from , which is easy when is a diagonal matrix as proved in Proposition 1, and this is the computational method we recommend to obtain the spectrum of the discrete LB operators. There are also methods to compute spectral quantities such as heat kernels and diffusion distances (discussed below in section 7) that do not require to compute the LB spectrum first (Patané, 2014).
A numerical question of importance is if the computation of the spectrum of the LB operator is stable or not. Patané (2017) gives the following analysis of the stability of this computation for each single eigenvalue . Suppose the estimated Laplacian is symmetrizable (i.e., it satisfies equation 9) and consider a given single eigenvalue-eigenvector pair . Perturb by where is a disturbance matrix and small, and solve for the new eigenvalue-eigenvector pair in with initial conditions (as ) and . The magnitude of the derivative provides a measure of change of the eigenvalue as is perturbed. Differentiating with respect to ,
Making we get . Pre-multiplying both sides times , we get which follows because the -norm of the eigenvector, , is one. Thus, from the Cauchy-Schwarz inequality we have that
again, since . Thus, the magnitude of change of each eigenvalue is bounded by the B-norm of , and therefore the computation of the eigenvalues with multiplicity one is stable numerically. Furthermore, our numerical experiments show that the lower spectrum is stable with respect to moderate levels of noise, typically encountered in manufacturing. The computation of the spectra is not stable when there are higher multiplicities (Patané, 2017), but as mentioned before, repeated eigenvalues occur due to exact symmetries which will be rare from scanned objects, and our numerical methods did not show evidence of this kind of potential problem.
4 Using the estimated LB spectrum as a tool for SPC
Our proposal is to use the lower part of the Localized Mesh Laplacian spectrum (8), i.e., the spectrum cropped up to certain maximum index, obtained from scanned parts and consider each spectrum a profile from which we derive a general statistical process monitoring technique supplemented by additional post-alarm tools to aid the localization of the defects on a part.
4.1 Permutation tests based on the LB spectrum
Empirical evidence presented by Sun et al. (2009b)
indicates that commercial 3D scanner noise is not Gaussian. But even for normally distributed, isotropic errors added to a surface, the estimated LB spectra are not multivariate normal. Figure9 shows the Shapiro-Wilks marginal tests of normality for each of the first 500 eigenvalues of 200 “acceptable” parts simulated based on the CAD model triangulation in Figure 10 below to which we added isotropic N measurement noise. It is therefore necessary to develop distribution-free process monitoring methods for the LB spectrum.
To obtain an initial assessment of the detection capabilities of the spectrum of the LB operator, we use 2-sample permutation tests to compare the mean LB spectra between two groups of parts using a component-wise Wilcoxon rank test. Represent the spectra of the estimated LB operator (sorted from smallest to largest, up to some given index p) of each part by , and let the two samples be , when the process is in control or acceptable, and for parts that have an out of control condition. Given the non-normality of the spectrum data, in our preliminary tests we use a nonparametric, distribution free permutation test for using two different types of statistics: the maximum t-statistic (Reuter et al. , 2007) and the componentwise Wilcoxon rank sum statistics (Chen et al. , 2016). The maximum t-statistic is defined as
Here is the th element of (in our case, the th eigenvalue of part ), and
is the pooled standard error estimate of theth eigenvalue
is the standard deviation of theth element in sample .
For specifying the componentwise Wilcoxon rank sum statistic, denote the rank of the jth eigenvalue from the ith part, with respect of the pool of parts, as ( and ) and define:
where and Var (see Appendix C
for a proof). The test statistic is formed by combining theusing , with a large value of the test statistic leading to rejection of .
Both statistics and are intuitive, as there is no explicit mapping available from the eigenvalues of the LB operator to the manifold , and differences in all the first eigenvalues should be considered jointly as evidence of a difference between the two objects.
shows the results of permutation tests between two samples. The first sample (with sample size 10) is a group of acceptable parts, while the second sample (with sample size 5) is a group of parts of the type shown on top of each column, where the first three parts have different types of defects (chipped corners or a protrusion) while the last one is acceptable in the sense of being equal the CAD design of the part plus isotropic white noise. To make the simulated data realistic, each part has a different number of points (between 26750 and 26850 points) since real datasets from a non-contact sensor will not have identical number of points/part. We added isotropicnoise to all points of all part types. The second row of plots in the figure are results from the permutation test results using the maximum t-statistic, while the last row is using the Wilcoxon rank-sum statistic with as in (4.1), both using . In each of the small figures showing the test results, the blue bars are the empirical pdf’s of the test statistic from all permutations, while the red line indicates the observed value of the test statistic, or . Both the maximum-t and rank-sum tests are one-sided. The red numbers under the red lines are the estimated p-values for the corresponding tests, defined as the number of permutations with more extreme (larger) value than the observed test statistic, divided by the total number of permutations (3003 for sample sizes 10 and 5, note this is an exact permutation test).
We repeated the experiments in figure 10 using eigenvalues normalized by area, given that from (5) they relate inversely to the surface area of . For small, localized defects on the surface of a part, the detection capabilities of the LB spectrum using either normalized or non-normalized are very similar, and therefore the normalized spectrum results are not shown here. Since we wish to detect changes in shape and in size, and not only in shape, we suggest using the unnormalized spectrum, which will be used in the following sections.
shows the distributions of the estimated p-values (as defined above) when the permutation test procedure is repeated 1000 times. In these figures, the last column of plots shows the case when both groups of parts consist of acceptable parts, i.e., the null hypothesis is true. As it is easy to show, in such case the theoretical p-value should follow a standard uniform distribution, and this is approximately the case in the depicted histograms of p-values. In the other cases, when we are testing defective parts against acceptable parts, it is desired to have p-values very close to zero, as it is indeed the case.
These comparisons indicate the potential for using the LB spectrum and permutation tests for shape difference detection. These are not however, an SPC scheme, since for on-line, “Phase II” monitoring we require a sequential test, as further discussed below.
4.2 On-line SPC scheme (“Phase II”)
Given the non-normality of the LB spectrum, for on-line statistical control we recommend to use a multivariate permutation-based control chart. We have used, with some modifications as discussed below, Chen et al. (2016) distribution-free multivariate exponentially-weighted moving average (“DFEWMA”) chart, and applied it to the lower part of the estimated LB spectra. Consider the set of the th component of the vectors one wishes to monitor (eigenvalues of the LB spectra, in our case), taken from observation to the most recent observation , . Let be the rank of among , where is the number of observed vectors in the in-control (IC) state (usually from a “phase I” SPC step). We wish to test the equality of the location of the samples and , that is, the in-control observations compared to the most recent observations in a “window” of observations.
The idea of the DFEWMA chart is to compute the “exponentially weighted” rank statistic of the last observations among all IC observations thus far. If these ranks are extreme (large or small) this is evidence the process has changed from its IC state. The exponential weights give more weight to the more recent observations within the last and can be useful to detect smaller process changes. We therefore use the statistic:
in-control observations and a false alarm probability
determining the geometric distributed in-control run lengths (hence the IC average run length is), the remaining chart design parameters are thus the weight and the window size . In Appendix C
we derive the following moment expressions, which consider the covariances of the weighted ranks in the sum:
We note these expressions are not the same as in Chen et al. (2016), who weighted the standardized ranks (4.1) rather than standardizing the weighted ranks as we do (there are some errors in their formula even for their weighted standardized ranks, see Appendix C). A sequential test statistic based on statistics is the sum of squares , used by Chen et al. (2016) (maximum statistics could also be used instead). Here we report results based on and the moment expressions above.
To illustrate the use of the resulting permutation chart, we simulated again parts from the CAD part model shown in Figure 10. Mesh sizes varied randomly (in the range 26750 to 26849 vertices). Simulations were comprised of a “Phase I” of 50 in-control parts (nominal plus noise), followed by an on-line “Phase II” where defectives (parts with a protrusion in one of its “teeth”, see Figures 10-11, third column) were introduced starting at part no. 21. Figure 12 shows simulations of the modified DFEWMA charts under different parameters and . The chart has a variable control limit that depends on the observed realization of the underlying process, thanks to which it has a geometric in-control run length distribution (the main results in Chen et al. (2016) hold as well under the different test statistic (11) we use). In every case, detection of the out of control (OC) state occurred within the first 3 observations.
4.3 Run length behavior
To gain a more complete sense of the effectivity of the SPC chart, we conducted a run length analysis based on simulation of cylindrical parts of increasingly more deformed shape, with parts acquiring a more “barrel-like” shape as an OC parameter is modified, to allow us to compute out of control run lengths parametrized in a simple way. We also conducted a run length analysis simulating realizations of the part (and its defect types) shown in Figure 10. Given that a run length analysis implies computation of thousands of LB spectra, to avoid long computation times, we used smaller mesh sizes, with points for the cylindrical parts and points for the parts in Figure 10. We also permuted the already simulated parts instead of simulating new parts for new replications to further reduce the computational cost while keeping the variability of the run lengths in our analysis.
We applied the DFEWMA charts –with the moments as described in section 4.2– to the LB spectrum with and also applied it to the optimal function value returned by the ICP algorithm (2), which is a measure of the difference between two configuration matrices after discounting similarity transformations. Table 2 shows the average run lengths (ARL) and the standard deviation of the run lengths (SDRL) in the in-control processes, where we use both a perfect cylinder with radius 10 and height 50 and the acceptable part in the last column of Figure 10, both with isotropic N noise added. As the table shows, all run lengths have nearly geometrical behaviors, which is consistent with the theorem given in Chen et al. (2016).
|LB spectrum||ICP objective|
For the out-of-control analysis of the cylindrical parts, we added a first harmonic with amplitude times the standard deviation of the noise to the radius, so the deformed radius at height becomes , adding also isotropic noise to the points. Table 3 compares the ARL and SDRL of both methods (LB spectrum and ICP) as a function of the OC parameter . The LB spectrum is very sensitive to changes in the shape of an object, having almost immediate detection of the deformation of the cylinder for (an ARL of around 3 means that the chart signals as soon as the majority of the parts in the window of size are defective). On the other hand, the ICP objective is less efficient than the LB spectrum, especially when is very small (note also the large SDRL values). This is because the small increase in the ICP statistic caused by a slight deformation can be masked by the natural variability of the in-control process, plus the inherent variability in the ICP algorithm itself (recall that (2) is a hard non-convex combinatorial problem), making it hard for the chart to distinguish the change to an OC condition until more parts are available. However, this would not be a problem for the LB spectrum because it is reflecting the overall shape of the parts. Both DFEWMA charts have a limiting ARL, because as mentioned by Chen et al. (2016) these are rank-based charts which are more effective at detecting small changes in a process, with the ranks not changing much after further increases in the size of the process change introduced. These authors recommend their chart when the ratio is small, which are ideal conditions for high quality discrete manufacturing processes, hence we show most of our run length results for . The limiting ARL value depends on a complicated form in the in-control and out of control states and on the chart parameters ().
|LB spectrum||ICP objective|
We also conducted the OC run length analysis for the part and defects shown in Figure 10, where we consider three types of defects corresponding to the first three columns in the figure. The ARLs and SDRLs for both methods (LB spectrum and ICP) for the three defective parts are summarized in Table 4. In this case, the ICP method works consistently better than the LB spectrum to detect all three types of defective parts. This is because the three defects are very localized and evident to the eye, making the increase in the ICP objective function quite significant. As these three defect types are local and the mesh size used is very small, they do not change the overall shape of the part enough for the LB spectrum to quickly detect the changes in the process, particularly with the chipped part #1, which is the smallest and more localized type of defect, with a faster detection for the protrusion defect part, which is the largest of the 3 defects relative to the mesh. To demonstrate that with larger meshes the LB spectrum would detect these types of defects quicker, consider Figure 10 where we have large meshes, and the p-values of the same Wilcoxon rank-sum tests (bottom row) are significant for all three cases. This indicates that the chart will signal faster with larger mesh sizes. We also note that chipped defect #1 has a slightly larger p-value in Figure 10, and in Figure 11 the distribution of its p-values has a thicker right tale than for the chipped #2 and protrusion defects. This shows the chipped #1 defect is harder to detected by nature and explains why it results in the longest run lengths, essentially becoming undistinguishable from the IC operation of parts if using a small mesh size.
|Defect type||LB spectrum||ICP objective|
|Chipped # 2||84.8106||149.6862||3.5922||0.8365|
5 Post alarm diagnostics
We have proposed a multivariate permutation SPC chart on the lower spectrum of the LB operator as a tool to detect general out of control (OC) states in the process that are not precisely defined, similarly to the role standard Shewhart charts have, as described by Box & Ramírez (1992). Once an alarm is triggered, an investigation of the specific assignable cause that is normally carried out should include the localization of the defect on the part surface or manifold , a task we now describe.
Suppose we have a part that has triggered an alarm in the SPC charts described above and we have also have available a CAD model for the part being produced. In order to localize the defect on each part, we apply the ICP algorithm to register the CAD model and the part that triggered the alarm (we use the ICP algorithm implemented by P. Bergstrom and available in Matlab Central). Upon completion, the ICP algorithm provides for each point on the defective part the index of the closest point on the noise-free CAD model, so that deviations from target can be computed as the Euclidean distance between them (this is the minimum distance in problem (2), with and being the OC part and the CAD model respectively).
Figure 13 shows three different locally defective parts, each with different number of points and with isotropic errors added to all three coordinates. We color each point on the OC part proportionally to these deviations, with lighter colors corresponding to larger deviations. As it can be seen, the location of each of the 3 defects on a part, the two parts with chipped corners and the part with a protrusion in one of its “teeth”, is very accurately identified. We suggest to conduct this ICP localization diagnostic after each SPC alarm.
When a global change in shape occurs due to an out of control condition, the ICP localization diagnostic will not work as well as in the case of very localized defects unless the change of shape is very evident. To illustrate, Figure 14 shows three out-of-control cylindrical parts with increasing values. The number of points varies from part to part and the same isotropic noise we have been using is added. Each part is color coded by the deviation from CAD target and lighter colors means larger deviations. The global deformation of the cylinders is strongest along the “waist” of the cylinder, and is only detectable by the ICP registration when is large enough to be quite evident to the eye. This is consistent with our findings in the OC run length analysis, where the ICP objective is more effective with relatively larger values. A similar ”defect localization” could be performed with a registration method based on other distance functions between points, such as spectral distances, see section 7.
6 A permutation-based SPC scheme for “Phase I”
For the startup or “Phase I” of a statistical process control scheme, we suggest using the distribution-free multivariate chart proposed by Capizzi & Masarotto (2017), available in their R package dfphase1
, applied to the spectrum of the LB operator. Similar to the DFEWMA chart, these authors used rank related statistics, so their method is more sensitive to small rather than large changes in the process. We point out that this multivariate chart requires the number of variables to be strictly smaller than the total number of observations (parts) available in “Phase I”. This condition does not hold for the lower part of the LB spectrum, since the number of eigenvalues (usually several hundreds) is typically larger than the number of “Phase I” observations (usually no larger than 100. We fix it to 50 to be consistent with our “Phase II” analysis). To deal with this situation, we recommend performing a principal component analysis (PCA) first to reduce the dimensionality of the “Phase I” spectral data. FollowingCapizzi & Masarotto (2017), the performance of the chart is evaluated based on the false alarm probability (FAP) for an in-control process and the alarm probabilities for out-of-control scenarios.
Given the “Phase I” dataset consisting of the cropped spectra, we first discard the first eigenvalue of each part because it is theoretically zero (non-zero first eigenvalues from the estimated LB operator are pure numerical error). Next we scale all other eigenvalues such that they have unit variance and thus receive the same weight for PCA. Finally, PCA is performed and the first few principal components are the variables used in the “Phase I” chart. We compare the performance of four different numbers of principal components ( PCs): the first PC, the first 10 PCs, the first 25 PCs, and the first 49 PCs. We also consider using the first 49 eigenvalues (i.e. the second to the fiftieth eigenvalue after discarding the first one) as a benchmark to evaluate the benefit of using PCA. The same part type as in section4.3 is used here so the dimensionality of the lower spectra is still 200.
Table 5 summarizes the in-control false alarm probability (FAP), where similarly to Section 4.3, we sampled and permuted 50 IC parts from pre-simulated 40,000 IC parts instead of simulating new parts for new replications to reduce the computational effort. As the table shows, all cases have a FAP close to the nominal , indicating the method works well either with or without PCA.
|1st PC||first 10 PCs||first 25 PCs||first 49 PCs||first 49 eigenvalues|
The out-of-control alarm probabilities for the cylindrical parts with increasing values are shown in table 6. Each replication consists of 25 simulated IC parts followed by 25 simulated OC parts under the same isotropic noise as before (N). The table indicates that the cropped spectrum (first 49 eigenvalues) does not have any detection power as its alarm probabilities are always small and close to . This is expected as we lose the information contained in the larger eigenvalues when we crop the spectrum. Another important observation is that using fewer principal components works much better than more principal components, with using only the first PC of the cropped spectrum working best. This can be explained by the hierarchical structure of the principal components, i.e., the PCs are more likely to be noise as their index increases. For example, the first 10 principal components already account for more than 95% of total variance, so the more principal components we included, the more noise is entering into the test statistics. We point out how the percentage of variance explained by the first 10 principal components increases as the out of control parameter increases, so it is safe to only use the first several principal components.
|1st PC||first 10 PCs||first 25 PCs||first 49 PCs||first 49 eigenvalues|
The OC alarm probabilities for the defective parts displayed in figure 10 are shown in Table 7. Each new set of “Phase 1” data consists of 25 simulated IC parts and 25 simulated OC parts with the same isotropic noise level as before. As it can be seen, the protrusion defect is the easiest to detect, followed by chipped #2 defect, with the chipped #1 type of defect being the hardest to detect. This is consistent with our OC results from “Phase II” (table 4). Similarly to the cylindrical parts, the first 49 eigenvalues do not have any detection power and the first 49 PCs are too noisy. Again, using 10 and 25 PCs provides satisfactory alarm probabilities, as they account for 70% and 90% of the total variance in the “Phase 1” dataset, respectively. In practice, we recommend using a scree plot to decide the best number of principal components to use.
|Defect type||1st PC||first 10 PCs||first 25 PCs||first 49 PCs||first 49 eigenvalues|
7 Discussion: other intrinsic geometrical statistics for process control
7.1 Heat kernel and diffusion distances
For a long time, researchers in computer vision have aimed at finding descriptors of shapes for object recognition that would use properties of a cloud of points from the point of view of each point. In a highly cited paper, Belongie et al. (2002) introduce the concept of a shape context
, a local description of the shape in the vi