Mathematical approaches play an essential role in understanding of practical harmonic techniques. Though differential geometry has contributed to the theoretical studies of the Laplacian, it does not work on discrete points, such as data. In order to formulate its discrete analogue on a set of points, we focus on a universal differential calculus [14, 17], which has an advantage to define the exterior derivative without any additional assumption on points, likesuch as continuous models or graphs. Since it is also possible to extend it to define the (discrete) Laplacian, this framework is naturally expected to provide a unified view among Laplacian-based algorithms in applied harmonic analysis and machine learning. Therefore, in this paper, we aim to construct a general formulation to enable differential geometry to work on discrete points with the help of a universal differential calculus, and then study how it shows geometric relationship of frameworks in applied harmonic analysis, machine learning, and so on.
In order to build a general setting, we start from defining a differential -form, a measure on functions, an inner product on -forms, and the Dirichlet energy over a set of discrete points, which is regarded as a manifold. Then, the Laplacian is immediately given as the Laplace-Beltrami operator. It is worth emphasizing that this Laplacian is compatible with that given in spectral graph theory [8, 27] and random walks [25, 1]
. Finally, we define the Fourier transform and the curvature vector of an embedding, which characterize geometric aspects of points. In summary, our formulation for differential geometry on discrete points consists of those in Table1.1.
|set of function||§2.1|
|exterior derivative||Dfn. 2.3|
|differential -form||Dfn. 2.3|
|integral of function||(2.6)|
|inner product on functions||(2.7)|
|inner product on -forms||(2.9)|
|Laplacian for function||(2.15)|
To show advantages of this formulation, we demonstrate three types of applications. First, we study a graph based frameworks; spectral graph theory and random walks. There, we review useful techniques for other applications to verify compatibility between our setting and theirs. Second, we figure out geometric aspects of principal component analysis and classical many-body physics. Though these frameworks are usually not explained in geometric contexts, a covariance and a force are interpreted as the Dirichlet energy and the curvature vector respectively. Third, we understand practical applications, signal processing and manifold leaning, in applied harmonic analysis and machine learning by their relations with other frameworks.
This paper is organized as follows: In §2, we explain the way to construct discrete differential geometry as in Table 1.1. Since this section is discussed in an abstract manner, we summarize main concepts by matrix description in §3 for the sake of the reader. In §4 and §5, we review some results from spectral graph theory and random walks, in §6 and §7, we explain geometric viewpoints in principal component analysis and physics, and last we study signal processing and manifold learning in §8 and §9 respectively.
2 Differential geometry on discrete points
In this section, we review a universal differential calculus, and then define differential geometry on a set of discrete points.
In §2.1, we check algebraic aspects of a set of functions over discrete points. Then, we build a geometric setting in §2.2. The Laplacian, the Fourier transform, and the curvature vector are introduced in §2.3, §2.4 and 2.5 respectively. Their matrix description is explained in §3.
2.1 universal differential calculus
Let be a finite set. Without loss of generality, we can assume . The set of functions is denoted by , which is an -vector space in a standard manner. It is useful to take its basis as , where is Kronecker’s delta. Define a product as pointwise:
By bilinearity, decomposes into two maps and which satisfy
. Here, the tensor productis over , and is regarded as functions on by . It is easy to see these maps and are given as
We set as the constant function taking a value , which is written as . The equation follows from the definition (2.1), or is checked by . Then, we define left and right actions on by and respectively. The next proposition follows:
is an -algebra with the product and the unity . is an -bimodule.
Now, we introduce a universal differential calculus.
Definition 2.3 ().
For , define a differential map by
and as the minimal left -submodule of containing . The pair is called the universal first order differential calculus on .
is an -bimodule.
We can see the Leibniz rule holds:
Hence, the element produced by the right action belongs to . ∎
The differential map can be defined on the higher tensor spaces in a similar way to the exterior derivative on manifolds [17, §2]. Hence, we refer to an element of as a -form.
is isomorphic to as -bimodules.
Notice that for , . Thus, is spanned by a basis in . The linear map means , and then we have . ∎
2.2 measure and metric
Let be a measure on , namely, and for any . We define an integral on with respect to the measure and an inner product for :
As usual, the corresponding norm is denoted by , and the volume of is given by . Put a mean of as . Note that the evaluation operator is represented in several ways:
Let us consider an inner product on 1-forms given as a symmetric bilinear map . In this paper, we define it by
with which satisfies for . For simplicity, we put for . This inner product satisfies the property; for . For , we have
Define a degree of the inner product as , which is often employed as a measure . Since , we get . For ,
is called the Dirichlet energy with respect to and . It is easy to check
The above inner product can be defined through a metric :
which preserves the -bimodule structures . By integrating it over , we have the inner product. Moreover, its integration over corresponds to a dual Riemann metric in differential geometry.
Sometimes, it is useful to consider another basis , which is an orthonormal basis on . By this basis, we can represent as
In general, is regarded as a set of vertices in an oriented graph and as a weight on the oriented edge . Here, we can ignore the orientation because of the condition . In this sense, and are Hilbert spaces on the vertices and the edges respectively . When , the edge is viewed as disconnected. If there does not exist non-empty proper subset which satisfies and for all and , the graph is called connected.
The original universal differential calculus refers to a disconnected edge as a non-allowed element , and then realizes a non complete graph as a quotient algebra of by the ideal generated by non-allowed elements [17, §4]. This construction seems to describe a topology of a graph, contrary, ours focus on its metric structure.
According to this convention, we often refer to as (graph) weights and as a graph.
2.3 Laplace operator
With the inner products given in 2.2, define a co-differential to satisfy for any and . Then, the Laplacian is defined by , in the same way as the Laplace-Beltrami operator in differential geometry. Since
Thereby, the Laplacian is represented as
This is also known as the graph Laplacian, as explained in §3. By definition, we have and the above representation follows from (2.11) as well. When the corresponding graph is connected, the Dirichlet energy takes the minimum value if and only if is a constant function. Since
is self-adjoint, we can take eigenfunctionsas follows:
Here, we see and for .
2.4 Fourier analysis
In the continuous setting, the Fourier transform is given by , and is an eigenfunction of the 1-dimensional Laplacian, .
On the analogy, in the graph setting, it is natural to use the eigenfunctions of the Laplacian , instead of , and define for . The transformation is known as the graph Fourier transform . We call the -th Fourier coefficient or the -th frequency. The corresponding inverse Fourier transform is given by
which is just the eigenfunction expansion by . It is easy to see Parseval’s identity holds:
This is valid for other expansions by orthogonal functions, such as (2.13). Sometimes, the convolution operator is defined so that holds:
We also obtain relations and .
2.5 embedding and curvature
Our setting so far did not use a coordinate of points in , just used their indexes. Herein, suppose that points are embedded in Euclidean space . Namely, we consider a map
where and . This element is viewed as a coordinate for a point . The Euclidean group acts on , hence it defines a coordinate transformation for (as a set).
In differential geometry, an embedding induces a Riemann metric on a manifold , and especially determines the Laplace-Beltrami operator . The normal bundle is given on , and then the mean curvature vector is defined as the trace of the second fundamental form divided by . Hence, the vector indicates the normal direction on each point of , and its length is called the mean curvature. Beltrami’s formula relates those objects as
Motivated by this formula, we define a graph curvature vector of an embedding by
Unlike differential geometry, this vector does not indicate the normal direction, because it is not defined for discrete points. Nevertheless, the vector has a special meaning in physics as explained in §7. The embedding energy is given with the curvature vector, that is, , and invariant by the Euclidean group action. In some cases, it is convenient to suppose a metric is induced by an embedding as in Figure 2.1. For example, we can define weights by using the distance, such as
3 Matrix description of the geometric formulation
In this section, we give matrix description of the formulation discussed in §2.
First we remark that our formulation contains two types of parameters in a measure and an inner product independently. A measure is just given by a positive function on
, thus its degree of freedom is. On the other hand, an inner product is determined by weights which satisfy and , hence its degree of freedom is . When they are taken on a certain relation, well-known cases appear as follows.
Let , , and be -matrices.
By -basis, a function is represented as a numerical vector . Then, an inner product is written as , and the Dirichlet energy is as , which does not depend on . The Laplacian is given as
, and its eigenvalue equation is
The corresponding eigenvectors, denoted by an-matrix , defines the Fourier transform and its inverse , where . An embedding is described as an -matrix , then its curvature vector is as .
In the case of -basis, since as mentioned in (2.13), we have , where . The Laplacian is described as , because
When assume or , we obtain the known Laplacians; the combinatorial Laplacian, the random walk Laplacian, and the normalized Laplacian . However, we do not use this configuration for random walks in §5.
The above notations are summarized in Table 3.1.
4 Application I: spectral graph theory
We review basic results about eigenvalue estimation in spectral graph theory to check its compatibility with our formulation. These results are regarded as discrete analogues of spectral geometry and related to the graph cut problem in §9.1. See also [8, 27] for reference.
4.1 upper bound of eigenvalues
First, we estimate an upper bound for the largest eigenvalue . Put .
because . ∎
Next, we give an upper bound for the second smallest eigenvalue , which is characterized as a minimum value of in functions . For this purpose, the isoperimetric constant is useful, because it is defined in a similar way to the characterization of :
where and is taken over all subsets satisfying . From (2.11), we can check , then for a complement .
For any such that , put , which satisfies , then we have
Hence, we obtain by . ∎
4.2 lower bound of eigenvalues
Here, we estimate an lower bound for the second smallest eigenvalue .
First we claim
for a positive function such that . We take a sequence of subsets so that is represented as by . Then, we can see
as required. Now, we can write by positive functions which satisfies since . It is easy to check , then applying (4.5) to , we obtain
In the second inequality, we used the Cauchy-Schwarz inequality: for . ∎
This theorem is called Cheeger’s inequality and its continuous analogue is known in differential geometry .
5 Application II: random walks
5.1 heat equations
For , putting , we have
for , where we put and for . Needless to say, the eigenvalue decomposition of is given as
For and , define operators