 # Geometric Formulation for Discrete Points and its Applications

We introduce a novel formulation for geometry on discrete points. It is based on a universal differential calculus, which gives a geometric description of a discrete set by the algebra of functions. We expand this mathematical framework so that it is consistent with differential geometry, and works on spectral graph theory and random walks. Consequently, our formulation comprehensively demonstrates many discrete frameworks in probability theory, physics, applied harmonic analysis, and machine learning. Our approach would suggest the existence of an intrinsic theory and a unified picture of those discrete frameworks.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Mathematical approaches play an essential role in understanding of practical harmonic techniques. Though differential geometry has contributed to the theoretical studies of the Laplacian, it does not work on discrete points, such as data. In order to formulate its discrete analogue on a set of points, we focus on a universal differential calculus [14, 17], which has an advantage to define the exterior derivative without any additional assumption on points, likesuch as continuous models or graphs. Since it is also possible to extend it to define the (discrete) Laplacian, this framework is naturally expected to provide a unified view among Laplacian-based algorithms in applied harmonic analysis and machine learning. Therefore, in this paper, we aim to construct a general formulation to enable differential geometry to work on discrete points with the help of a universal differential calculus, and then study how it shows geometric relationship of frameworks in applied harmonic analysis, machine learning, and so on.

In order to build a general setting, we start from defining a differential -form, a measure on functions, an inner product on -forms, and the Dirichlet energy over a set of discrete points, which is regarded as a manifold. Then, the Laplacian is immediately given as the Laplace-Beltrami operator. It is worth emphasizing that this Laplacian is compatible with that given in spectral graph theory [8, 27] and random walks [25, 1]

. Finally, we define the Fourier transform and the curvature vector of an embedding, which characterize geometric aspects of points. In summary, our formulation for differential geometry on discrete points consists of those in Table

1.1.

To show advantages of this formulation, we demonstrate three types of applications. First, we study a graph based frameworks; spectral graph theory and random walks. There, we review useful techniques for other applications to verify compatibility between our setting and theirs. Second, we figure out geometric aspects of principal component analysis and classical many-body physics. Though these frameworks are usually not explained in geometric contexts, a covariance and a force are interpreted as the Dirichlet energy and the curvature vector respectively. Third, we understand practical applications, signal processing and manifold leaning, in applied harmonic analysis and machine learning by their relations with other frameworks.

This paper is organized as follows: In §2, we explain the way to construct discrete differential geometry as in Table 1.1. Since this section is discussed in an abstract manner, we summarize main concepts by matrix description in §3 for the sake of the reader. In §4 and §5, we review some results from spectral graph theory and random walks, in §6 and §7, we explain geometric viewpoints in principal component analysis and physics, and last we study signal processing and manifold learning in §8 and §9 respectively.

## 2 Differential geometry on discrete points

In this section, we review a universal differential calculus, and then define differential geometry on a set of discrete points.

In §2.1, we check algebraic aspects of a set of functions over discrete points. Then, we build a geometric setting in §2.2. The Laplacian, the Fourier transform, and the curvature vector are introduced in §2.3, §2.4 and 2.5 respectively. Their matrix description is explained in §3.

### 2.1 universal differential calculus

We recall algebraic structures on functions to make sure the definition of a universal differential calculus. See also [13, 17] for reference.

Let be a finite set. Without loss of generality, we can assume . The set of functions is denoted by , which is an -vector space in a standard manner. It is useful to take its basis as , where is Kronecker’s delta. Define a product as pointwise:

 ~σ(f,g)(x)=(f⋅g)(x):=f(x)⋅g(x). (2.1)

By bilinearity, decomposes into two maps and which satisfy

. Here, the tensor product

is over , and is regarded as functions on by . It is easy to see these maps and are given as

 ι(ei,ej)=ei⊗ej, σ(ei⊗ej)=δijei.

We set as the constant function taking a value , which is written as . The equation follows from the definition (2.1), or is checked by . Then, we define left and right actions on by and respectively. The next proposition follows:

###### Proposition 2.2.

is an -algebra with the product and the unity . is an -bimodule.

Now, we introduce a universal differential calculus.

###### Definition 2.3 ().

For , define a differential map by

 ∂f:=1A⊗f−f⊗1A=∑i,j∈V(fj−fi)ei⊗ej,

and as the minimal left -submodule of containing . The pair is called the universal first order differential calculus on .

is an -bimodule.

###### Proof..

We can see the Leibniz rule holds:

 ∂(f⋅g) =1A⊗(f⋅g)−(f⋅g)⊗1A =(1A⊗f)⋅g−f⊗g+f⊗g−(f⋅g)⊗1A=∂f⋅g+f⋅∂g.

Hence, the element produced by the right action belongs to . ∎

The differential map can be defined on the higher tensor spaces in a similar way to the exterior derivative on manifolds [17, §2]. Hence, we refer to an element of as a -form.

###### Lemma 2.5.

is isomorphic to as -bimodules.

###### Proof..

Notice that for , . Thus, is spanned by a basis in . The linear map means , and then we have . ∎

### 2.2 measure and metric

Let be a measure on , namely, and for any . We define an integral on with respect to the measure and an inner product for :

 ∫Vf(x)dμx :=∑x∈V∑i∈Vfiei(x)μ(x)=∑i∈Vfiμi, (2.6) ⟨f,g⟩A :=∫Vf(x)g(x)dμx=∑i∈Vfigiμi. (2.7)

As usual, the corresponding norm is denoted by , and the volume of is given by . Put a mean of as . Note that the evaluation operator is represented in several ways:

 (2.8)

Let us consider an inner product on 1-forms given as a symmetric bilinear map . In this paper, we define it by

 ⟨ei⊗ej,ek⊗el⟩Ω1A:=δikδjlwij, (2.9)

with which satisfies for . For simplicity, we put for . This inner product satisfies the property; for . For , we have

 ⟨u,v⟩Ω1A=∑i,j∈V,i≠jwijuijvij.

Define a degree of the inner product as , which is often employed as a measure . Since , we get . For ,

 E(f,g):=12⟨∂f,∂g⟩Ω1A=12∑i,j∈Vwij(fi−fj)(gi−gj) (2.10)

is called the Dirichlet energy with respect to and . It is easy to check

 E(ei,ej)={deg(i)for i=j−wijfor i≠j (2.11)
###### Remark 2.12.

The above inner product can be defined through a metric :

 (ei⊗ej,ek⊗el)Ω1A:=δikδjlwijμiμjei⊗ej,

which preserves the -bimodule structures . By integrating it over , we have the inner product. Moreover, its integration over corresponds to a dual Riemann metric in differential geometry.

Sometimes, it is useful to consider another basis , which is an orthonormal basis on . By this basis, we can represent as

 f=∑i∈V~fi~ei, where ~fi=⟨f,~ei⟩A=fi√μi. (2.13)

In general, is regarded as a set of vertices in an oriented graph and as a weight on the oriented edge . Here, we can ignore the orientation because of the condition . In this sense, and are Hilbert spaces on the vertices and the edges respectively . When , the edge is viewed as disconnected. If there does not exist non-empty proper subset which satisfies and for all and , the graph is called connected.

###### Remark 2.14.

The original universal differential calculus refers to a disconnected edge as a non-allowed element , and then realizes a non complete graph as a quotient algebra of by the ideal generated by non-allowed elements [17, §4]. This construction seems to describe a topology of a graph, contrary, ours focus on its metric structure.

According to this convention, we often refer to as (graph) weights and as a graph.

### 2.3 Laplace operator

With the inner products given in 2.2, define a co-differential to satisfy for any and . Then, the Laplacian is defined by , in the same way as the Laplace-Beltrami operator in differential geometry. Since

 ⟨u,∂f⟩Ω1A=∑i,j∈V,i≠jwijuij(fj−fi)=∑i,j∈V,i≠jwij(uji−uij)fi,

we obtain

 ∂∗u=∑i∈V1μi{∑j∈V∖iwij(uji−uij)}ei.

Thereby, the Laplacian is represented as

 Lf=∑i,j∈Vwijμi(fi−fj)ei=∑i∈Vdeg(i)μifiei−∑i,j∈Vwijμifjei. (2.15)

This is also known as the graph Laplacian, as explained in §3. By definition, we have and the above representation follows from (2.11) as well. When the corresponding graph is connected, the Dirichlet energy takes the minimum value if and only if is a constant function. Since

is self-adjoint, we can take eigenfunctions

as follows:

 Lvi=ρivi, 0=ρ1≤ρ2≤⋯≤ρn, ⟨vi,vj⟩A=δij. (2.16)

Here, we see and for .

### 2.4 Fourier analysis

In the continuous setting, the Fourier transform is given by , and is an eigenfunction of the 1-dimensional Laplacian, .

On the analogy, in the graph setting, it is natural to use the eigenfunctions of the Laplacian , instead of , and define for . The transformation is known as the graph Fourier transform . We call the -th Fourier coefficient or the -th frequency. The corresponding inverse Fourier transform is given by

 f=n∑i=1F[f]ivi=n∑i=1⟨f,vi⟩Avi, (2.17)

which is just the eigenfunction expansion by . It is easy to see Parseval’s identity holds:

 ⟨f,g⟩A=n∑i=1⟨f,vi⟩A⟨g,vi⟩A=⟨F[f],F[g]⟩Rn.

This is valid for other expansions by orthogonal functions, such as (2.13). Sometimes, the convolution operator is defined so that holds:

 f∗g:=n∑i=1F[f]iF[g]ivi. (2.18)

We also obtain relations and .

### 2.5 embedding and curvature

Our setting so far did not use a coordinate of points in , just used their indexes. Herein, suppose that points are embedded in Euclidean space . Namely, we consider a map

 V∋i↦(r1(i),r2(i),⋯,rd(i))=:→rd(i)∈Rd, (2.19)

where and . This element is viewed as a coordinate for a point . The Euclidean group acts on , hence it defines a coordinate transformation for (as a set).

In differential geometry, an embedding induces a Riemann metric on a manifold , and especially determines the Laplace-Beltrami operator . The normal bundle is given on , and then the mean curvature vector is defined as the trace of the second fundamental form divided by . Hence, the vector indicates the normal direction on each point of , and its length is called the mean curvature. Beltrami’s formula relates those objects as

 Lg→r→r=−n⋅→H→r.

One can refer to [7, 4] for mathematical details.

Motivated by this formula, we define a graph curvature vector of an embedding by

 →Hd:=−L→rd,  Hs:=−Lrs,   for s=1,2,⋯,d. (2.20)

Unlike differential geometry, this vector does not indicate the normal direction, because it is not defined for discrete points. Nevertheless, the vector has a special meaning in physics as explained in §7. The embedding energy is given with the curvature vector, that is, , and invariant by the Euclidean group action. In some cases, it is convenient to suppose a metric is induced by an embedding as in Figure 2.1. For example, we can define weights by using the distance, such as

 wij:=C⋅exp⎛⎝−∥→rd(i)−→rd(j)∥2Rd2σ2⎞⎠.

Several researches show this type of weights converges into a heat kernel on a manifold in the limit [21, 3, 9]. Instead, we study another type of weights in Theorem 7.4. Figure 2.1: Dependency of geometric objects. A dashed arrow means that its head object can be determined by its tail object, but not necessary.

## 3 Matrix description of the geometric formulation

In this section, we give matrix description of the formulation discussed in §2.

First we remark that our formulation contains two types of parameters in a measure and an inner product independently. A measure is just given by a positive function on

, thus its degree of freedom is

. On the other hand, an inner product is determined by weights which satisfy and , hence its degree of freedom is . When they are taken on a certain relation, well-known cases appear as follows.

Let , , and be -matrices.

By -basis, a function is represented as a numerical vector . Then, an inner product is written as , and the Dirichlet energy is as , which does not depend on . The Laplacian is given as

, and its eigenvalue equation is

 (D−W)f=ρMf.

The corresponding eigenvectors, denoted by an

-matrix , defines the Fourier transform and its inverse , where . An embedding is described as an -matrix , then its curvature vector is as .

In the case of -basis, since as mentioned in (2.13), we have , where . The Laplacian is described as , because

 Lf=∑i∈Vdeg(i)√μiμi~fi~ei−∑i,j∈Vwij√μiμj~fj~ei.

When assume or , we obtain the known Laplacians; the combinatorial Laplacian, the random walk Laplacian, and the normalized Laplacian . However, we do not use this configuration for random walks in §5.

The above notations are summarized in Table 3.1.

## 4 Application I: spectral graph theory

We review basic results about eigenvalue estimation in spectral graph theory to check its compatibility with our formulation. These results are regarded as discrete analogues of spectral geometry and related to the graph cut problem in §

### 4.1 upper bound of eigenvalues

First, we estimate an upper bound for the largest eigenvalue . Put .

.

###### Proof..

We have

 ρn=⟨vn,Lvn⟩A =12∑i,j∈Vwij(vn(i)−vn(j))2 ≤∑i,j∈Vwij(vn(i)2+vn(j)2)≤2maxi∈Vdeg(i)μi

because . ∎

Next, we give an upper bound for the second smallest eigenvalue , which is characterized as a minimum value of in functions . For this purpose, the isoperimetric constant is useful, because it is defined in a similar way to the characterization of :

 β=min∅≠A⊊Vvol(∂A)vol(A):=min∅≠A⊊VE(χA,χA)∥χA∥2A, (4.2)

where and is taken over all subsets satisfying . From (2.11), we can check , then for a complement .

.

###### Proof..

For any such that , put , which satisfies , then we have

 E(fA,fA) =∑i∈A,j∈Acwij(vol(A)+vol(Ac))2=vol(V)2vol(∂A), ∥fA∥2A =vol(A)vol(Ac)2+vol(A)2vol(Ac)=vol(V)vol(A)vol(Ac).

Hence, we obtain by . ∎

### 4.2 lower bound of eigenvalues

Here, we estimate an lower bound for the second smallest eigenvalue .

.

###### Proof..

First we claim

 β∫Vf(x)dμx≤12∑i,j∈Vwij|fi−fj|, (4.5)

for a positive function such that . We take a sequence of subsets so that is represented as by . Then, we can see

 12∑i,j∈Vwij|fi−fj| =l∑s=1∑i∈As,j∈Acswijhs =l∑s=1hsvol(∂As)≥βl∑s=1hs∫VχAs(x)dμx,

as required. Now, we can write by positive functions which satisfies since . It is easy to check , then applying (4.5) to , we obtain

 β2∥g+∥4A ≤14(∑i,j∈Vwij|g+(i)−g+(j)|⋅|g+(i)+g+(j)|)2 ≤14(∑i,j∈Vwij|g+(i)−g+2(j)|2)⋅(∑i,j∈Vwij|g+(i)+g+(j)|2)2 ≤12E(g+,g+)⋅2∑i,j∈Vwij(g+(i)2+g+(j)2)≤2ρ2δ∥g+∥4A.

In the second inequality, we used the Cauchy-Schwarz inequality: for . ∎

This theorem is called Cheeger’s inequality and its continuous analogue is known in differential geometry .

## 5 Application II: random walks

In this section, we deduce some notations of random walks from our formulation. In §5.1, we review random walks briefly, and in §5.2, we consider their connection with a geometric distance.

### 5.1 heat equations

For , putting , we have

 Scf=∑i∈Vcμi−deg(i)cμifiei+∑i,j∈Vwijcμifjei=∑i,j∈Vθijcμifjei, (5.1)

for , where we put and for . Needless to say, the eigenvalue decomposition of is given as

 Scvi=λivi, λi=1−ρic, 1=λ1≥λ2≥⋯≥λn≥1−2δc,

by (2.16) and Lemma 4.1. Besides, is described as

 (5.2)
###### Proposition 5.3.

For and , define operators

 Pk:=Skc,  Qt:=∞∑k=0e−ttkk!P