1.1 Graph Fourier transform
The definition of the graph Fourier transform plays a central role in graph signal processing. By Fourier transform, a graph signal is decomposed into different spectral components and thus can be analyzed from the Fourier domain. The popular definition of graph Fourier transform is through the eigenvectors of the graph Laplacian matrix. Although this definition is adopted by many researchers, it has some limitations. First, the definition only applies to undirected graphs. Second, the computation of the Laplacian eigenvectors is rather expensive when the graph is large. Therefore, it is tempting to find an alternative definition of graph Fourier transform without these disadvantages.
One basic requirement for the Fourier basis is that the basis vectors should represent a range of different oscillating frequencies. For a time-domain signal, the classical Fourier transform decomposes it into different frequency components. Likewise, in the graph setting, one expects the graph Fourier basis to have a similar property, i.e., the basis vectors represent different oscillating frequencies. Generally speaking, the magnitude of oscillation of a signal can be measured by its variation. In fact, thenorm variation of the Laplacian eigenvectors
is characterized by the corresponding eigenvalue. When the eigenvalues are arranged in ascending order, the variation of the eigenvector will be ascending with , thus representing a range of frequencies from low to high. Moreover, the eigenvector minimizes the norm variation in the subspace orthogonal to the span of the previous eigenvectors.
Recently, Sardellitti et al. proposed a definition of directed graph Fourier basis as the set of orthogonal vectors minimizing the graph directed variation, and proposed two algorithms (SOC and PAMAL) to solve the related optimization problem Sardellitti2017 . However, there is a lack of theoretic analysis of the proposed Fourier basis, and the computational complexity of the proposed algorithms are rather high. Slightly different from Sardellitti’s approach, we propose a definition of Fourier basis based on iteratively solving a sequence of norm variation minimization problems. We rigorously prove a necessary condition satisfied by the proposed Fourier basis. Further, we provide a fast greedy algorithm to approximately construct the Fourier basis. Numerical experiments show the algorithm is effective, and the Fourier coefficients under the greedy basis and Laplacian basis have nearly the same rate of decay for simulated or real signals.
The rest of the paper is organized as follows. In Section 2, we discuss the relation between graph Fourier basis and signal variation, and propose the definition of Fourier basis based on norm variation minimization. In Section 3, we prove a necessary condition of Fourier basis, showing that the th basis vector ’s components have at most different values. In Section 4, we provide a greedy algorithm to construct an approximate basis. In Section 5, we present some numerical results. Section 6 is a final conclusion.
In this paper we use the following notations.
For a matrix , denotes its column space, i.e., ; and denotes its kernel, i.e., .
For a vector , denotes its Euclidean norm, i.e., . For a matrix , denotes its operator norm, i.e., . Denote by the open ball centered at with radius .
The cardinality of a set is denoted by . Let be a positive integer, and . For any , we use to denote the indication vector of , i.e., if and otherwise. is also written as .
For and subsets , is defined as .
2 Graph Fourier basis and signal variation
In this section, we shall derive the relationship between the graph Fourier basis and signal variation. Let us begin with the basic terminology of graph signal processing. Let be a connected, undirected, and weighted graph, where the vertices set and the weight matrix satisfying and . The degree of a vertex is defined as , and the degree matrix . The combinatorial Laplacian matrix is defined as . Since is symmetric and positive semi-definite, it has eigenvalues and the corresponding set of orthonormal eigenvectors . We call the Laplacian basis of . A graph signal is a real-valued function defined on , and can be regarded as a vector in . The Fourier transform of under the Laplacian basis is defined as .
Note that the norm variation of the Laplacian eigenvector is increasing with . To see this, let , then it can be proved that
That means the quadratic form exactly measures the norm variation of . Since , we have
i.e., the norm variation of is increasing with . In other words, the Laplacian basis vectors represent a range of frequencies from low to high.
Furthermore, the eigenvector minimizes the norm variation in the subspace orthogonal to the span of the previous eigenvectors, i.e.,
In fact, let satisfy and . Let the Fourier transform of be . Then can be expressed as , hence
Therefore the eigenvector solves the norm variation minimization problem (2) for .
It is natural to consider the more general norm variation. In this paper, we restrict ourselves to norm variation defined as follows
Similar to Laplacian basis minimizing norm variation, we define the Fourier basis as the solution of norm variation minimization problem.
Let . If a sequence of vectors solves the norm variation minimization problem as follows,
, then we say the orthogonal matrixconstitutes an Fourier basis, or simply an basis, of the graph .
Remarks: The above definition of Fourier basis can be extended to directed graphs. All one needs is to replace in the minimization problem by a directed version
where (more details can be found in Sardellitti2017 ). Then one can similarly defined the directed Fourier basis as the solution of the corresponding problem. Without loss of generality, we only consider undirected graphs in this paper. Most results can be generated to the directed case without essential difficulties.
3 Necessary condition of Fourier basis
where is a matrix with its first column being , , and . With this notation, problem (4) can be referred to as . Now our goal is to solve problem .
First, let us recall some basic definitions of optimization theory. Denote the feasible region of problem by , i.e.,
A point is called a local minimum of problem if there exists such that for any . If holds for any , then is called a global minimum of problem . Obviously a global minimum is necessarily a local minimum. We denote the set of all local minima of problem by .
Due to the sphere constraint , problem is not a convex optimization problem. As far as we know, there are no general results about the global minimum of such problems, and in most cases it is only possible to approach the local minimum by iterative algorithms Bresson2012 ; Lai2014 . As the main result of this section, we shall prove a necessary condition satisfied by the local minimum (Theorem 4). The key ingredient of the proof is based on the concept of piecewise representation, which is introduced as follows.
Suppose . Let and . Then can be rewritten as , where . Let , and . Then , which is called the piecewise representation of . We also call the partition matrix of , denoted by .
It is easy to see that any vector in has unique piecewise representation. Under the piecewise representation , the norm variation can be simplified to a linear form in a local neighborhood of .
Suppose , , and . Then there exists such that
where is defined by
Suppose , then . Let and . When is sufficiently small, we have , i.e., there exists such that for all , is a piecewise representation. Therefore
If and , then
The main idea is to transform problem to a easier one by using Lemma 3. Suppose and . By assumption of problem , we have and , therefore is a non-constant signal, i.e., . Since is a local minimum of , there exists such that
By Lemma 3, there exists and such that for all and . Let , then implies and . Let , then implies , and implies . Therefore
Suppose , and is an orthonormal basis of . Define , , . Then we have
We next prove problem (13) has minimum only if . It is proved by contradiction.
Suppose . By the method of Lagrange multipliers, the minimum of problem (13) satisfies the equation
where is a Lagrange multiplier. Thus .
Since , there exists a nonzero vector such that . Let , then . Let , , . Then
since is symmetric and positive definite.
Let , then . Choose small enough to guarantee . Since , we have
which contradicts to being the minimum of problem (13). The proof is complete. ∎
), we deduce an estimate of the number of values of the components of a local minimum.
If , then the components of have at most different values.
Let . By Theorem 4, . Since
we have . By definition of piecewise representation, is the number of different values of ’s components. The proof is complete. ∎
Corollary 5 asserts that the th basis vector , as the global (hence local) minimum of problem , is at most a -valued signal. In particular, is a constant signal and is exactly a two-valued signal. Intuitively speaking, the larger is, the more values can take, the more oscillation might present. Thus the basis vectors represent different oscillation frequencies from low to high as expected.
For any vector , its partition matrix has at most columns, and each entry of is either or . Therefore the set of all partition matrices of vectors in is a finite set, so as a subset is also finite.
By Theorem 4, if is a local minimum of problem , then its partition matrix belongs to . Conversely, given a partition matrix , we show that there are only two with partition matrix being equal to .
If and , then .
Since , there exist such that and . Then and , i.e., . Since and , there exists such that , hence . From , we have . The proof is complete. ∎
By Theorem 6, has two elements in total, which differs by a sign. Let
Then , i.e., is a finite set.
The local minima set is a subset of . In fact, If and , then and , hence . It follows that is also a finite set, i.e., each local minima is isolated and the total number of local minima is finite. Figure 1 shows the relations between these sets and definitions. Here resembles the concept of critical points, which contains but not equals the set of local minima.
Since is finite, to find the global minimum of problem , one way is to compute for all in and pick out the largest one. Table 1 shows a special case for , .
Through this method of enumeration, the continuous problem is equivalent to a discrete problem in which the variable belongs to a finite set . However, as far as we know, the discrete problem has no effective algorithm, since the size of grows exponentially with , and the method of enumeration is impractical for large . In the next section, we will give a fast greedy algorithm to approximately construct the Fourier basis when is large.
4 Greedy algorithm for Fourier basis
In this section, we provide a fast greedy algorithm to approximately construct the Fourier basis. Through piecewise representation, the partition matrix of the th basis vector naturally induces a partition of the vertices set . The increasing of variation of implies that the corresponding partition evolves from coarser to finer scales. On the contrary, given a sequence of partitions varying across different scales, one might be able to construct an orthonormal basis close to basis. Motivated by this idea, we propose a greedy algorithm, based on a partition sequence created by iteratively grouping the vertices. In each step, we pick out the two groups of vertices with the largest mutual weights between them, and combine them in a new group. Repeating the process, we get a sequence of partitions varying from finer to coarser scales. Then based on , we define a sequence of subspaces of . By using the similar ideas of multi-resolution analysis, we obtain an orthonormal basis.
4.1 Greedy partition sequence
We define a sequence of partitions on the vertices set as follows.
For , define
Definition 7 actually represents a vertices grouping process. At the beginning, the finest partition has groups, each group having one vertex. To get the next partition , we identify as the two groups having the largest mutual weight. Then we combine and to get a new group , and together with the other groups in we form a new partition . This operation repeats for times. At the end, we get the coarsest partition , with all the vertices belonging to a single group. See Figure 2 for an illustration.
4.2 Greedy basis
The greedy partition sequence defined above yields a sequence of subspaces
which satisfy the relations
Denote the orthogonal complement of in by . By definition 7, the partition is obtained by combining two groups and in . Suppose and . Let . Then can be written in the form . From , we get , . Since
there exists such that , . By requiring , we get . We summarize these results in the following theorem.
Suppose are defined as in Definition 7. Let ,
Then is an orthogonal matrix. We call the greedy basis of the graph .
An interesting question is whether the greedy basis vector minimizes the norm variation. We will show that the partition matrix induced by the greedy partition satisfies the necessary condition (10).
where is defined in Theorem 8. Suppose , and . Then .
Suppose and . Then , i.e., . Since , we have . Because , that means . Since and , there exists such that , i.e., . Hence , , , i.e., , therefore and . ∎