 # Clustering Signed Networks with the Geometric Mean of Laplacians

Signed networks allow to model positive and negative relationships. We analyze existing extensions of spectral clustering to signed networks. It turns out that existing approaches do not recover the ground truth clustering in several situations where either the positive or the negative network structures contain no noise. Our analysis shows that these problems arise as existing approaches take some form of arithmetic mean of the Laplacians of the positive and negative part. As a solution we propose to use the geometric mean of the Laplacians of positive and negative part and show that it outperforms the existing approaches. While the geometric mean of matrices is computationally expensive, we show that eigenvectors of the geometric mean can be computed efficiently, leading to a numerical scheme for sparse matrices which is of independent interest.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

A signed graph is a graph with positive and negative edge weights. Typically positive edges model attractive relationships between objects such as similarity or friendship and negative edges model repelling relationships such as dissimilarity or enmity. The concept of balanced signed networks can be traced back to Harary:1954:Notion ; Cartwright:1956:Structural . Later, in Davis:1967:Clustering , a signed graph is defined as -balanced if there exists a partition into groups where only positive edges are within the groups and negative edges are between the groups. Several approaches to find communities in signed graphs have been proposed (see tang2015survey for an overview). In this paper we focus on extensions of spectral clustering to signed graphs. Spectral clustering is a well established method for unsigned graphs which, based on the first eigenvectors of the graph Laplacian, embeds nodes of the graphs in and then uses -means to find the partition. In Kunegis:2010:spectral the idea is transferred to signed graphs. They define the signed ratio and normalized cut functions and show that the spectrum of suitable signed graph Laplacians yield a relaxation of those objectives. In Chiang:2012:Scalable other objective functions for signed graphs are introduced. They show that a relaxation of their objectives is equivalent to weighted kernel -means by choosing an appropriate kernel. While they have a scalable method for clustering, they report that they can not find any cluster structure in real world signed networks.

We show that the existing extensions of the graph Laplacian to signed graphs used for spectral clustering have severe deficiencies. Our analysis of the stochastic block model for signed graphs shows that, even for the perfectly balanced case, recovery of the ground-truth clusters is not guaranteed. The reason is that the eigenvectors encoding the cluster structure do not necessarily correspond to the smallest eigenvalues, thus leading to a noisy embedding of the data points and in turn failure of

-means to recover the cluster structure. The implicit mathematical reason is that all existing extensions of the graph Laplacian are based on some form of arithmetic mean of operators of the positive and negative graphs. In this paper we suggest as a solution to use the geometric mean of the Laplacians of positive and negative part. In particular, we show that in the stochastic block model the geometric mean Laplacian allows in expectation to recover the ground-truth clusters in any reasonable clustering setting. A main challenge for our approach is that the geometric mean Laplacian is computationally expensive and does not scale to large sparse networks. Thus a main contribution of this paper is showing that the first few eigenvectors of the geometric mean can still be computed efficiently. Our algorithm is based on the inverse power method and the extended Krylov subspace technique introduced by druskin:1998:extendedKrylov and allows to compute eigenvectors of the geometric mean of two matrices without ever computing itself.

In Section 2 we discuss existing work on Laplacians on signed graphs. In Section 3 we discuss the geometric mean of two matrices and introduce the geometric mean Laplacian which is the basis of our spectral clustering method for signed graphs. In Section 4 we analyze our and existing approaches for the stochastic block model. In Section 5 we introduce our efficient algorithm to compute eigenvectors of the geometric mean of two matrices, and finally in Section 6 we discuss performance of our approach on real world graphs. .

## 2 Signed graph clustering

Networks encoding positive and negative relations among the nodes can be represented by weighted signed graphs. Consider two symmetric non-negative weight matrices and , a vertex set , and let and be the induced graphs. A signed graph is the pair where and encode positive and the negative relations, respectively.

The concept of community in signed networks is typically related to the theory of social balance. This theory, as presented in Harary:1954:Notion ; Cartwright:1956:Structural , is based on the analysis of affective ties, where positive ties are a source of balance whereas negative ties are considered as a source of imbalance in social groups.

###### Definition 1 (Davis:1967:Clustering , k-balance).

A signed graph is -balanced if the set of vertices can be partitioned into sets such that within the subsets there are only positive edges, and between them only negative.

The presence of -balance in implies the presence of groups of nodes being both assortative in and dissassortative in . However this situation is fairly rare in real world networks and expecting communities in signed networks to be a perfectly balanced set of nodes is unrealistic.

In the next section we will show that Laplacians inspired by Definition 1 are based on some form of arithmetic mean of Laplacians. As an alternative we propose the geometric mean of Laplacians and show that it is able to recover communities when either is assortative, or is disassortative, or both. Results of this paper will make clear that the use of the geometric mean of Laplacians allows to recognize communities where previous approaches fail.

### 2.1 Laplacians on Unsigned Graphs

Spectral clustering of undirected, unsigned graphs using the Laplacian matrix is a well established technique (see Luxburg:2007:tutorial for an overview). Given an unsigned graph , the Laplacian and its normalized version are defined as

 L=D−WLsym=D−1/2LD−1/2 (1)

where is the diagonal matrix of the degrees of . Both Laplacians are positive semidefinite, and the multiplicity of the eigenvalue is equal to the number of connected components in the graph. Further, the Laplacian is suitable in assortative cases Luxburg:2007:tutorial , i.e. for the identification of clusters under the assumption that the amount of edges inside clusters has to be larger than the amount of edges between them.

For disassortative cases, i.e. for the identification of clusters where the amount of edges has to be larger between clusters than inside clusters, the signless Laplacian is a better choice Liu2015 . Given the unsigned graph , the signless Laplacian and its normalized version are defined as

 Q=D+W,Qsym=D−1/2QD−1/2 (2)

Both Laplacians are positive semi-definite, and the smallest eigenvalue is zero if and only if the graph has a bipartite component Desai:1994:characterization .

### 2.2 Laplacians on Signed Graphs

Recently a number of Laplacian operators for signed networks have been introduced. Consider the signed graph . Let be the diagonal matrix of the degrees of and the one of the overall degrees in .

The following Laplacians for signed networks have been considered so far

 LBR =D+−W++W−,LBN=¯D−1LBR,\footnotesize{(balance ratio/normalized Laplacian)} (3) LSR =¯D−W++W−,LSN=¯D−1/2LSR¯D−1/2,\footnotesize{(signed ratio/normalized % Laplacian)}

and spectral clustering algorithms have been proposed for , based on these Laplacians Kunegis:2010:spectral ; Chiang:2012:Scalable . Let and be the Laplacian and the signless Laplacian matrices of the graphs and , respectively. We note that the matrix blends the informations from and into (twice) the arithmetic mean of and , namely the following identity holds

 LSR=L++Q−. (4)

Thus, as an alternative to the normalization defining from , it is natural to consider the arithmetic mean of the normalized Laplacians . In the next section we introduce the geometric mean of and and propose a new clustering algorithm for signed graphs based on that matrix. The analysis and experiments of next sections will show that blending the information from the positive and negative graphs trough the geometric mean overcomes the deficiencies showed by the arithmetic mean based operators.

## 3 Geometric mean of Laplacians

We define here the geometric mean of matrices and introduce the geometric mean of normalized Laplacians for clustering signed networks. Let be the unique positive definite solution of the matrix equation , where is positive definite.

###### Definition 2.

Let be positive definite matrices. The geometric mean of and is the positive definite matrix defined by .

One can prove that (see bhatia2009positive for details). Further, there are several useful ways to represent the geometric mean of positive definite matrices (see f.i. bhatia2009positive ; Ianazzo:2012:geometricMean )

 A#B=A(A−1B)1/2=(BA−1)1/2A=B(B−1A)1/2=(AB−1)1/2B (5)

The next result reveals further consistency with the scalar case, in fact we observe that if and have some eigenvectors in common, then and have those eigenvectors, with eigenvalues given by the arithmetic and geometric mean of the corresponding eigenvalues of and , respectively.

###### Theorem 1.

Let be an eigenvector of and with eigenvalues and , respectively. Then, is an eigenvector of and with eigenvalue and , respectively.

###### Proof.

Using the identities and we have . For the geometric mean, observe that for any positive definite matrix , if , then . In particular we have

 A−1/2BA−1/2u=λ−1/2A−1/2Bu=λ−1/2μA−1/2u=(μ/λ)u

thus . As a consequence

 (A#B)u=A1/2(A−1/2BA−1/2)1/2A1/2u=λ1/2A1/2(A−1/2BA−1/2)1/2u=(√λμ)u

which concludes the proof. ∎

### 3.1 Geometric mean for signed networks clustering

Consider the signed network . We define the normalized geometric mean Laplacian of as

 LGM=L+sym#Q−sym (6)

We propose Algorithm 1 for clustering signed networks, based on the spectrum of . By definition 2, the matrix geometric mean requires and to be positive definite. As both the Laplacian and the signless Laplacian are positve semi-definte, in what follows we shall assume that the matrices and in (6) are modified by a small diagonal shift, ensuring positive definiteness. That is, in practice, we consider and being and small positive numbers. For the sake of brevity, we do not explicitly write the shifting matrices.   Input: Symmetric weight matrices , number of clusters to construct. Output: Clusters . Compute the eigenvectors corresponding to the smallest eigenvalues of . Let . Cluster the rows of with -means into clusters . Algorithm 1 Spectral clustering with on signed networks

The main bottleneck of Algorithm 1 is the computation of the eigenvectors in step 1. In Section 5 we propose a scalable Krylov-based method to handle this problem.

Let us briefly discuss the motivating intuition behind the proposed clustering strategy. Algorithm 1, as well as state-of-the-art clustering algorithms based on the matrices in (3), rely on the smallest eigenvalues of the considered operator and their corresponding eigenvectors. Thus the relative ordering of the eigenvalues plays a crucial role. Assume the eigenvalues to be enumerated in ascending order. Theorem 1 states that the functions and map eigenvalues of and having the same corresponding eigenvectors, into the arithmetic mean and geometric mean , respectively, where is the smallest eigenvalue of the corresponding matrix. Note that the indices and are not the same in general, as the eigenvectors shared by and may be associated to eigenvalues having different positions in the relative ordering of and . This intuitively suggests that small eigenvalues of are related to small eigenvalues of both and , whereas those of are associated with small eigenvalues of either or , or both. Therefore the relative ordering of the small eigenvalues of is influenced by the presence of assortative clusters in (related to small eigenvalues of ) or by disassortative clusters in (related to small eigenvalues in ), whereas the ordering of the small eigenvalues of the arithmetic mean takes into account only the presence of both those situations.

In the next section, for networks following the stochastic block model, we analyze in expectation the spectrum of the normalized geometric mean Laplacian as well as the one of the normalized Laplacians previously introduced. In this case the expected spectrum can be computed explicitly and we observe that in expectation the ordering induced by blending the informations of and trough the geometric mean allows to recover the ground truth clusters perfectly, whereas the use of the arithmetic mean introduces a bias which reverberates into a significantly higher clustering error.

## 4 Stochastic block model on signed graphs

In this section we present an analysis of different signed graph Laplacians based on the Stochastic Block Model (SBM). The SBM is a widespread benchmark generative model for networks showing a clustering, community, or group behaviour rohe2011spectral

. Given a prescribed set of groups of nodes, the SBM defines the presence of an edge as a random variable with probability being dependent on which groups it joins. To our knowledge this is the first analysis of spectral clustering on signed graphs with the stochastic block model. Let

be ground truth clusters, all having the same size . We let () be the probability that there exists a positive (negative) edge between nodes in the same cluster, and let () denote the probability of a positive (negative) edge between nodes in different clusters.

Calligraphic letters denote matrices in expectation. In particular and denote the weight matrices in expectation. We have and if belong to the same cluster, whereas and if belong to different clusters. Sorting nodes according to the ground truth clustering shows that and have rank .

Consider the relations in Table 1. Conditions and describe the presence of assortative or disassortative clusters in expectation. Note that, by Definition 1, a graph is balanced if and only if . We can see that if then and give information about the cluster structure. Further, if holds then holds. Similarly characterizes a graph where the relative amount of conflicts - i.e. positive edges between the clusters and negative edges inside the clusters - is small. Condition is strictly related to such setting. In fact when holds then holds. Finally condition implies that the expected volume in the negative graph is smaller than the expected volume in the positive one. This condition is therefore not related to any signed clustering structure.

Let

 χ1=1,χi=(k−1)1Ci−1¯¯¯¯Ci.

The use of -means on , identifies the ground truth communities . As spectral clustering relies on the eigenvectors corresponding to the smallest eigenvalues (see Algorithm 1) we derive here necessary and sufficient conditions such that in expectation the eigenvectors correspond to the smallest eigenvalues of the normalized Laplacians introduced so far. In particular, we observe that condition affects the ordering of the eigenvalues of the normalized geometric mean Laplacian. Instead, the ordering of the eigenvalues of the operators based on the arithmetic mean is related to and . The latter is not related to any clustering, thus introduces a bias in the eigenvalues ordering which reverberates into a noisy embedding of the data points and in turn into a significantly higher clustering error.

###### Theorem 2.

Let and be the normalized Laplacians defined in (3) of the expected graphs. The following statements are equivalent:

1. [topsep=-3pt]

2. are the eigenvectors corresponding to the smallest eigenvalues of .

3. are the eigenvectors corresponding to the smallest eigenvalues of .

4. The two conditions and hold simultaneously.

###### Proof.

We first prove that are the eigenvectors corresponding to the smallest eigenvalues of if and only if the two conditions and hold simultaneously. It is simple to verify that are eigenvectors of and , with eigenvalues denoted by and , respectively. Thus and are simultaneously diagonalizable, that is there exists a non-singular matrix such that , where and are diagonal matrices . Observe that the eigenvalues and admits the following explicit representations

 λ+1=|C|(p+in+(k−1)p+out),λ−1=|C|(p−in+(k−1)p−out) (7) λ+i=|C|(p+in−p+out)λ−i=|C|(p−in−p−out),

for . As we assume clusters of the same size, the nodes have the same degree in expectation, inducing a regular graph. Hence the expected degrees of the graph are , and . With corresponding degree matrices and . The expected balanced-ratio cut Laplacian operator is thus given by . It follows that the eigenvalues of correspond to eigenvectors in the following way

 {d+−λ+i+λ−iwith eigenvector χi,i=1,…,kd+corresponding to the remaining eigenvectors

Thus, eigenvectors correspond to the smallest eigenvalues if and only if

 d+−λ+i+λ−i

By Eqs. (7) we see that for the constant eigenvector we have

whereas for the eigenvectors the corresponding condition is

 λ−i<λ+i⟺p−in+p+out

We deduce that the eigenvectors correspond to the smallest eigenvalues of if and only if and .

As differs from by a constant factor, the conditions hold for . Conditions for

can be proved in the same way, as the only difference in the eigenvalues is a shift given by the degree vector

. ∎

###### Theorem 3.

Let be the geometric mean of the Laplacians of the expected graphs. Then are the eigenvectors corresponding to the smallest eigenvalues of if and only if condition holds.

###### Proof.

We use the same notation as in the proof of Theorem 2. Observing that and have the same eigenvectors, it follows from Theorem 1 that

 LGM=Σ√(I−ˆΛ+)(I+ˆΛ−)Σ−1 (8)

where , and . We deduce that the eigenvalues of correspond to eigenvectors in the following way

 ⎧⎪ ⎪⎨⎪ ⎪⎩√(1−λ+id+)(1+λ−id−)with eigenvector χi,i=1,…,k1corresponding to the remaining eigenvectors

Thus, eigenvectors correspond to the smallest eigenvalues if and only if

 (1−λ+id+)(1+λ−id−)<1

By eqs. (7) we see that for the constant eigenvector we have

 (1−λ+1d+)(1+λ−1d−)=(1−d+d+)(1+d−d−)=0<1.

For eigenvectors first observe that

 1−λ+id+=(d+−λ+i)/d+ =(d+−|C|(p+in−p+out))/d+=kp+outp+in+(k−1)p+out

In the same way we have

 1+λ−id−=1+p−in−p−outp−in+(k−1)p−out.

Thus, for the eigenvectors we have the following condition

 (1−λ+id+)(1+λ−id−)<1⟺(kp+outp+in+(k−1)p+out)(1+p−in−p−outp−in+(k−1)p−out)<1,

which implies in turn that the eigenvectors correspond to the smallest eigenvalues of if and only if holds. ∎

As mentioned above, in practical implementations one modifies the Laplacians defining by adding a small diagonal shift. This is done to ensure positive definiteness of the matrices. The next theorem shows how to extend the previous result to the case of diagonally shifted Laplacians.

###### Theorem 4.

Let be the geometric mean of the shifted Laplacians of the expected graphs. Then are the eigenvectors corresponding to the smallest eigenvalues of if the following conditions hold.

1. [topsep=-3pt]

2. .

3. .

###### Proof.

We use the same notation as in the previous proof. Observing that and have the same eigenvectors, it follows from Theorem 1 that

 LGM=Σ√(I−ˆΛ++ε1I)(I+ˆΛ−+ε2I)Σ−1 (9)

where , and . We deduce that the eigenvalues of correspond to eigenvectors in the following way

Thus, eigenvectors correspond to the smallest eigenvalues if and only if

 (1−λ+id++ε1)(1+λ−id−+ε2)<(1+ε1)(1+ε2) (10)

Further, we can see that the previous equation holds if and only if

 (1−λ+id+)(1+λ−id−)+ε1(1+λ−id−)+ε2(1−λ+id+)<1+ε1+ε2

More over, as , we can see that eq.(10) holds if

 (1−λ+id+)(1+λ−id−)+ε1+ε2<1

By eqs. (7) we see that for the constant eigenvector we have . Thus,

 (1−λ+id+)(1+λ−id−)+ε1+ε2=ε1+ε2<1

For eigenvectors first observe that

 1−λ+id+=(d+−λ+i)/d+ =(d+−|C|(p+in−p+out))/d+=kp+outp+in+(k−1)p+out

In the same way we have

 1+λ−id−=1+p−in−p−outp−in+(k−1)p−out.

Thus, for the eigenvectors we have the following condition

 (1−λ+id+) (1+λ−id−)+ε1+ε2<1⟺ (kp+outp+in+(k−1)p+out) (1+p−in−p−outp−in+(k−1)p−out)+ε1+ε2 <1,

This implies in turn that the eigenvectors correspond to the smallest eigenvalues of if the following conditions hold

1. [topsep=-3pt]

2. .

3. .

Intuition suggests that a good model should easily identify clusters when . However, unlike condition , condition is not directly satisfied under that regime. Specifically, we have

###### Corollary 1.

Assume that holds. Then are eigenvectors corresponding to the smallest eigenvalues of . Let denote the proportion of cases where are the eigenvectors of the smallest eigenvalues of or , then .

###### Proof.

The event is defined as

 Evol={(p−in,p−out,p+in,p+out)∈[0,1]4|p−in+(k−1)p−out

We can rewrite the inequality as

 p−out−p+out<1k−1(p+in−p−in)<1k−1.

Thus the event defined as

 ~Evol={(p−in,p−out,p+in,p+out)∈[0,1]4|p−out−p+out<1k−1},

satisfies . Then with

 E3 =E+∩E−={(p−in,p−out)∈[0,1]2|p−in

we observe . Then

 p(k) =P(EB∩Evol|E3)=P(EB∩Evol∩E3)P(E3)=P(Evol∩E3)P(E3)

Then we get with corresponding to

 P(~Evol∩E3) ≤∫10(∫x10(∫x2+1k−10(∫x30dx4)dx3)dx2)dx1 =∫10(∫x10(∫x2+1k−10x3dx3)dx2)dx1 =∫10(∫x10(12(x2+1k−1)2)dx2)dx1 =∫10[16(x1+1k−1)3]x10dx1 =∫10[x316+x212(k−1)+x12(k−1)2]dx1 =[x4124+x316(k−1)+x214(k−1)2]10 =124+16(k−1)+14(k−1)2

The first inequality comes from the fact that we do not ensure that the integration upper border for is smaller or equal to one. Thus with we get

 p(k)≤16+23(k−1)+1(k−1)2.

In order to grasp the difference in expectation between , and , in Fig 1 we present the proportion of cases where Theorems 2 and 3 hold under different contexts. Experiments are done with all four parameters discretized in with 100 steps. The expected proportion of cases where holds (Theorem 3) is far above the corresponding proportion for (Theorem 2), showing that in expectation the geometric mean Laplacian is superior to the other signed Laplacians. In Fig. 2 we present experiments on sampled graphs with -means on top of the smallest eigenvectors. In all cases we consider clusters of size and present the median of clustering error (i.e., error when clusters are labeled via majority vote) of runs. The results show that the analysis made in expectation closely resembles the actual behavior. In fact, even if we expect only one noisy eigenvector for and , the use of the geometric mean Laplacian significantly outperforms any other previously proposed technique in terms of clustering error. and achieve good clustering only when the graph resembles a -balanced structure, whereas they fail even in the ideal situation where either the positive or the negative graphs are informative about the cluster structure. As shown in Section 6, the advantages of over the other Laplacians discussed so far allow us to identify a clustering structure on the Wikipedia benchmark real world signed network, where other clustering approaches have failed. Figure 1: Fraction of cases where in expectation χ1,…,χk correspond to the k smallest eigenvalues under the SBM. Figure 2: Median clustering error under the stochastic block model over 50 runs.

## 5 Krylov-based inverse power method for small eigenvalues of L+sym#Q−sym

The computation of the geometric mean of two positive definite matrices of moderate size has been discussed extensively by various authors raissouli:continued_fractions ; higham:sign_function ; Ianazzo:2012:geometricMean ; Iannazzo:optimization . However, when and have large dimensions, the approaches proposed so far become unfeasible, in fact is in general a full matrix even if and are sparse. In this section we present a scalable algorithm for the computation of the smallest eigenvectors of . The method is discussed for a general pair of matrices and , to emphasize its general applicability which is therefore interesting in itself. We remark that the method takes advantage of the sparsity of and and does not require to explicitly compute the matrix . To our knowledge this is the first effective method explicitly built for the computation of the eigenvectors of the geometric mean of two large and sparse positive definite matrices.

Given a positive definite matrix with eigenvalues , let

be any eigenspace of

associated to . The inverse power method (IPM) applied to is a method that converges to an eigenvector associated to the smallest eigenvalue