 # Nonlinear Eigenproblems in Data Analysis - Balanced Graph Cuts and the RatioDCA-Prox

It has been recently shown that a large class of balanced graph cuts allows for an exact relaxation into a nonlinear eigenproblem. We review briefly some of these results and propose a family of algorithms to compute nonlinear eigenvectors which encompasses previous work as special cases. We provide a detailed analysis of the properties and the convergence behavior of these algorithms and then discuss their application in the area of balanced graph cuts.

Comments

There are no comments yet.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Spectral clustering is one of the standard methods for graph-based clustering Jost_Setzer_Hein:Lux2007 . It is based on the spectral relaxation of the so called normalized cut, which is one of the most popular criteria for balanced graph cuts. While the spectral relaxation is known to be loose Jost_Setzer_Hein:GM98 , tighter relaxations based on the graph -Laplacian have been proposed in Jost_Setzer_Hein:BueHei2009 . Exact relaxations for the Cheeger cut based on the nonlinear eigenproblem of the graph -Laplacian have been proposed in Jost_Setzer_Hein:SB10 ; Jost_Setzer_Hein:HeiBue2010 . In Jost_Setzer_Hein:HeiSet2011 the general balanced graph cut problem of an undirected, weighted graph is considered. Let and denote the weight matrix of the graph by , then the general balanced graph cut criterion can be written as

 argminA⊂Vcut(A,¯¯¯¯A)^S(A),

where , , and is a symmetric and nonnegative balancing function. Exact relaxations of such balanced graph cuts and relations to corresponding nonlinear eigenproblems are discussed in Jost_Setzer_Hein:HeiSet2011 and are briefly reviewed in Section 2. A further generalization to hypergraphs has been established in Jost_Setzer_Hein:HeiSet2013 .

There exist different approaches to minimize the exact continuous relaxations. However, in all cases the problem boils down to the minimization of a ratio of a convex and a difference of convex functions. The two lines of work of Jost_Setzer_Hein:BLUB2012 ; Jost_Setzer_Hein:BLUB12 and Jost_Setzer_Hein:HeiBue2010 ; Jost_Setzer_Hein:HeiSet2011 have developed different algorithms for this problem, which have been compared in Jost_Setzer_Hein:BLUB2012 . We show that both types of algorithms are special cases of our new algorithm RatioDCA-prox introduced in Section 3.1. We provide a unified analysis of the properties and the convergence behavior of RatioDCA-prox. Moreover, in Section 4 we prove stronger convergence results when the RatioDCA-prox is applied to the balanced graph cut problem or, more generally, problems where one minimizes nonnegative ratios of Lovasz extensions of set functions. Further, we discuss the choice of the relaxation of the balancing function in Jost_Setzer_Hein:HeiSet2011 and show that from a theoretical perspective the Lovasz extension is optimal which is supported by the numerical results in Section 5.

## 2 Exact Relaxation of Balanced Graph Cuts

A key element for the exact continuous relaxation of balanced graph cuts is the Lovasz extension of a function on the power set to .

###### Definition 1

Let be a set function with . Let , let be ordered such that and define . Then, the Lovasz extension of is given by

 S(f)

Note that for the characteristic function of a set

, we have .

The Lovasz extension is convex if and only if is submodular Jost_Setzer_Hein:Bac2013 and every Lovasz extension can be written as a difference of convex functions Jost_Setzer_Hein:HeiSet2011 . Moreover, the Lovasz extension of a symmetric set function is positively one-homogeneous111A function is (positively) -homogeneous if for all (). In the following we will call functions just homogeneous when referring to positive homogeneity. and preserves non-negativity, that is if . It it well known, see e.g. Jost_Setzer_Hein:HeiSet2013 , that the Lovasz extension of the submodular cut function, , yields the total variation on a graph,

 R(f)=12n∑i,j=1wij|fi−fj|. (1)

Theorem 2.1 shows exact continuous relaxations of balanced graph cuts Jost_Setzer_Hein:HeiSet2011 . A more general version for the class of constrained fractional set programs is given in Jost_Setzer_Hein:BueEtAl2013 .

###### Theorem 2.1

Let be an undirected, weighted graph and and let be symmetric with , then

 minf∈RV12∑ni,j=1wij|fi−fj|S(f)=minA⊂Vcut(A,¯¯¯¯A)^S(A),

if either one of the following two conditions holds

1. is one-homogeneous, even, convex and for all , and is defined as for all .

2. is the Lovasz extension of the non-negative, symmetric set function with .

Let and denote by , then it holds under both conditions,

 mint∈Rcut(Ct,¯¯¯¯¯¯Ct)^S(Ct)≤12∑ni,j=1wij|fi−fj|S(f).

We observe that the exact continuous relaxation corresponds to a minimization problem of a ratio of non-negative, one-homogeneous functions, where the enumerator is convex and the denominator can be written as a difference of convex functions.

## 3 Minimization of Ratios of Non-negative Differences of Convex Functions via the RatioDCA-prox

We consider in this paper continuous optimization problems of the form

 minf∈RVF(f), where F(f)=R(f)S(f)=R1(f)−R2(f)S1(f)−S2(f), (2)

where are convex and one-homogeneous and and are non-negative. Thus we are minimizing a non-negative ratio of d.c. (difference of convex) functions. As discussed above the exact continuous relaxation of Theorem 2.1 leads exactly to such a problem, where and . Different choices of balancing functions lead to different functions .

While Jost_Setzer_Hein:HeiBue2010 ; Jost_Setzer_Hein:BLUB12 ; Jost_Setzer_Hein:BLUB2012 consider only algorithms for the minimization of ratios of convex functions, in Jost_Setzer_Hein:HeiSet2011 the RatioDCA has been proposed for the minimization of problems of type (2). The generalized version RatioDCA-prox is a family of algorithms which contains the work of Jost_Setzer_Hein:HeiBue2010 ; Jost_Setzer_Hein:HeiSet2011 ; Jost_Setzer_Hein:BLUB12 ; Jost_Setzer_Hein:BLUB2012 as special cases and allows us to treat the minimization problem (2) in a unified manner.

### 3.1 The RatioDCA-prox algorithm

The RatioDCA-prox algorithm for minimization of (2) is given in Algorithm 1. In each step one has to solve the convex optimization problem

 minG(u)≤1Φckfk(u), (3)

which we denote as the inner problem in the following with

 Φckfk(u):=R1(u)−⟨u,r2(fk)⟩+λk(S2(u)−⟨u,s1(fk)⟩)−ck⟨u,g(fk)⟩

and . As the constraint set we can choose any set containing a neighborhood of , such that the inner problem is bounded from below, i.e. any nonnegative convex -homogeneous function . Although a slightly more general formulation is possible, we choose the constraint set to be compact, i.e. . Moreover, , , , where are the subdifferentials. Note that for any -homogeneous function we have the generalized Euler identity (Jost_Setzer_Hein:YanWei08, , Theorem 2.1) that is for all .

Clearly is also one-homogeneous and with the Euler identity we get so we can always find minimizers at the boundary.

The difference to the RatioDCA in Jost_Setzer_Hein:HeiSet2011 is the additional proximal term in and the choice of . It is interesting to note that this term can be derived by applying the RatioDCA to a different d.c. decomposition of . Let us write as

 F=R′1−R′2S′1−S′2=(R1+cRG)−(R2+cRG)(S1+cSG)−(S2+cSG) (4)

with arbitrary . If we now define , the function to be minimized in the inner problem of the RatioDCA reads

which is not necessarily one-homogeneous anymore. The following lemma implies that the minimizers of the inner problem of RatioDCA-prox and of RatioDCA applied to the d.c.-decomposition (4) can be chosen to be the same.

###### Lemma 1

For we have . Moreover,

1. if then for some ,

2. if then .

###### Proof

∎For fixed it follows from the one-homogeneity of that any minimizer of is a multiple of one , so let us look at with . We get from the homogeneity of and for that

 ∂∂ν(Φ′fk(νfk+1))=Φckfk(fk+1)+ckpνp−1≤ckp(νp−1−1),

which is non-positive for and with it follows that a minimum is attained at . If then the global optimum of exists and by the previous arguments is attained at multiples of . If then also the global optimum of exists and the claim follows since is a minimizer of . ∎

Note that is no restriction since we get from the one-homogeneity of that for all . The following lemma verifies the intuition that the strength of the proximal term of RatioDCA-prox controls in some sense how near successive iterates are.

Let and .
If then .

###### Proof

∎This follows from

 Φdkfk(fk+12) ≤Φdkfk(fk+11)=Φckfk(fk+11)+(ck−dk)⟨fk+11,g(fk)⟩ ≤Φckfk(fk+12)+(ck−dk)⟨fk+11,g(fk)⟩ =Φdkfk(fk+12)+(dk−ck)⟨fk+12,g(fk)⟩+(ck−dk)⟨fk+11,g(fk)⟩.\qed
###### Remark 1

As all proofs can be split up into the individual steps we may choose different functions in every step of the algorithm. Moreover, it will not be necessary that is an exact minimizer of the inner problem, but we will only use that .

### 3.2 Special cases

It is easy to see that we get for and the RatioDCA Jost_Setzer_Hein:HeiSet2011 as a special case of the RatioDCA-prox. Moreover, Lemma 1 shows that the RatioDCA-prox corresponds to the RatioDCA with a general constraint set for the d.c. decomposition of the ratio given in (4).

If we apply RatioDCA-prox to the ratio cut problem, where , then and Jost_Setzer_Hein:BLUB12 chose . The following lemma shows that for a particular choice of and , RatioDCA-prox and algorithm 1 of Jost_Setzer_Hein:BLUB12 , which calculates iterates for by

 hk+1=argminu{12∑i,jwij|ui−uj|+λk2c∥u−(~fk+cvk)∥22},~fk+1=hk+1/∥∥hk+12∥∥,

produce the same sequence if given the same initialization.

###### Lemma 3

If , , and one uses the same subgradients in each step then, for the sequence produced by algorithm 1 of Jost_Setzer_Hein:BLUB12 and produced by RatioDCA-prox with and , we have for all .

###### Proof

∎If and we choose . For RatioDCA-prox we get by

 fk+1=argmin∥u∥22≤1Φckfk(u)

and for the algorithm 1 of Jost_Setzer_Hein:BLUB12

 hk+1 =argminu{R(u)+λk2c(∥u∥22−2⟨u,fk⟩−2⟨u,cvk⟩)} =argminu{Φckfk(u)+λk2c∥u∥22}

Finally, and application of Lemma 1 then shows that . As is strictly convex, the minimizers are unique. ∎

Analogously, the algorithm presented in Jost_Setzer_Hein:BLUB2012 is a special case of RatioDCA-prox applied to the ratio cheeger cut where and .

### 3.3 Monotony and convergence

In this section we show that the sequence produced by RatioDCA-prox is monotonically decreasing similar to the RatioDCA of Jost_Setzer_Hein:HeiSet2011 and, additionally, we can show a convergence property, which generalizes the results of Jost_Setzer_Hein:BLUB2012 ; Jost_Setzer_Hein:BLUB12 .

###### Proposition 1

For every nonnegative sequence any sequence produced by RatioDCA-prox satisfies for all or the sequence terminates. Moreover, we get that .

###### Proof

∎If the sequence does not terminate then and it follows

 R(fk+1)−λkS(fk+1)−ck⟨fk+1,g(fk)⟩≤Φckfk(fk+1)<Φckfk(fk)=−ck⟨fk,g(fk)⟩,

where we used that for any one-homogeneous convex function we have for all and all

 A(f)≥A(g)+⟨f−g,a⟩=⟨f,a⟩.

Adding gives

 (5)

where we used that since is convex

Dividing (5) by gives . As the sequence is bounded from below and monotonically decreasing and thus converging and is bounded on the constraint set, we get the convergence result from

 λk+1S(fk+1)−λkS(fk+1)≤ck⟨fk+1−fk,g(fk)⟩≤0.

If we choose we get and if is bounded from below as in the case of Jost_Setzer_Hein:BLUB2012 ; Jost_Setzer_Hein:BLUB12 but we can show, that this convergence holds for any strictly convex function .

###### Proposition 2

If is strictly convex and for all , then any sequence produced by RatioDCA-prox fulfills .

###### Proof

∎As in the proof of Proposition 1, we have and . Suppose . If , then the first order condition yields for

 G(fk+t(fk+1−fk))≥G(fk)+⟨g(fk),t(fk+1−fk)⟩=G(fk)=1,

which is a contradiction to the strict convexity of as for ,

 G(fk+t(fk+1−fk))<(1−t)G(fk)+tG(fk+1)=1.

Thus with the compactness of we get

 ⟨g(fk),fk+1−fk⟩≤maxu∈Gϵ⟨g(fk),u−fk⟩=:δ<0.

However, with for all this contradicts for large enough the result as of Proposition 1. Thus under the stated conditions as . ∎

While the previous result does not establish convergence of the sequence, it establishes that the set of accumulation points has to be connected.

As we are interested in minimizing the ratio

we want to find vectors

with

###### Lemma 4

If then every vector in the sequence produced by RatioDCA-prox fulfills .

###### Proof

∎As and are one-homogeneous and , we have for any vector with and ,

 Φckfk(h) ≥R(h)−ck⟨fk,g(fk)⟩≥−ck⟨fk,g(fk)⟩=Φckfk(fk)

where we have used that . Further, if is a minimizer then the algorithm terminates. ∎

### 3.4 Choice of the constraint set and the proximal term

While the iterates and thus the final result of RatioDCA and RatioDCA-prox differ in general, the following lemma shows that termination of RatioDCA implies termination of RatioDCA-prox and under some conditions also the reverse implication holds true. Thus switching from RatioDCA to RatioDCA-prox at termination does not allow to get further descent.

###### Lemma 5

Let , , , , as in the algorithm RatioDCA-prox and

 Ω1=argminG(u)≤1Φckfk1(u), and Ω2=argmin∥u∥2≤1Φ0fk2(u).

Then the following implications hold:

1. If then .

2. If and either or then .

###### Proof

∎If then . As is one-homogeneous, is also a global minimizer and thus for all with , . As , is minimizer which proves the first part.
On the other hand if

 fk1∈argminG(u)≤1Φckfk1(u),

then by Lemma 1 also

 fk1∈argminu{Φckfk1(u)+ckG(u)}.

being a global minimizer implies

 0 ∈∂(Φckfk1+ckG)(fk1)=∂Φ0fk1(fk1)−ckg(fk1)+ck∂G(fk1)=∂Φ0fk1(fk1),

where we used that by assumption . Thus is also a minimizer of and the result follows with and . ∎

### 3.5 Nonlinear eigenproblems

The sequence is not only monotonically decreasing but we also show now that the sequence converges to a generalized nonlinear eigenvector as introduced in Jost_Setzer_Hein:HeiBue2010 .

###### Theorem 3.1

Each cluster point of the sequence produced by RatioDCA-prox fulfills for a and with

 0∈∂(R1(f∗)+c∗G(f∗))−∂(R2(f∗)+c∗G(f∗))−λ∗(∂S1(f∗)−∂S2(f∗)).

If for every with the subdifferential is unique or for all , then

is an eigenvector with eigenvalue

in the sense that it fulfills

 0∈∂R1(f∗)−∂R2(f∗)−λ∗(∂S1(f∗)−∂S2(f∗)). (6)
###### Proof

∎By Proposition 1 the sequence is monotonically decreasing. By assumption and are nonnegative and hence is bounded below by zero. Thus we have convergence towards a limit

 λ∗=limk→∞F(fk) .

Note that is contained in a compact set, which implies that there exists a subsequence converging to some element . As the sequence is a subsequence of a convergent sequence, it has to converge towards the same limit, hence also

 limj→∞F(fkj)=λ∗ .

Assume now that for all . Then by Proposition 1, any vector satisfies

 F(f(c))<λ∗=F(f∗) ,

which is a contradiction to the fact that the sequence has converged to . Thus there exists such that and by Lemma 1 then and we get

 0 ∈∂R1(f∗)−r2(f∗)+λ∗(∂S2(f∗)−s1(f∗))−c∗g(f∗)+c∗∂G(f∗).

If for all then we only need to look at . In this case or if we get from that it follows that

 0∈∂R1(f∗)−r2(f∗)+λ∗(∂S2(f∗)−s1(f∗))

which then implies that is an eigenvector of with eigenvalue . ∎

###### Remark 2

(6) is a necessary condition for being a critical point of . If are continuously differentiable at , it is also sufficient. The necessity of (6) follows from (Jost_Setzer_Hein:Cla83, , Proposition 2.3.14). If are continuously differentiable at then we get from (Jost_Setzer_Hein:Cla83, , Propositions 2.3.6 and 2.3.14) that and is a critical point of .

## 4 The RatioDCA-prox for Ratios of Lovasz Extensions - Application to Balanced Graph Cuts

A large class of combinatorial problems Jost_Setzer_Hein:HeiSet2011 ; Jost_Setzer_Hein:BueEtAl2013 allows for an exact continuous relaxation which results in a minimization problem of a non-negative ratio of Lovasz extensions as introduced in Section 1. In this paper, we restrict ourselves to balanced graph cuts even though most statements can be immediately generalized to the class of problems considered in Jost_Setzer_Hein:BueEtAl2013 .

We first collect some important properties of Lovasz extensions before we prove stronger results for the RatioDCA-prox when applied to minimize a non-negative ratio of Lovasz extensions.

### 4.1 Properties of the Lovasz extension

The following lemma is a reformulation of (Jost_Setzer_Hein:Bac2013, , Proposition 4.2(c)) for our purposes:

###### Lemma 6

Let be a submodular function with . If is the Lovasz extension of then

 ⟨∂S(f),1Ci⟩=S(1Ci)=^S(Ci)

for all sets .

###### Proof

∎Let wlog be in increasing order . With we get

 n∑i=1^S(Ci)(fi+1−fi) =S(f)=⟨∂S(f),f⟩=n−1∑i=1⟨∂S(f),1Ci⟩(fi+1−fi).

Since is submodular is convex and thus , but because this holds with equality in all cases. ∎

More generally this also holds if is not submodular:

Let