A unifying Perron-Frobenius theorem for nonnegative tensors via multi-homogeneous maps

01/12/2018 ∙ by Antoine Gautier, et al. ∙ Universität Saarland 0

Inspired by the definition of symmetric decomposition, we introduce the concept of shape partition of a tensor and formulate a general tensor spectral problem that includes all the relevant spectral problems as special cases. We formulate irreducibility and symmetry properties of a nonnegative tensor T in terms of the associated shape partition. We recast the spectral problem for T as a fixed point problem on a suitable product of projective spaces. This allows us to use the theory of multi-homogeneous order-preserving maps to derive a general and unifying Perron-Frobenius theorem for nonnegative tensors that either implies previous results of this kind or improves them by weakening the assumptions there considered. We introduce a general power method for the computation of the dominant tensor eigenpair, and provide a detailed convergence analysis.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Tensor eigenvalue problems have gained considerable attention in recent years as they arise in a number of relevant applications, such as best rank-one approximation in data analysis [6, 18]

, higher-order Markov chains

[17], solid mechanics and the entanglement problem in quantum physics [5, 16], multi-layer network analysis [20], and many other. A number of contributions have addressed relevant issues both form the theoretical and numerical point of view. The multi-dimensional nature of tensors naturally gives rise to a variety of eigenvalue problems. In fact, the classical eigenvalue and singular value problems for a matrix can be generalized to the tensor setting following different constructions which lead to different notions of eigenvalues and singular values for tensors, all of them reducing to the standard matrix case when the tensor is assumed to be of order two. Moreover, the extension of the power method to the tensor setting, including certain shifted variants, is the best known method for the computation of tensor eigenpairs.

When the tensor has nonnegative entries, many authors have worked on tensor generalizations of the Perron-Frobenius theorem for matrices [2, 3, 8, 15, 17]. In this setting, existence, uniqueness and maximality of positive eigenpairs of the tensor are discussed, in terms of certain irreducibility assumptions. Moreover, as for the matrix case, Perron-Frobenius type results allow to address the global convergence of the power method for tensors with nonnegative entries [2, 5, 9, 17].

However, all the contributions that have appeared so far address particular cases of tensor spectral problems individually. In this work we formulate a general tensor spectral problem which includes the known formulations as special cases. Moreover, we prove a new Perron-Frobenius theorem for the general tensor eigenvalue problem which allows to retrieve the previous results as particular cases and, often, allows to significantly weaken the assumptions previously made. In addition, we prove the global convergence of a nonlinear version of the power method that allows to compute the dominant eigenpair for general tensor eigenvalues, under mild assumptions on the tensor and with an explicit upper-bound on the convergence rate.

For the sake of clearness, we first discuss the case of a square tensor of order three, . Let denote the multilinear form induced by ,

and, for , consider the following nonlinear Rayleigh quotients:

(1)

Note that, since the tensor is nonnegative and has odd order, the maximum of

over its domain provides a notion of norm of , for . Furthermore note that and lead naturally to the definition of

-eigenvectors,

-singular vectors and

-singular vectors of the tensor [15]. Indeed the latter are respectively defined as the solutions of the following spectral equations

(2)

where and, for , the mapping is the gradient of .

It is well known that the singular values of a matrix always admit a variational characterization, whereas the same holds true for eigenvalues only if the matrix is symmetric. A similar situation occurs for tensors, where suitable symmetry assumptions on are required in order to relate the critical points of the Rayleigh quotients in (1) with the solutions of the spectral equations in (2): If is super symmetric, i.e. the entries of are invariant under any permutation of its indices, then and so the correspondence between the critical points of and the solutions to is clear. If is partially symmetric with respect to its second and third indices, i.e. for every , then and and, again, it can be verified that the critical points of coincide with the solutions to the second system in (2). Finally, the third system in (2) always characterizes the critical points of as . This latter case is the analogue of the singular value problem for matrices. In the case where does not have such symmetries, then it can be shown that the critical points of and are solutions to spectral systems analogous to those in (2) but where the mapping is the gradient of and is a symmetrized version of whose construction depends on the considered problem. Note that this phenomenon is, again, aligned with the matrix case. In fact, the quadratic form associated to a matrix always coincides with the form associated with the symmetric matrix . We discuss this property in detail in Section 4.

Now, if has nonnegative entries, i.e. , for all , then a simple argument shows that for every and where the absolute value is taken component wise. In particular, this implies that the maximum of is attained in the nonnegative orthant . Similar arguments show that the maxima of and are attained on nonnegative vectors as well. There is a vast literature on the study of the solutions to the systems in (2) in the particular setting where is nonnegative. We refer to it as the Perron-Frobenius theory for nonnegative tensors [4]. Let us briefly recall typical results of the latter theory. To this end, in this paragraph, we abuse the nomenclature and refer to a solution of one of the systems in (2) as an eigenpair of . These eigenpairs are of the form where and belongs to , or depending on which problem is considered. First, it can be shown that when are large enough, then there is always a nonnegative maximal eigenpair , i.e. is an eigenvalue of largest magnitude and has nonnegative components. Note that this is consistent with the fact that the maximum of the Rayleigh quotient is attained at nonnegative vectors, as observed above. The maximal eigenvalue has min-max and max-min characterizations which are usually referred to as Collatz-Wielandt formulas [3, 8, 9]. Furthermore, under additional irreducibility assumptions on , known as weak irreducibility, it can be shown that has strictly positive components and that it is the unique eigenvector with this property. Moreover, assuming further irreducibility conditions, known as strong irreducibility, it can be shown that has a unique nonnegative eigenvector. Finally, it is possible to derive conditions under which ad-hoc versions of the power method converge to

. This computational aspect is particularly interesting as it allows to estimate accurately the maximum of the Rayleigh quotient with a simple and efficient iterative method. Convergence rates for such generalizations of the power method have been derived for instance in

[8, 9, 13, 23] under quite restrictive assumptions on and the irreducibility of .

In this paper, we address tensors of any order and propose a framework that allows us to unify the study of all spectral equations of the type shown in (2) and to prove a general Perron-Frobenius theorem which either improves the known results mentioned above or includes them as special cases. In particular, we give new conditions for the existence, uniqueness and maximality of positive eigenpairs for an ample class of tensor spectral equations, we prove new characterizations for the maximal eigenvalue and we discuss the convergence of the power method including explicit rates of convergence. This is done by introducing a parametrization, which we call shape partition, so that the three problems discussed in (2) can be recovered with a suitable choice of the partition. Moreover, shape partitions allow us to introduce general definitions of weak and strong irreducibility, which both reduce to existing counter parts for suitable choices of the partition. We discuss in detail the relationship between different types of irreducible nonnegative tensors and we show how they are related for different spectral equations.

A particular contribution of this paper is that we reformulate these tensor spectral problems in terms of suitable multi-homogeneous maps and the associated fixed points on a product of projective spaces. Thus, based on our results in [10], we show that most of the tensor spectral problems correspond to a multi-homogeneous mapping that is contractive with respect to a suitably defined projective metric. This relatively simple observation turns out to be very relevant as it allows to systematically weaken the assumptions made in the Perron-Frobenius literature for nonnegative tensors so far. The paper is written in a self-contained manner. However, for the proofs we rely heavily on our results from [10].

2 Preliminaries

In this section we fix the main notation and definitions that are required to formulate the Rayleigh quotients in (1) and the associated spectral problems in a unified fashion for the general case of a tensor of any order and with possibly different dimensions.

Let be a nonnegative tensor of order , and define the induced multilinear form as

where for all . Furthermore, let us consider the gradient of , that is let with and defined as

As for the case of a square tensor of order three, described in the previous section, several Rayleigh quotients and spectral equations can be associated to . For instance, we have now up to different choices of the norms in the denominator of (1). Moreover, various choices for the numerator are possible, depending on how one partitions the dimensions of . In order to formalize these properties for a general tensor , we introduce here the concept of shape partition.

[Shape partition] We say that is a shape partition of if is a partition of , i.e.  and for , such that for every and , it holds . Moreover, we always assume that:

  1. For every and it holds .

  2. If , then for every .

Observe that the conditions (a) and (b) in the above definition are not restrictive. Indeed, if is a partition of such that for every and , then there exists a permutation such that defined as is a shape partition of the tensor defined as for all . For instance if and , then one can define for all and .

Figure 1: Ferrers diagram of the integer partition of (left) and the Cartesian product of the integer partition of with itself. They correspond to the shape partition of a fourth order tensor whose dimensions of the orders are all the same (left) or coincide two by two (right).

The concept of shape partition of a tensor is strictly related with the integer partition of its order. More precisely, it is related with the Cartesian product of the integer partitions of the number of orders of having same dimension. Let us explain this with an example. Let be a tensor of order four. If then the shape partitions of are , , , and which formally coincide with the integer partitions of the number , i.e. the order of . Whereas, when , the shape partitions of formally coincide with the Cartesian product of the integer partitions of the number . Precisely, when , then , , are the shape partitions of . To help intuition, we show in Figure 1 the Ferrers diagrams of the shape partitions of the example tensor discussed here.

Shape partitions are useful and convenient for describing all spectral systems of the same form as (2) but for tensors of any order. In particular, throughout this paper we associate to each shape partition of the numbers , , and defined as follows:

(3)

and .

Given a shape partition we will always assume the definitions in (3), although the reference to the specific will be understood implicitly. Moreover, for convenience, we will very often use the in place of the . The relation between these two numbers is made more clear by noting that the dimensions of can be rewritten as follows:

Now, given and the shape partition of , we define the Rayleigh quotient of induced by and as follows:

(4)

In particular, we note that the funtions of (1) can be recovered by setting , and respectively.

The Rayleigh quotient (4) is naturally related to a norm of the tensor which depends on both the shape partition and the choice of the norms . We denote such norm as . Note that the absolute value in the definition of can be omitted when is nonnegative. In fact, as discussed in the introduction, if is nonnegative, then the maximum is always attained at nonnegative vectors. In the case and , is called the spectral norm of and it is known that its computation is NP-hard in general (c.f. [12]). If , then coincides with the -norm of the matrix [2] and it is also known to be NP-hard for general matrices if, for instance, is a rational number or , see e.g. [11, 19].

A direct computation shows that the critical points of in (4) are solutions to the following spectral equation:

(5)

where denotes the gradient of the map , for all and if and .

It is important to note that and do not coincide in general, unless . Hence, we consider a more general class of spectral problems for tensors which is formulated as follows:

(6)

Depending on the choice of , various known spectral problems related to nonnegative tensors can be recovered from (6). Indeed, if , then and we recover equation (1.2) in [8] which characterizes the -singular vectors of . If , then for some and we recover equation (2) in [16] which characterizes the -singular vectors of the rectangular tensor . Finally, if , then and we recover equation (7) in [15] which characterizes the -eigenvectors of . Perron-Frobenius type results have been established for each of the aforementioned spectral problems. In order to unify these results, we introduce here the following definition: [-eigenvalues and eigenvectors] We say that is a -eigenpair of if it satisfies (6). We call a -eigenvalue of and a -eigenvector of .

Key assumptions in the Perron-Frobenius theory of nonnegative tensor are strict nonnegativity, weak irreducibility and (strong) irreducibility. In order to address the general spectral problem of Definition 1, we recast such assumptions in terms of the chosen shape partition.

[-nonnegativity and -irreducibility] For a nonnegative tensor and an associated shape partition , consider the matrix defined as

where is the vector of all ones. We say that is:

  • -strictly nonnegative, if has at least one nonzero entry per row.

  • -weakly irreducible, if is irreducible.

  • -strongly irreducible, if for every that is not entry-wise positive and is such that for all , there exists such that and .

These definitions coincide with most of the corresponding definitions introduced for special cases. Indeed, if , -strict nonnegativity reduces to the definition of strictly nonnegative tensor introduced in [13]. If , -weak irreducibility reduces to the definition of weak irreducibility introduced in [8] and [16], respectively. If , -strong irreducibility reduces to the existing definitions of irreducibility introduced in [3] and [8]. However, in the case , -strong irreducibility is strictly less restrictive than the definition of irreducibility introduced in [5]. In Section 6.4 we give a detailed characterization of each of these classes of nonnegative tensors. In particular, we propose equivalent formulations of these class of tensors in terms of graphs and in terms of the entries of . Furthermore, we show in Theorem 6.5 that -strong irreducibility implies -weak irreducibility which itself implies -strict nonnegativity. We also study how these classes are related, for a fixed tensor but different choices of .

Using different shape partitions, one can associate several spectral problems to a tensor via Definition 1 and sometimes one can transfer properties that hold true for one formulation to another one. For instance, if a symmetric matrix is irreducible, i.e. is -irreducible, then its corresponding bipartite graph is strongly connected, i.e. is also -irreducible. In particular, this implies that the classical Perron-Frobenius theorem holds not only for the eigenpairs of but also for its singular pairs. A similar situation arises in the more general setting of tensors. In order to formalize this property, we define the following partial order on the set of shape partitions of : Let , be two shape partitions of , then we write if and there exists such that for every .

Note that, for instance, in Remark 2 we have whereas and . Moreover, note that the shape partitions and of the symmetric matrix above satisfy and irreducibility with respect to carries over to . More generally, we discuss in Sections 4 and 6 several properties of the tensor preserved by the partial ordering , that is properties that automatically hold for when holding for a shape partition such that . In particular, this is the case of tensor symmetries that we define below in terms of .

We have already mentioned in the introduction that, as for the case of matrices, symmetries in the entries of allow for different variational characterizations of the associated spectrum. Therefore, given the shape partition of , we introduce the definition of -symmetry. The latter is based on the concept of partially symmetric tensors introduced in [7] which we recall for the sake of completeness: [Partially symmetric tensor, [7]] Let and let be a subset of cardinality at least. We say that is symmetric with respect to if for each pair and the value of does not change if we interchange any two indices for and any . We agree that is symmetric with respect to each for . [-symmetry] Let and let be a shape partition of . We say that is -symmetric if it is partially symmetric with respect to for all .

Observe that, in particular, every matrix is -symmetric and symmetric matrices are -symmetric. Moreover, if is -symmetric, then is -symmetric for every shape partition of such that .

Similarly to the matrix case where only eigenpairs of symmetric matrices have a variational characterization, we show in Lemma 4 that solving (5) is equivalent to solve a problem of the form (6) where the tensor is -symmetric. Vice-versa, in Lemma 4, we show that when the tensor is partially symmetric with respect to , then the solutions of (6) are critical points of the Rayleigh quotient in (4).

3 Main results

In this section we describe the main results of this paper: A complete characterization of the irreducibility properties of in terms of the shape partition ; a unifying Perron-Frobenius theorem for the general tensor spectral problem of (6); and a generalized power method with a linear convergence rates that allows to compute the dominant -eigenvalue and -eigenvector of . These results are based on a number of preliminary lemmas and results that we prove in the next sections. Thus, for the sake of readability, we postpone the poofs of the main results to the end of the paper. We devote this section to describe the results and to relate them with previous work.

The first result is presented in the following: Let and let and be shape partitions of such that . Then, the following holds:

  1. If is -weakly irreducible, then is -strictly nonnegative.

  2. If is -strongly irreducible, then is -weakly irreducible.

  3. If is -strictly nonnegative, then is -strictly nonnegative.

  4. If is -weakly irreducible and -symmetric, then is -weakly irreducible.

  5. If is -strongly irreducible and -symmetric, then is -strongly irreducible.

Proof

See Section 6.4.

Few comments regarding the partial symmetry assumption in (iv) and (v) of the above theorem are in order: First, note that, as in the matrix case, the irreducibility of a tensor does not depend on the magnitude of its entries and so it is enough to assume that the nonzero pattern of is -symmetric. Second, by giving explicit examples, we note in Remarks 6.3 and 6.4 that assumption (v) can not be omitted in order to deduce -weak (resp. strong) irreducibility from -weak (resp. strong) irreducibility.

It is well known that in the case of nonnegative matrices, i.e. and , -weak irreducibility and -strong irreducibility are equivalent. This equivalence is proved also for and in Lemma 3.1 [8]. Furthermore, (i), (ii) are known for the particular cases . Precisely, refer to Lemma 3.1 [8] for an equivalent of (ii) and to Proposition 8, (b) [9], Corollary 2.1. [13] for an equivalent of (i) in the cases respectively. However, to our knowledge, the results of points (iii), (iv), (v) have not been proved before, in any setting.

Figure 2: Conditions on for different settings involving a tensor of order . The figure shows that generally, implies a less restrictive condition on than the previous existing ones. Left: Here so that . The plain line is the set of such that and the dashed line is the set of such that [16]. Middle: Here so that . The dark gray surface is the set of such that and the light gray surface is the set of such that [8, 15]. Right: Here again and is fixed to . The plain line is the set of such that , the dotted line is the set of for which there exists such that for all [9] and the dashed line the set of such that [8, 15].

Our second result is a new and unifying Perron-Frobenius theorem for -eigenpairs. First, let us consider the sets of nonnegative, nonnegative nonzero and positive tuples of vectors in , that is: let , and let be the interior of . Furthermore, let us define the -spectral radius of :

(7)

As mentioned before, the key of our Perron-Frobenius theorem is the relation with the theory of multi-homogeneous and order-preserving mappings [10]. In particular, let us consider defined as where and, for all ,

(8)

We show in Lemma 5 that the nonnegative -eigenpairs of are in bijection with the multi-homogeneous eigenvectors of , i.e. vectors for which there exists such that for all . This key observation allows us to exploit the results proved in [10]. In particular, we consider the homogeneity matrix of given as

(9)

and let be its spectral radius. In the following, always refers to the homogeneity matrix of , hence, when it is clear from the context, we omit the arguments and write instead of . Lemma 3.2 in [10] implies that is an upper bound on the Lipschitz constant of with respect to a suitable weighted Hilbert metric on . Therefore, when , we can recast the -eigenvalue problem for in terms of a non-expansive map and derive the Perron-Frobenius theorem for as a consequence.

In the particular cases , typical assumptions on found in the literature on Perron-Frobenius theory of nonnegative tensors are for every , [8, 15, 16]. It is not difficult to see that if for all , then , with equality if and only if . However, by the Collatz-Wielandt formula, we have , and thus it is clear that there are many choices of such that but . Moreover, note that the function is strictly monotonically decreasing in the sense that for every with for all , it holds with equality if and only if . An example comparing with the conditions on given in [8, 9, 15, 16] is shown in Figure 2.

The following Perron-Frobenius theorem consists of five parts: The first one is a weak Perron-Frobenius theorem ensuring the existence of a maximal nonnegative -eigenpair. The second characterizes via a Collatz-Wielandt formula, a Gelfand type formula and a cone spectral radius formula. The third part, gives sufficient conditions for the existence of a positive -eigenpair. The fourth part, gives conditions ensuring that -eigenvectors which are nonnegative but not positive can not correspond to . The last part gives further conditions which guarantee that has a unique nonnegative -eigenvector.

Let us denote by the -th composition of with itself, that is and for Moreover, let us define the following product of balls and its positive part .

Let be a shape partition of . Furthermore let , and be as in (7), (8) and (9) respectively. Suppose that is -strictly nonnegative and . Then, there exists a unique such that and . Furthermore, we have the following:

  1. There exists a -eigenpair of such that .

  2. Let then and the following Collatz-Wielandt formula holds:

    (10)

    If additionally, , then it holds

    (11)
  3. If either or is -weakly irreducible, then the -eigenvector of (i) can be chosen to be strictly positive, i.e. . Moreover, is then the unique positive -eigenvector of .

  4. If is -weakly irreducible, then for every -eigenpair of such that , it holds .

  5. If is -strongly irreducible, then the -eigenvector of (i) is positive and it is the unique nonnegative -eigenvector of .

Proof

See Section 7.