Robust Eigenvectors of Symmetric Tensors

The tensor power method generalizes the matrix power method to higher order arrays, or tensors. Like in the matrix case, the fixed points of the tensor power method are the eigenvectors of the tensor. While every real symmetric matrix has an eigendecomposition, the vectors generating a symmetric decomposition of a real symmetric tensor are not always eigenvectors of the tensor. In this paper we show that whenever an eigenvector is a generator of the symmetric decomposition of a symmetric tensor, then (if the order of the tensor is sufficiently high) this eigenvector is robust, i.e., it is an attracting fixed point of the tensor power method. We exhibit new classes of symmetric tensors whose symmetric decomposition consists of eigenvectors. Generalizing orthogonally decomposable tensors, we consider equiangular tight frame decomposable and equiangular set decomposable tensors. Our main result implies that such tensors can be decomposed using the tensor power method.

Authors

• 2 publications
• 12 publications
• 12 publications
01/01/2018

On the tensor rank of 3× 3 permanent and determinant

The tensor rank and border rank of the 3 × 3 determinant tensor is known...
10/29/2021

Landscape analysis of an improved power method for tensor decomposition

In this work, we consider the optimization formulation for symmetric ten...
12/09/2019

Subspace power method for symmetric tensor decomposition and generalized PCA

We introduce the Subspace Power Method (SPM) for decomposing symmetric e...
01/12/2018

A unifying Perron-Frobenius theorem for nonnegative tensors via multi-homogeneous maps

Inspired by the definition of symmetric decomposition, we introduce the ...
01/31/2019

A Generalized Language Model in Tensor Space

In the literature, tensors have been effectively used for capturing the ...
05/02/2018

Computing tensor Z-eigenvectors with dynamical systems

We present a new framework for computing Z-eigenvectors of general tenso...
05/31/2021

UNiTE: Unitary N-body Tensor Equivariant Network with Applications to Quantum Chemistry

Equivariant neural networks have been successful in incorporating variou...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

With the rising demand for techniques to handle massive, high-dimensional datasets, many scientists have turned to finding adaptations of matrix algorithms to high-order arrays, known as tensors. The main obstacle is that determining quantities such as the rank, singular values, and eigenvalues

[Hitchcock, LHLim, qi2005eigenvalues] of a general tensor is an NP-hard problem [HL]

. Nonetheless some heuristics have been proposed for computing such quantities

[Anandkumar2014, pmlr-v40-Anandkumar15, KoBa09, shiftedPower, nie2014simax] and efficient algorithms exist for many families of tensors. For symmetric tensors efficient algorithms for decomposition exist in the low-rank case [tensorTrains, Jennrich, kileel2021subspace]. Furthermore, the decomposition, approximations, eigenvectors, and algebraic characterization have been thoroughly studied in the special case of orthogonally decomposable tensors [Anandkumar2014, BDHR, kolda2015symmetric, li2018jacobi, MuHsuGoldfarb, Robeva].

The computation of eigenvectors and singular vectors is particularly important because it is tightly linked to the best rank-one approximation problem [chen2009simax]. A recurring tool that has been used in several of the works cited here is the tensor power method [ZhaGol], which generalizes the well-known matrix power method. For non-symmetric tensors, the tensor power method is globally convergent [uschmajew2015pjo], with speed of convergence established in [hu2018convergence]. For the symmetric case, fewer results are available on the convergence of the power method; examples are known when the method does not converge at all [chen2009simax, kofidis2002simax].

In this paper, we first show that if a vector in the symmetric decomposition of a symmetric tensor is an eigenvector, then for sufficiently high orders, it is robust, i.e., it is an attracting fixed point of the tensor power method (see Theorem 3.1). We then exhibit several families of tensors whose symmetric decomposition consists of (robust) eigenvectors. Generalizing the class of orthogonally decomposable tensors, we study tensors generated by linear combinations of tensor powers of vectors which form an equiangular tight frame (ETF), or, more generally, an equiangular set (ES).

The rest of the paper is organized as follows. In Section 2 we provide background on tensor decompositions and the tensor power method. In Section 3 we present our main result, Theorem 3.1. In Section 4 we introduce and provide a detailed study of ETF and ES decomposable tensors; this includes the study of not only eigenvectors and their robustness, but also regions of convergence of the tensor power method. In Section 5 we conclude with a discussion and some open problems.

2 Background

Denote by the set . We write unbolded lowercase letters for scalars in , such as , bolded lowercase letters for vectors, such as , bolded uppercase letters for matrices, such as , and script letters for -tensors where , such as . An order tensor with dimensions is an element . The -th entry of the tensor will be denoted by where . An order tensor is said to be symmetric if for all permutations of ,

 Ti1,...,id=Tiσ(1),...,iσ(d).

We denote the set of all symmetric tensors of order and dimension by .

Definition 2.1.

A symmetric decomposition of a symmetric tensor is an expression of of the form

 T=r∑i=1λiv⊗di, (2.1)

where , are unit-norm vectors, and (-times) is a symmetric rank-one tensor. We say that is generated by the vectors and the coefficients . The smallest for which such a decomposition exists is called the symmetric rank of .

The rank of any symmetric matrix is always at most . For tensors , this is not the case since the rank can be much larger. The Alexander-Hirschowitz Theorem [BRAMBILLA20081229]

states that, with probability

, the symmetric rank of a random tensor

(drawn from an absolutely continuous probability distribution) is

except for a few special values of and where the rank is more than this number.

A vector is an eigenvector of with eigenvalue if

 T⋅vd−1=μv,

where is a vector defined by contracting by along all of its modes except for one, i.e. the -th entry of is

 (T⋅vd−1)i=n∑i1,…,id−1=1Ti1…id−1ivi1⋯vid−1.

Since is symmetric, it does not matter which modes of we contract. The eigenvectors of are the fixed points (up to sign) of an iterated method called the tensor power method given by

 xk+1↦T⋅xd−1k∥T⋅xd−1k∥.

Yet another important characterization of the eigenvectors is that they are the critical points of the symmetric best rank-one approximation problem:

 minc,v∥T−cv⊗d∥2F, (2.2)

see e.g., [chen2009simax, §6] for a related discussion. Note that while it is known that a non-symmetric best rank-one approximation of a symmetric tensor can be always chosen symmetric [friedland2013best], this does not give us information about all critical points; in particular the results about the convergence of the non-symmetric power method [uschmajew2015pjo] cannot be applied.

We call the vector an initializing vector of the tensor power method. Note that we are interested in real tensors and their real eigenvectors and investigate the convergence behavior of the tensor power method for real, non-zero initializing vectors. We are also not interested in eigenvectors of that have eigenvalue , since in that case, the tensor power method is not applicable. A robust eigenvector of is an eigenvector that is an attracting fixed point of the tensor power method, i.e. there exists an such that the tensor power method converges to for all initializing vectors in the ball of radius centered at . This means that an eigenvector is robust if it can be reliably obtained from the tensor power method.

A tensor is said to be orthogonally decomposable, or odeco, if it has a symmetric decomposition of the form

 T=n∑i=1λiv⊗di, (2.3)

where form an orthonormal basis of . Since there are at most orthogonal vectors in , the symmetric rank of an odeco tensor is at most . Odeco tensors have been thoroughly characterized and display a number of remarkable properties [Anandkumar2014, BDHR, Robeva, RobevaSeigal]. One, of interest here, is that the robust eigenvectors of an odeco tensor are precisely [Anandkumar2014].

This brings a few important points. In general, a symmetric tensor with a symmetric decomposition , firstly, may not have as an eigenvector for some , secondly, may have eigenvectors that are not robust, and lastly, may have robust eigenvectors that are not one of the ’s. In comparison, the only robust eigenvectors of a generic symmetric matrix is the one whose eigenvalue is largest in absolute value. Additionally, unlike symmetric matrices, symmetric tensors can have several robust eigenvectors.

In the following Section 3, we present our main result which essentially says that if a term is part of the symmetric decomposition  of a symmetric tensor , then, for sufficiently large, if is an eigenvector, then it is robust. In Section 4, we introduce a family of tensors, called equiangular tensors, which generalize odeco tensors. These tensors share the property that the vectors in the symmetric decomposition are eigenvectors. We apply our main result to study the robustness of these eigenvectors. We leave the study of robust eigenvectors that do not generate the decomposition of the symmetric tensor as an open problem in the conclusion.

3 Main Theorem

We now proceed to our main result which gives a condition on when an eigenvector is robust.

Theorem 3.1.

For , let be a tensor with symmetric decomposition

 Td=r∑i=1λiv⊗di, (3.1)

with for all . Then there exists a such that for all , if is an eigenvector of with non-zero eigenvalue, then is a robust eigenvector of .

As we will see in the sections to follow, this result allows us to use the tensor power method in order to decompose certain classes of tensors.

The following lemma is used in the proof of Theorem 3.1.

Lemma 3.2.

[Rheinboldt1998, Theorem 3.5] Let be a fixed point of a function where is an open set, and let be the Jacobian matrix of . Then is an attracting fixed point of the iterative method if , where is the spectral radius of the matrix . Furthermore, if , then for sufficiently close to , the rate of convergence of this iterative method is linear.

We will also need the following lemma about the structure of the Jacobian matrix of the tensor power method iteration.

Lemma 3.3.

Let and let be the tensor power method iteration map

 ϕ(x)=Td⋅xd−1∥Td⋅xd−1∥ (3.2)

where is an open set. Assume that the vector is a unit-norm eigenvector of with non-zero eigenvalue . Then the Jacobian matrix of at , , is symmetric and has the following form:

 J(v)=(d−1)μ(T⋅vd−2−μvv⊤).
Proof.

Denote and so that . Then

and

 ϕ′2(x)=(d−1)(T⋅xd−2).

Next, we can express

 ϕ′1(ϕ2(x))=∥T⋅xd−1∥2In×n−(T⋅xd−1)(T⋅xd−1)⊤(∥T⋅xd−1∥)3

and therefore, by the chain rule,

Now let us evaluate the expression at the unit-norm eigenvector corresponding to an eigenvalue (i.e., satisfying ). Then we have that

 J(v)=(d−1)μ2(T⋅vd−2)−μ2vv⊤(T⋅vd−2)μ3=(d−1)μ(T⋅vd−2−μvv⊤),

where we used the fact that . ∎

We now proceed with the proof of our theorem.

Proof of Theorem 3.1.

We may assume that no is and no two vectors and are colinear, or else we may rewrite as a sum of a smaller number of symmetric rank-1 tensors.

Contracting with copies of , since is an eigenvector of with eigenvalue , we have

 Td⋅vd−1j=r∑i=1λi⟨vi,vj⟩d−1vi=λjvj+∑i∈[r]∖{j}λiαd−1i,jvi=VΛ(V⊤vj)⊙(d−1)=μj,dvj, (3.3)

where , and contracting with copies of , we have

 Td⋅vd−2j=r∑i=1λi⟨vi,vj⟩d−2viv⊤i=VΛD(x)V⊤,

where , , and is the -th Hadamard power of the vector . Hence, by Lemma 3.3, we have that the Jacobian matrix of at is

 J(vj)=d−1μj,d(VΛD(vj)V⊤−μj,dvjv⊤j). (3.4)

Now multiplying both sides of by on the right, we have

 λjvjv⊤j+∑i∈[r]∖{j}λiαd−1i,jviv⊤j=μj,dvjv⊤j,

and hence

 λjvjv⊤j−μj,dvjv⊤j=−∑i∈[r]∖{j}λiαd−1i,jviv⊤j. (3.5)

Next, we are going to bound the spectral radius of , which is equal to because is symmetric. Due to (3.5), we can express

 VΛD(vj)V⊤−μj,dvjv⊤j

Therefore, since is an orthogonal projector with , and by submultiplicativity of the spectral norm, we get

 ρ(J(vj)) =∣∣∣d−1μj,d∣∣∣∥VΛD(vj)V⊤−μj,dvjv⊤j∥2≤∣∣∣d−1μj,d∣∣∣∥∥ ∥∥∑i∈[r]∖{j}λiαd−2i,jviv⊤i∥∥ ∥∥2 (3.6) ≤∣∣∣d−1μj,d∣∣∣∑i∈[r]∖{j}∥∥λiαd−2i,jviv⊤i∥∥2=∣∣∣d−1μj,d∣∣∣∑i∈[r]∖{j}|λiαd−2i,j| ≤∣∣∣d−1μj,d∣∣∣(r−1)(maxi∈[r]∖{j}|λi|)(maxi∈[r]∖{j}∣∣αi,j∣∣)d−2,

where we used the triangle inequality and the Cauchy-Schwarz inequalities.

Note that since no two vectors and are colinear, for all . Therefore, rearranging and applying the triangle inequality,

 |μj,d−λj|≤∑i∈[r]∖{j}|λi||αi,j|d−1,

we see that converges to as becomes large. This shows that for sufficiently large , and hence the result follows by Lemma 3.2.

4 Equiangular Tensors

The main result Theorem 3.1 compels us to find sufficient conditions for when a generating vector of a symmetric tensor in is an eigenvector. We will see that generated by a certain class of vectors will have this property.

4.1 Equiangular sets and equiangular tight frames

Definition 4.1.

An equiangular set (ES) is a collection of vectors with if there exists such that

 α=|⟨vi,vj⟩|,∀i≠j and ∥vi∥=1,∀i. (4.1)

An ES is an equiangular tight frame (ETF) if, in addition,

 ⎛⎜⎝||v1⋯vr||⎞⎟⎠⎛⎜ ⎜⎝−v1−⋮−vr−⎞⎟ ⎟⎠=rnIn (4.2)

where

is the identity matrix.

Note that if form an ES, and if for , then

 (4.3)

Suppose form an ETF. Then a number of additional results can be deduced. If with is a collection of vectors, then the following always holds

 maxi,j∈[r]i≠j|⟨ui,uj⟩|≥√r−nn(r−1) (4.4)

with equality if and only if is an ETF [etfWelchBound]. Thus, in . Furthermore, the matrix in , known as the Gram matrix, has rank (Proposition 3, [gramMatrixRank]). The Gram matrix gives a canonical representation of an equiangular tight frame. This results in a one-to-one correspondence between ETFs up to orthogonal transformation and their corresponding Gram matrix [WALDRON20092228], which means that also has rank .

ESs correspond to sets of lines in passing through the origin such that the angle between every pair of lines is the same. Determining the maximum number of equiangular lines in for each is an old problem that has recently seen significant progress by [equiangularLineBEST], who determined an asymptotically tight upper bound.

ETFs with vectors in do not exist for many values of and , making ETFs quite rare [ETFexistence]. Nonetheless, they have attracted a wide interest for a number of reasons. ETFs are a natural generalization of orthonormal sets of vectors where the number of vectors in the set is allowed to exceed the dimension of the space they lie in. ETFs minimize the maximum coherence between the vectors, attaining equality in what is known as the Welch bound . ETFs can also be formulated for and have found numerous applications in signal processing [steinerFrames], coding theory [grassmannFrames], and quantum information processing [quantumFrames].

4.2 Eigenvectors of equiangular tensors

We call a tensor generated by vectors from an ES equiangular set decomposable, or equiangular for short. We begin with some general results on equiangular tensors.

Theorem 4.2.

Let be a tensor generated by an ES and coefficients . If for some , there exists such that

 (λ1σd−11,j,...,λj−1σd−1j−1,j,μj,λj+1σd−1j+1,j,...,λrσd−1r,j)∈Ker(V) (4.5)

where is the matrix whose columns are , then is an eigenvector.

In particular, if

is odd and

, then, all of are eigenvectors of . Furthermore, in this case, all of these vectors are robust eigenvectors if

 ∥VV⊤∥2αd−2(d−1)(mini∈[r]|λi|)(1−αd−1)<1, (4.6)

which always holds when is large enough.

Proof.

We observe that for ,

 T⋅vd−1j=r∑i=1λi⟨vi,vj⟩d−1vi=λjvj+αd−1∑i∈[r]∖{j}λiσd−1i,jvi,

and hence if holds, then

 λjvj+αd−1∑i∈[r]∖{j}λiσd−1i,jvi=λjvj−αd−1μjvj=(λj−αd−1μj)vj.

Therefore, is an eigenvector of .

Now if is odd and , we have

 T⋅vd−1j=r∑i=1λi⟨vi,vj⟩d−1vi=λjvj+∑i∈[r]∖{j}λi(σi,jα)d−1vi=λjvj+αd−1∑i∈[r]∖{j}λivi
 =λjvj+αd−1(−λjvj)=λj(1−αd−1)vj,

i.e., all of are eigenvectors. In addition, using inequality (3.6),

 ρ(J(vj)) ≤d−1|λj|(1−αd−1)αd−2∥∥ ∥∥∑i∈[r]∖{j}viv⊤i∥∥ ∥∥2 ≤d−1|λj|(1−αd−1)αd−2∥VV⊤∥2≤∥VV⊤∥2αd−2(d−1)(mini∈[r]|λi|)(1−αd−1).

When the above quantity is less than , is a robust eigenvector, for all . ∎

When a tensor is generated by the vectors in an ETF, it is called an ETF decomposable tensor. Such tensors are a special case of fradeco tensors, which were studied in [OEDING2016125].

Theorem 4.3.

If form an ETF, then for some , for all . In particular, all of are eigenvectors of the tensor

 T=r∑i=1v⊗di (4.7)

when is even. Furthermore, in this case, all of these vectors are robust eigenvectors if

 rnαd−2(d−1)(1+αd−2(rn−1))<1, (4.8)

which always holds when is large enough.

Proof.

Starting with the Gram matrix of the ETF,

we subtract the identity matrix on both sides of the equation to obtain

Multiplying on the left of both sides of the equation by ,

and using , we obtain

Dividing by , we have

so .

When is even, we now show that all of are eigenvectors of . We have

 T⋅vd−1j=r∑i=1⟨vi,vj⟩d−1vi=vj+∑i∈[r]∖{j}(σi,jα)d−1vi=vj+αd−1∑i∈[r]∖{j}σi,jvi
 =vj+αd−1(1α(rn−1))vj=(1+αd−2(rn−1))vj.

Therefore, by (3.6), we obtain

 ρ(J(vj)) ≤d−1(1+αd−2(rn−1))αd−2∥VV⊤∥2=rnαd−2(d−1)(1+αd−2(rn−1)),

where the last equality follows from (4.2). Thus, is a robust eigenvector for any and all even for which the quantity above is strictly less than (which holds when is large enough). ∎

Example 4.4.

(Orthogonally Decomposable Tensors) An ETF with is clearly an orthonormal set of vectors, with constant , and thus we may choose for all . Tensors generated by this ETF are called orthogonally decomposable tensors and their properties have been studied in [Robeva, RobevaSeigal]. It is not hard to see that an orthogonally decomposable tensor has all of as eigenvectors, and the spectral radius bound is trivially equal to and therefore less than , meaning all of , …, are also robust eigenvectors of , for all .

In the following sections, we first study in more detail the robust eigenvectors of tensors generated by particular ETFs, and then by an ES which is not an ETF.

4.3 Regular Simplex Tensors

An ETF with always consists of the vertices of a regular simplex in (page 623, [ETFexistence]), called a regular -simplex frame, or regular simplex frame for short, with constant and signs for all . A particular example of a regular simplex is given by the Mercedes-Benz frame in , as shown in Figure 1, consisting of the vectors

 v1=(01),v2=⎛⎝√32−12⎞⎠,v3=⎛⎝−√32−12⎞⎠. (4.9)

We will call a tensor generated by a regular simplex frame a regular -simplex tensor, or regular simplex tensor for short. Let be the matrix whose columns are . We observe for the regular simplex frame that

 VV⊤⎛⎜ ⎜⎝1⋮1⎞⎟ ⎟⎠=⎛⎜ ⎜ ⎜ ⎜ ⎜ ⎜⎝1−1n…−1n−1n−1n1…−1n−1n⋮⋮⋱⋮⋮−1n−1n…−1n1⎞⎟ ⎟ ⎟ ⎟ ⎟ ⎟⎠⎛⎜ ⎜⎝1⋮1⎞⎟ ⎟⎠=0,

and since has rank , the kernel of is the span of . Thus, consider the regular simplex tensor

 T=n+1∑i=1v⊗di (4.10)

for . Then all of are eigenvectors of by Theorems 4.2 and 4.3.

There is a systematic method of generating regular simplex frames in for all as follows: if are the standard basis vectors and , then set

 vi=√1+1nei−1n32(√n+1−1)1n,i∈[n]vn+1=−1√n1n. (4.11)

These vectors are constructed by projecting the standard basis vectors in onto the subspace orthogonal to the vector with an appropriate rotation and rescaling.

We now present a theorem which shows that all of are also robust eigenvectors of for many values of and .

Theorem 4.5.

Let

 T=n+1∑i=1v⊗di

be a tensor generated by a regular simplex frame . Then all of are robust eigenvectors for for and such that .

Proof.

For a regular -simplex frame, we have , , and for all , . Since , both bounds and apply. Thus,

 rnαd−2(d−1)1+αd−2(rn−1)=(n+1)(d−1)nd−1+1<(n+1)(d−1)nd−1−1=rnαd−2(d−1)(mini∈[r]|λi|)(1−αd−1).

Hence, regardless of the parity of , it suffices to find values of and for which . This happens if and only if the following quantity is positive ():

 γ(n,d)=nd−1+n−d−dn.

We can easily check that is positive for the following values:

 γ(2,5)=3,γ(3,4)=14,γ(4,3)=5

Moreover the partial derivatives

 ∂∂nγ(n,d)=(d−1)(nd−2−1),∂∂dγ(n,d)=ln(n)nd−1−(n+1)

are positive for , . This guarantees that whenever

 (n,d)≥(2,5) or (n,d)≥(3,4) or (n,d)≥(4,3),

and thus all of are robust eigenvectors for and with . ∎

4.4 Regular 2-Simplex Tensors

In fact, for regular -simplex tensors, we can prove even stronger results compared to Theorem 4.5. The next theorem concerns not only robustness of the vectors in the regular simplex frame, but regions of convergence of the tensor power method, for tensors of order when is even.

Theorem 4.6.

Let be vectors of a regular -simplex frame. If and there is a unique which maximizes , then the tensor power method with initializing vector applied to the tensor

 T=v⊗d1+v⊗d2+v⊗d3.

will converge to , for all even . If , then any initializing vector is a fixed point of the tensor power method.

We leave the proof of this theorem in the Appendix.

The results of Theorems 4.5 and 4.6 can be visualized in Table 1. We denote by a green tick mark () convergence which is guaranteed by Theorem 4.5, and by a red cross mark () failure of convergence for the case and where every initializing vector is a fixed point of the tensor power method, due to Theorem 4.6.

In addition, we performed the following two numerical experiments.

The first numerical experiment concerns robustness. Let be the tensor generated by the regular simplex frame . We can observe the values of and for which the tensor power method applied to converges to vectors in the frame. Using MATLAB, we choose an initial vector

drawn from a uniform distribution on the unit sphere in

and apply the tensor power method for iterations. We denote by a black cross mark (✗) a lack of convergence, which occurs if for all eigenvectors of . We then performed this experiment for . In the cases with the green cross mark, we did not observe the tensor power method converging to any vectors other than . This suggests that the frame vectors may be the only robust eigenvectors, which we leave as a conjecture:

Conjecture 4.7.

The robust eigenvectors of a regular simplex tensor are precisely the vectors in the frame.

The second numerical experiment concerns the regions of convergence of the tensor power method in Theorem 4.6. Figure 2 shows regions of convergence to the eigenvectors , , and , starting from an initial vector of the tensor power method applied to the regular simplex tensor in generated by the vectors in a Mercedes-Benz frame . For in the blue, red, and green regions, the method will convergence to , , and , respectively. As Theorem 4.6 predicts, when is even, the regions of convergence form a partition of the unit disk into sectors. One can also observe a fractal subdivision of the regions of convergence for odd values of , which we lack an explanation for. As a consequence of the Theorem 3.1, however, larger values of result in greater robustness of the eigenvectors, and hence the observed thinning of these fractal subdivisions.

For the tensor generated by a Mercedes-Benz frame () for given by the vectors in , we can find its eigenvectors and their corresponding eigenvalues and multiplicities, which we show in Tables 2 and 3.