# Permutation Invariant Gaussian Matrix Models

Permutation invariant Gaussian matrix models were recently developed for applications in computational linguistics. A 5-parameter family of models was solved. In this paper, we use a representation theoretic approach to solve the general 13-parameter Gaussian model, which can be viewed as a zero-dimensional quantum field theory. We express the two linear and eleven quadratic terms in the action in terms of representation theoretic parameters. These parameters are coefficients of simple quadratic expressions in terms of appropriate linear combinations of the matrix variables transforming in specific irreducible representations of the symmetric group S_D where D is the size of the matrices. They allow the identification of constraints which ensure a convergent Gaussian measure and well-defined expectation values for polynomial functions of the random matrix at all orders. A graph-theoretic interpretation is known to allow the enumeration of permutation invariants of matrices at linear, quadratic and higher orders. We express the expectation values of all the quadratic graph-basis invariants and a selection of cubic and quartic invariants in terms of the representation theoretic parameters of the model.

## Authors

• 3 publications
12/19/2019

### Gaussianity and typicality in matrix distributional semantics

Constructions in type-driven compositional distributional semantics asso...
03/28/2017

### Linguistic Matrix Theory

Recent research in computational linguistics has developed algorithms wh...
04/07/2020

### Model selection in the space of Gaussian models invariant by symmetry

We consider multivariate centred Gaussian models for the random variable...
11/19/2019

### Representation Learning with Multisets

We study the problem of learning permutation invariant representations t...
06/03/2019

### Generalizations of k-Weisfeiler-Leman partitions and related graph invariants

The family of Weisfeiler-Leman equivalences on graphs is a widely studie...
07/11/2018

### Everybody's Got To Be Somewhere

The key to any nameless representation of syntax is how it indicates the...
02/03/2016

### Computing with quasiseparable matrices

The class of quasiseparable matrices is defined by a pair of bounds, cal...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

In the context of distributional semantics [1, 2]

, the meaning of words is represented by vectors which are constructed from the co-occurrences of a word of interest with a set of context words. In tensorial compositional distributional semantics

[3, 4, 5, 6, 7]

, different types of words, depending on their grammatical role, are associated with vectors, matrices or higher rank tensors. In

[8, 9]

we initiated a study of the statistics of these tensors in the framework of matrix/tensor models. We focused on matrices associated with adjectives or verbs, constructed by a linear regression method, from the vectors for nouns and for adjective-noun composites or verb-noun composites.

We developed a 5-parameter Gaussian model,

 Z(Λ,a,b,J0,JS) =∫dMe−Λ2∑Di=1M2ii−14(a+b)∑i

The parameters are coefficients of five linearly independent linear and quadratic functions of the random matrix variables which are permutation invariant, i.e. obey the equation

 f(Mi,j)=f(Mσ(i),σ(j)) (1.3)

for , the symmetric group of all permutations of distinct objects. This invariance implements the notion that the meaning represented by the word-matrices is independent of the ordering of the context words. General observables of the model are polynomials obeying the condition (1.3). At quadratic order there are linearly independent polynomials, which are listed in Appendix B of [8]. A three dimensional subspace of quadratic invariants was used in the model above. The most general Gaussian matrix model compatible with symmetry considers all the eleven parameters and allow coefficients for each of them. What makes the 5-parameter model relatively easy to handle is that the diagonal variables are each decoupled from each other and from the off-diagonal elements, and there are pairs of off-diagonal elements. For each , and mix with each other so the solution of the model requires an inversion of a matrix.

Expectation values of are computed as

 ⟨f(M)⟩≡1Z∫dMf(M)EXP (1.4)

where EXP is the product of exponentials in (1.2).

Representation theory of offers the techniques to solve the general permutation invariant Gaussian model. The matrix elements transform as the tensor product of two copies of the natural representation . We first decompose into irreducible representations of the diagonal .

 VD⊗VD=2V0⊕3VH⊕V2⊕V3 (1.5)

The trivial (one-dimensional) representation occurs with multiplicity . The -dimensional irreducible representation (irrep) occurs with multiplicity . is an irrep of dimension which occurs with multiplicity . Likewise, of dimension occurs with multiplicity . As a result of these multiplicities, the 11 parameters can be decomposed as

 11=1+1+3+6 (1.6)

is the size of a symmetric matrix. is the size of a symmetric matrix. More precisely the parameters form

 M=R+×R+×M+2×M+3 (1.7)

where is the set of real numbers greater or equal to zero, is the space of positive semi-definite matrices of size . Calculating the correlators of this Gaussian model amounts to inverting a symmetric matrix, inverting a symmetric matrix, and applying Wick contraction rules, as in quantum field theory, for calculating correlators. There is a Graph basis for permutation invariant functions of . This is explained in Appendix B of [8] which gives examples of graph basis invariants and representation theoretic counting formulae which make contact with the sequence A052171 - directed multi-graphs with loops on any number of nodes - of the Online Encyclopaedia of Integer Sequences (OEIS) [10].

In this paper we show how all the linear and quadratic moments of the graph-basis invariants are expressed in terms of the representation theoretic parameters of (

1.7). We also show how some cubic and quartic graph basis invariants are expressed in terms of these parameters. These results are analytic expressions valid for all .

The paper is organised as follows. Section 2 introduces the relevant facts from the representation theory of we need in a fairly self-contained way, which can be read with little prior familiarity of rep theory, but only knowledge of linear algebra. This is used to define the 13-parameter family of Gaussian models (equations (2.115) ,(2.116), (2.118)). Section 3 calculates the expectation values of linear and quadratic graph-basis invariants in the Gaussian model. Sections 4 and 5 describe calculations of expectation values of a selection of cubic and quartic graph-basis invariants in the model.

## 2 General permutation invariant Gaussian Matrix models

We solved a permutation invariant Gaussian Matrix model with linear and quadratic parameters [8]. The linear parameters are coefficients of linear permutation invariant functions of and the quadratic parameters (denoted ) are coefficients of quadratic functions. We explained the existence of a parameter family of models, based on the fact that there are linearly independent quadratic permutation invariant functions of a matrix. The general -parameter family of models can be solved by using techniques from the representation theory of . Useful background references on representation theory are [11, 12, 13]. We begin by collecting the relevant facts which will allow a useful parametrisation of the quadratic invariants of a matrix . An important step is to form linear combinations of the labelled by irreducible representations of . The results of this step are in (2.68)-(2.77). The quadratic terms in the action of the Gaussian model are close to diagonal in these -variables.

The matrix elements , where run over span a vector space of dimension . It is isomorphic to the tensor product , where is a -dimensional space. Consider as a span of basis vectors . This is a representation of . For every permutation , there is a linear operator defined by

 ρVD(σ)ei=eσ−1(i) (2.1)

on the basis vectors and extended by linearity. With this definition, is a homomorphism from to linear operators acting on .

 ρVD(σ1)ρVD(σ2)=ρVD(σ1σ2) (2.2)

We can take the basis vectors to be orthonormal.

 (ei,ej)=δij (2.3)

We can form the following linear combinations

 E0 = 1√D(e1+e2+⋯+eD) (2.4) E1 = 1√2(e1−e2) (2.5) E2 = 1√6(e1+e2−2e3) (2.6) ⋮ (2.7) Ea = 1√a(a+1)(e1+e2+⋯+ea−aea+1) (2.8) ⋮ (2.9) ED−1 = 1√D(D+1)(e1+e2+⋯+eD−1−(D−1)eD) (2.10)

is invariant under the action of

 ρVD(σ)E0=E0 (2.11)

The one-dimensional vector space spanned by is an invariant vector subspace of . We can call this vector space . The vector space spanned by , where , which we call , is also an -invariant subspace.

 ρVD(σ)Ea∈VH (2.12)

We have some matrices such that

 ρVD(σ)Ea=∑bDHba(σ)Eb (2.13)

These matrices are obtained by using the action on the and the change of basis coefficients. The vectors for are orthonormal.

 (EA,EB)=δA,B (2.14)

All the above facts are summarised by saying that the natural representation of decomposes as an orthogonal direct sum of irreducible representations of as

 VD=V0⊕VH (2.15)

By reading off the coefficients in the expansion of the in , we can define the coefficients

 C0,i = (E0,ei) (2.16) Ca,i = (Ea,ei) (2.17)

They are

 C0,i = 1√D (2.18) Ca,i = Na(−(a+1)+a∑j=1δji) (2.19) Na = 1√a(a+1) (2.20)

The orthonormality means that

 ∑iC0,iC0,i=1 (2.21) ∑iCa,iCb,i=δa,b (2.22) ∑iC0,iCa,i=0 (2.23)

The last equation implies that

 ∑iCa,i=0 (2.25)

From

 D−1∑A=0CA,iCA,j=C0,iC0,i+D−1∑a=1Ca,iCa,j=δi,j (2.26)

we deduce

 ∑aCa,iCa,j=(δij−1D)≡F(i,j) (2.27)

As we will see, this function will play an important role in calculations of correlators in the Gaussian model. It is the projector in for the subspace , obeying

 ∑jF(i,j)F(j,k)=F(i,k) (2.28) ∑iF(i,i)=(D−1) (2.29)

Now we will use these coefficient to build linear combinations of the matrix elements which have well-defined transformation properties under . Define

 S00 = ∑i,jC0,iC0,jMij=1DD∑i,j=1Mij (2.30) S0Ha = ∑i,jC0,iCa,jMij=1√D∑i,jCa,jMij (2.31) SH0a = ∑Ca,iC0,jMij=1√D∑i,jCa,iMij (2.32) SHHab = ∑i,jCa,iCb,jMi,j (2.33)

The indices range over . These variables are irreducible under , transforming as . Under the diagonal , the first three transform as while form a reducible representation.

Conversely, we can write these variables in terms of the variables, using the orthogonality properties of the .

 Mij =C0,iC0,jS00+D−1∑a=1C0,iCa,jS0Ha+D−1∑a=1Ca,iC0,jSH0a+D−1∑a,b=1Ca,iCb,jSHHab (2.35) =1DS00+1√DD−1∑a=1Ca,jS0Ha+1√DD−1∑a=1Ca,iSH0a+D−1∑a,b=1Ca,iCb,jSHHab

The next step is to consider quadratic products of these -variables, and identify the products which are invariant. In order to do this we need to understand the transformation properties of the above variables in terms of the diagonal action of . It is easy to see that is invariant. and both have a single index running over , and they transform in the same way as . The vector space spanned by form a space of dimension which is

 VH⊗VH (2.36)

Permutations act on this as

 σ(SHHab)=∑a1,b1DHa1a(σ) DHb1b(σ) SHHa1,b1 (2.37)

### 2.1 Useful facts from representation theory

• The representation space can be decomposed into irreducible representations (irreps) of the diagonal action as

 VH⊗VH=V0⊕VH⊕V2⊕V3 (2.38)

In Young diagram notation for irreps of

 V0→[n] (2.39) VH→[n−1,1] (2.40) V2→[n−2,2] (2.41) V3→[n−2,1,1] (2.42)
• These irreps are known to have dimensions . They add up to which is the dimension of .

• The vector is invariant under diagonal action of the . The action of on is given by

 DHab(σ)=(Ea,σEb)=∑iCa,iCb,σ(i) (2.43)

These can be verified to satisfy the homomorphism property

 DHab(σ)DHbc(τ)=DHac(στ) (2.44)

We also have . Using these properties, we can show that is invariant under the diagonal action. The vector

 1D−1∑aea⊗ea=1(D−1)∑aea⊗ea (2.45)
• The vector in coming from is simply

 ∑i,jC0,iC0,jMij=1D∑i,jMij≡SHH→0 (2.46)

The vector in inside (2.38) is some linear combination

 ∑b,cCH,H→H b, c ; aSHHbc≡SH,H→Ha (2.47)

The coefficients are some representation theoretic numbers ( called Clebsch-Gordan coefficients ) which satisfy the orthonormality condition

 ∑b,cCH,H→H b, c ; aCH,H→H b, c ; d=δa,d (2.48)

As shown in Appendix B, these Clebsch-Gordan coefficients are proportional to

 CHH→H a, b ; c=√D(D−2)Ca,b,c (2.49)

It is a useful fact that the Clebsch-Gordan coefficients for can be usefully written in terms of the describing as a subspace of the natural representation. This has recently played a role in the explicit description of a ring structure on primary fields of free scalar conformal field theory [14]. It would be interesting to explore the more general construction of explicit Clebsch-Gordan coefficients and projectors in the representation theory of in terms of the .

• Similarly for we have corresponding vectors and clebsch-Gordan coefficients

 ∑b,cCH,H→V2 b, c ; aSb,c≡SHH→V2a (2.50)

where ranges from to . We have the orthogonality property

 ∑b,cCH,H→V2 b, c ; a1CH,H→V2 b, c ; a2=δa1,a2 (2.51)

And for

 ∑b,cCH,H→V3 b, c ; aSHHbc≡SHH→V3a (2.52) ∑b,cCH,H→V3 b, c ; a1CH,H→V3 b, c ; a2=δa1,a2 (2.53)

Here the runs over to .

• The projector for the subspace of transforming as under the diagonal is

 (PH,H→H)a,b;c,d =∑eCH,H→H a, b ; eCH,H→H c, d ; e (2.55) =D(D−2)∑eCa,b,eCc,d,e
• The projector for in is

 (PH,H→V0)a,b;c,d=1(D−1)δa,bδc,d (2.56)

is just the anti-symmetric of . It is the orthogonal complement to inside the symmetric subspace of which is invariant under the swop of the two factors (often denoted )

 (PH,H→V2)=(1−PH,H→H−PH,H→V0)(1+s)2 (2.57) (s)a,b;c,d=δa,dδb,c (2.58)

The quadratic invariant corresponding to is

 SHHab(PH,H→V2)a,b;c,dSHHcd (2.59)

The quadratic invariant corresponding to is similar. We just have to calculate

 PH,H→V3=12(1−s) (2.60)
• The inner product

 ⟨Mij,Mkl⟩=δikδjl (2.61)

is invariant under the action .

 ⟨σ(Mij),σ(Mkl)⟩=⟨Mij,Mkl⟩ (2.62)
• The following is an important fact about invariants. Every irreducible representation of , let us denote it by has the property that

 Sym2(VR) (2.63)

contains the trivial irrep once. This invariant is formed by taking the sum over an orthonormal basis .

 DV⊗V(σ)∑AeVA⊗eVA = ∑ADV(σ)eVA⊗DV(σ)eVA (2.64) = ∑BDVBA(σ)DVCA(σ)eVB⊗eVC (2.65) = ∑BDVBA(σ)DVAC(σ−1)eVB⊗eVC (2.66) = ∑AeVA⊗eVA (2.67)
• To summarize the matrix variables

can be linearly transformed to the following variables, organised according to representations of the diagonal

.

 Trivial rep: S00,SHH→0 (2.68) Hook rep: S0,Ha,SH,0a,SH,H→Ha (2.69) The rep V2: SHH→V2a (2.70) The rep V3: SHH→V3a (2.71)
• For convenience, we will also use simpler names

 SV0;1=S00 (2.72) SV0;2=SH,H→0 (2.73)

where we introduced labels to distinguish two occurrences of the trivial irrep in the space spanned by the . We will also use

 SH;1a=S0,H→Ha (2.74) SH;2a=SH,0→Ha (2.75) SH;3a=SH,H→Ha (2.76)

where we introduced labels to distinguish the three occurrences of the space spanned by . For the multiplicity-free cases, we introduce

 SV2a=SHH→V2a (2.77) SV3a=SHH→V3a (2.78)

The variables can be written as linear combinations of the variables. Rep-basis expansion of

 Mij =C0,iC0,jS00+Ca,iCb,jSHHab+C0,iCa,jS0Ha+Ca,iC0,jSH0a (2.85) =1DS00+1√DD−1∑a=1Ca,jS0Ha+1√DD−1∑a=1Ca,iSH0a+D−1∑a,b=1Ca,iCb,jSHHab =1DS00+1√DD−1∑a=1Ca,jS0Ha+1√DD−1∑a=1Ca,iSH0a +∑a,bCa,iCb,j∑V∈{V0,VH,V2,V3}CHH→V a, b ; c SHH→Vc =1DS00+1√DD−1∑a=1Ca,jS0Ha+1√DD−1∑a=1Ca,iSH0a+1√D−1∑aCa,iCa,jSHH→V0 +D−1∑a,b=1Ca,iCb,jCHH→H a, b ; cSHH→Hc+D−1∑a,b=1Ca,iCb,jCHH→V2 a, b ; cSHH→V2c +D−1∑a,b=1Ca,iCb,jCHH→V3 a, b ; cSHH→V3c

In going from first to second line, we have used the fact that the transition from the natural representation to the trivial is given by simple constant coefficients. In the third line, we have used the Clebsch-Gordan coefficients for , obeying the orthogonality

 ∑a,bCH,H→VabcCH,H→Vabc′=δcc′ (2.87)

For , which is one dimensional, we just have

 CHH→V0ab=δab√D−1 (2.88)

It is now useful to collect together the terms corresponding to each irrep

 Mij =(1DS00+1√D−1∑aCa,iCa,jSHH→0) (2.91) +(1√DD−1∑a=1Ca,jS0Ha+1√DD−1∑a=1Ca,iSH0a+D−1∑a,b=1Ca,iCb,jCHH→H a, b ; cSHH→Hc) +D−1∑a,b=1Ca,iCb,jCHH→V2 a, b ; cSHH→V2c+D−1∑a,b=1Ca,iCb,jCHH→V3 a, b ; cSHH→V3c

Using the notation of (2.72), (2.74), (2.77) , we write this as

 Mij=(1DSV0;1+1√D−1∑aCa,iCa,jSV0;2) (2.92) +(1√DD−1∑a=1Ca,jSH;1a+1√DD−1∑a=1Ca,iSH;2a+D−1∑a,b=1Ca,iCb,jCHH→H a, b ; cSH;3c) (2.93) +D−1∑a,b=1Ca,iCb,jCHH→V2 a, b ; cSHH→V2c+D−1∑a,b=1Ca,iCb,jCHH→V3 a, b ; cSHH→V3c (2.94)
• The discussion so far has included explicit bases for inside which are easy to write down. A key object in the above discussion is the projector defined in (2.27). For the irreps which appear in , we will not need to write down explicit bases. Although Clebsch-Gordan coefficients for and appear in some of the above formulae, we will only need some of their orthogonality properties rather than their explicit forms. The projectors for in can be written in terms of the , and it is these projectors which play a role in the correlators we will be calculating.

### 2.2 Representation theoretic description of quadratic invariants

With the above background of facts from representation theory at hand, we can give a useful description of quadratic invariants. Quadratic invariant functions of form the invariant subspace of since transform as .

 (VD⊗VD)=(V0⊕VH)⊗(V0⊕VH) (2.95) =(V00⊕VH0,H⊕VHH,0⊕V0H,H⊕VHH,H⊕VV2H,H⊕VV3H,H) (2.96)

So there are two copies of , namely . contains three invariants.

 (S00)2 = (SV0;1)2 (2.97) (S00SHH→0) = SV0;2SV0;1=SV0;1SV0;2 (2.98) (SHH→0)2 = (SV0;2)2 (2.99)

These are all easy to write in terms of the original matrix variables (using the formulae for -variables in terms of