Construction Methods for Gaussoids

02/28/2019 ∙ by Tobias Boege, et al. ∙ Otto-von-Guericke-University Magdeburg 0

The number of n-gaussoids is shown to be a double exponential function in n. The necessary bounds are achieved by studying construction methods for gaussoids that rely on prescribing 3-minors and encoding the resulting combinatorial constraints in a suitable transitive graph. Various special classes of gaussoids arise from restricting the allowed 3-minors.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1. Introduction

Gaussoids are combinatorial structures that encode independence among Gaussian random variables, similar to how matroids encode independence in linear algebra. They fall into the larger class of

CI structures which are arbitrary sets of conditional independence statements. The work of Fero Matúš is in particular concerned with special CI structures such as graphoids, pseudographoids, semigraphoids, separation graphoids, etc. In his works Fero Matúš followed the idea that conditional independence can be abstracted away from concrete random variables to yield a combinatorial theory. This should happen in the same manner as matroid theory abstracts away the coefficients from linear algebra. His work [Mat97] on minors of CI structures displays the inspiration from matroid theory very clearly.

In 2007, Lněnička and Matúš defined gaussoids [LM07] of dimension as sets of symbols , denoting conditional independence statements, which satisfy the following Boolean formulas, called the gaussoid axioms:

(G1)
(G2)
(G3)
(G4)

for all distinct and . Here and in the following, we use the efficient “Matúš set notation” where union is written as concatenation and singletons are written without curly braces. For example, is shorthand for .

A gaussoid is realizable if its elements are exactly the conditional independence statements that are valid for some -variate normal distribution. Realizability was characterized for in [LM07] and a characterization for is open. There is no general forbidden minor characterization for realizability of gaussoids [Šim06, Sul09]. We therefore think about gaussoids as synthetic conditional independence in the sense of Felix Klein [Kle16, Chapter V]. This view is inspired by the parallels to matroid theory. The algebra and geometry of gaussoids was developed with this in mind in [BDKS17]. Gaussoids are also the singleton-transitive compositional graphoids according to [Sad17, Section 2.3].

In the present paper we view gaussoids as structured subsets of -faces of an -cube. This readily simplifies the definition of a gaussoid, but it has several additional advantages. For example, it makes the formation of minors more effective, as this now corresponds to restricting to faces of the cube. To start, consider the usual 3-dimensional cube. A knee in the cube consists of two squares that share an edge. A belt consists of all but two opposing squares of the cube. The following combinatorial definition of a gaussoid can be confirmed (for example by examining Figure 2) to agree with the gaussoid axioms.

Definition 1.1.

An -gaussoid is a set of -faces of the -cube such that for any -face of the -cube it holds:

  1. If contains a knee of , then it also contains a belt that contains that knee.

  2. If contains two opposing faces of , then it also contains a belt that contains these two faces.

The dimension of the ambient cube is also the dimension of . is the set of -dimensional gaussoids and the set of all gaussoids.

(G1)—(G3): Any knee in the cube is completed to the unique belt which contains it. (G1)—(G3) (G4): Two opposite squares are completed to (at least) one of the two belts which contain them.
Figure 1. The gaussoid axioms in the -cube. Premises of the axioms are colored in purple, possible conclusions in different shades of green. The pictures encode the gaussoid axioms mod , the symmetry group of the -cube.

This definition is illustrated in Figure 1. As with the gaussoid axioms, this definition applies certain closure rules in every -face of the -cube, but whereas acts on the axes of the cube in the gaussoid axioms, the group acting on the two pictures in Figure 1 is the full symmetry group of the -cube, . This bigger group conflates the first three axioms into the first picture.

The gaussoid axioms and also Definition 1.1 only work with -cubes. This locality can be expressed as in Lemma 3.3: For any , being an -gaussoid is equivalent to all restrictions to -faces being -gaussoids. The aim of this work is to explore gaussoid puzzling, the reversal of this idea, that is, constructing -gaussoids by prescribing their -gaussoids. The implementation hinges on an understanding of how exactly the -faces of the -cube intersect, because these intersections are obstructions to the free specification of -gaussoids. In Section 3

we encode these obstructions in a graph and then Brooks’ theorem gives access to large independent sets, where gaussoids can be freely placed. This yields a good estimate of the number of gaussoids in Theorem 

3.12.

In Section 4 we explore classes of special gaussoids that arise by restricting the puzzling of -gaussoids to subsets of the possibilites. Several of these classes have nice interpretations and can be matched to combinatorial objects.

Acknowledgement

The authors are supported by the Deutsche Forschungsgemeinschaft (314838170, GRK 2297, “MathCoRe”).

2. The cube

Consider the face lattice of the -cube. This lattice contains , the unique face of dimension . To specify a face of non-negative dimension , one needs to specify the dimensions in which the face extends, and then the location of the face in the remaining dimensions. We employ two natural ways to work with faces. The first is string notation. In this notation a face is an element of where the s indicate dimensions in which the face extends and the remaining binary string determines the location; a at position means that the face is translated along the -th axis inside the cube. This string notation naturally extends the binary string notation for the vertices of the -cube: if , then its vertices are

The second choice is set notation. In this notation, a face of dimension is specified by two sets and , where and .

The set of -faces of the -cube is . As in [BDKS17], the squares of the -cube are denoted by . Of special interest in this article are also the -cubes . The constructions in Section 3 based on Lemma 3.3 frequently exploit the following

Fact.

For , a -face shares at most squares with an -face or is already included in it. In particular for , if a cube shares more than a single square with an -face, then it is already contained in it.

Minors are important in matroid theory and gaussoid theory. When a simple matroid is represented as the geometric lattice of its flats, a minor corresponds to an interval of the lattice [Wel10, Theorem 4.4.3], which is again a geometric lattice. For gaussoid minors the lattice is replaced by the set of squares in the hypercube and the lattice intervals are replaced by hypercube faces.

Minors for arbitrary CI structures have been studied for example in [Mat97]. There, a minor of a CI structure is obtained by choosing two disjoint sets and performing restriction to followed by contraction by , which are in symbols:

In [BDKS17], minors were also defined specifically for gaussoids using statistical terminology with an emphasis on the parallels to matroid theory. A minor is every set of squares arising from a gaussoid via any sequence of marginalization and conditioning:

These operations are dual to the ones defined by Matúš: and . Furthermore, either operation can be the identity, and , and finally, the two sets and in Matúš’ definition of minor can be decoupled: . Thus both notions of minor coincide.

Our aim is to provide a geometric intuition for the act of taking a gaussoid minor. A face of the -cube is canonically isomorphic to the -cube by deleting from the -cube all coordinates outside of . This deletion is a lattice isomorphism , with the face lattice of an -dimensional cube. We can interpret taking the minor as an operation in the hypercube.

Proposition 2.1.

Let , then .

Proof.

Take . Then and can be seen as subsets of and they satisfy and . From this it is immediate that and . Furthermore, , hence and .

In the other direction, suppose that and let be its preimage under . Then and it follows , and also because . Thus decomposes into where naturally . This proves that . ∎

Proposition 2.1 compactly encodes the definitions of minor. The following definition introduces notation reflecting this as well as an opposite embedding, which mounts a set of squares from the -cube into an -dimensional face of a higher hypercube.

Definition 2.2.
(1) For a set and , the -minor of is the set . A -minor is an -minor with . (2) For a set and , the embedding of into is the preimage .

3. Gaussoid puzzles

Several theorems in matroid theory concern the (impossibility of a) characterization of classes of matroids in terms of forbidden and compulsory minors. For CI structures such as gaussoids the definitions read as follows.

Definition 3.1.
  1. A class of sets of squares is minor-closed if with all minors of belong to .

  2. A set of squares is a forbidden minor for a minor-closed class if it is minimal with the property that it does not belong to , in the sense that all its proper minors do belong to .

  3. If there is a forbidden -minor for some , then all non-forbidden -minors are called compulsory -minors for the class .

It is easy to see that gaussoids are minor-closed, i.e. any -minor of an -gaussoid is always a -gaussoid. But even more is true: given any set of squares in the -cube, if all of its -minors, for any , are -gaussoids, then the whole is an -gaussoid. This claim is proved in Lemma 3.3. The present section uses this property to construct gaussoids by prescribing their -minors. Section 4 investigates subclasses of gaussoids which have the same anatomy. We formalize this property in

Definition 3.2.

A class of sets of squares stratified by dimension, i.e. , has a puzzle property if it is minor-closed and its -th stratum is generated via embeddings from the strata below , i.e. if for some all its -minors, , are in , then already . The lowest stratum is the basis of and the puzzle property is based in dimension .

Lemma 3.3.

The set of gaussoids has a puzzle property based in dimension , whose basis are the eleven -gaussoids.

Proof.

Let and . We show that is an -gaussoid if and only if is a -gaussoid for every . First consider the case . The gaussoid axioms are quantified over arbitrary cubes together with an order on the set , and each axiom refers to squares inside the cube only. Confined to this cube, the axioms state precisely that this -minor is a -gaussoid. The case of is reduced to the statement for . Indeed, all -minors of are gaussoids if and only if all -minors of -minors of are gaussoids, because those two collections of minors both arise from the same set of cubes of the -cube. ∎

Turning Definition 3.2 upside down, the construction of an -gaussoid can be seen as a high-dimensional jigsaw puzzle. The puzzle pieces are lower-dimensional gaussoids which are to be embedded into faces of the -cube. The difficulty comes from the fact that every square is shared by -faces. The minors must be chosen so that all of them agree on whether a shared square is an element of the -gaussoid under construction or not. The incidence structure of -faces in the -cube is important. We study it via the following graph.

Definition 3.4.

Let , for , be the undirected simple graph with vertex set and an edge between if and only if there is a -face such that and .

The idea behind this definition is that for suitable choices of and , the faces indexed by an independent set in these graphs will be just far enough away from each other in the -cube to allow free puzzling of -gaussoids without one minor choice creating constraints for other minors.

Theorem 3.5.

The graph is transitive, hence regular. It is complete if and only if . The degree of any vertex can be calculated as follows:

where the sum extends over pairs which satisfy the feasibility and connectivity conditions

()
Proof.

The symmetry group acts on the -cube as automorphisms of the face lattice. The group action is transitive on -faces for any and respects meet and join. Therefore acts transitively on the graph .

The characterization of completeness rests on Lemma 3.6. Using the gap function defined there, it is shown that is equivalent to the adjacency of and in and that if is a face with smaller gap, then is adjacent to . Since is regular, it is complete if and only if some vertex is adjacent to all others. For that to happen, the vertex must be adjacent to one which has the largest gap to it. As shown in the lemma, the maximum of is and hence completeness is equivalent to .

The exact degree also follows from Lemma 3.6. Fix any vertex of . By regularity it suffices to count the adjacent vertices of . We subdivide vertices according to two parameters: is a disagreement between and and is the number of common dimensions of and . A priori, ranges in and ranges in , but not all combinations allow to be a -face adjacent to . First, we determine the pairs for which an adjacent -face exists and then count how many of them exist for fixed parameters. Let . For it must hold that , since and are -faces. Assuming this, can be constructed if and only if the dimensions in leave enough space to create the prescribed disagreement of size . As an inequality this is , or . Together with , this inequality already entails the condition imposed by the choice of . Thus it is sufficient to require , which is the first condition in (3.5). Given a -face with parameters and , the existence of an edge between and in imposes the condition Lemma 3.6 (1), which is the right half of (3.5).

As for the counting, let be a fixed -face and let satisfy (3.5). We count the -faces with parameters and . There are ways to place the for . On , there are independent choices from . The choices so far fix in . There are now choices for the remaining s in . Then is fixed. Now to finish , we may only place and in where has only s and s as well. Among the remaining positions, a set of size must be chosen, where is already determined by the condition that it differs from . On the remaining positions, is determined by not differing from . The feasibility of all the choices enumerated so far is guaranteed by (3.5). The tally is

Since is not adjacent to itself, which is uniquely described by the feasible parameters and , subtracting concludes the proof. ∎

Lemma 3.6.

Let , be -faces and , with and . The following hold:

  1. if and only if and are adjacent in ,

  2. the range of is ,

  3. is strictly isotone with respect to , i.e. ,

  4. for with , if and are adjacent in , then so are and .

Proof.

Given two -faces and , the ground set splits into three sets: (i) of cardinality where both have and symbols only but differ, (ii) of cardinality of shared symbols, and (iii) everything else, i.e. positions where and patterns agree or where and are in one face and in the other. In order to connect two -faces in , there needs to be a -face which intersects either of them in at least dimension . Such a face has to cover the set of size with , as it otherwise it will not intersect both faces. Conversely, once is covered, a -dimensional intersection with both faces is ensured by placing s and s appropriately. To achieve a -dimensional intersection, have to be placed on and each. By using the shared s, one needs at least further to construct a connecting -face. Thus is the minimum dimension necessary to connect and in . This proves claim (1).

It is clear that is minimal when is minimal and is maximal. This can be achieved simultaneously by choosing and there . Now consider the opposing face of . The gap is assuming is a vertex of where in particular . Increasing this value would require reducing since is already maximal. Un-sharing s with consumes positions inside the block of s and s in of size which reduces by an equal amount. Hence is maximal. Furthermore, by varying but keeping , all values in the range can be attained, proving claim (2).

Claim (3) follows from a straightforward calculation:

In the situation of claim (4), since and are adjacent in , we have by (1). Applying this property in reverse proves the claim. ∎

Corollary 3.7.
  1. is complete for . Otherwise its degree is .

  2. is complete for . Otherwise its degree is . ∎

Remark 3.8.

For the theory of gaussoids, the cases are relevant. We consider it an interesting problem to study growth of the degree formula for other parameters. Certainly the graph can be complete, where the degree is as large as . To construct large independent sets, one wants smaller degrees. It is proved below that a maximal independent set in has cardinality in of which one inequality follows from the degree formula.

Proposition 3.9.

Let be an independent set in , then the following inequality holds: .

Proof.

Let . Since is independent, there is no -cube sharing a square with and with . Since , also and share no square. Thus an assignment of -gaussoids lifts to a well-defined set of squares . The map is injective.

To see that is a gaussoid, we examine its -minors. Let be arbitrary. In case is fully contained in some , then clearly since . Otherwise can share at most one square with any face in . If it shares no square with any element of , then is empty, hence a gaussoid. If it shares a square with some face in , it cannot share a square with any other element of because is an independent set in . In this case, is a singleton and hence a gaussoid. ∎

Proposition 3.10.

Let be an independent set in and the maximum size of a set of mutually range-disjoint injections of into . Then .

Proof.

The proof is analogous to Proposition 3.9 but uses the independent set to perturb any gaussoid injectively into non-gaussoids. Again, since and is independent, an assignment lifts uniquely via to a subset of . Let be a set of range-disjoint injections as in the claim. Consider the maps . To each associate .

Because the ranges of the are disjoint, the map is injective. None of the sets is a gaussoid since any certifies . ∎

Remark 3.11.

The proofs of Propositions 3.9 and 3.10 exploit two properties of the class of gaussoids: (1) it has a puzzle property, and (2) the empty set and all singletons are in its basis. The same technique does not work for realizable gaussoids because they lack property (1) and not for graphical gaussoids (see Section 4) because they lack property (2). Indeed their numbers can be shown to be single exponential. For realizable gaussoids, this follows from Nelson’s recent breakthrough: If a gaussoid is realizable with a positive-definite covariance matrix , then the matrix

both defines a vector matroid identifies the gaussoid. By

[Nel18, Theorem 1.1] there are only exponentially many realizable matroids and thus realizable gaussoids. Nelson’s bound features a cubic polynomial in the exponent, while there are certainly realizable gaussoids coming from graphical models.

To get explicit bounds we apply the propositions for . To find suitable independent sets in and we use Brooks’ Theorem [Lov75] and the degree bounds from Corollary 3.7. Since the graphs are connected, have degree at least but are not complete, there exists a proper -coloring of , and we can pick a color class as an independent set . Its size is at least that of an average color class:

For , we find analogously

Proposition 3.9 now shows, using and , that there are at least -gaussoids. Similarly, Proposition 3.10 with gives an upper bound on the ratio of -gaussoids of . We have proved

Theorem 3.12.

For , the number of -gaussoids is bounded by

Remark 3.13.

A simple way to obtain a weaker double exponential lower bound for the number of gaussoids was suggested to us by Peter Nelson, following a matroid construction of Ingleton and Piff. Let be the set of all -subsets of for some . Every defines a -face of the -cube, where are the minimal elements of . Any subset of is a gaussoid. The axioms (G1) and (G4) are satisfied because their premises contain sets of different sizes. The axioms (G2) and (G3) are satisfied because their premises correspond to the same and thus only one of them can be in. With there are least gaussoids.

Substituting in Theorem 3.12 gives an interval for the absolute number of -gaussoids for . It shows .

We conclude this section by showing that the linear order lower bound is the best that the independent set construction in can do. The independence number of a graph is the maximal size of an independent set in . Similarly, the clique number is the maximal size of a clique in . Since is transitive, the following inequality holds [GR01, Lemma 7.2.2]:

Since , it suffices to find a clique of size in every . Take the set of cubes . This set has cardinality and any two elements , in it are connected by an edge in , since with and .

4. Special gaussoids

Because of their puzzle property, gaussoids are the largest class of CI structures whose -minors are -gaussoids. The base case of this definition are the eleven -gaussoids arising from

covariance matrices of Gaussian distributions. The

-gaussoids split into five symmetry classes modulo which we denote by letters E, L, U, B, and F. They are depicted in Figure 2.






E L U B F
Figure 2. The eleven -gaussoids in five symmetry classes mod organized in columns. From left to right: the empty gaussoid E, the lower singletons L, the upper singletons U, the belts B and the full gaussoid F.

The special -invariant types of gaussoids in this section arise from choosing subsets of these five symmetry classes to base a puzzle property on. Each of the 32 sets of bases can be converted into axioms in the -cube similar to the gaussoid axioms (G1)—(G4). SAT solvers [Thu06, TS16] were used on the resulting Boolean formulas to enumerate or count these classes. The listings can be found on our supplementary website gaussoids.de. For nine classes an entry in the OEIS [OEI19] could be found. Table 1 is the main result of this section. It summarizes the different types of gaussoids that arise from the different bases.

Name Count in dim. OEIS Interpretation
Fast-growing
ELUBF 11, 679, 60 212 776 Gaussoids
ELUB 10, 640, 59 348 930
ELUF 8, 522, 48 633 672
ELU 7, 513, 47 867 881 Required for Prop. 3.9
Incompatible
LUB 9, 111, 0, 0 Vanishes for
LUF 7, 61, 1, 1 Only F for
LU 6, 60, 0, 0 Vanishes for
{L,U}B 6, 15, 0, 0 Vanishes for
{L,U}F 4, 1, 1, 1 Only F for
EF 2, 2, 2, 2 A007395 Only E or F for all
Graphical
E{L,U}BF 8, 64, 1 024, 32 768, 2 097 152 A006125 Undirected simple graphs
E{L,U}B 7, 41, 388, 5 789, 133 501 A213434 Graphs without -cycles
{L,U}BF 7, 34, 206, 1 486, 12 412 A011800 Forests of paths on
E{L,U}F, EBF 5, 15, 52, 203, 877, 4 140 A000110 Partitions of
E{L,U}, BF 4, 10, 26, 76, 232, 764, 2 620 A000085 Involutions on
EB 4, 8, 16, 32, 64, 128, 256 A000079 Subsets of
Exceptional
LUBF
10, 142, 1 166, 12 796,
183 772, 3 221 660
Table 1. 26 classes of special gaussoids categorized into four types. The remaining six classes are described by one or zero letters of and belong to the Incompatible type, as each of them is a subclass of a class found to be Incompatible.

The classes E, B and F are themselves closed under duality, while L and U are interchanged by it. It follows that one of the 32 classes is invariant under duality if it contains either none of L and U or both of them. On the remaining classes, duality acts by swapping L with U. The combinatorial properties of the classes, e.g. the size, are unaffected by this action, hence LB and UB are conflated to {L,U}B in Table 1.

4.1. Fast-growing gaussoids

By Remark 3.11, the construction of doubly exponentially many members of a class of gaussoids requires that the class has a puzzle property and that its basis includes ELU. This explains the rapid growth of all four classes of this type.

4.2. Incompatible minors

As a consequence of Definition 3.2, if there is no gaussoid of dimension in a class, there are no gaussoids of any dimension in the class. Similarly, if the class contains only the empty or full gaussoid in dimension , the members of dimension are the empty or full gaussoid as well. Hence computations in small dimension suffice to explain these classes. Despite their simplicity, each of them provides higher compatibility axioms. For example the annihilation of LUB in dimension implies that every -minor of a gaussoid contains an empty or a full -minor. Or: a graphical -gaussoid with no belts is full or contains an empty -minor.

4.3. Graphical gaussoids

Each undirected simple graph defines a CI structure , where two vertices and are separated by a set if every path between and intersects . These are the separation graphoids of [Mat97]. They fulfill a localized version of the global Markov property. According to [LM07, Remark 2], separation graphoids are exactly the gaussoids satisfying the ascension axiom:

(A)

Therefore we refer to them as ascending gaussoids. The operation is a bijection whose inverse recovers the graph via its edges , where abbreviates . Any gaussoid in this section is of the form for some undirected simple graph .

Since (A) uses only -faces of a single -face of the -cube, being an ascending gaussoid is a puzzle property based in dimension . Its basis are the ascending -gaussoids. This was shown by Matúš [Mat97, Proposition 2] and in our terminology it can be restated as follows

Lemma 4.1.

A gaussoid is ascending if and only if L is a forbidden minor. ∎

This shows that EUBF are the ascending gaussoids. Their duals are ELBF and it is easy to see that their axiomatization replaces (A) by the descension axiom

(D)

EUBF-gaussoids arise from undirected graphs via vertex separation, i.e.  if and only if and are in different connected components of . Their duals contain if and only if and are in different connected components in the induced subgraph on . Therefore we call elements of graphical gaussoids. For our classification purposes it is sufficient to study the “Upper” half of dual pairs.

Our technique to understand EUBF and its subclasses has already been used in [Mat97]: since the presence of an edge in is encoded by the non-containment , the compulsory minors of of the form prescribe induced subgraphs on vertex triples . In the opposite direction, however, the induced -subgraphs of a graph do not in general reveal the types all minors in its corresponding gaussoid.

Example 4.2.

Consider the cycle

corresponding to the gaussoid . Its -minors are exclusively E and U. The U minors arise precisely in the -cubes

All other -minors are E. This means that the -cycle is contained in EUBF, EUB, and EU. To match with Table 1, check that the -cycle has no induced -cycle, corresponds to the partition of , and the involution .

This graph shows that the class of a gaussoid cannot be determined by looking only at the induced subgraphs of . All -minors observable from induced subgraphs are U, but the smallest class to which this gaussoid belongs is EU.

Example 4.3.

Consider the star

with interior node and leaves . It corresponds to the gaussoid

Because the right-hand side of every element of the gaussoid contains , this gaussoid has the minor F in , E in the opposite face and U everywhere else.

We now establish relationships of subclasses of EUBF with known combinatorial objects. For some the graph is more convenient, for others it is the complement graph which is more natural. Figure 3 shows the complement graphs corresponding to E, U, B and F and is useful to keep in mind for the proof of Theorem 4.4.

E3[scale=0.7] E1,E2[scale=0.7]

[scale=0.7]
E1[scale=0.7] E2,E3[scale=0.7] E1,E2,E3[scale=0.7]

E2[scale=0.7] E1,E3[scale=0.7]
E U B F
Figure 3. The complementary graphs of -gaussoids organized in symmetry classes mod according to Figure 2. E, U, B, F index a partition of the orbits of all graphs on vertices. To obtain the diagram of graphs , flip the pictures over the vertical axis.
Theorem 4.4.

The gaussoids in the class EUBF are in bijection with the simple undirected graphs on vertices. The subclasses distribute as follows

  1. EUB contains exactly the gaussoids such that is -free.

  2. UBF contains exactly the gaussoids such that each connected component of is a path.

  3. EUF contains exactly the gaussoids such that in each connected component is a clique, and hence corresponds to partitions of the vertex set .

  4. EU is EUF where additionally every connected component of has at most two vertices.

Proof.

The first statement summarizes the discussion in the beginning of this section. (1) The graphs for are free of triangles, as seen in Figure 3. If conversely triangle-free, then does not have F among its minors . By ascension, the cardinality of