Representing Independence Models with Elementary Triplets

12/04/2016
by   Jose M. Peña, et al.
Linköping University
0

In an independence model, the triplets that represent conditional independences between singletons are called elementary. It is known that the elementary triplets represent the independence model unambiguously under some conditions. In this paper, we show how this representation helps performing some operations with independence models, such as finding the dominant triplets or a minimal independence map of an independence model, or computing the union or intersection of a pair of independence models, or performing causal reasoning. For the latter, we rephrase in terms of conditional independences some of Pearl's results for computing causal effects.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

02/27/2013

Inter-causal Independence and Heterogeneous Factorization

It is well known that conditional independence can be used to factorize ...
02/27/2013

A New Look at Causal Independence

Heckerman (1993) defined causal independence in terms of a set of tempor...
02/06/2013

Structure and Parameter Learning for Causal Independence and Causal Interaction Models

This paper discusses causal independence models and a generalization of ...
02/27/2013

Semigraphoids Are Two-Antecedental Approximations of Stochastic Conditional Independence Models

The semigraphoid closure of every couple of CI-statements (GI=conditiona...
06/05/2019

On Testing Marginal versus Conditional Independence

We consider testing marginal independence versus conditional independenc...
07/11/2012

Stable Independance and Complexity of Representation

The representation of independence relations generally builds upon the w...
03/10/2020

A different perspective of cross-world independence assumption and the utility of natural effects versus controlled effects

The pure effects described by Robins and Greenland, and later called nat...

1 Introduction

In this paper, we explore a non-graphical approach to representing and reasoning with independence models. The approach consists in representing an independence model by its elementary triplets, i.e. the triplets that represent conditional independences between individual random variables. It is known that the elementary triplets represent the independence model unambiguously when the independence model satisfies the semi-graphoid properties

[1, 13, 24]. Moreover, every elementary triplet corresponds to an elementary imset, i.e. a function over the power set of the set of random variables at hand [24]. This provides an interesting connection between the question addressed in this paper and imset theory. Specifically, structural imsets are an algebraic method to represent independence models that solve some of the drawbacks of graphical models. Interestingly, every structural imset can be expressed as a linear combination of elementary imsets. For a detailed account of imset theory, we refer the reader to [24]. See also [5] for a study about efficient ways of solving the implication problem between two structural imsets, i.e. deciding whether the independence model represented by one of the imsets is included in the model represented by the other. This paper aims to show how to reason efficiently with independence models when these are represented by elementary triplets, instead of by structural imsets. Another set of distinguished triplets that has been used in the literature to represent and reason with independence models are dominant triplets, i.e. any triplet that cannot be derived from any other triplet [2, 3, 4, 9, 12, 23]. We will later briefly compare the relative merits of elementary and dominant triplets. We will also show how to produce the dominant triplets from the elementary triplets.

The rest of the paper is organized as follows. In Section 2, we introduce some notation and concepts. In Section 3, we study under which conditions an independence model can unambiguously be represented by its elementary triplets. In Section 4, we show how this representation helps performing some operations with independence models, such as finding the dominant triplets or a minimal independence map of an independence model, or computing the union or intersection of a pair of independence models, or performing causal reasoning. Finally, we close the paper with some discussion in Section 5.

2 Preliminaries

In this section, we introduce some notation and concepts. Let denote a finite set of random variables. Subsets of are denoted by upper-case letters, whereas elements of are denoted by lower-case letters. We shall not distinguish between elements of and singletons. Given two sets , we use to denote . Union has higher priority than set difference in expressions.

Given three disjoint sets , the triplet denotes that is conditionally independent of given . Given a set of triplets , also known as an independence model, denotes that is in whereas denotes that does not hold. A triplet is called elementary if . Moreover, a triplet dominates another triplet if , and . Given a set of triplets, a triplet in the set is called dominant if no other triplet in the set dominates it.

Given a probability distribution

and three disjoint sets , the triplet denotes that is conditionally independent of given in , i.e.

The set of all such triplets is called the independence model induced by . Moreover, if does not hold but

where is a value in the domain of , then we say that is conditionally independent of given and the context in , and we denote it by .

Consider the following properties between triplets:

  • .

  • .

  • .

  • .

A set of triplets with the properties CI0-1/CI0-2/CI0-3 is also called a semigraphoid/graphoid/compositional graphoid. For instance, the independence model induced by a probability distribution is a semigraphoid, while the independence model induced by a strictly positive probability distribution is a graphoid, and the independence model induced by a regular Gaussian distribution is a compositional graphoid. The CI0 property is also called symmetry property. The

part of the CI1 property is also called contraction property, and the part corresponds to the so-called weak union and decomposition properties. The CI2 and CI3 properties are also called intersection and composition properties. Intersection is typically defined as . Note however that this and our definition are equivalent if CI1 holds. First, implies and by CI1. Second, together with imply by CI1. Likewise, composition is typically defined as . Again, this and our definition are equivalent if CI1 holds. First, implies and by CI1. Second, together with imply by CI1. In this paper, we will study sets of triplets that satisfy CI0-1, CI0-2 or CI0-3. So, the standard and our definitions are equivalent.

Consider also the following properties between elementary triplets:

  • .

  • .

  • .

  • .

Note that CI2 and CI3 only differ in the direction of the implication. The same holds for ci2 and ci3. Note that ci0-3 are the elementary versions of CI0-3 with the only exception of ci1 and CI1.

Given a set of triplets , let

Similarly, given a set of elementary triplets , let

We say that a set of triplets is closed under CI0-1/CI0-2/CI0-3 if applying the properties CI0-1/CI0-2/CI0-3 to triplets in the set always returns triplets that are in the set. Given a set of triplets , we define its closure under CI0-1/CI0-2/CI0-3, denoted as , as the minimal superset of that is closed under CI0-1/CI0-2/CI0-3. We define similarly the closure of a set of elementary triplets under ci0-1/ci0-2/ci0-3, which we denote as .

Graphs can be used to represent independence models as follows. A directed and acyclic graph (DAG) is a graph that only has directed edges and does not have any subgraph of the form . Given a DAG over , a path between a node and a node on is a sequence of distinct nodes such that has an edge between every pair of consecutive nodes in the sequence. If every edge in the path is of the form , then is called an ancestor of . Let with denote the union of the ancestors of each node in . A node on a path in is said to be a collider on the path if is a subpath. Moreover, the path is said to be connecting given when

  • every collider on the path is in , and

  • every non-collider on the path is outside .

Let , and denote three disjoint subsets of . When there is no path in connecting a node in and a node in given , we say that and are d-separated given in , denoted as . The independence model induced by consists of the triplets such that .

We say that a DAG over is a minimal independence map of a set of triplets relative to an ordering of the elements in if (i) implies that , (ii) removing any edge from makes it cease to satisfy condition (i), and (iii) the edges of are of the form with . Moreover, if is the independence model induced by a probability distribution , then the following factorization holds:

where are the parents of in . Moreover, is a perfect map of if implies and vice versa.

Finally, given three disjoint sets , we define the causal effect on given of an intervention on

as the conditional probability distribution of

given after setting to some value in its domain through an intervention, as opposed to an observation. We say that the causal effect is identifiable if it can be computed from observed quantities, i.e. from the probability distribution over .

3 Representation

In this section, we study the use of elementary triplets to represent independence models. We start by proving in the following lemma that there is a bijection between certain sets of triplets and certain sets of elementary triplets. The lemma has previously been proven when the sets of triplets and elementary triplets satisfy CI0-1 and ci0-1 [13, Proposition 1]. We extend it to the cases where they satisfy CI0-2/CI0-3 and ci0-2/ci0-3.

Lemma 1.

If a set of triplets satisfies CI0-1/CI0-2/CI0-3 then satisfies ci0-1/ci0-2/ci0-3, , and . Similarly, if a set of elementary triplets satisfies ci0-1/ci0-2/ci0-3 then satisfies CI0-1/CI0-2/CI0-3, , and .

Proof.

The lemma has previously been proven when and satisfy CI0-1 and ci0-1 [13, Proposition 1]. Therefore, we only have to prove that if satisfies CI0-3 then satisfies ci2-3, and that if satisfies ci0-3 then satisfies CI2-3.

Proof of CI0-2 ci2

Assume that and . Then, it follows from the definition of that or with , and . Note that the latter case implies that by CI1. Similarly, implies . Then, and by CI2. Then, and by definition of .

Proof of CI0-3 ci3

Assume that and . Then, and by the same reasoning as before, which imply and by CI3. Then, and by definition of .

Proof of ci0-2 CI2

  1. Assume that and .

  2. and for all and follows from (1) by definition of .

  3. and for all and by ci2 on (2).

  4. and follows from (3) by definition of .

Therefore, we have proven the result when . Assume as induction hypothesis that the result also holds when . Assume without loss of generality that . Let such that and .

  1. and by CI1 on .

  2. and by the induction hypothesis on (5) and .

  3. by the induction hypothesis on (6).

  4. by CI1 on (6) and (7).

  5. by CI1 on (8) and .

Proof of ci0-3 CI3

  1. Assume that and .

  2. and for all and follows from (10) by definition of .

  3. and for all and by ci3 on (11).

  4. and follows from (12) by definition of .

Therefore, we have proven the result when . Assume as induction hypothesis that the result also holds when . Assume without loss of generality that . Let such that and .

  1. by CI1 on .

  2. by CI1 on .

  3. by the induction hypothesis on (14) and .

  4. by the induction hypothesis on (15) and (16).

  5. by CI1 on (17) and .

  6. and by CI1 on (18).

The following lemma generalizes Lemma 1 by removing the assumptions about and .

Lemma 2.

Let denote a set of triplets. Then, , and . Let denote a set of elementary triplets. Then, , and .

Proof.

Clearly, and, thus, because satisfies CI0-1/CI0-2/CI0-3 by Lemma 1. Clearly, and, thus, because satisfies ci0-1/ci0-2/ci0-3 by Lemma 1. Then, and . Then, and , because and by Lemma 1. Finally, that is now trivial.

Similarly, and, thus, because satisfies ci0-1/ci0-2/ci0-3 by Lemma 1. Clearly, and, thus, because satisfies CI0-1/CI0-2/CI0-3 by Lemma 1. Then, and . Then, and , because and by Lemma 1. Finally, that is now trivial. ∎

Lemma 1 implies that every set of triplets satisfying CI0-1/CI0-2/CI0-3 can be paired to a set of elementary triplets satisfying ci0-1/ci0-2/ci0-3, and vice versa. The lemma implies that the pairing is actually a bijection. Thanks to this bijection, we can use to represent . This is in general a much more economical representation: If , then there are up to triplets,111A triplet can be represented as a -tuple whose entries state if the corresponding element is in the first, second, third or none set of the triplet. whereas there are elementary triplets at most.

Likewise, Lemma 2 implies that there is a bijection between the CI0-1/CI0-2/CI0-3 closures of sets of triplets and the ci0-1/ci0-2/ci0-3 closures of sets of elementary triplets. Thanks to this bijection, we can use to represent . Note that is obtained by ci0-1/ci0-2/ci0-3 closing , which is obtained from . So, there is no need to CI0-1/CI0-2/CI0-3 close and so produce . Whether closing can be done faster than closing on average is an open question. In the worst-case scenario, both imply applying the corresponding properties a number of times exponential in [14]. The following examples illustrate the savings in space that can be achieved by using to represent .

Example 1.

This example is taken from [12]. Let . Let . The CI0-1, CI0-2 and CI0-3 closures of have the same 162 triplets. However, they can be represented in a more concise manner by their 82 elementary triplets.

Example 2.

This example will be used again later in this work. Let . Let . The CI0-1, CI0-2 and CI0-3 closures of have the same 218 triplets. However, they can be represented in a more concise manner by their 112 elementary triplets.

One may think that Lemmas 1 and 2 have theoretical interest but little practical interest, because one may have access to a set of triplets that is not closed under CI0-1/CI0-2/CI0-3 and, thus, can only be obtained by first producing the CI0-1/CI0-2/CI0-3 closure of or as the ci0-1/ci0-2/ci0-3 closure of . As mentioned above, the worst-case scenario for either alternative is computationally demanding. The complexity of the average case is unknown. However, we believe that Lemmas 1 and 2 are of practical interest when all one has access to is a probability distribution , e.g. the empirical distribution derived from a sample. In that case, the independence model induced by can be represented by the elementary triplets such that holds. To see it, recall from Section 2 that the independence model induced by a probability distribution always satisfies the CI0-1 properties. Note that the process of finding the elementary triplets may be sped up by using the ci0-1 properties to derive elementary triplets from previously obtained elementary triplets, and so avoiding checking some pairwise independences in . One can instead use the ci0-2 or ci0-3 properties if it is known that is strictly positive or regular Gaussian. This speeding up is warranted from the fact that the elementary triplet representation must be closed under ci0-1/ci0-2/ci0-3 by Lemmas 1 and 2. For instance, having found that holds implies that must in the representation of the independence model induced by , which implies that so does by ci0. So, there is no need to check whether holds. This approach (without the speeding up sketched) of representing the independence model induced by a probability distribution with its elementary triplets has been instrumental in developing exact and assumption-free learning algorithms for chain graphs and acyclic directed mixed graphs [20, 22]. One may argue that there is no need to produce a concise representation of such as the elementary triplet representation, since it takes time and storage space and it provides no additional information about . However, some operations with independence models are not easy to perform without representing the independence models explicitly, e.g. it is not clear to us how to compute the intersection of the independence models induced by two probability distributions without representing the independence models in any way whereas, as we will see in Section 4, this is a straightforward question to answer from their elementary triplet representations.

For simplicity, all the results in the sequel assume that and satisfy CI0-1/CI0-2/CI0-3 and ci0-1/ci0-2/ci0-3. Thanks to Lemma 2, these assumptions can be dropped by replacing , , and in the results below with , , and .

Let and . In order to decide whether , the definition of implies checking whether elementary triplets are in . The following lemma simplifies this when satisfies ci0-1, as it implies checking elementary triplets. When satisfies ci0-2 or ci0-3, the lemma simplifies the decision even further as the conditioning sets of the elementary triplets checked have all the same size or form.

Lemma 3.

Let denote a set of elementary triplets. Let for all and , for all and , and for all and . If satisfies ci0-1, then . If satisfies ci0-2, then . If satisfies ci0-3, then .

Proof.

Proof for ci0-1

It suffices to prove that because clearly . Assume that . Then, and by definition of . Then, and by ci1. Then, and by definition of . By repeating this reasoning, we can then conclude that for any permutation of the set . By following an analogous reasoning for instead of , we can then conclude that for any permutations and of the sets and . This implies the desired result by definition of .

Proof for ci0-2

It suffices to prove that because clearly . Note that satisfies CI0-2 by Lemma 1. Assume that .

  1. and follow from and by definition of .

  2. by CI2 on (1), which together with (1) imply by CI1.

  3. follows from by definition of .

  4. by CI2 on (2) and (3), which together with (3) imply by CI1.

By continuing with the reasoning above, we can conclude that . Moreover, by a reasoning similar to (1-4) and, thus, by an argument similar to (2). Moreover, by a reasoning similar to (1-4) and, thus, by an argument similar to (4). Continuing with this process gives the desired result.

Proof for ci0-3

It suffices to prove that because clearly . Note that satisfies CI0-3 by Lemma 1. Assume that .

  1. and follow from and by definition of .

  2. by CI3 on (5), which together with (5) imply by CI1.

  3. follows from by definition of .

  4. by CI3 on (6) and (7), which together with (7) imply by CI1.

By continuing with the reasoning above, we can conclude that . Moreover, by a reasoning similar to (5-8) and, thus, by an argument similar to (6). Moreover, by a reasoning similar to (5-8) and, thus, by an argument similar to (8). Continuing with this process gives the desired result. ∎

As mentioned in the introduction, another set of distinguished triplets in that can be used to represent it is the set of dominant triplets [2, 9, 12, 23]. The following lemma shows how to find these triplets with the help of .

Lemma 4.

Let denote a set of triplets. If satisfies CI0-1, then is a dominant triplet in if and only if and are two maximal sets such that for all and and, for all , and for some and . If satisfies CI0-2, then is a dominant triplet in if and only if and are two maximal sets such that for all and and, for all , and for some and . If satisfies CI0-3, then is a dominant triplet in if and only if and are two maximal sets such that for all and and, for all , and for some and .

Proof.

We prove the lemma when satisfies CI0-1. The other two cases can be proven in much the same way. To see the if part, note that by Lemmas 1 and 3. Moreover, assume to the contrary that there is a triplet that dominates . Consider the following two cases: and . In the first case, CI0-1 on implies that or with and . Assume the latter without loss of generality. Then, CI0-1 implies that for all and . This contradicts the maximality of . In the second case, CI0-1 on implies that or with . Assume the latter without loss of generality. Then, CI0-1 implies that for all , which contradicts the assumptions of the lemma.

To see the only if part, note that CI0-1 implies that for all and . Moreover, assume to the contrary that for some , for all or for all . Assume the latter without loss of generality. Then, by Lemmas 1 and 3, which implies that is not a dominant triplet in , which is a contradiction. Finally, note that and must be maximal sets satisfying the properties proven in this paragraph because, otherwise, the previous paragraph implies that there is a triplet in that dominates . ∎

A natural question to ponder is whether it is better to represent an independence model by its elementary or dominant triplets. In terms of storage space, it seems that the dominant triplet representation should be preferred. For instance, for the independence model in Example 1, there are 82 elementary triplets but only 12 dominant triplets and nine non-symmetric dominant triplets [12]. For the independence model in Example 2, there are 112 elementary triplets but only two non-symmetric dominant triplets, as we will see later. In terms of running time, the answer is less clear. As mentioned before, finding for a given set of triplets implies producing the CI0-1/CI0-2/CI0-3 closure of or the ci0-1/ci0-2/ci0-3 closure of . The average case complexity of either case is unknown. The algorithms in [2, 9, 12, 23] for finding the dominant triplets in are conceptually more involved but they could be faster than finding . Performing an empirical comparison of the two alternatives is definitely an interesting research project. However, it is beyond the scope of this work. Moreover, the methods for finding dominant triplets take a set of triplets as input. It is not clear to us how to run them when all we have access to is a probability distribution , e.g. the empirical distribution derived from a sample. As discussed before, finding the elementary triplet representation in that scenario is conceptually easy. Yet another dimension to compare elementary and dominant triplet representations is the operations that each alternative allows to perform efficiently, e.g. there is no method to our knowledge for computing the intersection of the CI0-1 closures of two sets of triplets when all we have is their dominant triplet representations whereas, as we will see in Section 4, this is a straightforward question to answer from their elementary triplet representations. That is why we prefer to see elementary and dominant triplets as complementary rather than competing alternatives to represent independence models: Depending on task at hand, one or the other may be preferred.