1 Introduction
In this paper, we explore a nongraphical approach to representing and reasoning with independence models. The approach consists in representing an independence model by its elementary triplets, i.e. the triplets that represent conditional independences between individual random variables. It is known that the elementary triplets represent the independence model unambiguously when the independence model satisfies the semigraphoid properties
[1, 13, 24]. Moreover, every elementary triplet corresponds to an elementary imset, i.e. a function over the power set of the set of random variables at hand [24]. This provides an interesting connection between the question addressed in this paper and imset theory. Specifically, structural imsets are an algebraic method to represent independence models that solve some of the drawbacks of graphical models. Interestingly, every structural imset can be expressed as a linear combination of elementary imsets. For a detailed account of imset theory, we refer the reader to [24]. See also [5] for a study about efficient ways of solving the implication problem between two structural imsets, i.e. deciding whether the independence model represented by one of the imsets is included in the model represented by the other. This paper aims to show how to reason efficiently with independence models when these are represented by elementary triplets, instead of by structural imsets. Another set of distinguished triplets that has been used in the literature to represent and reason with independence models are dominant triplets, i.e. any triplet that cannot be derived from any other triplet [2, 3, 4, 9, 12, 23]. We will later briefly compare the relative merits of elementary and dominant triplets. We will also show how to produce the dominant triplets from the elementary triplets.The rest of the paper is organized as follows. In Section 2, we introduce some notation and concepts. In Section 3, we study under which conditions an independence model can unambiguously be represented by its elementary triplets. In Section 4, we show how this representation helps performing some operations with independence models, such as finding the dominant triplets or a minimal independence map of an independence model, or computing the union or intersection of a pair of independence models, or performing causal reasoning. Finally, we close the paper with some discussion in Section 5.
2 Preliminaries
In this section, we introduce some notation and concepts. Let denote a finite set of random variables. Subsets of are denoted by uppercase letters, whereas elements of are denoted by lowercase letters. We shall not distinguish between elements of and singletons. Given two sets , we use to denote . Union has higher priority than set difference in expressions.
Given three disjoint sets , the triplet denotes that is conditionally independent of given . Given a set of triplets , also known as an independence model, denotes that is in whereas denotes that does not hold. A triplet is called elementary if . Moreover, a triplet dominates another triplet if , and . Given a set of triplets, a triplet in the set is called dominant if no other triplet in the set dominates it.
Given a probability distribution
and three disjoint sets , the triplet denotes that is conditionally independent of given in , i.e.The set of all such triplets is called the independence model induced by . Moreover, if does not hold but
where is a value in the domain of , then we say that is conditionally independent of given and the context in , and we denote it by .
Consider the following properties between triplets:

.

.

.

.
A set of triplets with the properties CI01/CI02/CI03 is also called a semigraphoid/graphoid/compositional graphoid. For instance, the independence model induced by a probability distribution is a semigraphoid, while the independence model induced by a strictly positive probability distribution is a graphoid, and the independence model induced by a regular Gaussian distribution is a compositional graphoid. The CI0 property is also called symmetry property. The
part of the CI1 property is also called contraction property, and the part corresponds to the socalled weak union and decomposition properties. The CI2 and CI3 properties are also called intersection and composition properties. Intersection is typically defined as . Note however that this and our definition are equivalent if CI1 holds. First, implies and by CI1. Second, together with imply by CI1. Likewise, composition is typically defined as . Again, this and our definition are equivalent if CI1 holds. First, implies and by CI1. Second, together with imply by CI1. In this paper, we will study sets of triplets that satisfy CI01, CI02 or CI03. So, the standard and our definitions are equivalent.Consider also the following properties between elementary triplets:

.

.

.

.
Note that CI2 and CI3 only differ in the direction of the implication. The same holds for ci2 and ci3. Note that ci03 are the elementary versions of CI03 with the only exception of ci1 and CI1.
Given a set of triplets , let
Similarly, given a set of elementary triplets , let
We say that a set of triplets is closed under CI01/CI02/CI03 if applying the properties CI01/CI02/CI03 to triplets in the set always returns triplets that are in the set. Given a set of triplets , we define its closure under CI01/CI02/CI03, denoted as , as the minimal superset of that is closed under CI01/CI02/CI03. We define similarly the closure of a set of elementary triplets under ci01/ci02/ci03, which we denote as .
Graphs can be used to represent independence models as follows. A directed and acyclic graph (DAG) is a graph that only has directed edges and does not have any subgraph of the form . Given a DAG over , a path between a node and a node on is a sequence of distinct nodes such that has an edge between every pair of consecutive nodes in the sequence. If every edge in the path is of the form , then is called an ancestor of . Let with denote the union of the ancestors of each node in . A node on a path in is said to be a collider on the path if is a subpath. Moreover, the path is said to be connecting given when

every collider on the path is in , and

every noncollider on the path is outside .
Let , and denote three disjoint subsets of . When there is no path in connecting a node in and a node in given , we say that and are dseparated given in , denoted as . The independence model induced by consists of the triplets such that .
We say that a DAG over is a minimal independence map of a set of triplets relative to an ordering of the elements in if (i) implies that , (ii) removing any edge from makes it cease to satisfy condition (i), and (iii) the edges of are of the form with . Moreover, if is the independence model induced by a probability distribution , then the following factorization holds:
where are the parents of in . Moreover, is a perfect map of if implies and vice versa.
Finally, given three disjoint sets , we define the causal effect on given of an intervention on
as the conditional probability distribution of
given after setting to some value in its domain through an intervention, as opposed to an observation. We say that the causal effect is identifiable if it can be computed from observed quantities, i.e. from the probability distribution over .3 Representation
In this section, we study the use of elementary triplets to represent independence models. We start by proving in the following lemma that there is a bijection between certain sets of triplets and certain sets of elementary triplets. The lemma has previously been proven when the sets of triplets and elementary triplets satisfy CI01 and ci01 [13, Proposition 1]. We extend it to the cases where they satisfy CI02/CI03 and ci02/ci03.
Lemma 1.
If a set of triplets satisfies CI01/CI02/CI03 then satisfies ci01/ci02/ci03, , and . Similarly, if a set of elementary triplets satisfies ci01/ci02/ci03 then satisfies CI01/CI02/CI03, , and .
Proof.
The lemma has previously been proven when and satisfy CI01 and ci01 [13, Proposition 1]. Therefore, we only have to prove that if satisfies CI03 then satisfies ci23, and that if satisfies ci03 then satisfies CI23.
Proof of CI02 ci2
Assume that and . Then, it follows from the definition of that or with , and . Note that the latter case implies that by CI1. Similarly, implies . Then, and by CI2. Then, and by definition of .
Proof of CI03 ci3
Assume that and . Then, and by the same reasoning as before, which imply and by CI3. Then, and by definition of .
Proof of ci02 CI2
Therefore, we have proven the result when . Assume as induction hypothesis that the result also holds when . Assume without loss of generality that . Let such that and .
Proof of ci03 CI3
Therefore, we have proven the result when . Assume as induction hypothesis that the result also holds when . Assume without loss of generality that . Let such that and .
The following lemma generalizes Lemma 1 by removing the assumptions about and .
Lemma 2.
Let denote a set of triplets. Then, , and . Let denote a set of elementary triplets. Then, , and .
Proof.
Lemma 1 implies that every set of triplets satisfying CI01/CI02/CI03 can be paired to a set of elementary triplets satisfying ci01/ci02/ci03, and vice versa. The lemma implies that the pairing is actually a bijection. Thanks to this bijection, we can use to represent . This is in general a much more economical representation: If , then there are up to triplets,^{1}^{1}1A triplet can be represented as a tuple whose entries state if the corresponding element is in the first, second, third or none set of the triplet. whereas there are elementary triplets at most.
Likewise, Lemma 2 implies that there is a bijection between the CI01/CI02/CI03 closures of sets of triplets and the ci01/ci02/ci03 closures of sets of elementary triplets. Thanks to this bijection, we can use to represent . Note that is obtained by ci01/ci02/ci03 closing , which is obtained from . So, there is no need to CI01/CI02/CI03 close and so produce . Whether closing can be done faster than closing on average is an open question. In the worstcase scenario, both imply applying the corresponding properties a number of times exponential in [14]. The following examples illustrate the savings in space that can be achieved by using to represent .
Example 1.
This example is taken from [12]. Let . Let . The CI01, CI02 and CI03 closures of have the same 162 triplets. However, they can be represented in a more concise manner by their 82 elementary triplets.
Example 2.
This example will be used again later in this work. Let . Let . The CI01, CI02 and CI03 closures of have the same 218 triplets. However, they can be represented in a more concise manner by their 112 elementary triplets.
One may think that Lemmas 1 and 2 have theoretical interest but little practical interest, because one may have access to a set of triplets that is not closed under CI01/CI02/CI03 and, thus, can only be obtained by first producing the CI01/CI02/CI03 closure of or as the ci01/ci02/ci03 closure of . As mentioned above, the worstcase scenario for either alternative is computationally demanding. The complexity of the average case is unknown. However, we believe that Lemmas 1 and 2 are of practical interest when all one has access to is a probability distribution , e.g. the empirical distribution derived from a sample. In that case, the independence model induced by can be represented by the elementary triplets such that holds. To see it, recall from Section 2 that the independence model induced by a probability distribution always satisfies the CI01 properties. Note that the process of finding the elementary triplets may be sped up by using the ci01 properties to derive elementary triplets from previously obtained elementary triplets, and so avoiding checking some pairwise independences in . One can instead use the ci02 or ci03 properties if it is known that is strictly positive or regular Gaussian. This speeding up is warranted from the fact that the elementary triplet representation must be closed under ci01/ci02/ci03 by Lemmas 1 and 2. For instance, having found that holds implies that must in the representation of the independence model induced by , which implies that so does by ci0. So, there is no need to check whether holds. This approach (without the speeding up sketched) of representing the independence model induced by a probability distribution with its elementary triplets has been instrumental in developing exact and assumptionfree learning algorithms for chain graphs and acyclic directed mixed graphs [20, 22]. One may argue that there is no need to produce a concise representation of such as the elementary triplet representation, since it takes time and storage space and it provides no additional information about . However, some operations with independence models are not easy to perform without representing the independence models explicitly, e.g. it is not clear to us how to compute the intersection of the independence models induced by two probability distributions without representing the independence models in any way whereas, as we will see in Section 4, this is a straightforward question to answer from their elementary triplet representations.
For simplicity, all the results in the sequel assume that and satisfy CI01/CI02/CI03 and ci01/ci02/ci03. Thanks to Lemma 2, these assumptions can be dropped by replacing , , and in the results below with , , and .
Let and . In order to decide whether , the definition of implies checking whether elementary triplets are in . The following lemma simplifies this when satisfies ci01, as it implies checking elementary triplets. When satisfies ci02 or ci03, the lemma simplifies the decision even further as the conditioning sets of the elementary triplets checked have all the same size or form.
Lemma 3.
Let denote a set of elementary triplets. Let for all and , for all and , and for all and . If satisfies ci01, then . If satisfies ci02, then . If satisfies ci03, then .
Proof.
Proof for ci01
It suffices to prove that because clearly . Assume that . Then, and by definition of . Then, and by ci1. Then, and by definition of . By repeating this reasoning, we can then conclude that for any permutation of the set . By following an analogous reasoning for instead of , we can then conclude that for any permutations and of the sets and . This implies the desired result by definition of .
Proof for ci02
It suffices to prove that because clearly . Note that satisfies CI02 by Lemma 1. Assume that .

and follow from and by definition of .

follows from by definition of .
By continuing with the reasoning above, we can conclude that . Moreover, by a reasoning similar to (14) and, thus, by an argument similar to (2). Moreover, by a reasoning similar to (14) and, thus, by an argument similar to (4). Continuing with this process gives the desired result.
Proof for ci03
It suffices to prove that because clearly . Note that satisfies CI03 by Lemma 1. Assume that .

and follow from and by definition of .

follows from by definition of .
By continuing with the reasoning above, we can conclude that . Moreover, by a reasoning similar to (58) and, thus, by an argument similar to (6). Moreover, by a reasoning similar to (58) and, thus, by an argument similar to (8). Continuing with this process gives the desired result. ∎
As mentioned in the introduction, another set of distinguished triplets in that can be used to represent it is the set of dominant triplets [2, 9, 12, 23]. The following lemma shows how to find these triplets with the help of .
Lemma 4.
Let denote a set of triplets. If satisfies CI01, then is a dominant triplet in if and only if and are two maximal sets such that for all and and, for all , and for some and . If satisfies CI02, then is a dominant triplet in if and only if and are two maximal sets such that for all and and, for all , and for some and . If satisfies CI03, then is a dominant triplet in if and only if and are two maximal sets such that for all and and, for all , and for some and .
Proof.
We prove the lemma when satisfies CI01. The other two cases can be proven in much the same way. To see the if part, note that by Lemmas 1 and 3. Moreover, assume to the contrary that there is a triplet that dominates . Consider the following two cases: and . In the first case, CI01 on implies that or with and . Assume the latter without loss of generality. Then, CI01 implies that for all and . This contradicts the maximality of . In the second case, CI01 on implies that or with . Assume the latter without loss of generality. Then, CI01 implies that for all , which contradicts the assumptions of the lemma.
To see the only if part, note that CI01 implies that for all and . Moreover, assume to the contrary that for some , for all or for all . Assume the latter without loss of generality. Then, by Lemmas 1 and 3, which implies that is not a dominant triplet in , which is a contradiction. Finally, note that and must be maximal sets satisfying the properties proven in this paragraph because, otherwise, the previous paragraph implies that there is a triplet in that dominates . ∎
A natural question to ponder is whether it is better to represent an independence model by its elementary or dominant triplets. In terms of storage space, it seems that the dominant triplet representation should be preferred. For instance, for the independence model in Example 1, there are 82 elementary triplets but only 12 dominant triplets and nine nonsymmetric dominant triplets [12]. For the independence model in Example 2, there are 112 elementary triplets but only two nonsymmetric dominant triplets, as we will see later. In terms of running time, the answer is less clear. As mentioned before, finding for a given set of triplets implies producing the CI01/CI02/CI03 closure of or the ci01/ci02/ci03 closure of . The average case complexity of either case is unknown. The algorithms in [2, 9, 12, 23] for finding the dominant triplets in are conceptually more involved but they could be faster than finding . Performing an empirical comparison of the two alternatives is definitely an interesting research project. However, it is beyond the scope of this work. Moreover, the methods for finding dominant triplets take a set of triplets as input. It is not clear to us how to run them when all we have access to is a probability distribution , e.g. the empirical distribution derived from a sample. As discussed before, finding the elementary triplet representation in that scenario is conceptually easy. Yet another dimension to compare elementary and dominant triplet representations is the operations that each alternative allows to perform efficiently, e.g. there is no method to our knowledge for computing the intersection of the CI01 closures of two sets of triplets when all we have is their dominant triplet representations whereas, as we will see in Section 4, this is a straightforward question to answer from their elementary triplet representations. That is why we prefer to see elementary and dominant triplets as complementary rather than competing alternatives to represent independence models: Depending on task at hand, one or the other may be preferred.