DeepAI

# Categoroids: Universal Conditional Independence

Conditional independence has been widely used in AI, causal inference, machine learning, and statistics. We introduce categoroids, an algebraic structure for characterizing universal properties of conditional independence. Categoroids are defined as a hybrid of two categories: one encoding a preordered lattice structure defined by objects and arrows between them; the second dual parameterization involves trigonoidal objects and morphisms defining a conditional independence structure, with bridge morphisms providing the interface between the binary and ternary structures. We illustrate categoroids using three well-known examples of axiom sets: graphoids, integer-valued multisets, and separoids. Functoroids map one categoroid to another, preserving the relationships defined by all three types of arrows in the co-domain categoroid. We describe a natural transformation across functoroids, which is natural across regular objects and trigonoidal objects, to construct universal representations of conditional independence.. We use adjunctions and monads between categoroids to abstractly characterize faithfulness of graphical and non-graphical representations of conditional independence.

06/21/2014

### Graphical structure of conditional independencies in determinantal point processes

Determinantal point process have recently been used as models in machine...
12/18/2022

### A Layered Architecture for Universal Causality

We propose a layered hierarchical architecture called UCLA (Universal Ca...
03/27/2013

### On the Logic of Causal Models

This paper explores the role of Directed Acyclic Graphs (DAGs) as a repr...
04/23/2021

### Transitional Conditional Independence

We develope the framework of transitional conditional independence. For ...
05/30/2021

### Lattice Conditional Independence Models and Hibi Ideals

Lattice Conditional Independence models are a class of models developed ...
10/28/2021

### Universal Decision Models

Humans are universal decision makers: we reason causally to understand t...
03/03/2021

### Incidence geometry in the projective plane via almost-principal minors of symmetric matrices

We present an encoding of a polynomial system into vanishing and non-van...

## 1 Introduction

A categoroid is defined by a “flexible" join of two categories, one over the usual objects and binary arrows, and the other over trigonoidal objects, or triples, and their associated morphisms. 111Recall that a category is defined by a collection of objects , and a collection of morphisms , where , each object is equipped with an identity morphism 1, and morphisms satisfy associative and compositional properties (MacLane, 1971). Bridge arrows define the join between the two categories. Categoroids are used in this paper to analyze universal properties underlying conditional independence, illustrated by previous axiomatizations such as graphoids (Pearl, 1989, 2009), integer-valued multisets (Studeny, 2010; Matus, 1995), and separoids (Dawid, 2001, 2010). Categoroids are a type of category (MacLane, 1971), with a novel dual parameterization, one involving the usual unary collection of objects and binary arrows mapping between them, but the other comprised of a collection of trigonoidal morphisms that act over triples of objects, with bridge morphisms defining the join. The trigonoidal morphisms represent a proof system defined over a conditional axiom system. Any valid conditional independence property can be inferred as a sequence of trigonoidal morphisms if such a property can be inferred by using the axiom system. Our primary interest is in developing a deeper understanding of how these two parameterizations interact. Figure 1 gives an overview of the different categoroids that will be discussed in this paper. The arrow indicates each categoroid in the co-domain is a special case of the categoroid in the domain. Not all relationships are shown to enhance readability.

Conditional independence structures have been actively studied in AI, causal inference, machine learning, probability, and statistics for many years.

Dawid (2001, 2010) define separoids, a join semi-lattice, to formalize reasoning about conditional independence and irrelevance in many areas, including statistics. Pearl (1989) introduced graphoids, a distributive lattice over disjoint subsets of variables, to model reasoning about irrelevance in probabilistic systems, and proposed representations using directed acyclic graphs (DAGs). Studeny (2010) proposed a lattice-theoretic model of conditional independences using integer-valued multisets to address the intrinsic limitations of DAG-based representations. Designing suitable graphical and non-graphical representations of conditional independence structures continues to be actively studied in the literature (Lauritzen and Richardson, 2002; Evans, 2018; Forré and Mooij, 2017; Amini et al., 2022; Heymann et al., 2020; Sadeghi, 2017).

To illustrate the main themes of this paper, consider a directed acyclic graph (DAG) as a representation of a conditional independence relation set, where the conditional independence property is defined using the graph property of -separation (Pearl, 1989). A given DAG can be characterized in two ways: one parameterization specifies the DAG in terms of the vertices and edges , which corresponds to specifying the objects and morphisms of a categoroid. The second way to parameterize a DAG is by its induced collection of conditional independence properties, as defined by -separation. For example, the serial DAG over three variables, , can be defined using its two edges and , but also by its conditional independences, namely using the theory of -separation. We are thus given two possibly redundant parameterizations of the same algebraic structure. However, multiple DAG models can define the same conditional independences. For example, the serial model , as well as the “diverging" model and the “reverse" serial model all capture the same conditional independence property . 222This non-uniqueness property arises because Bayes rule can be used to reparameterize any one of these three DAGs into the form represented by one of the other DAGs.

Categoroids are more general than previous graphical and non-graphical representations. A separoid (Dawid, 2001) defines a semi-lattice , where the join operator over the semi-lattice defines a preorder , and the ternary relation is defined over triples of the form (which are interpreted to mean is conditionally independent of given ). We can define a “separoid" induced categoroid as a category, where the morphism defines a preorder . As pointed out by MacLane (1971), “lattice" objects can be constructed in any category. It is then possible to define an abstract “separoid" ternary relation in any category, because each morphism yields a bridge morphism (over all objects in C) (abstracting from the property in separoids that implies that ). 333A notational remark: we often write to denote the conditional independence , when we wish to abstract from the somewhat loaded semantics associated with in the previous literature, similar to the graphoid notation of Pearl (1989), where inserting the conditioning element in the middle is suggestive of the “separoid" action as defined by by (Dawid, 2001). Furthermore, a trigonoidal morphism can be defined for any (abstracting again from the separoid axiom that if and , then it follows that for any ). Other conditional independence properties in a separoid, for example the symmetry property stating from , we can infer can in turn be abstracted into a trigonoidal morphism .

Functoroids map from one categoroid to another categoroid , mapping the object in to , an object in , arrows in to corresponding arrows in , but uniquely, acting trigonoidically as well, mapping trigonoids in to corresponding trigonoids in (and bridge arrows as well). To explore the mapping between a categoroid and its graphical or non-graphical representation, we use the theory of adjoint functors (MacLane, 1971; Richter, 2020; Riehl, 2017). We can illustrate this framework using integer-valued sets (or imsets) (Studeny, 2010), which are another non-graphical approach to representing conditional independence. An imset is defined as an integer-valued multiset function from the power set of integers, to integers . An imset is defined over a distributive lattice of disjoint (or non-disjoint) subsets of variables , , and , over the universe of all variables . A combinatorial imset is defined as: 444A structural imset is defined as one where the coefficients can be rational numbers. Our analysis of imsets using adjoint functors can be extended to this case as well.

 u=∑A⊂NcAδA

where is an integer,

is the characteristic function for subset

, and potentially ranges over all subsets of . An elementary imset is defined over , where are singletons, and . Bouckaert et al. (2010)

explore linear programming methods for conditional independence inferences using imsets. For a general DAG model

, an imset in standard form (Studeny, 2010) is defined as

 uG=δV−δ∅+∑i∈V(δ{\bf Pa}i−δi∪{\bf Pa}i)

where Pa define the parents of variable in the DAG. The three canonical DAGs over the variables , namely the serial DAG , the reverse serial DAG , and the diverger DAG , all yield the same imset representation (see Figure 2):

 uG=δabc−δab−δbc+δb

We define adjunctions between functoroids to explore the mapping between the “free" objects associated with a categoroid, and a “forgetful" functoroid that maps a given graphical or non-graphical representation back to a categoroid. The family of adjoint functors defined by such “free" “forgetful" functors forms an important aspect of many constructions in category theory (MacLane, 1971; Riehl, 2017; Richter, 2020). Viewed in this light, imsets and other graphical representations of conditional independence structures emerge as illustrative examples of adjoint functoroids between categoroids. Similarly, we can study graphoids and the entire panoply of graph-based representations, including DAGs (Pearl, 1989), marginalized DAGs (Evans, 2018), hyperedge-directed graphs (HEDGes) (Forré and Mooij, 2017), chain graphs (Lauritzen and Richardson, 2002), and mixed graphs (Sadeghi, 2017) in terms of the theory of adjoint functoroids between categoroids. For example, we can define a “forgetful" functoroid from a DAG model to a universal categoroid representing this conditional independence property. Using adjunctions on categoroids, we can view the left adjoint of the“forgetful" functoroid as mapping from the “free" object in the conditional independence model that generates all three DAG structures associated with it.

To explain in a bit more detail the concept of adjunctions between free and forgetful functors, we can analyze imsets in terms of adjunctions defined by a pair of functors between a categoroid and the category of left-k modules over a commutative ring k[S] (Riehl, 2017). The left-k module over a commutative ring defined by a set is defined formally as

 {\bf k}[S]=∑x∈Szxx

where are elements of the ring k (which, as in the case of imsets, can be the integers or rationals ). A “forgetful" functor defines the right-adjoint between a left k-module k[S] and the set (where the Abelian group structure of the commutative ring is “thrown away" in the category Sets, of which is an object). The “free" functor maps the set back to the left-k module k[S]. We generalize the imset construction, developing a more general theory of left-k module representations of any categoroid , building on recent advances in mathematics on Gröbner representations of combinatorial categories Sam and Snowden (2016), as explained below.

A categoroid is defined by a unary collection of objects , a collection of binary arrows defined as a subset of the binary relation , but crucially and perhaps uniquely also includes a collection of ternary structures called trigonoids defined over triples of objects , and their associated morphisms representing the inference of new conditional independence properties. Trigonoidal objects and their morphisms act as the carrier of conditional independence in a categoroid. The axiomatic inference of new conditional inference properties has been extensively explored in the previous work on graphoids, imsets, and separoids, among others, but perhaps unique to our paper is the conceptualization of this process as a type of proof theory in a categoroid involving the composition of trigonoidal morphisms. Category theory has long been understood as a way to conceptualize inference rules in logic, where assertions are objects, and proofs are morphisms (Baez and Stay, 2010). As an example, a separoid is a semi-lattice with a join operator, augmented with a ternary relation defined over triples of objects. The term is typically read as is conditionally independent of given . A common use of conditional independence is to gate the transmission of information from an object to an object by conditioning on a third object , which eliminates all paths that carry information from to . Similarly, if we interpret conditional independence as a statement about “irrelevance", then we might assert that if . This assertion can be interpreted as stating that if “knowing" , is irrelevant to , then we can also surmise that any “weaker" is also irrelevant to when is known. The properties of separoids can be viewed as morphisms that map a triple of objects, for example into a new triple , reflecting a symmetry principle. A fundamental aspect of conditional independence structures is that they all combine an associative algebraic structure with a ternary relation over triples of objects. Trigonoids formulates these ternary relation using the actions of trigonoidal morphisms. Unlike previous approaches, categoroids make no assumptions regarding the finiteness of conditional independence axiomatizations. A collection of trigonoids need not be finite, and can capture non-finite axiomatizations (Studeny, 2010).

Generalizing from previous axiomatizations of conditional independence, in a categoroid, we interpret an arrow as defining a (non-unique) preorder relation . We can conceptualize a categoroid as inducing two subcategories, one modeling the regular morphisms, and the other representing the trigonoidal morphisms over triples of objects. However, these structures interact in interesting ways, and while categoroids can be viewed as a join of two categories, their interaction is what makes categoroids (and previous axiomatizations) unique. We can abstractly define a “forgetful" functor that converts a categoroid into a category by discarding one of the two parameterizations, which would correspond to projecting a separoid on one of its two parameterizations. The collection of morphisms out of an object in a category is typically denoted as Hom, but also more succinctly as , constitutes a set-valued functor called a co-presheaf (MacLane and leke Moerdijk, 1994). We include in the presheaf all bridge and trigonoidal morphisms“out" of a trigonoidal object containing . For each morphism , we define a collection of trigonoidal morphisms that are induced by a given axiomatization of conditional independence. For example, the axioms of separoids permit inferring from , the conditional independence . Using a further property of conditional independence, an entire (possibly non-finite) collection of trigonoidal morphisms can be defined from the regular morphism as for all . This process generalizes the case in separoids where from and , it is possible to infer . Similarly, a “basic" separoid is defined as one where , leading to a notion of a “basic" categoroid as one where a trigonoidal object can be used to define a morphism . In a DAG model, the co-presheaf contains all the directed paths entering a vertex in a DAG.

Using the property that imsets are essentially functions on the distributive lattice of sets, which define partially ordered sets (posets), we use the theory of Möbius inversions (Lewis, 1972) to decompose each imset into a convolution of the zeta function (which is equal to for all in a poset with finite intervals), and the möbius function , which reveals the connection between imsets and categoroids, showing how each imset can be expressed in terms of the arrow structure of a categoroid. In particular, for the imset representation, the presheaf is defined by the möbius inversion of an imset, which represents the imset in terms of a function defined over all morphisms entering the lattice element defined by the subset of all variables. Thus, an important theme of our paper is that the use of category-theoretic notions, such as adjunctions, functors, and presheafs provides an abstract language to analyze previous axiom systems of conditional independence.

It is possible to combine categoroids in interesting ways. Given a collection of separoid relations , generalizing the construction given in (Dawid, 2001), we can define an induced categoroid over a collection of trigonoidal relations, each defined by a collection of trigonoidal morphisms. In the case of separoids, each relation is defined as a set of triples , and the set of all such relations forms a lattice itself, where two relations can be combined using joins (unions) or meets (intersections). Generalizing from these ideas in separoids, we can define corresponding structures in categoroids. As MacLane (1971) pointed out, it is easy to construct “lattice" objects in any category in terms of the arrows . Thus, we can construct “separoid"-like structures in any category as well, which generates a rich variety of possible categoroids. For example, we can define topoids

as categoroids that satisfy the axioms of a topos, namely they have all finite colimits and limits, have exponential objects, and a subobject classifier, but additionally admit a dual parameterization in terms of trigonoidal objects and their morphisms. To this end, we prove a novel variant of the Yoneda Lemma, which provides a faithful and full embedding of any categoroid in the category of sets. We construct the Yoneda embedding

Hom over all types of morphisms defining the functoroid category of co-presheaves on a categoroid. Co-presheaves over a categoroid define a topoid, and the density theorem for sheaves (MacLane and leke Moerdijk, 1994) extends to cateogoroids as well, allowing a way for constructing abstract diagrams for categoroids. Specifically, abstract categoroid diagrams can be seen as functoroids from some finite indexing category of diagrams. The density theorem for categoroids shows that every presheaf in a categoroid can be represented in a canonical way as the co-limit of a set of representable presheaves. This result has many applications, including Universal Causality (Mahadevan, 2022).

Many other examples of categoroid construction are possible, such as braided categoroids (Joyal et al., 1996), and symmetric monoidal categoroids, which has applications in modeling complex compositional systems Fong and Spivak (2018). Andersson et al. (1997)

study lattice conditional independence models, defined over a ring of subsets of a set of discrete elements. These models arise naturally in investigating missing data generated from a multivariate normal distribution, where the pattern of missingness forms a lattice. These models are also equivalent to transitive DAG models, which naturally can be viewed as a category. These models can also be parameterized using the order ideals of a distributive lattice. A classic theorem by Hibi

(Hibi, 1987)

shows that a reduced Gröbner basis can be defined over ideals of a poset, which can be applied to construct a toric variety over graphical models, such as Conjunctive Bayesian Networks

(Beerenwinkel et al., 2007), log-linear statistical models and Markov random fields (Geiger et al., 2006). This approach can be generalized by defining modular representations of categoroids based on recent work on Gröbner categories (Sam and Snowden, 2016). The key idea underlying this approach is to identify conditions under which a modular representation is noetherian, that is, it satisfies an ascending chain condition on a partially ordered set, and thereby can give rise to Gröbner bases. At a high level, we are given a set-valued functor , which maps a categoroid into the category of sets. Each object is therefore mapped to a set . Using the representation theory of left k-modules over a ring, a standard approach in mathematics for studying many algebraic structures, we can define the left k-module where is the free k-module on the set . This approach can be seen as a generalization of integer-valued multisets (Studeny, 2010), which as described above has been used to study conditional independence structures.

## 2 Categoroids

We define categoroids in this section, and relate them to previous axiomatizations of conditional independence, such as graphoids (Pearl, 1989), imsets (Studeny, 2010), and separoids (Dawid, 2001). Categoroids build on the formalism of category theory (MacLane, 1971). We will introduce the terminology as appropriate. A simple way to understand a category is as a quiver, a directed graph with many (possibly infinite) numbers of directed edges between any two objects representing arrows (or also referred to as morphisms). It is possible to interpret any directed graph as a “free" category, where the vertices define the objects, and the set of all (finite or non-finite) paths between a pair of vertices are defined as its morphisms. Succinctly, we can define a category as a collection of objects and arrows A with two functions, and , which define the “head" and “tail" of each arrow. The arrows compose in the usual manner, that is is the composition of and . Each object is associated with a distinguished self loop called the identity arrow 1. The category with one object and one morphism (identity) is denoted . An initial object in a category is one that defines exactly one morphism between it and any other object in the category (including itself). A terminal object in a category is one that has exactly one morphism from any object to itself.

As much of our discussion below will involve lattices, it is useful to relate these definitions to lattices and partially ordered sets. Each element in the lattice L defines an object in the category. The morphisms in the category are defined by the preorder (partial) order associated with the (semi)lattice (or poset). Any category induces a (reflexive, transitive) preorder by defining if there is a morphism . The initial element in the lattice is the bottom element, often denoted (or for lattices defined over subsets). The final element in a lattice is denoted as (or the entire universe of objects, e.g., the set of vertices in a graphical model).

Given two objects in a category, there can be a non-finite number of arrows between them. For example, consider the category Top of all topological spaces. Each object is a set , and a collection of open sets closed under finite intersection and arbitrary unions. An arrow between two topological spaces is a continuous function that maps each element to a corresponding element , where , such that the preimage of any open set is an open set in , that is . In general, there can be a non-finite number of such continuous functions between any two topological spaces. A category is called locally finite if there exists only a set of arrows between any two objects (note the set may not be finite). In our paper, we assume all categories are locally finite. In addition, for some of the results, it may be useful to restrict the category to a finite number of objects, and a finite number of morphisms between a pair of objects, such as in the application to graphical models over a finite number of variables. For simplicity, we will endeavor to keep the category-theoretic formalism at an elementary level. The definition of categoroids proposed below can be enhanced using fancier constructions, such as those given by Fong and Spivak (2018), including braided or symmetric monoidal categories, decorated cospans, and operads. We leave these embellishments to a future paper.

###### Definition 1.

A conditional independence quiver is a set of objects, a set of arrows, a set of trigonoidal arrows, and two sets of bridge arrows, along with pairs of functions and , , , , , and finally, , and specifying the domain (“head") and co-domain (“tail") of each type of arrow, respectively. 555We are not assuming that conditional independence quivers are finite, unless explicitly stated in the paper. it is possible to have a (an infinite) set of arrows of any type between the appropriate object types.. We define the composable pairs of arrows and trigonoids as the sets:

 A×0A={⟨g,f⟩| g,f∈A  and  ∂A0(g)=∂A1(f)} (1) T×0T={⟨g,f⟩| g,f∈T  and  ∂T0(g)=∂T1(f)} (2) B1×0B0={⟨g,f⟩| g∈B1,f∈B0  and  ∂B10(g)=∂B01(f)} (3) B0×0B1={⟨g,f⟩| g∈B0,f∈B1  and  ∂B00(g)=∂B11(f)} (4)

or in words, the composition is possible when the “domain" of is equal to the “range" (or co-domain) of for all types of arrows. Note in particular that arrows can only combine with arrows, and vice versa, arrows can only combine with arrows. A categoroid is the free category over the conditional independence quiver with these additional functions, under the quotient equivalence class defined by ignoring the distinction between primitive morphisms and compositions of morphisms:

 Oid−−→A (5) A×0A∘→A (6) T×0T∘→T (7) B1×0B0∘→B0 (8) B0×0B1∘→B1 (9)

Lurie (2006) defines the join of two categories and as one where the objects are defined as the disjoint unions of the objects of each, and the morphisms correspond to the morphisms of if are from , whereas if are from , the morphism is defined as in . He defines the join over pairs of objects across the two categories differently than we do above. In our case, the bridge morphisms serve as the interface between the two categories whose join defines a categoroid.

### 2.1 Separoids, Integer-valued Multisets and Graphoids

We illustrate the definition of categoroids above using three well-known axiomatizations of conditional independence: separoids (Dawid, 2001), graphoids (Pearl, 1989), and imsets (Studeny, 2010).

#### 2.1.1 Separoids are Categoroids over Lattices

###### Theorem 1.

A separoid Dawid (2001) defines a categoroid over preordered set , namely is reflexive and transitive, equipped with a ternary relation on triples , where satisfy the following properties:

• S1: is a join semi-lattice.

• P1:

• P2:

• P3:

• P4:

• P5:

A strong separoid also defines a categoroid. A strong separoid is defined over a lattice has in addition to a join , a meet operation, and satisfies an additional axiom:

• P6: If and , then

Proof: The proof is quite straightforward, and involves showing how each of the elements in a separoid can be represented by an element of a categoroid.

• The set of objects in a separoid is simply the elements of the preordered set .

• The set of arrows is defined as the morphisms if in the preorder.

• The set of trigonoidal morphisms is defined using the properties through above. For example, property induces a trigonoidal morphism . 666As a reminder, we are using the notation introduced by Pearl (1989) and placing the conditioning element in the middle to indicate its action as a “separoid" between and .

• The set of bridge arrows act as a connection between the preorder relation to the conditional independence statement . Stated more generally, each bridge arrow here is written as precisely when .

• Finally, the set of bridge arrows act as a connection between the conditional independence statement to the preorder . Stated in more general terms,

To explain the construction above in more detail, our goal is to ensure that every inference of a property in a separoid is captured by an arrow of some type, from a regular arrow corresponding to the preorder relation , to the trigonoidal arrow capturing the symmetry property , and finally to the two bridge arrows connecting the preorder relation with the ternary relation. To show that categoroids are more general than separoids, note that we can generalize the join operator in a semi-lattice to any coproduct in a categoroid, or even more generally, to a colimit or even a Kan extension (MacLane, 1971). Coproducts, colimits, and Kan extensions are universal constructions in a category that assemble into a unified structure a diverse set of similar constructions. For example, the join operator in separoids is the same as the union of two (disjoint) sets in graphoids and imsets, and all of these define coproducts or colimits in a categoroid. The following commutative diagram captures the universal property underlying coproducts. The below figure shows a diagram, a standard construct in category theory, where objects are depicted by vertices with labels, and morphisms are indicated by labeled edges.

& Z[r, "p"] [d, "q"] & X [d, "f"] [ddr, bend left, "h"]

& Y [r, "g"] [drr, bend right, "i"] &X ⊔Y [dr, "r"]

& & & R

In the commutative diagram above, the coproduct object uniquely factorizes any arrow and any arrow , so that , and furthermore . Coproducts are themselves special cases of the more general notion of colimits which will be defined below. The object is called a universal element (Riehl, 2017) because it represents the universal property of coproducts. Thus, defining a categoroid using coproducts generalizes the use of joins in separoids. The separoid axioms can then be generalized as well, replacing each occurrence of the join above with the coproduct operator.

### 2.2 Graphoids are Categoroids over Graphs

Pearl (1989) introduced an axiomatization of irrelevance to study causal and probabilistic reasoning over graphical models called graphoids. (Dawid, 2001) shows that graphoids can be characterized as strong separoids on a distributive lattice having a minimal element and possessing relative complements, with the added condition that the join operation is defined on pairwise disjoint terms. We show more directly that semi-graphoids and graphoids define categoroids over a ring of disjoint subsets, where joins are defined by unions and meets are defined by intersections.

###### Theorem 2.

Semi-graphoids and graphoids define categoroids. A graphoid G is defined over a universe of (typically discrete) variables, where given any three disjoint subsets of variables , and , the graphoid satisfies the following ternary relationship specified by (which is to interpreted as is independent of given ):

• Symmetry:

• Decomposition:

• Weak Union:

• Contraction:

• Intersection: For strictly positive distributions whose independence properties are being captured by , the following additional property holds as well:

Proof:

The proof follows along the lines of the one given above for separoids. We note that the objects of the categoroid are defined as all subsets of a finite collection of variables. The binary arrows are defined using containment relationships between subsets. Each graphoid property above defines a trigonoidal morphism, following the approach used above for separoids. Similarly, the bridge morphisms can also be defined analogously as was done for separoids. It is worth explicitly noting how the trigonoids defined over these properties capture conditional independence over probability distributions. Specifically, if

holds, this is tantamount to asserting that

 P(X=x,Y=y,W=w)=P(X=x)P(Y=y,W=w)   ∀x,y,w

That is, the joint distribution

over factors into a product of two smaller distributions as specified above. Pearl (1989) explores many variants of these axioms for specific graph structures, for example causal DAG models satisfy the following additional axioms:

• Weak Transitivity:

• Chordality:

Here, denote individual variables in the model. All of these variants can be easily represented as morphisms over trigonoidal objects in a categoroid. (Fong, 2012) shows how Bayesian networks and DAG models can be represented using symmetric monoidal categories, where the parents Pa of a variable , namely , in a collider DAG

can be “tensored" together as the composite object

using the symmetric monoidal structure of the category. Such symmetric monoidal categoroids can in turn be used to define symmetric monoidal categoroids, by including as well trigonoidal objects and morphisms capturing graphoid properties. Importantly, a wider class of graphical models can be defined using universal constructions. For example, a universal collider is defined as the universal construction of a pullback in category theory, where the morphisms and can represent arbitrary morphisms, in which case the universal element is the product element .

Analogous to the universal construction of coproducts above, let us define the universal construction of a product in a category to show how it can be used to define more general types of universal graphical models. The commutative diagram of a limit is shown below, which can be viewed as a form of “universal collider":

T [drr, bend left, "x"] [ddr, bend right, "y"] [dr, dotted, "r" description] & &

& X ×Y [r, "p"] [d, "q"] & X [d, "f"]

& Y [r, "g"] &Z

This diagram asserts that there is a “pullback" object labeled with morphisms and , which can be viewed as a generalization of the canonical projections from a cartesian product to its components. Furthermore, the diagram asserts that given any morphism from an object to , there is a unique way to factor that morphism through the product object, so that the diagram “commutes", meaning the morphism . Similarly, any morphism from to is also uniquely factored through , so that . We have thus characterized the product object purely in terms of the morphisms into and out of the object. The use of universal constructions to define “universal causal models" is explored in greater detail in (Mahadevan, 2022). These constructions show how categoroids generalize graphoids.

#### 2.2.1 Integer Valued Multi-sets as Categoroids

Now, we turn to integer-valued multisets, or imsets, as described in (Studeny, 2010). Our goal is to explore the connections between imsets and categoroids. As mentioned earlier, imsets are defined as an integer-valued multiset function from the power set of integers, to integers . An imset is defined over partialy ordered set (poset), defined as a distributive lattice of disjoint (or non-disjoint) subsets of variables. The bottom element is denoted , and top element represents the complete set of variables . A full discussion of the probabilistic representations induced by imsets is given (Studeny, 2010). We will only focus on the aspects of imsets that relate to its conditional independence structure, and its topological structure as defined by the poset. A combinatorial imset is defined as:

 u=∑A⊂NcAδA

where is an integer, is the characteristic function for subset , and potentially ranges over all subsets of . An elementary imset is defined over , where are singletons, and . A structural imset is defined as one where the coefficients can be rational numbers. For a general DAG model , an imset in standard form (Studeny, 2010) is defined as

 uG=δV−δ∅+∑i∈V(δ{\bf Pa}i−δi∪{\bf Pa}i)

Figure 2 shows an example imset for DAG models over three variables, defined by an integer valued function over the lattice of subsets. Each of the three DAG models shown defines exactly the same imset function. Studeny (2010) gives a detailed analysis of imsets as a non-graphical representation of conditional independence. We construct a novel representation of imsets using the theory of Möbius inversion in this section, which we will then use to prove a theorem, showing that imsets define a special type of categoroid.

###### Definition 2.

The incidence algebra over a poset is defined as the collection of all functions that are for all pairs of incomparable elements on the poset:

 Λ(P)={f:P2→R|f(x,y)=0  % whenever  x≰y}

We will show that imsets, which are essentially functions on posets, can be decomposed using a convolution of more elementary incidence algebra functions on posets. This reformulation, which can be viewed as a generalized Fourier “change of basis" of imsets, uses the properties of Möbius inversions.

###### Definition 3.

The zeta function on posets is defined as whenever , and when .

###### Definition 4.

The convolution of two functions on a poset is defined as:

 f⋆g(x,y)=∑z:x≤z≤yf(x,z)g(z,y)

As an example, it is easy to show that for any function convolved with the function produces a new function that is defined over the poset ideal of all elements . This relationship will be central to the construction below.

 (f⋆ζ)(x,y) = ∑z:x≤z≤yf(x,z)ζ(z,y) = ∑z:x≤z
###### Definition 5.

The Möbius function is the convolutional inverse of the function:

 (ζ(x,y)⋆μ(x,y))=δ(x,y)

where if and only if . It is both a left and a right inverse.

The proof of the following result is standard in any combinatorics text, and will not be given.

###### Theorem 3.

Let be any poset, and let . Suppose that has a unique minimal element . Define the following function as follows, for any :

 n(a)=∑x≤ae(x)

We can then use the möbius function to “invert" the original function, and perform a change of basis:

 e(a)=∑x≤an(x)μ(x,a)

If we define , and for all other undefined values, and , where similarly for all other undefined values, then it follows that:

 g(⊥,a)=(f⋆ζ)(⊥,a)
 e(a)=(g⋆μ)(⊥,a)

The importance of this theorem is that it shows we can reformulate any integer-valued multiset function using the möbius transform to a function defined over the incidence algebra of a poset. This change of basis will clarify the connection to the arrow structure of a categoroid. To the best of our knowledge, the following simple theorem below is novel in the literature.

###### Theorem 4.

Any integer-valued multiset (imset) function over a poset can be written in terms of a convolution of the function on the poset .

Proof: As shown by Studeny (2010), any integer-valued multiset function can be written as the linear combination of a set of simpler multiset functions, involving elementary, combinatorial, or structural multiset functions, with the only change being the coefficient structure. As this does not affect our proof, we simply assume that the function is defined as a linear combination of elementary multiset functions, such as:

 u=∑A⊂NcAδA

Note that as , the characteristic function for a set , is a function on the poset , which is assumed to have a unique bottom element , we can write it using Theorem 3 as:

 u(A)=∑B⊂An(B)μ(B,A)

where the auxiliary function is defined as:

 n(A)=∑B⊂Au(B)

Note that for the case of DAG models, such as the ones shown in Figure 2, the bottom element is the , which is the unique bottom element in the lattice of subsets defining the graphoid axioms using which the theory of imsets is defined. In effect, this construction shows that any imset can be written as the pair of convolutions involving the mobius function and the function:

 g(⊥,A)=(f⋆ζ)(⊥,A)
 u(A)=(g⋆μ)(⊥,A)

The significance of this change of basis of the imset function is that it shows that each imset function can be written as a linear combinations of functions defined as arrows in a categoroid. This statement is the basis of the next theorem.

###### Theorem 5.

Any integer-valued multiset (imset) function on a poset defines a categoroid .

Proof: As in the previous two cases of graphoids and separoids, the proof is quite straightforward, and involves showing how each of the elements in a categoroid can be defined using an imset:

• The set of objects in the categoroid is the elements of the poset .

• The set of arrows is defined as the morphisms if in the poset. In particular, for posets defined over a lattice of subsets, as shown in Figure 2, the partial order is defined by the subset relation on the lattice.

• The set of trigonoidal morphisms is defined using the conditional inference rules that are implicit in the definition of imsets. Following the construction above, each imset , where defines a function on the poset ideal . These representation of each imset can be then transformed into a linear combination of morphisms , where each morphism is defined using the möbius inversion of an imset. To capture inference rules involving conditional independence properties, we can, as before, include the axioms of (semi)graphoids as trigonoidal morphisms. To make this point clear, Lemma 6.1 in Studeny (2010) states that if and are structural imsets, then a conditional independence inference implication follows if and only if , such that l . u - v is a structural imset. Each such property can be encoded as a trigonoidal morphism in the categoroid.

• The set of bridge arrows act as a connection between the poset relation to the conditional independence statement . Stated more generally, each bridge arrow here is written as precisely when .

• Finally, the set of bridge arrows act as a connection between the conditional independence statement to the partial ordering . Stated in more general terms, . These can remain the same as was done for separoids and graphoids.

## 3 Functoroids and Natural Transformations over Categoroids

Our goal is to construct a universal representation of any conditional independence, for which we define functoroids and natural transformations over functoroids. As we show later, these notions play a vital role in defining universal representations of conditional independences. We use a modified form of the Yoneda Lemma (MacLane, 1971), which constructs a set-valued functor defined by the collection of all arrows (of all types) from an object, and uses a natural transformation from this set-valued functor to any other set-valued functor defined on a categoroid as a universal representation of that functor. In effect, we are constructing through the Yoneda Lemma a “universal simulator" of a conditional independence structure, such as a graphoid or separoid, or possibly even a non-finite parameterization, such as proposed by Studeny (2010). Every proof of a conditional independence property that is deduced in a separoid by invoking a sequence of rules from P1 through P6 above is simulated in the categoroid by using the trigonoidal morphisms.

###### Definition 6.

A functoroid is a mapping from one categoroid to another, which comprises of the following mappings:

• For each object , is the corresponding object in .

• For each arrow in , a corresponding arrow in

• For each trigonoidal arrow in , a corresponding trigonoidal arrow in .

• For each bridge arrow , a corresponding bridge arrow .

• Finally, for each bridge arrow , a corresponding bridge arrow .

It should be clear that functoroids can be composed easily, so that can be defined as the composite functoroid from to , by composing the functoroids and the functoroid , with the only caveat that bridge arrows are composed as defined earlier, in keeping with their asymmetric structures.

###### Definition 7.

The categoroid natural transformation between two functoroids is defined component-wise as follows: for each object in , and each morphism in , the following diagram commutes:

F(a) [r, "η_a"] [d, "f", red] & D(a) [d, "f" red]

F(b) [r, red, "η_b" blue] & |[blue]| D(b)

To capture the action of the trigonoids, we define a corresponding natural transformation for each trigonoid as:

(F(a), F(b), F(c)) [r, "λ_abc"] [d, "f", red] & (G(a), G(b), G(c)) [d, "f" red]

(F(a’), F(b’), F(c’)) [r, red, "λ_a’b’c’" blue] & |[blue]| (D(a’), D(b’), D(c’))

Finally, to capture the action of the bridge arrows, we define a corresponding natural transformation for each as:

(F(a), F(b), F(c)) [r, "λ^0_abc"] [d, "f", red] & (G(a), G(b), G(c)) [d, "f" red]

(F(a’), F(b’)) [r, red, "λ^0_a’b’" blue] & |[blue]| (D(a’), D(b’))

and likewise, for each bridge arrow , we get:

(F(a), F(b)) [r, "λ^1_abc"] [d, "f", red] & (G(a), G(b)) [d, "f" red]

(F(a’), F(b’), F(c’)) [r, red, "λ^1_a’b’c’" blue] & |[blue]| (D(a’), D(b’), D(c’))

Natural transformations compose as well between functoroids , and all mapping categoroid to . We can indicate the composed natural transformation abstractly as follows (where and bundle together all the components of the natural transformation across normal, trigonoidal, and bridge arrows):

[row sep=huge] C [r, bend left=65, "F"name=F] [r, "G"inner sep=0,fill=white,anchor=center,name=G] [r, bend right=65, "H"name=H, swap] [from=F.south-|G,to=G,Rightarrow,shorten=2pt,"α"] [from=G,to=H.north-|G,Rightarrow,shorten=2pt,"β"] & D.

.

## 4 Further Examples of Categoroids

Here are a few of the many examples of categoroid structures, which have proposed in the literature. Many of these are discussed in (Dawid, 2001). (Witsenhausen, 1975) proposed information fields, a measure-theoretic model of decision making that uses a lattice structure of sigma algebras to model decentralized decision-making in multi-agent systems. (Heymann et al., 2020) apply Witsenhausen’s model to define a topological version of conditional independence. (Mahadevan, 2021) defines conditional independence using Alexandroff finite topological spaces, and introduces the idea of constructing homotopic equivalences among causal models.

• Modulo arithmetic: The categoroid defines modulo arithmetic, where the ternary relation is defined as , for . The integers form a semi-lattice, where the join represents least upper bound between two integers. The trigonoids are defined by the congruence relation between triples of numbers, so for example, . It is interesting to note that the möbius inversion procedure applies in this case to give an elegant characterization of the prime factorization property of integers.

• Finite Space Topological Categoroids: A categoroid can be defined over finite space topologies, where the trigonoids are defined as denoting topological conditional independence, as discussed in (Heymann et al., 2020; Mahadevan, 2021). The preorder relation can be defined over any (finite or not) topology in terms of containment over open sets. Topological conditional independence is defined using the notion of continuity of paths over finite (Alexandroff) topological spaces, where a path is continuous if it can be represented by a continuous function from the unit interval to . We give the details of this construction for finite Alexandroff spaces in the next section.

• Orthogonoids in Hilbert and Inner Product Spaces: (Dawid, 2001) defines orthogonoids in inner product and Hilbert spaces. A subspace is orthogonal independent of given , denoted if in some inner product space , the projection onto is lies in .

• Sigma Fields and Probability Spaces: Let denote a probability space, and let denote the lattice of sub- fields of , ordered by inclusion. For , Dawid (2001) defines to denote that sigma fields and are (conditionally) independent, given (under the probability measure ). If is the trivial -field, then conditional independence becomes the usual property that is marginally independent of (under ). This notion of conditional independence can be seen as a basis of the work on causal information fields (Heymann et al., 2020), which is based on Witsenhausen’s definition of information fields (Witsenhausen, 1975), which are subfields of a product sigma algebra over a collection of decision variables.

• Graphoids: Pearl (1989) introduced graphoids, which can be viewed as a strong separoid over a distributive lattice, which contains a minimal element, and posesses relative complements, and restricts the join operation to pairwise disjoint terms.

• Graphical models: Various type of graphical models, from DAGs (Pearl, 1989) to marginalized DAGs (Evans, 2018), chain graphs (Lauritzen and Richardson, 2002), hyperedge-directed graphs (Forré and Mooij, 2017), and lattice conditional independence models (Andersson et al., 1996) can all be viewed as categoroids, where the conditional independence structure is encoded in the structure of the graph. Later we will formally define the relationship between graphical models and categoroids using adjunctions on categoroids.

• Co-Presheaves: Co-presheaves define the functoroid category Set. The Yoneda embedding of any categoroid into the categoroid Set yields a presheaf, where each object in the category is mapped to the functor Hom of the set of all (regular, bridge, and trigonoidal) morphisms into object . We will derive a novel form of the Yoneda Lemma for categoroids below, building on the definition of the categoroid natural transformation above.

• Commutative monoidal preorders over co-presheaves: A commutative monoidal preorder is a preorder and a commutative monoid satisfying the property whenever and . (Bradley et al., 2022)

studied commutative monoidal preorders over co-presheaves applied to deep language models, where objects represents partial sentences in English and morphisms represent all possible completions.

• Gröbner categories: Sam and Snowden (2016) introduced Gröbner categories. In brief, in a Gröbner category, for each object , a representation is defined by as the free k module with basis Hom. The set of isomorphism classes of representations must admit an admissible order, and the poset representation induced by must be noetherian, that is, satisfy the ascending chain condition.

## 5 Universal Constructions in Categoroids

The principal aim of this paper is to elucidate universal properties of conditional independence in categoroids. We need to precisely define what is meant by the term “universal". We follow the definition given by Riehl (2017) below. We first need to define some terminology from category theory.

### 5.1 Covariant and Contravariant Functoroids

Our goal is to abstract from the previous representations of conditional independence to define universal representations of categoroids using a novel variant of the Yoneda Lemma. To do that, we need to introduce some preliminary background material for readers unfamiliar with some basic terminology. Category theory can be viewed as the “science of analogy". Instead of asking the question whether two objects are “equal", it instead poses the question of whether objects are isomorphic. The Yoneda Lemma shows how to construct universal representations of objects in a category, so that they are fully and faithfully embedded up to isomorphism in the category of sets.

###### Definition 8.

Two objects and in a categoroid are deemed isomorphic, or if and only if there is an invertible morphism , namely is both left invertible using a morphism so that id, and is right invertible using a morphism where id.

Functoroids come in two varieties.

###### Definition 9.

A covariant functoroid from categoroid to categoroid , and defined as the following:

• An object (sometimes written as of the categoroid for each object in categoroid .

• An regular arrow in categoroid for every arrow in categoroid .

• A trigonoidial arrow in categoroid for every trigonoidal arrow in categoroid , where and now denote trigonoidal objects (triples).

• A bridge arrow in categoroid for every bridge arrow in categoroid , respecting the type of bridge morphism (for example, if , then and , and if , then , and ).

• The preservation of identity and composition: and for any composable arrows of any type (keeping in mind that bridge arrows compose only with arrows, and vice versa).

###### Definition 10.

A contravariant functoroid from category to category is defined exactly like the covariant functoroid, except all the arrows are reversed. In the contravariant functoroid , every morphism is assigned the reverse morphism in category . Similarly, the trigonoidal arrows are also reversed. Particularly important is the way contravariance works for bridge arrows. The contravariant version of the arrow turns it into a arrow, and similarly, the contravariant version of the arrow turns it into a arrow (this follows from the asymmetry in their domain and range).

We introduce the following functors that will prove of value in the proof of the Yoneda Lemma:

• For every object in a categoroid , there exists a covariant functoroid that assigns to each object in the set of morphisms , and to each morphism , the pushforward mapping . Similarly, for every trigonoidal object in , there is a covariant functoroid that assigns to each object in , the trigonoidal morphisms , and to each trigonoidal morphism the pushforward . Similarly, for every object in , there is a covariant functoroid that assigns the bridge morphisms , and to each bridge morphism the pushforward .

• For every object in a category , there exists a contravariant functoroid that assigns to each object in the set of morphisms Hom, and to each morphism , the pullback mapping Hom. Note how “contravariance" implies the morphisms in the original category are reversed through the functorial mapping, whereas in covariance, the morphisms are not flipped. Similarly, we need to define contravariant trigonoidal morphisms and bridge morphisms, following the above constructions, with the only proviso again noting that contravariant bridge morphisms of one type get converted into the other type, as noted earlier.

###### Definition 11.

Let be a functoroid from categoroid to categoroid . If for all arrows , including regular, trigonoidal and bridge, the mapping

• injective, then the functoroid is defined to be faithful.

• surjective, then the functoroid is defined to be full.

• bijective, then the functoroid is defined to be fully faithful.

### 5.2 Generalizing Joins to Co-limits

As illustrated earlier, we can generalize joins to co-products and co-limits, providing a way to generalize the notion of conditional independence based on join structures in separoids. As we defined co-products above, we turn to define colimits.

###### Definition 12.

Given a functoroid from an indexing diagram category to a categoroid , an element from the set of natural transformations is called a cone. A limit of the diagram is a cone from an object lim to the diagram satisfying the universal property that for any other cone from an object to the diagram, there is a unique morphism so that for all objects in . Dually, the co-limit of the diagram is a cone satisfying the universal property that for any other cone from the diagram to the object , there is a unique mapping so that for all objects in .

### 5.3 Kan Extensions over Categoroids

Kan extensions are the single most powerful universal construction in category theory from which every other concept can be defined. MacLane (1971) stated it boldly as “Every concept is a Kan extension". It is well known in category theory that ultimately every concept, from products and co-products, limits and co-limits, and ultimately even the Yoneda embeddings, can be derived as special cases of the Kan extension (MacLane, 1971). Kan extensions are usually defined over categories, but since they are stated in terms of natural transformations, they can readily be generalized to categoroids. Kan extensions intuitively are a way to approximate a functoroid so that its domain can be extended from a categoroid to another categoroid . Because it may be impossible to make commutativity work in general, Kan extensions rely on natural transformations to make the extension be the best possible approximation to along .

###### Definition 13.

A left Kan extension of a functoroid along another functoroid , is a functoroid with a natural transformation such that for any other such pair , factors uniquely through . In other words, there is a unique natural transformation .

[row sep=2cm, column sep=2cm] C [dr, "K"’, ""name=K] [rr, "F", ""name=F, below, near start, bend right]&& E

& D [ur, bend left, "Lan_KF", ""name=Lan, below] [ur, bend right, "G"’, ""name=G]

[Rightarrow, "!", from=Lan, to=G] [Rightarrow, from=F, to=K, "η"]

A right Kan extension can be defined similarly.

## 6 Yoneda Lemma for Categoroids

In this section, we construct a universal representation of categoroids using a modified version of the well-known Yoneda Lemma (MacLane, 1971). The central philosophy underlying category theory is construct representations of objects in terms of their interactions with other objects. Unlike set theory, where an object like a set is defined by listing its elements, in category theory objects have no explicit internal structure, but rather are defined through the morphisms that define their interactions with respect to other objects. The celebrated Yoneda lemma makes this philosophical statement more precise. We state it first for categories, before proving an extended version for categoroids.

###### Theorem 6.

Yoneda Lemma: For every object in category , and every covariant functor , the set of natural transformations from to is isomorphic to .

That is, the natural transformations from to serve to fully characterize the object up to isomorphism. In the special circumstance when the set-valued functor , the Yoneda lemma asserts that . In other words, a pair of objects are isomorphic if and only if the corresponding contravariant functors are isomorphic, namely .

To give some insight into the theorem, let us first understand at a high level how the proof of the original Yoneda Lemma works. The key insight in Yoneda Lemma is recognizing that any set-valued functor on a category must act functorially, that is, it must not only map any object to the set , but it must also map each morphism to the set-valued function (see Figure 3). It seems almost impossible that no matter what the functor is, it is possible to “mimic" the action of this functor using just the morphisms leaving an object as a universal representation. To get some intuition underlying the proof, note that for any element , the function must map into some element in . In other words, the action of is simply the functional mapping of to . Given that two sets of equal cardinality can be placed into bijective correspondence, it is sufficient to guarantee that there are enough elements in the morphism functor that can “mimic" the action of the functor . A bit of introspection reveals that this must be the case because the set-valued function must map every element , and there are exactly as many of those as there are morphisms coming out of in the category .

### 6.1 Generalizing the Yoneda Lemma to Categoroids

The principal aim of this section is to show an enhanced variant of the Yoneda Lemma, which applies to categoroids. The principal difference, of course, is that categoroids are a join of two categories, one of which includes regular objects and morphisms, to which the above Yoneda Lemma directly applies, but the other includes trigonoids, morphisms that act over triples of objects, and there are bridge morphisms defining the join. The modifications to the Yoneda Lemma are simple, because as we have defined the categoroid completely in terms of three types of morphisms, and have defined an augmented natural transformation over these three types of arrows, we just have to use the modified notion of natural transformation in defining the Yoneda Lemma for categoroids. We give a detailed proof of the modified Yoneda Lemma, leaving aside some details that are covered in standard textbooks (Richter, 2020). Figure 4 gives the high level idea of constructing once again a basis for the trigonoid actions in the category of , which mimics the actions of the trigonoid morphisms in the original category. The details are given in the proof below.

###### Definition 14.

A functoroid is representable if it is isomorphic to a morphism functoroid, that is, there exists an object of C, and a natural isomorphism over functoroids such that:

 ηC,F:C(C,−)⇒F (10)

We now state the Yoneda Lemma for categoroids, and work through a detailed proof of it, mirroring the proof of the original Yoneda Lemma, noting places where it needs to be modified to account for the trigonoidal and bridge morphisms. Our notation below follows that in the book by Richter (2020), which of course states the proof just for the regular case of categories, not categoroids. Much of the hard work of extending this lemma has already been accomplished above in our extended definitions of functoroids and categoroid natural transformations.

###### Theorem 7.

(Yoneda Lemma for Categoroids:) Let be a categoroid and be a functoroid from the categoroid to the category Sets.

• For each object of , there is a bijection between the set of all categoroid natural transformations from to , denoted as Nat, and the set . These include the regular objects as well as the trigonoidal objects defining conditional independence.

• The bijections defined as are the components of a categoroid natural transformation .

• If is a small categoroid, that is, its collection of all (regular, bridge, trigonoidal) arrows are representable as a set, then the bijections are the components of a categoroid natural transformation from the functoroid

 Nat(C(C,−),−):Fun(C,Sets)→Sets

to the functoroid that sends each to the set .

Proof: We will give a detailed proof of the Yoneda Lemma, including the extra components for dealing with categoroids, for the sake of completeness, and because the construction details will be of value later in the paper. Let us define the categoroid natural transformation for an object in C by the function:

 C(C,C′)→F(C′)

That is, for each element , we “l