The ERA of FOLE: Foundation

12/23/2015
by   Robert E. Kent, et al.
0

This paper discusses the representation of ontologies in the first-order logical environment FOLE (Kent 2013). An ontology defines the primitives with which to model the knowledge resources for a community of discourse (Gruber 2009). These primitives, consisting of classes, relationships and properties, are represented by the entity-relationship-attribute ERA data model (Chen 1976). An ontology uses formal axioms to constrain the interpretation of these primitives. In short, an ontology specifies a logical theory. This paper is the first in a series of three papers that provide a rigorous mathematical representation for the ERA data model in particular, and ontologies in general, within the first-order logical environment FOLE. The first two papers show how FOLE represents the formalism and semantics of (many-sorted) first-order logic in a classification form corresponding to ideas discussed in the Information Flow Framework (IFF). In particular, this first paper provides a foundation that connects elements of the ERA data model with components of the first-order logical environment FOLE, and the second paper provides a superstructure that extends FOLE to the formalisms of first-order logic. The third paper defines an interpretation of FOLE in terms of the transformational passage, first described in (Kent 2013), from the classification form of first-order logic to an equivalent interpretation form, thereby defining the formalism and semantics of first-order logical/relational database systems (Kent 2011). The FOLE representation follows a conceptual structures approach, that is completely compatible with formal concept analysis (Ganter and Wille 1999) and information flow (Barwise and Seligman 1997).

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

10/24/2018

The FOLE Table

This paper continues the discussion of the representation of ontologies ...
01/19/2022

FOLE Equivalence

The first-order logical environment FOLE provides a rigorous and princip...
10/10/2018

The IFF Foundation for Ontological Knowledge Organization

This paper discusses an axiomatic approach for the integration of ontolo...
10/10/2018

Conceptual Knowledge Markup Language: An Introduction

Conceptual Knowledge Markup Language (CKML) is an application of XML. Ea...
07/19/2021

ThingFO v1.2's Terms, Properties, Relationships and Axioms – Foundational Ontology for Things

The present preprint specifies and defines all Terms, Properties, Relati...
10/10/2020

Defining Computer Art: Methods, Themes, and the Aesthetic Problematic

The application of computer technology in the field of art has given ris...
07/03/2020

Logical Separability of Incomplete Data under Ontologies

Finding a logical formula that separates positive and negative examples ...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The first-order logical environment FOLE (Kent [8]) is a framework for defining the semantics and formalism of logic and databases in an integrated and coherent fashion. Institutions in general, and logical environments in particular, give equivalent heterogeneous and homogeneous representations for logical systems. FOLE is an institution, since “satisfaction is invariant under change of notation”. FOLE is a logical environment, since “satisfaction respects structure linkage”. As an institution, the architecture of FOLE consists of languages as indexing components, structures to represent semantic content, specifications to represent formal content, and logics to combine formalism with semantics. FOLE structures are interpreted as relational/logical databases. This is the first of three papers, which are concerned with the presentation of FOLE: (1) the FOLE foundation, (2) the FOLE superstructure (Kent [9]), and (3) the FOLE interpretation (Kent [10]).

This paper, which is concerned with the FOLE foundation, is illustrated in Fig. 7 and is centered on the mathematical context of structures. 111Following the original discussion of FOLE (Kent [8]), we use “mathematical context” (Goguen [4]) for the mathematical term “category”, “passage” for the term “functor”, and “bridge” for the term “natural transformation”. In § 3 and § 4 we show how the ERA data model is represented in FOLE by connecting elements of the ERA data model to components of the FOLE structure concept. § 3 discusses the direct lower-level connection between the ERA elements (attributes, entities, relations) and the FOLE components (type domains and entity classifications). 222The theory of classifications and infomorphisms is discussed in the book Information Flow by Barwise and Seligman [1]. § 4 discusses the abstract higher-level representation of the ERA data model within the FOLE architecture. In addition, we give a rudimentary description of the interpretation of FOLE structures in § 4.4. In § 5 we connect FOLE to Sowa’s knowledge representation hierarchy (Sowa [11]) and through linearization to the Olog data model (Spivak and Kent [12]).

The FOLE superstructure, which is concern with the formalism and semantics of first-order logic, and the FOLE interpretation, which is concerned with database interpretation, are presented in the two papers that follow this one. Two further papers are pending on the integration of federated systems of knowledge: one discusses integration over a fixed type domain and the other discusses integration over a fixed universe.


Fig. 1  : ERA Data Model in FOLE Fig. 2  : Example Fig. 3  : Structure Fig. 4  : Interpreted Structure Fig. 5  : Structure Morphism Fig. 6  : Interpreted Structure Morphism Fig. 7  : FOLE Foundation Fig. 8  : Analogy

Tbl. 1  : FOLE-ERA Correspondence Tbl. 2  : Matrix of six central categories
Figures and Tables

2 Overview

A conceptual model for a community represents the information needed by the community — the content, relationships and constraints necessary to describe the community. The content consists of things of significance to the community (entities), and characteristics of those things (attributes). The relationships are associations between those things. The entities are the core concepts that are used for representing the semantics of the community. Entities are described by attributes, which are the various properties, characteristics, modifiers, aspects or features of entities. Hence, the entity-relationship-attribute (ERA) formalism is a ternary representation for knowledge, since it uses three kinds for representation: entities, attributes and relations. In contrast, the first-order logical environment FOLE (Kent [8]) followed the knowledge representation approach of traditional many-sorted first-order logic (MSFOL). Both the original FOLE formalism and the MSFOL formalism are binary representations for knowledge, since they use two kinds for representation: entities and relations.

Revised Fole Terminology.

However, the first-order logical environment FOLE can very naturally represent the ERA data model. The idea is that the original FOLE relation represented a nexus of roles, where the roles were played by the original FOLE entities. In order to represent the ERA data model, we think of the original FOLE relations as the new FOLE entities described by a nexus of features or aspects, where the aspects are represented by the new FOLE attributes, which replace the original FOLE entities. In the FOLE representation of the ERA data model, entities and their attributes are primary notions, whereas relationships are secondary notions that are subsumed by other constructs. Some relations (foreign keys, subtypes, sums, …) have a special representation in FOLE; whereas, other relations can be resolved into concepts (entities) with a nexus of roles.

Tbl. 1 shows the terminological correspondence between the basic components of (old/new) FOLE and ERA. For example, the original FOLE entity type is renamed the new FOLE attribute type (sort), and this corresponds to the ERA attribute type (data type); and the original FOLE relation instance is renamed the new FOLE entity instance (key), and this corresponds to the ERA entity.

FOLE (old) FOLE (new) ERA
relation   entity    entity
entity   attribute    attribute
Table 1: FOLE-ERA Correspondence

Philosophy: the Era of Fole.

As commonly observed, an entity is a thing capable of an independent existence that can be uniquely identified. In natural language, an entity corresponds to a noun. A relationship links entities, and corresponds to a verb in natural language. Entities and relationships can both have attributes. In natural language, a relational attribute corresponds to a role or case. Inclusion and subtype relationships are special kinds of relationships. A data model can be visualized in terms of entities, relationships and attributes. But in general, relationships can be conceptualized by being converted to entities.

Hence, a data model is more simply conceptualized in terms of entities and attributes. When doing so, there is an implied boundary around the visualization, which converts an entity’s collection of attributes into a list (possibly infinite in size or arity); a signature is the list of attribute types (sorts) associated with an entity type, whereas a tuple is the list of attribute instances (values) associated with an entity instance (key). Entity types can be mapped to the associated signature, and entity instances (primary keys) identify and can be mapped to the associated tuple (horizontal dimension of Fig. 1

). In general, types classify instances. Hence, entity types classify keys and sorts classify values (vertical dimension of Fig. 

1). Implicit from the ERA data model is an entity type system and multi-sorted logic, which uses boolean operators and quantification, and is defined in terms of signature-based fibers of formulas (queries) in Kent [9].

Figure 1: ERA Data Model in FOLE

In review, the simplest way to handle things is first to distinguish types from instances along the FOLE classification dimension in Fig. 1, and second to view things (either types or instances) as participating in Whitehead’s fundamental prehension relationship (Sowa [11]) along the FOLE hypergraph dimension in Fig. 1, which links a prehending thing called an entity to a prehended thing called an attribute: “an entity has an attribute”. The ERA data model of FOLE uses an inclusive prehension for things; hence, it is a mixed data model; some entities are not attributes, some attributes are not entities, and some are both. For any type in the overlap , any instance of that type is also in the overlap. Foreign keys are examples of things that are both entities and attributes, things in the overlap.

  : entities
  : attributes
=
Disjoint Model Mixed Model Unified Model

3 Era Data Model

3.1 Attributes.

In the ERA data model, attributes are represented by a typed domain consisting of a collection of data types. In FOLE, a typed domain is represented by an attribute classification consisting of a set of attribute types (sorts) , a set of attribute instances (data values) and an attribute classification relation . For each sort (attribute type) , the data domain of that type is the -extent . The passage maps a sort to its data domain (-extent) .

An -signature (header) is a sort list , where is a map from an indexing set (arity) to the set of sorts . A more visual representation for this signature is . The mathematical context of -signatures is . 333 is the comma context of -signatures, where an object is an -signature and a morphism is an arity function that preserves signatures ; visually, . 444The header for a database table is a signature (list of sorts) . Pairs from a signature are called attributes (see § 4.1). Examples of attributes are ‘(name : Str)’, ‘(age : Natno)’. A -tuple (row) is an list of data values , where is a map from an indexing set (arity) to the set of data values . A more visual representation for this tuple is . The mathematical context of -tuples is . The attribute list classification has -signatures as types and -tuples as instances, with classification by common arity and universal -classification: a -tuple is classified by an -signature when and for all .

3.2 Entities.

We distinguish between an entity instance and an entity type. An entity type is a category of existence; entity types classify entity instances. There might be many instances of an entity type, and an entity instance can be classified by many types. An entity instance (entity, for short) is also called an object. Every entity is uniquely identified by a key. In FOLE, entities and their types are collected together locally in an entity classification consisting of a set of entity types , a set of entity instances (keys) and an entity classification relation . In the database interpretation in §4.4, each entity type is regarded to be the name for a relation (or table) in the database: for each entity type (relation name) , the set of primary keys for that type is the -extent .

3.3 Relations.

Here we discuss how the relational aspect of the ERA data model is handled in FOLE. Some relations are special. One example is subtyping, which specifies that one category of existence is more general than another. This arises when representing the taxonomic aspect of ontologies. Subtyping is handled by the binary sequents 555A sequent expresses interpretation widening between formulas. in FOLE specifications (discussed further in the (Kent [9])). Some many-to-one relationships can be represented as attributes. But in general, many-to-many relationships are represented in FOLE as entities, whose attributes, each of which plays a thematic role for the relationship, may be other entities. 666As an example, the “marriage” binary relation can be represented as a Marriage entity with wife and husband attributes that are themselves Person entities.

Consider the example (Fig. 2) of a simple entity-relationship-attribute diagram. Here we have three entities (represented by rectangles), two relationships (represented by diamonds) and numerous attributes (represented by ovals). The works_on relationship is many-to-many, and so we can represent this in FOLE as an entity type Activity with four attributes: entry_date of sort Date, job_descr of sort String, employee of sort Employee, and project of sort Project. Note that attributes employee and project are foreign keys of the Activity entity. 777The Employee type, which plays the employee thematic role for the works_on relationship, is both an entity type and an attribute type (sort); any value in the Employee data domain is a key of the Employee entity and a foreign key of the Activity entity. Since the works_for relationship is many-to-one without any attributes of its own, we can represent this as an attribute called dept of sort Department. This is a foreign key of the Employee entity.

nameid_numEmployeeworks_forn1Departmentnameid_numlocationworks_onmnentry_datejob_descrProjectnameid_numbudget
Figure 2: Example

4 Fole Components

4.1 Schema.

The type aspect of the ERA data model is gathered together into a schema. A schema consists of a set of sorts (attribute types) , a set of entity types and a signature map . Within the schema , we think of each as being an entity type that is locally described by the associated -signature . 888There is an associated arity function . A more visual representation for this signature mapping is . An ERA-style visualization might be , where the box encloses the entity type , the oval encloses the attribute type , and the arrow is labeled with the index . For example, or .

The entity type in the ERA data model corresponds to the relation symbol in FOLE/MSFOL. Either representation is a kind of nexus. A schema corresponds to a multi-sorted first-order logical language in the FOLE/MSFOL approach to knowledge representation. 999Formulas based on relation symbols can be inductively defined, thus forming extended schemas (Kent [9]). Terms composed of function symbols can be added as constraints between formulas. In the database interpretation of FOLE (Kent [10]), we think of as being a relation name with associated header .

We formally link schemas with morphisms. A schema morphism from schema to schema consists of an sort function and an entity type function , which preserve signatures by satisfying the condition .

Let denote the mathematical context of schemas and their morphisms.

4.2 Universe.

The instance aspect of the ERA data model is gathered together into a universe. A universe consists of a set of values (attribute instances) , a set of keys (entity instances) and a tuple map . Within the universe , we think of each key as being an identifier or name for an object that is locally described by the associated tuple of values . A more visual representation for this tuple mapping is . Note that, no typing has been mentioned here and no typing restrictions are required. In a universe by itself, we do not require the data values to be members of any special data-types.

An element of a universe is a key with associated list. We can think of such universe elements as object descriptions without attached typing or as tuples untethered from a database table. They develop meaning by being classified by schema elements in a structure (§ 4.3). 101010They are somewhat like genes (bits of DNA) without the genomic structure that provides interpretation. Hence, a FOLE universe is like the keyvaluelist store at the heart of Google’s Spanner database (Google [5]):

“Spanner’s data model is not purely relational, in that rows must have names. More precisely, every table is required to have an ordered set of one or more primary-key columns. This requirement is where Spanner still looks like a key-value store: the primary keys form the name for a row, and each table defines a mapping from the primary-key columns to the non-primary-key columns.” 111111When the universe is the instance aspect of a FOLE structure with typed domain , in the database interpretation of that structure (§ 4.4 and Kent [10]), we think of the entity instance as being a primary key that indexes a row in the table associated with the relation symbol with associated header . A more visual representation for this tuple mapping is , where is the data-type for sort . Here, we do require the data values to be members of the special data-types .

We semantically link universes with morphisms. A universe morphism consists of a value (attribute instance) function and a key (entity instance) function , which preserve tuples (instance lists) by satisfying the condition .

Let denote the mathematical context of universes and their morphisms.

4.3 Structure.

The complete ERA data model is incorporated into the notion of a (model-theoretic) structure in the FOLE representation of knowledge.

Structures.

A FOLE structure is a hypergraph of classifications (Fig. 3) — a two-dimensional construct with the following components:

and a list designation with signature map and tuple map , whose defining condition states that: if entity is of type , then the description tuple is the same “size” () as the signature and each data value is of sort ; or interpretively (§ 4.4 and Kent [10]), in a database table all rows are classified by the table header.

                        
structure
Figure 3: Structure

A FOLE structure embodies the idea of an ERA data model (compare Fig. 3 with Fig. 1). Each community of discourse that incorporates the ERA data model will have its own local FOLE structure. 121212In anticipation of the discussion in § 4.4, we illustrate the associated tabular interpretation (3) on the right side of Fig. 3.

The entity and attribute-list classifications and are equivalent 131313Any classification is equivalent to its extent map . to their extent diagrams and , and the list designation is equivalent to its extent diagram morphism consisting of the signature map and the bridge , whose -component is the tuple function (see the tabular interpretation (3) in § 4.4). Hence, any structure has the interpretive presentation in Fig. 4 (see the discussion on linearization § 5.2).

Figure 4: Interpreted Structure


In the concept of a FOLE structure we have abstracted the (primary) keys from the tuples that they described. The key-embedding construction replaces keys into their tuples.

Definition 1

(key embedding) Any FOLE structure with signature map and tuple map has a companion key embedding structure consisting of entity classification , parallel sum typed domain , 141414We can think of the entity classification as a type domain. For each sort (attribute type) , the data domain of that type is the -extent . The passage maps a sort to its data domain (-extent) . schema with signature map , and universe with tuple map . The signature and tuple maps are injective.

Structure Morphisms.

In order to allow communities of discourse to interoperate, we define the notion of a morphism between two structures that respects the ERA data model. A structure morphism (Fig. 5) from source structure to target structure is defined in terms of the hypergraph and classification morphisms between the source and target structure components (projections):

satisfying the conditions

Figure 5: Structure Morphism

The designation defining condition states that for any and ,

Structure morphisms compose component-wise. Let denote the context of structures and structure morphisms. In the appendix § 0.A, we develop as a fibered mathematical context in two orientations: either as the Grothedieck construction of the schema indexed mathematical context of structures or as the Grothedieck construction of the universe indexed mathematical context of structures . The schema indexed mathematical context of structures is used in Kent [10] to establish the institutional aspect of FOLE.

Any structure morphism with identity value map has the interpretive presentation in Fig. 6. 151515Any infomorphism has the equivalent condition .

Figure 6: Interpreted Structure Morphism
Definition 2

(key embedding) Any structure morphism has a companion key embedding structure morphism

.

with the following components:

Proof

The following conditions must hold.

We use the comparable conditions for the original structure morphism . The entity infomorphism condition is given. The type domain morphism condition is straightforward. We show the schema morphism condition. The universe morphism condition is similar. The schema morphism condition for the original structure morphism is ; that is, for any , if and , then . Hence, the schema morphism condition for the key-embedding structure morphism holds, since  

Integrity Constraints.

Integrity constraints help preserve the validity and consistency of data. Here we briefly explain how various integrity constraints are represented in the ERA data model of FOLE.

Entity:

(primary key rule) Entity integrity states that every table must have a primary key and that the column or columns chosen to be the primary key should be unique and not null. In the ERA data model of FOLE, entity integrity asserts that the universe of a structure is well-defined.

Domain:

Domain integrity specifies that all columns in a relational database must be declared upon a defined domain. In the ERA data model of FOLE, domain integrity asserts that the schema and the list designation of a structure are well-defined.

Referential:

(foreign key rule) Referential integrity states that the foreign-key value of a source table refers to a primary key value of a target table. In the ERA data model of FOLE, referential integrity asserts that the ERA data model of FOLE is a mixed data model.

Algebra.

For simplicity of presentation, this paper and the paper on FOLE superstructure (Kent [9]) use a simplified form of FOLE, in contrast to the full form presented in Kent [8]. In this paper and Kent [9], schemas are used in place of (many-sorted) first-order logical languages. Schemas are simplified logical languages without function symbols. The main practical result is that signature morphisms

are replaced by term vectors

in the full version of FOLE. Signature morphisms are simplified term vectors without function symbols. In the full version of FOLE, equations can be defined between parallel pairs of term vectors , thus allowing the use of equational presentations and their congruences. 161616The tuple relational calculus is a query language for relational databases. In order to use the tuple calculus in the FOLE, we need to enrich with many-sorted constant declarations and equational presentations. Constant declarations are first-order logical languages with sorted nullary function symbols. A constant of sort is an -sorted nullary function symbol . Also in the full version of FOLE, the tuple map along signature morphisms becomes the algebraic operation along term vectors; hence, formula flow (substitution/quantification) in Kent [9] is lifted from being along signature morphisms to being along term vectors. 171717Let FOLE-ARCH denote Fig. 1. in Kent [8]. FOLE-ARCH is the 3-dimensional visualization of the fibered architecture of FOLE. The upper right quadrant of Fig. 7 corresponds to the the 2-D prism below in FOLE-ARCH. As indicated in FOLE-ARCH, to move from the simple version of the FOLE foundation used here (Fig. 7) to the full version in FOLE-ARCH, we lift from sort sets to algebraic languages and from typed domains to many-sorted algebras.

Figure 7: FOLE Foundation

4.4 Interpretation.

In the model theory for traditional many-sorted first-order logic, a (possible world, model) structure corresponds to an interpretation of relation symbols (entity types) in terms of relations in a typed domain. The FOLE approach to logic replaces -tuples with lists, defines quantification/substitution along signature morphisms (Kent [9]) (or term vectors in the full version [8]), and following databases, incorporates identifiers (keys) for data value lists (tuples) (here and in Kent [10]). The FOLE approach modifies the idea of model-theoretic interpretation as follows. 181818Hence, the notions of (1) a many-sorted first-order logic interpretation, (2) a FOLE structure, and (3) an ERA data model are all equivalent; and each can be implemented as a relational database with associated logic (for more on this, see Kent [10]).

We assume that the traditional many-sorted first-order logic language is represented by the schema and that the typed domain is represented by the attribute classification . We further assume that these are components of a structure .

Traditional:

In the traditional approach, an entity type is interpreted as the set of descriptors of entities in the extent of . For -signature , this is the subset of -tuples , an element of the fiber relational order . 191919The fibered context is defined in the paper on FOLE interpretation (Kent [10]). An object of , called an -relation, is a pair consisting of an indexing -signature and a subset of -tuples . This defines the traditional interpretation function

(1)

For all , we have the relationships

(2)
Definition 3

The inequality says that is not the morphic closure of itself w.r.t. the tuple map . A structure is called extensive when the right hand expression in (2) is an equality: for any entity type . 202020Philosophical note: In the knowledge resources for a community, the entities are of first importance. The tuples in are descriptors, which may or may not have an identity. An entity consists of an identifier and its descriptor . Tuples with identity are those in . Two entities that have the same descriptor are said to be “descriptor-equivalent”.

Any structure with an injective tuple map has an associated extensive structure. An example is the key-embedding structure .

Tabular:

In the database approach, an entity type is interpreted as a table with the entities (both the keys and their descriptors) being explicit. For -signature , this is the -indexed -table