Why a computer program is a functional whole

07/21/2020
by   C. Maria Keet, et al.
University of Cape Town
0

Sharing, downloading, and reusing software is common-place, some of which is carried out legally with open source software. When it is not legal, it is unclear just how many copyright infringements and trade secret violations have taken place: does an infringement count for the artefact as a whole or perhaps for each file of the program? To answer this question, it must first be established whether a program should be considered as an integral whole, a collection, or a mere set of distinct files, and why. We argue that a program is a functional whole, availing of, and combining, arguments from mereology, granularity, modularity, unity, and function to substantiate the claim. The argumentation and answer contributes to the ontology of software artefacts, may assist industry in litigation cases, and demonstrates that the notion of unifying relation is operationalisable. Indirectly, it provides support for continued modular design of artefacts following established engineering practices.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

11/20/2014

Justifying Answer Sets using Argumentation

An answer set is a plain set of literals which has no further structure ...
01/08/2013

Utilizing ASP for Generating and Visualizing Argumentation Frameworks

Within the area of computational models of argumentation, the instantiat...
03/04/2022

Forms and Norms of Indecision in Argumentation Theory

One main goal of argumentation theory is to evaluate arguments and to de...
05/11/2018

Semiotic internationalization and localization of computer programs

Localization, the process--part of translation studies--of adapting a pr...
09/24/2021

Slicing functional aspects out of legacy applications

Aspect-oriented software development builds upon object-oriented develop...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

End users of Information Technology are familiar with software ‘apps’, i.e., computer programs, that can be launched through a one or double mouse-click or a finger-tap action on the screen of the device. From a user experience perspective, the computer program may appear to be a single thing. There are typically multiple files involved in the running of a computer program for it to carry out its principal function111Note: software, computer program, (software) application, and (software) tool are often used interchangeably, but are not all one and the same thing. A software application or tool, such as Adobe Acrobat Reader, is a particular type of computer program, since it is a computer program that needs an operating system to be able to run and perform its function. Software is a broader term and also includes other files on the computer (including its external drives). however, due to the increasingly complex software nowadays, as well as the move toward modularisation in software design and development over at least the past 25 years. For instance, compare an Wordperfect v3.x word processor with MS Word with what-you-see-is-what-you-get text processing, mail merge, and inserting tables, or texting an SMS compared to all the WhatsApp features that has a range of facilitating sub-functions such as encryption, saving conversations, and reporting status updates. ‘Under the hood’ outside of the end user’s immediate view, there are typically multiple files to make such apps work. It is not impossible that there is only one single file as source code and as executable, but is is very rare and such a tool is typically not for regular consumption as industry-grade software. For instance, a hello.c source code file to print just “Hello World” on screen still calls a built-in print function that is stored in another file than the typed-up code.

A computer program may exist in different ‘formats’: human-readable source code that is normally written by humans or as compiled or interpreted code (which is then eventually object (machine) code) that is the executable that the machine will use. The program typically uses components internally and any component may ‘call’ one or more other components (files) for its proper functioning. This is often hidden from the user view, since it is irrelevant for the user how exactly the program achieves what it is supposed to achieve, just like one isn’t subjected to the inner workings of a car when driving the car. Such component files are stored in some specific directory and are artefacts—i.e., human-made things—such as plugins for a larger program, dynamic linked libraries (dll’s), icons for the interfaces and so forth, as well as configuration files to, e.g., set the language of the program or how much computer resources it is allowed to use. Disregarding those finer-grained details, one still may consider, e.g., MS Word, Google Chrome, or Tinder, or their respective source code, a single artefact that is downloaded and installed. If you ‘pirate’, say, McAfee Antivirus software, Microsoft Office, or the Adobe Suite, you presumably would break the law once for each program, not hundreds of times, once for each component file in the installation. Why is this so? Or, conversely: could one possibly be subjected to be sued for many infringements if the program might turn out not be a whole upon closer inspection, as if it were a mere set of independent objects, and each file is going to cost you money or time in prison? Why may one care about the answer?

Aside from intellectual inquiry, there are many practical and financial consequences, to the point that there are, and have been, multiple court cases on copyright infringements, trade secret violations, and patent misuse of programs. For illustrative purpose of such type of litigations, one could consider internationally widely debated ones, such as the ongoing Google vs Oracle case on whether a programming interface can be copyrighted or not [22] and a recently settled case between Cisco and Arista on computer network management tools where Arista eventually payed Cisco $ 400 million [3], but there are also ongoing local (South African) cases, such as about software for medical schemes [4]. Precedent-setting ones involve, among others, the notion of non-literal similarity of the source code and the, as it turned out, free use of copyrighted text for the purpose of training an algorithm. Closely related issues include the ‘patent wars’ on software patents and their infringements and whether data can be copyrighted or not.

One of the issues some of these litigation cases raise, amounts to a question of an ontological nature, which is introduced here in an general way and with broader applicability. One party claims that the computer program is not one whole, but that for the purpose of litigation, there are the, say, , source files that allegedly have been copied and the claimant wants the defendant to be fined for the 1000 copyright violations, once for each source file; hence, not, say, a €100 fine for the one infringement of the program, but a fine of €100000 altogether. The defendant, for obvious reasons, would rather prefer to pay the €100 for one infringement, if infringement were to be deemed to be the case according to the court. Such an argument can likewise be constructed for trade secret violations for stealing the intellectual property of an app or even a whole operating system, for pirated software that was illegally copied, downloading textbooks in website format (which has multiple files cf. the ‘one pdf’ option) or a single Game of Thrones episode that happens to have as components, say, one audio file, one video file, and one subtitle file.

On the surface, this may sound similar to the well-known case in philosophy on the collapsing bridge. However, that typical example is about one event vs. multiple collapsing events, like many small cracks before it finally collapses and drawing event boundaries. Here, to stay with the analogy of a bridge, the ‘collapsing bridge’ argument is rather one of “is the number of collapses equal to the number of the parts of the bridge?”. Clearly, for the bridge, there is one collapsing event regardless how many parts the engineer used to build the bridge. If negligence has been found in the actual construction, then there is one fine for the culpability of bad bridge-building by the responsible construction company, not thousands of fines, one for each screw, cable, metal pole, and brick that the bridge was made of. Similarly, if someone steals a car, then there is one culpable act, not thousands for each part of the car. One easily can draw a parallel with computer programs: they are functional wholes, and any unlawful act, such as illegal downloads or copying copyrighted source code of a program (not under a public licence, such as GPL), tally up for the wholes, not for its, possibly thousands, of constituent parts.

While this reasoning by analogy may already suffice as argument to some, one could dig deeper into computer programs to assess the strength of the analogy. This means that it first needs to be established how all those files of a computer program relate, and of the source code in particular since that may be subject to copyright in some, but not all, cases. This may not seem as straightforward as the structural components of a bridge or a car. The main question to answer, then is:

  1. Is a computer program (or source code) a) a (tight) whole with parts, b) a whole that is a collection of artefacts, or c) just a set of artefacts where each element is a separate self-standing, independent, item?

This question generates two more specific ones to answer regarding the parts, in order to be able to answer Q1:

  1. What is the relation between the files of a computer program (resp. source code) and the computer program (resp. source code)?

  2. What is the relation among the files of a computer program (resp. source code)?

An answer as to the ontology of the computer program as a whole may also elucidate what exactly copyright infringement really means for a computer program and, specifically, help draw boundaries of what is, and what is not, part of the program.

There is ample documentation with ‘just so’ statements about the compositionality of software, i.e., that a program consists of parts that make up the whole, and why that is essential to good software design practices (e.g., [2, 19, 20]) and to system design more generally [23]. Yet, to the best of our knowledge, there is no extant argumentation regarding the wholeness of a computer program or what it is that makes it a whole consisting of parts, why, and how from an ontological perspective. The more common question—that is investigated in philosophy in particular—is the nature of a computer program, like whether it is a process or of some other category [24]

or when an artefact can be classified as being a computer program. They are considerations

at that level of granularity of the artefact, not about its compositional nature. Such arguments do, however, strengthen the argument that a computer program is a whole.

To answer the questions, we take insights from part-whole relations and mereology, the notion of unity, and basics of granularity, to combine them with modes of participation and functional parthood. Computer programs have a main function that is carried out by genuine functional parts and those parts have a specific unifying relation among each other at that finer-grained level of granularity, where one implies the other, therewith making a computer program a functional whole.

In the remainder of the paper, we first introduce some preliminaries in Section 2. We then consider the ‘vertical’ relation between the whole and its components in Section 3, the ‘horizontal’ relation between the components in Section 4, and close the argument that software is a proper (complex) whole in Section 5. We conclude in Section 6.

2 Preliminaries

There are many aspects to computer programs, but only a subset is needed to answer the questions posed in Section 1, as it concerns the notion of compositionally specifically, which can be answered with going into a debate about, say, what copies of files are or whether a program is a static ting or a running process222In more detail, out of scope are topics such as whether a computer program is a process, a disposition, or an endurant, or of another main category typically found in a foundational ontology (i.e., the ontological nature of a computer program), the nature and multiplication aspects of so-called ‘information objects’, and digital copies with their identity, such as that the files stored on disk temporarily reside in main memory.. The argumentation for it avails of some theoretical foundations, which are described in an introductory way in subsections 2.2-2.4. While they might appear ‘uninteresting’ to some readers and some of it might seem irrelevant to computing, we include a few computing examples and we definitely will apply all that to computer programs in Sections 3-5 and therewith answer those three questions from Section 1.

To meaningfully discuss the relevant details with respect to the scope of this document, some more preliminaries will be briefly summarised for the reader unfamiliar with them. They will assist in following the argumentation to establish that the computer program is a whole, and why. They are of a general educational or recap nature, as the case may be, rather than specifically IT only. In order to work toward a conclusion and justification, there are three relevant topics:

  • Part-whole relations, and in particular:

    • i) Mereological parthood, and possibly any of its refinements;

    • ii) Mandatory, immutable, and essential parthood;

    • iii) Collective entities (collections) and sets.

  • Identity and unity.

  • Granularity.

After a basic note on computer programs and the introduction of a running example, we introduce each one briefly informally for the reader unfamiliar with them, so as to keep the document self-contained. The respective formal characterisations (i.e., formalised in logic) can be found in the literature cited. For the most part, those details are not needed for the argumentation on software wholes, unless stated otherwise.

2.1 A note on computer programs and a running example

First, one could debate the definition of computer program. Consider, for instance, the definition of “computer program” as defined in the Copyright Act 98 of 1978 of South Africa333inserted due to 1 (g) of Act 125 of 1992 (available at https://www.gov.za/sites/default/files/gcis_document/201409/act125of1992.pdf; last accessed: 16 July 2020), it being: “a set of instructions fixed or stored in any manner and which, when used directly or indirectly in a computer, directs its operation to bring about a result.”. This definition presumably entails that: i) it is essential that a computer program has some function that is ‘to bring about some result’ x, ii) that it indeed can execute its task to bring about that result, i.e., the source code compiles (or is being interpreted) and successfully executes its task to perform function x, and iii) it distinguishes form from function, since it can be done ‘in any manner’, i.e., how it achieves that function is beyond the definition’s scope. The function view on computer programs is also held up in various courts cases [9] and, as we shall see further below, also from a theoretical viewpoint.

There are many types of computer programs. For instance, there is operating system (OS) software, such as Windows, Linux, and Mac OSX that have specific editions, such as Windows 10 and Mac OSX Yosemite 10.x, and utility software, such as disk defragmentation and activity monitor tools, which assist a user to manage the computer hardware. For end users, the most well-known category is applications, which are computer programs that require an OS to run to carry out its intended function. Popular examples of specific applications include Microsoft Word (a word processing program), WhatsApp for chat, and browsers, such as Google Chrome and Internet Explorer. Among application software, there are further subtypes, such as desktop and mobile applications, and among those, further divisions, such as native mobile apps designed for a particular OS and hybrid mobile apps. Hybrid apps use Web technologies, such as HTML 5 and Javascript, and are then wrapped in a ‘native’ wrapper specific for the OS of the mobile phone (Android, iOS etc.). A popular example of a native app is WhatsApp and example of hybrid apps are the Twitter, FaceBook messenger, and Uber apps.

To answer the questions posed in the introduction, we need to dig a bit deeper, into the components that make up a computer program. Among the illustrations, we will reuse one multiple times, which is introduced in the following example

Example 1 (Running Example: IVO)

As running example throughout this document, I will use the IsiZulu Verbaliser for Ontologies (IVO) [13], which is a desktop application that puts into natural language (‘verbalises’) structured input in the form of an ‘ontology’ that is stored in some text file (in OWL/XML format, to be precise) and that the author was involved in the design and programming of. Verbalising the OWL file into sentences in isiZulu is its only function. A snippet of the input in that OWL/XML format is as follows, which is normally not intended for human consumption, but for computer programs to process:

....
<SubClassOf>
        <Class IRI="#uSolwazi"/>
        <ObjectSomeValuesFrom>
            <ObjectProperty IRI="#fundisa"/>
            <Class IRI="#isifundo"/>
        </ObjectSomeValuesFrom>
    </SubClassOf>
...

IVO takes such input and then generates natural language sentences from it in isiZulu (the Zulu language). The end user interface of IVO with the result of verbalising a particular ontology is shown in Figure 1, where the input file included the snippet above, which resulted in the third sentence in the exists series of sentences, starting with “Bonke oSolwazi…”, and the other sentences from the other serialised axioms in that file. It can also output that on the command line, but then without colours for the different categories of words.

Figure 1: Screenshot of the isiZulu verbaliser with the enduser rendering of the output (source: adapted from [13]).

2.2 Parts, wholes, and part-whole relations

Part-whole relations are used and are being investigated in several fields of study and applications, notably philosophy, ontologies for information systems, conceptual models for application development, linguistics, and natural language processing (see

[11, 25, 27] and references therein). For instance, you heart is part of your body, in the process of eating food, there is a part-process of swallowing that food (and also taking a bite and chewing), and this paragraph is part of this section that is part of the document. There are some finer-grained details to it depending on the context where a ‘part of’ is used in natural language, which has resulted in a fairly stable list of common types of part-whole relations, as shown informally in Figure 2, including mereological parthood, refinements thereof, and informal ones in natural language utterances only but not mereologically [11]. For instance, instead of using aforementioned ‘part-process’, one could name the parthood that relates only processes to its part-processes as, e.g., (), specialising mereological parthood () that does not have any constraint on its participating entities; i.e., . Then, with a natural language utterance ‘each Eating event involves a Swallowing event’, one then identifies involves as a surface realisation of the relation, and thus eating has as part swallowing. The easier way of communication is ‘eating involves swallowing’.

Figure 2: Summarised sketch of the part-whole taxonomy and informal descriptions of their domain and range (extended from [11]/simplified from [12]).
Example 2 (Part-whole relation examples for computer programs)

Computer programs, when they run, can be seen ontologically as an ongoing process with sub (i.e., part) processes until someone closes the running of the program444Such a process in ontological sense is different from the technical meaning of the term ‘process’ in computing, where it has a narrower meaning.. Program code when the programmer writes it, is in letters such as these, and within that written ‘text’, on can identify parts of the source code file, such as a ‘header’, a class declaration, or a function, which can have components in turn. For instance, consider the function locpre that adds a locative prefix to a word, such as adding e- to iTheku:

def locpre(word,nc):
    phonolocword = ’e’ + ’attached’
    if nc == ’1a’ or nc == ’2a’ or nc == ’3a’ or nc == ’17’:
        phonolocword = vowel_coal(’ku’,word)
    else:
        phonolocword = vowel_coal(’e’,word)
    return phonolocword

It calls another function specified elsewhere in the code, being vowel_coal, that takes care of phonological conditioning when prefixing the word555Following the example: isiZulu does not have consecutive vowels, and the vowel coalescence in this case then applies to e- + i- of iTheku, which then returns eTheku. Locative suffix processing will eventually result in eThekwini (Durban).. In turn, locpre is called in the function pw_ci, which, in turn, is one of the functions to generate a natural language sentence (for contained in), which is a function in the IVO that has as function to verbalise an ontology. That is: one can construct a hierarchy (more precisely: a partonomy) of functions and part-functions.

There is a notion of atom: those parts that cannot be divided further, i.e., that do not have any parts themselves. In the other direction, toward the ever greater whole, the boundary lies with the software specification: while our specification went as far as the IVO, one easily can take the source code and use it as a module in some grander computer program; e.g., one that accommodates the visually impaired and also synthesise speech of the sentences generated and that allows one to save both the sentences and the generated speech. What we see here conceptually, is some semblance of a software version of Lego, with basic building blocks, recognisable units, whole entities that have those units as part and that also can dual as modules (parts) of larger wholes, building software from components.

Similar structural parthood relations can be observed in many other everyday artefacts. For instance: the cells of a heart are part of a heart are part of a body, the nail hammered in a piece of wood that is part of the bridge. These things are all physically connected. There are also whole artefacts whose components are not physically connected, but stand in a part-whole relation nonetheless. For instance, a bikini and its parts, and if a manager orders a new “desktop computer for the office” for a new employee, it goes without saying that means machine + monitor + keyboard + mouse, not just the machine without any input/output devices.

Collectives.

On the right-hand side of Figure 2 are part-whole relations that are not parthood in the strict sense of mereology. For instance, in an sentence “Ms. X is part of the project team GoTeam”, the ‘part of’ is better characterised as or relation between the part or grain Human that Ms. X is and the collective Project Team that GoTeam is an instance of, rather than parthood, because transitivity cannot be guaranteed [11]. A key feature of such constructions is that the object of the sentence is a physical object or a social object that is a collective, such as team, flock, or fleet. For instance, Project Team may be considered to be a collective consisting of human beings, any Flock has as members sheep, and Fleet has as members (grains) the ships that are part of the fleet. What exactly collective is, has been investigated as well, in fields as different as conceptual modelling for information system development [8], biomedicine [21], and social theory [5]. They may endow collectives with features or constraints on the whole collective or on its parts. Widely-agreed upon key characteristics are that the collective: i) has an identity, ii) has some particular meaning, and iii) it may survive its members, i.e., a member may change or be swapped but the whole keeps its identity. For instance, the rock band Queen can be identified, it has the characteristics of a rock band, and it has had some changes of membership along the way (it has had three bass players before John Deacon), yet Queen was already, and remained, Queen.

One can put further constraints on the collective for it to be collective, such as that they can and do perform actions [5], which is an argument popular especially in legal matters. Another possibility is to put an additional constraint on a collective’s parts; e.g., that those parts must all perform the same role and if they do not, then the entity is more complex than a mere collective [8]. Collective nouns in natural language, such as a fleet and flock thus may, or may not, be considered collectives ontologically, depending on such additional constraints.

Sets.

Collectives stand in contrast to mere sets. A set is just a mathematical ‘device’ where a change in member changes the identity of the set, its members do not need a common binding feature, and therewith a set does not need to have a specific meaning, and sets have no agency. For instance, your left foot and my laptop is a set with two objects as its members, but it is meaningless ontologically and it does not cause anything of itself. One may, of course, construct meaningful sets, such as the set of all mobile apps, but that does not mean that all sets have to be meaningful. Also, sets have members, not parts.

Particiaption in the relations.

In addition to looking at the type of part-whole relations, we can say something about how the part and the whole relate, which, perhaps, could be referred to as the strength of the participation of the part and the whole in the relation. There are four principal options [1]:

  • the part may, or may not, be part of the whole , or the may or may not have that part; e.g., a rear-view camera that may be part of a car, but a car does not need to have one.

  • some must be part of , or vv., but not necessarily that one; e.g., for a house to be a house, it has to have a roof as part, but the roof can be renewed while still being the same house, and similarly for a human to mandatorily have a heart as part.

  • that particular must be part of (or vv.), for as long as is in a particular role; e.g., a boxer must have as part his own hands and if he loses a hand he ceases to be a boxer (as entity he still may continue to live).

  • that particular must be part of and if it loses that , the will cease to be ; e.g., how the particular brain is part of a human: remove the brain and the human cease to exist as a human (turns into a corpse).

Besides how the part/whole participates in the relation, one also can make statements about shareability of the part:

  • some may be part of more than one at the same time, e.g., a talk may be part of a course and of a seminar series;

  • is part of one at a time but can be part of different s at different times, e.g., a removable network card may be part of one PC at one time and part of another one another time but it cannot reside in two different PCs at the same time;

  • can be part of at most one , e.g., entities such as the brain and spinal cord as part of a human body and some particular swallowing event that was part of eating your lunch yesterday.

Functional parts.

To talk of functional parts, we first have to consider function. Rather than investigating that here, we use an off-the-shelf definition that emerged from extensive research. Mizoguchi et al. define functions as “a role played by a behavior specified in a context” (for detail, see [18]). While this definition may sound somewhat obscure, it is closer to the meaning than, say, the main Oxford Dictionary definition666It defines function as “An activity that is natural to or the purpose of a person or thing.”, but a function need never be realised so then is not an activity that is happening; e.g., after using the new scissors to screw in the screw, it may never perform its intended function of cutting because the blade got bended. (Definition’s source: https://www.lexico.com/definition/function; last accessed on 14 July 2020), since this is broad enough to also include situations where, say, a pair of scissors is used to screw in a screw, in that that pair of scissors performs the role of screwdriver. Many types of function have been identified; e.g., the ontology of function identified 89 different types of functions [16]. Two key observations are that a “function is necessarily supported by the structure and/or properties of the things” and that “one of the most significant properties which function must have is implementation-independence.” [16]. This thus relates ‘vertically’ between the part and the whole, where those parts contribute to realise the overall function (with to without component functions that contribute to that) and the structures that realise that would thus be “functional parts” of the whole in some way [17, 26]. The definition of functional part provided by Mizoguchi and Borgo [17] is as follows: “Given an entity A and a behavior B of it, a functional part for that behavior is a mereological part of A that, when installed in A, has a behavior that contributes to the behavior B of A.” This is underspecified when taken in isolation, because it is meant in the context of their definition of function of the “a role played by a behavior specified in a context”. Therefore, we refine the definition as follows:

Definition 1 (Functional part)

Given an entity and a behaviour of type of it that plays the role of function F in that context , a functional part for that behaviour of type is a mereological part of that, when installed in , has a behaviour of type that contributes to of , where plays the role of part function .

Mizoguchi and Borgo then go on to identify four types of functional parts which are: genuine, replaceable, persistent, and constituent functional part. A part is a genuine functional part if it is installed correctly in , from a structural viewpoint; a replaceable functional part refers to the regular ‘mandatory’ participation, discussed earlier; a persistent functional part is either an essential or an immutable part; and a constituent functional part is a generic part regardless its assigned position as it may be temporarily taken out physically [17]. Although Mizoguchi and Borgo did not formally characterise these parts, it is clear from the context in the paper that it is to be understood as proper part.

2.3 Identity and unity

Ascertaining the identity of an entity is about examining something as being the same or different at one point in time (synchronic identity) or across time (diachronic identity). To establish the identity of an entity, one uses identity criteria that the entity has, which is a stronger notion than identification criteria (typically a single artificial attribute) that may be assigned to entities. For instance, one may identify Tibbles, the neighbour’s cat, by the colour of its fur, its size, and how it meows, whereas artificial attributes of entities assists with identification, such as one’s ID number and the MAC address of a network card. Within the context of software, it may be of use to also note “striking resemblance” or “comprehensive non-literal similarity” of program code777The famous cases where this got established are Franklin vs. Apple (1984) where Franklin had copied Apple’s operating system often line by line and Whelan vs. Jaslow (1987) where Whelan had developed a program in Fortran and Jaslow had then rewritten it line by line in Pascal., which holds when two pieces of code are identifiably different, but the semantics/logic of the expression of the idea is the same. Two typical examples are 1) to take some piece of code and change the names of the variables: they are different documents with, to the letter, different text, but have striking resemblance; 2) to keep all the business logic of the code but translate it line by line from one programming language into another language (say, Java to C++).

Unity [6, 7] is a special case contributing to identity, which focuses on the relation that the parts have that make up the whole entity that has some identity. Unity criteria are those criteria that have to hold among the parts for the whole to be a whole. Depending on the nature of the unifying relation, one can identify different types of wholes. Some typical examples are topological wholes, such as a tree and a heap of sand, morphological wholes, such as a bouquet of flowers and a constellation of stars, functional wholes, such as scissors and a bikini, and social wholes, which are certain types of collectives, such as a population of a country and a herd of sheep. Besides the aforementioned part-whole relation between the parts and the whole, there is a ‘something’ that binds the parts together to be a meaningful whole with an identity, which stands in contrast to, notably, stuffs (typically indicated with mass nouns, such as water, mayonnaise, and wood) and arbitrary sets of entities like the set of your left foot and my laptop. Since in the current context we only deal with programs, we exclude stuffs or amount of matter from the possible relata henceforth.

Guarino and Welty sought to formalise the notion of unity, with a good start made in [6]: with the generic unifying relation that binds the parts to each other with respect to the whole, and mereological parthood that is temporally indexed (presumably with time as discrete time), their first condition is that

(1)

Informally: if there is a parthood between and as some time , then at the same time it holds that for any entity that is also part of the same whole at that time, then it is in a unifying relation with the part , and vice versa.

Eq. 1 and the description in [6] assume weak supplementation would apply, under the assumption that , which they specified as , i.e., every proper part must be supplemented by another, disjoint [not overlapping ], part, because of the implication: there must be another part of , being , so that they can relate through .

One would want to exclude any mereological sum that is just a contingent whole, like my thumb and index finger. This can be seen in analogy to the difference between mere sets and collectives. Noting a typo in their formalisation in [6] (corrected in their lecture slides on OntoClean), and adding quantification, we formalise it as

(2)

That is, it does not hold for all and that when they are related through at time they are also a part of , or: and may be bound in some way, but they are not therefore also part of a whole because of that. This raises a few questions about the exact configuration when Eqs. 1 and 2 are taken together. For instance, can or do they have to be different s that relate and ? Can they be part of different wholes, or: are the s in Eqs. 1 and 2 supposed to refer to the same whole (which they currently do not)? Weak supplementation aims to rule out the possibility that , in that there cannot be a whole with a single proper part, but unity is declared with , not , so would unity still apply just in case there is only one part? These finer-grained details about the configuration brings us to the last preliminary: granularity.

2.4 Granularity

Granularity refers to the level at which one operates or analyses and that there are different levels of granularity [10]. For instance, at level , some amount of water is a type of stuff, but at its finer-grained level , there is a set or collection of HO molecules, or at one has an instance of Team and at are its grains, Human Beings. The level of analysis may be of importance ontologically. For instance, Organisation is a social entity at level that has one or more instances of Human (more precisely: its employees) as members at . It is only the former, however, that must have a bearer with a single physical address that holds for that entity at that level. That feature is not inherited by its constituent parts, since, e.g., the company’s employees typically live elsewhere. Similarly, it is the app that has, at that level of granularity , the property of directing the computer’s operation to bring about a result. That is, the whole entity at has properties of its own, which can be—and typically is—very different from what its constituent parts do at the finer-grained level where they reside.

Note that granularity focuses on the more/less details of analysis, which is different from prioritisation at the same level of granularity. For instance, there may be direct parts , and at level of whole at and one is interested only in and . This does not push into a finer-grained level , since an ontologist’s interest is a separate matter.

3 Software components and their relation to the source code and program

Let us first have a look at some important features of computer programs, which concern their parts, construction and use, and how those files relate to source code and executable.

Figure 3: Principal components of the OWL verbaliser (source: adapted from [13]).

For illustrations, we continue to use the IVO, but now we start looking ‘under the hood’ to its architecture and components, whose principal components are shown in Figure 3 and described in the following example.

Example 3 (IVO’s architecture)

Consider Figure 3, with the coloured shapes: the large green rectangle is the IVO computer program, with main being the central component that is used to ‘launch’ it and to do some final processing. main pulls all the other components together, being 1) the integrated modules (in orange) Owlready and Tkinter that already existed beforehand and were developed by someone else, 2) the two other clear purple boxes with program code that this author wrote (verbaliser algorithms and pluraliser of nouns), which, in turn, 3) call certain text files (pale purple, right-hand side) with required data structures to function properly. The yellow rectangle separates out the input/output: one gives IVO a text file in OWL/XML format, to load and process, IVO will then process it and return the generated sentences as output either in the GUI, like in Figure 1, or in the terminal.

One has to distinguish between the source code as artefact and the compiled code (or interpreted code) as artefact, since multiple source code files may compile into one compiled artefact that is a digital file (the executable), source code files may result in (with ) files of the software application, and the application has one ‘point of entry’ to launch it irrespective of the number of files it needs for its proper operation to bring about a result. Since the arguments are the same for compiled code and interpreted code, we will use for convenience just the shorthand of compiled code and compilation process henceforth, but it all just as well applies to program code written in interpreted languages, such as Python, and who’s program files are interpreted into machine code as well.

3.1 Modularisation and compositionality

Good design principles, be they for engineering for physical objects or for software, avail of modularisation and compositionality [2, 19, 20, 23]. That is, one breaks up a large problem into smaller ones, devises solutions for the smaller problems, and puts those together—declaring relations or rules-as-relations to specify how the components interact—to solve the larger problem. Some key benefits of this approach include manageability, division of labour when multiple people work on solving the problem, potential for reuse, quality control, and maintainability.

Example 4

For instance, with IVO (Figure 3), Owlready is reused from elsewhere, two files with the algorithms access the same data structure file (nncPairs.txt) rather than having to declare that twice (once in each file), and the vocabulary is easy to maintain by just updating a text file rather than having to dig through the algorithms in the source code files with program code.

There is a way how these files relate to the program and source code as a whole, and, taking a small abstraction step from concrete files to conceptual containers, how components of the architecture of the software relate. Since they are components, one should have a look at theories of how components, i.e., parts, relate a whole: this is the topic of mereology. A mereology-based approach (joined with category theory to accommodate multiple types of part-whole relations), has been applied to component-based software architectures [15], but not at the level of code. One could extend that idea by assuming that each component of the architecture is realised by one file. Practically, especially for production-level software, each component of the architecture may be divided up into smaller components to realise the purpose of the component, and some may be combined into one file, which brings us to the next point.

There is a difference between what an artefact, such as a computer program, is designed to do, i.e., its function or intended purpose, and how that is realised. This follows directly from the modularisation, since there may be multiple ways of breaking up the design into smaller components. Indeed, for some simple artefact, such as a hammer, these options are rather limited. A straight-forward webpage can have everything declared in one single HTML file, but that also can be split up into multiple files: the text to be displayed written in an .html file, the layout relegated to a style sheet in .css called from the .html file, and some javascript in a separate .js file also called from the .html file. This can be much more complicated for programs, and especially in their source code form. There may be modularisation by sub-function that each are stored in a separate file.

Example 5

Regarding sub-functions, consider the pluraliser.py that generates a plural from a noun in the singular that is used by the module that generates the sentences, imported packages that themselves consist of several files (e.g., Owlready), a data structure that is essential to several functions that are written in several files and therefore it is stores in a separate file (e.g., nncPairs.txt), a file with configurations values, and so on.

Then, while a single source file in, say, the programming language C will result in a single program file, in the intermediate steps there are other files and processes involved: during the compilation process that translates the source file into the object file, .lib files are called by the linker to generate the program file.

In this context of function and modularisation, let us digress briefly into the user experience of the computer program. A function may be realised by more or less well integrated components that may be automatically invoked or manually started. Manual actions may give the user the impression of it being distinct but related components, whereas automated processes will provide the impression of it being one entity. This may be configured as such by design. For instance, a user may by default block the running of scripts in spreadsheets, so when there is one, the user has to approve of its execution. However, when the security setting is such that running scripts is permitted by default, end-users will not be aware that it is executing the script as a sub-process of the spreadsheet process or even that there is a script running at all.

User experience of seamless integration is not a good measure whether some component file is really a part of the program, or whether there are actually multiple programs. A possible way of determining that, is with respect to the code and behaviour888e.g., in the GNU FAQ about “aggregates”: https://www.gnu.org/licenses/gpl-faq.en.html#MereAggregation; last accessed on 16 July 2020. : how do the files interact (i.e., what mechanism), and what does that interaction mean (i.e., the semantics of the communication)? The GNU FAQ suggests that links such as function calls and inclusion by reference definitely is a tight enough interaction for indicating parthood, as are the links resulting from the linking stage in the compilation, whereas piping on the command line—e.g., output from one to be used as input into another—is seen as too decoupled interaction and therewith a reason to lean toward them being separate programs, with the caveat of “exchanging complex internal data structures” where the semantics of what is being communicated is “intimate enough”, then it still may be considered as one program with two parts. Either way, these are examples of considerations for possibly counting as unifying relation that is required for a whole to be a whole (recall Section 2.3). We shall return to this, and attempt to resolve that aspect of interaction among (possible) parts, in Section 4. Currently, to the best of our knowledge, there are no formal, widely agreed-upon, and standardised criteria for determining whether a file is part of a program or not.

3.2 Mode of participation in the part-whole relation

We have to consider in what way the files of a program participate in the relation to the program as a whole. As noted at the end of Section 2.2, there are four options: optional, mandatory, immutable, and essential participation. We now have to assess which one(s) apply in which cases with respect to the source code and the compiled program. On principled grounds, we will first take this to be under the assumption that the program is free of bugs, and sources code that needs to be compiled, will do so without errors.

Optional participation of a file: In the interest of optimal design and development and optimised (minimised) software, there should be no optional files either in the source code or in the running program. That is: if the source code still compiles and the program can continue to operate without any bugs without that file, then that file is redundant and should be removed. Yet, there may well be sub-optimal source code and programs, in particular when some modules or libraries are imported. Such imports are typically used as-is and operationally redundant files may not have been removed to save oneself testing whether it could be removed.

Example 6

IVO and optional participation: The programmers did not check whether all the files of the imported Owlready are used and just bundled Owlready with the verbaliser. It is very likely that not all Owlready source files are used, since IVO only fetches data from the OWL file but does not modify it, yet there are functions stored in separate files to modify the ontology, such as instance_editor.py, which is never used by IVO at least.

Mandatory participation of a file: the file has to be present and at the right location, but it can be swapped for another without altering the function and functioning of the software and it won’t introduce bugs. This would be the most common case.

Example 7

Mandatory participation applies to IVO as well: the code looks up nouns in the nncPairs.txt file for any axiom it has to verbalise. If it is not there, it will throw an error, but it can be overwritten with a different file also called nncPairs.txt that has to same outward function and minimal (but possibly more) content, e.g., one that has more nouns with noun classes listed in the file. The function—verbalising the supported types of axioms in isiZulu—remains the same regardless.

Essential participation, in short: remove that particular file and the app is ‘broken’. There are different levels of ‘broken’, however, such as having minor bugs but the core function still can be achieved mostly or fully, major issues so that it only works partially, and the program not being able to run at all. From the binary viewpoint of ‘bug-free or not’, these differences do not matter: if not all specified functions, including sub-functions, work when that particular file is not present, then it is an essential file.

Example 8

Essential participation may hold for the Owlready module. For instance, if any other version than that instance Owlready-0.3 is not backward or forward compatible, then that version is essential to the function of verbalising ontologies by IVO.

Immutable participation of a file: putting the definition in terms of software, then when the software performs a particular role, it is essential to it. It is highly debatable whether this mode of participation in a part-whole relation would be applicable for either source code or computer program, for two main reasons. First, if at some point the program has to perform some function , then it is part of the specs at all times. The only case where it may be argued to be applicable is when the program has multiple alternate functions, and one is using only a subset of it, then if there are files essential to the operation of that subset of functions but they are not used when running deploying the tool differently in a different, disjoint, function, then they could be said to be immutable parts. The easy counter-argument to that, however, is that the software specs has the requirement that the software has to have those two different functions, and then we’re back at mandatory participation at least. Second, in addition to the former, for that particular file in that particular role or mode of operation, it would have to be the case that only that version of the file or module would work with the rest of the program.

Example 9

IVO has two ‘modes’ of operation—terminal-based with as aim to incorporate the verbaliser into other tools and one for end user interaction for reading of the output with a GUI. Tkinter is only essential for the GUI operation mode999Tkinter is the de facto user interface module for Python; see also https://wiki.python.org/moin/TkInter; last accessed on 16 July 2020., but not the terminal-based mode that still performs the essential function of verbalising ontologies. One thus might argue that Tkinter could be an example of an immutable part of IVO, if it were the case that another version of Tkinter would not be able to generate the user interface. Yet, if the requirement is ‘to be able to operate in two modes, terminal and with end user formatting’, then omitting Tkinter breaks the overall functionality, and would therewith be at least mandatory (if not essential), rather than immutable. The requirement was to have both ways—the GUI for the linguist to check the output easily and the terminal output in anticipation of reuse—so then immutability does not apply in this case.

In sum, for bug-free operation, mandatory participation in the parthood relation is the common case, with optional participation arguably amounting to lazy or time-constrained coding and essential participation would be an example of ‘brittle’ coding practices. With these principles in place, one can start going off a sliding scale: what if we let go of the bug-free premise? What if one prioritises the functions of a program? Drawing the line in that case will end up to be subjective and dependent on the example. For instance, take the relatively well-known program MS Word: even if the Clip Art feature does not work, it still easily can go on with word processing, but if the ‘insert table’ or ‘create columns’ functions are dysfunctional, definitely less so, although one still can type and save text. This would require establishing a partonomy of functions, each with priorities or weights assigned, and some threshold function to compute the outcome.

Lastly, shareability of a file depends on whether a file has to have a lock on it when open because of concurrency issues. If that is the case, then that component file of the computer program (not the source code) is only sequentially sharable. Due to the unclear nature of software artefacts, exclusivity is debatable depending on the choices—among others, whether copies count as distinct and whether it is the file as information artefact, etc. Since this is not crucial for source code copyright/copyleft issues, trade secret violations, and illegal downloads, we will not address this topic further.

3.3 Functional parthood

Since there is the compositionality thanks to modular design, and the types of files with their contents are distinct rather than subsets, the applicable relation between those component files and the whole—the Source code or its compiled (or interpreted) Executable—is that of parthood. One may argue that it would also be a proper parthood in most cases, because production-level software normally consist of multiple files in source code and the compilation process. Multiple part-whole relations have been proposed (recall Figure 2), so this relation possibly may be refined. However, because the ontological status of a file and of a computer program is not fully settled [24], a confidence in applicability of a refinement along the line of [11]

is limited, except that from a ‘written text viewpoint’ on code, it would probably be

structural parthood and when the program is being executed there are running processes and sub-processes, hence, involvement.

It is complicated further in that a program has a specific assigned function, such as Text processing for MS Word and Song management for iTunes, which each have sub-functions, such as Text formatting, Printing, and Playlist creation, i.e., there are functions and part-functions, and they in turn may have part-functions, such as Changing text to bold face, Adding song to playlist etc.

Recalling Definition 1 for functional parthood, and the four types of functional parthood that Mizoguchi and Borgo specified [17], we can no apply this to clarify the relation between the parts of the computer program and the program. If we assume that source and compiled code consist of more than one file and there are no optional files, then

  • Source code of the computer program: all files are genuine functional parts and replaceable. The files have to be in specific locations to be called.

  • Compiled code of the computer program: all files are genuine functional parts and possibly also persistent (if not, then they are replaceable). The files have to be in the right folders and only there.

This finalises the ‘vertical’ relation between the parts and the whole. We will move on to the relation among the parts in the next section.

4 The relation that binds the components

Taking stock, the files that are part of a computer program participate mostly mandatorily in the genuine functional parthood relation () with the source code or compiled program, normally there is more than one file involved, and each file either contributes to some function or performs one or more functions to contribute to the specified function(s) of the software. In terms of applying Eq. 1 toward sorting out that ‘unity’ for wholes, a first step would thus be the following:

(3)

Note that the time indexing with is kept here from Guarino and Welty’s Eq. 1. This is more precise as it allows for specification whether it has to hold at some time or at all times or at different times, rather than not saying anything at all about it. For computer programs, it may assist later on with any possible extensions that would take into account versioning, without affecting the foundations.

This brings us now finally to addressing the research questions posed in Section 1. First, we look at its question Q2, which amounts to resolving the “unifying relation” for the source code and the compiled program, and related loose ends from the formalisation of unity in Section 2, such as the implicit assumption that in the axiom.

4.1 The unifying relation : can it even be applied?

When we characterise from Eq. 3, i.e., the relation among the parts, it will ensure we arrive at either that the program is a (functional) whole since it has unity, or, if it cannot be specified, that then there is no whole after all. What such as then may be is unclear from the exposé on unity in [6]. There is not even one example of such a unifying relation and nor do the OntoClean examples [7] have any example for either—it is just assumed to exist with some appeal to intuition. Philosophers, such as [5, 14], do not do a better job at it, like with a mere “x is unified in the right way” [14, p180]. At the other end of the spectrum in the conceptual modelling area, it goes the other extreme with non-reusable situation-specific instance-level relations like “carrying out research in the same sub-area of the area of distributed systems in the University of Twente” [8, p159], but no type-level relation that may hold throughout similar cases. For instance, while Bikini is a functional whole consisting of two parts, the particular type of unifying relation that holds between the two parts, for any bikini, is left undetermined. However, if we are to assess program, then this unifying relation needs to be ascertained, for it would assist explaining why it is a whole or not. We shall try this in the remainder of this section.

Let us first consider two cases with commonsense examples. For a morphological whole, such as the Constellation example (e.g., the stars making up the Sagittarius sign), this may be, say, that all the stars involved share that they participate in the connecting the drawing lines to make up the figure that is a certain shape. For social wholes, such as Electorate of a country, its parts (or members of the collection)—the voters—share that that they stand in a fellow adult citizen (of some country) property relation to each other. Applying Eq. 1 specifically for the purpose of illustrating its workings, let’s formalise the latter as follows:

(4)

So, if The Daily Show presenter Trevor Noah is a voter in the Electorate of South Africa, then if there is another voter, say, the national rugby team’s captain Siya Kolisi, in the Electorate of South Africa, then Trevor Noah and Siya Kolisi are fellow adult citizens and vv.; hence, it can be applied generically and for specific instances.

4.2 for computer programs

Since the notion of unity seems to be usable at least in other domains, let’s consider programs: what is it that unifies, or binds, the files, other than some vague ‘being another component’? Because there is a compiled program and a source code program, with as consequence that the files comprising each are different, the relation among the file is, perhaps, different, too. Therefore, we shall address each in turn. In both cases, we use the ‘realistic case’ argument, i.e., setting aside the corner case of one main file without calling anything else ever, since we are interested in practical applications.

4.2.1 The unifying relation for source code files

Recall that regarding source code, the whole is the source code for the application, such as Firefox v77.0 source code or Linux kernel v5.7 source code, which operates at level of granularity, regardless how that is organised. The components and the organisation thereof—i.e., the files that make up the source code like a file with the main() and imports—operate at the finer-grained level of granularity. This is likewise for the computer program, such as the Firefox executable app and Linux kernel with the compiled code.

There are various ways to link the source files to the ‘main’ file that stands at the root. For instance, there may be import statements, as is the case with the Python and Java programming languages, an #include statement in case of C and C++, or some other coupling mechanism, such as function calls [19, ch9]. An example of cascading imports for the IVO source code is depicted in Figure 4. The organisation of the source files form a tree that may have files with more than one parent, which is also called a dependency graph. Since each coupling mechanism goes one way, it is called ‘directed’. Mathematically, it is thus a directed graph with a top-node, where the other nodes may have more than one incoming edge and more than one outgoing edge, ideally that graph is acyclic, and through that chain of connections one should be able to end up at that top. A circular import, be it directly between two files importing each other or indirectly through a chain of import statements may not be forbidden by the syntax of a programming language, but it is considered an anti-pattern, i.e., bad design, and, consequently, there is an Acyclic Dependencies Principle with strategies for breaking cycles to foster good design [19, ch9]. Also, there may be ‘orphans’ in the source code: one or more files that do not relate to anything, or at least do not seem to. While one cannot guarantee they are useless during execution, it is highly likely to be either excess baggage or the coupling mechanism is so loose, it may have to be considered to be a separate program. This argument is advanced also by GNU’s copyleft licences (cf. footnote 8).

Assuming the aforementioned sufficiently tight coupling where the files will show up in a dependency graph, then, for any two component files of the source code, being genuine functional parts, they thus relate through that directed graph of dependencies, either further down in the hierarchy or upward and into another branch. One thus always can construct a path between any two nodes (ignoring the direction of the edge). Let us call that the property, then Eq. 3 can be updated into Eq. 5:

(5)

That is, there is a notion of unity for source code. We still need to verify it is not just a contingent whole. That is, either the following holds:

(6)

or, at least, that it holds with Eq. 2’s generic rather than the refined . Is it possible that and are (genuine functional) parts of some whole but not do stand in a relation to each other, or the other way around? This is indeed possible: they may be parts of a library of header files that come with the C/C++ Integrated Development Environments such as Eclipse101010http://www.eclipse.org/cdt/; last accessed on 16 July 2020. or and are in a Python repository of modules , such as PyPI111111https://pypi.org/; last accessed on 16 July 2020. or your own one. In those cases, and are neither in the relation nor, arguably, but another part-whole relation, since there they are part of a collective. It may be that one header file imports another, or one module avails of another in such as repository or library, but this need not be the case, and many files will not do so.

Figure 4: Series of actual import statements of the isiZulu OWL verbaliser, for the Python files only (it calls other files as well, not shown).

Lastly, Eq. 1 implicitly assumed weak supplementation (‘there must be some other part, too’) even though it was not formalised as such. That is, when applied to the software case, that there ought to be more than one source code file. There exist source code that has no import or #include statement, although this is rare for production-level software. Still, with the current formalisation, even such a corner case is not a problem, since 1) weak supplementation is asserted for , not , but is used in the unity axiom and 2) for “” in the axiom to hold, even just one example satisfies it, which we just have with the module library.

We therefore can define source code to be a functional whole, as follows:

Definition 2 (Program source code as a functional whole)

For any computer program in source code form that has one or more genuine functional parts with and , and time where is the set of discrete time point, it holds that . For at least some , does not hold, i.e.,:
.

This concludes the part of the argument that, besides the ‘vertical’ relation of (functional) parthood, there’s also that ‘horizontal’ relation among the parts with the dependency graph counting for unifying relation for the source code.

4.2.2 The unifying relation for the object code (compiled program)

How can Eq. 3 be finalised for the computer program, as in the executable (object code) version of it? At that stage of the software development, the explicitly stated to be imported files have been included, so is not applicable here and thus no tree of source code files, since they have been compiled. Yet, there still may be multiple files that are part of the computer program that “brings about a result”; examples include an .ini file for initialisation setting, an .xml file with the locale setting for menu options in a different language, a dictionary to load for spellchecking, and a .dll as dynamically linked library.

Example 10

For instance, the platform-independent version of the Protégé tool (v5.2)121212https://protege.stanford.edu/; last accessed on 16 July 2020. is unzipped into three run files for different operating systems and five folders with files, which include the core jar files such as protege-launcher.jar, protege-editor- core.jar, protege-editor-owl.jar, and protege-common.jar, as well as jar files for other essential features, such as owlapi-osgidistribution.jar

, and a bunch of plugins, such as the automated reasoner

org.semanticweb.hermit-1.3.8.413.jar and some query functionality org.coode.dlquery-4.0.1.jar, further configuration settings (like in config.xml) and the occasional icon for the interface. This is in contrast to the platform-specific versions, such as a Protege.exe, which come as a single executable where all those separate files are bundled together and hidden from the user. That is, it is basically the same application, but it is offered in different modalities, which is entirely an engineering decision to cater for different user preferences, since the former, platform-independent version, is easier for, mainly, plugin management and less popular operating systems, whereas the latter is easier for end users to give it an out-of-the-box one-click-launch look-and-feel.

Put differently, when there are separate files, at least some of them could have been put ‘inside’ the executable, but from an engineering and usability viewpoint, they allow for more flexible customisation of the program and support the prospect of reusability. In any case, regardless the variation in configuration, if there is more than one file for the computer program, they are linked—in the sense of the compilation processes in compilers, not the generic term ‘linked’—so that for compiled code, we obtain a similar definition as for the source code functional whole, but then rather than with . There are several differences between the two relations: the latter is mutable and the former () is not; the latter can have those files swapped, the former not; they hold between different types of files; and the latter has the links manually specified by the programmer’s design, whereas this happens automatically in the compilation by the optimisation algorithms that are part of the compiler.

Can be linked but not be part of a computer program , or be genuine functional part but not linked, i.e., that the bi-implication does not hold? For code, there are object code libraries that are used in the linking stage of the compilation process, which are part of and while being part of that, they are not linked. With sub-optimal program code and that optional participation, surely also at least one example can be found where either or is not a of . We thus obtain the following definition:

Definition 3 (Compiled program as a functional whole)

For any computer program in compiled form that has one or more genuine functional parts with and , and time where is the set of discrete time points, it holds that . For at least some , does not hold, i.e.,: .

The situation is sketched in Figure 5, which shows two of the possibly many levels of granularity, each having entities residing at that level. The ‘vertical’ relations between the whole at level , and its parts at the finer-grained level are typically a part-whole relation, but may be different [10]. The relation among the parts at a particular level is the unifying relation, such as being in a source code graph with and begin a fellow adult citizen of a country. Informally, this constructs a triangle of relations between the entities: if you have one vertical side, there must be the other two (a horizontal and another vertical one).

Figure 5: Levels of granularity, with entities residing in each level, which may be related ‘horizontally’ among themselves at that level, and ‘vertically’ between the parts and the whole each residing in a different level. The object a2 on the left visualises a collective with its members and a1 visualises an integral whole with its parts.

5 Computer programs as wholes, collections, or set?

Having answered Q2 and Q3—, , and —we now can turn to Q1: is a program a whole, mere collection, or just a set? Clearly, it is not just a set of discrete artefacts, for the artefacts have a specific relation to each other, and they need to have that coupling to achieve the proper functioning of the program. Sets, on the other hand, do not pose such conditions on their members. Put differently: there are more things that are sets than that are programs. For instance, one can put protege-launcher.jar and pluraliser.py together as a set, but that does not make them two parts of a computer program, since they are independent files because there is no ‘communication’ between the two, notably not being in the same source code graph nor being linked, and not even more loosely connected through sockets or pipes.

That program is not a collection either may be less obvious, and the arguments depend on the definition of collection that one prefers. Let us consider each in turn. By Guizzardi’s and Rector et al.’s requirement on the parts or grains to have to perform the same role [8, 21], a computer program is not a collective, because its components do not have all the same function or role nor are they necessarily the same type of file. Moreover, a crucial aspect of modularisation in software development is to separate the different functions into different modules and files. Thus, when following established software design principles, it would never be a collective by their characterisation.

Considering Copp’s criteria for collective [5]: the program can and does perform actions, therewith meeting one of the three criteria. Copp focusses on social collectives rather than any collective, and therefore also has the requirement that a collective is composed of persons and has a “plausible theory of the legal system”, which clearly do not apply. He generalises collective so that possibly they are “mereological sums of collective stages linked by a unity relation” where his “unity” refers to a property of the whole to track the collective through time to determine diachronic identity and the mereological sum applies to aggregating the time-slices, not about the parts being summed into a whole. Diachronic identity is not applicable here as a criterion, supporting yet further the notion that software is not a collective. Thus, also by Copp’s criteria and definition, a computer program is not a collective.

Tangentially, it does raise again the topic of mereological sum: could a software app be just a mereological sum? Bare mereological sums permit a sort of ‘contingent unity’, however, like a sum of your right index finger and right thumb. This is not what we have with program code, since each unity is deliberate by design and has properties at the level of the whole. Guarino and Welty tried to avoid simple mereological sums by assuming that “An object is an intrinsic whole under if, at any time where exists, it is a contingent whole under .” [6] where their is here. That is, the unifying relation is essential to the whole, whereas just a mereological sum need not to have holding among its parts all the time or even at some time. For instance, a Python module repository may contain the modules Owlready v0.3, Tkinter v5, and NLTK v2, which is a mereological sum and also a collective, yet the modules are independent from each other. The files in one’s ‘Downloads’ folder have even less unity: albeit sharing the characteristic of having been downloaded, each download, such as a yetAnotherVideoTool.pkg and a 1234567.jpg, is independent. The files of a particular version of the source code or computer program (without orphan files), however, always will have a binding relation to each other for as long as they exist; hence, then indeed is an intrinsic whole under or , respectively.

Having used both the process of elimination and having provided supporting arguments as to what makes a whole a whole (recall Definitions 2 and 3), a computer program—be it in source code form or as compiled code—qualifies as being a whole. It is a functional whole, since the function aspect comes from the whole having a function, which has ‘sub’ or part-functions (recall Section 3.3), as per design of the artefact. Thus, then the source code and the computer program are a functional whole.

It perhaps invites one to propose a definition for what a computer program is. This, however, may at least in part be dependent on answering the ontological nature of a computer program as a whole [24], which was not addressed in this paper—here it was argued only that it is a whole. For the situation in South Africa, where the author resides, there is no need for proposing a detailed definition, since it has been defined in its Copyright Act of 1978 that we have to adhere to. As noted in Section 2.1, “computer program” is defined in the Act as: “a set of instructions fixed or stored in any manner and which, when used directly or indirectly in a computer, directs its operation to bring about a result.”. Observe that the notion that the source code and compiled code are functional wholes is compatible with this definition, since both assert there is an entity (the program), the Act’s definition does not make any statement about the files—be that component files of the program, in the sense of stored as human-readable source code, bytecode, or object code—so this cannot contradict, and the Act’s statement is slightly stronger regarding function in the sense that the function actually has to work (cf. possibly not ever being realised).

6 Conclusions

An argumentation was presented that computer programs—be it as source code or as executable—is a functional whole, and why. In so doing, it had to draw from theories of mereology, granularity, unity, and function. The unifying relation among the parts (files) is the graph for source code and being linked for compiled code. The relation between the component files and the program is one of functional parthood, since the files perform (sub)functions of the overall function of the program. These additional insights into software and the notion of the internal structure of wholes may assist practically with, among others, litigation cases in software development, illegal downloads, and copyright infringements, as well as more generally by having demonstrated that the notion of a unifying relation is indeed operationalisable.

An aspect that may be of interest for future work concerns the assumption of ‘free of bugs’ and the modes of participation of the parts in the whole: should one distinguish between those files that prevent a program from running altogether and those that cause one to lose some functionality but not its principal one and those where mainly just aesthetics are affected? That is, whether a sliding scale should be applied to determining importance of a source file to the whole program.

Acknowledgements I would like to thank the advocate and lawyer on the litigation case for the conversations on this topic, which assisted with devising examples and descriptions to try to explain the theory to a non-specialist audience.

References

  • [1] A. Artale, N. Guarino, and C. M. Keet. Formalising temporal constraints on part-whole relations. In G. Brewka and J. Lang, editors, 11th International Conference on Principles of Knowledge Representation and Reasoning (KR’08), pages 673–683. AAAI Press, 2008. Sydney, Australia, September 16-19, 2008.
  • [2] D. Bjørner and A. Eir. Compositionality: Ontology and Mereology of Domains – Some Clarifying Observations in the Context of Software Engineering, pages 22–59. Springer Berlin Heidelberg, Berlin, Heidelberg, 2010.
  • [3] S. Brachmann. Arista pays cisco $400m to end patent litigation at district court and itc. IPwatchdog, 8 August 2018 2018. Online: https://www.ipwatchdog.com/2018/08/08/arista-pays-cisco-400m-end-patent-litigation-district-court-itc/id=100102/; Last accessed on 8 June 2020.
  • [4] P. Cairns. Does afrocentric have a r1bn problem? The Citizen, 16 July 2018 2018. Online: https://citizen.co.za/business/1979904/does-afrocentric-have-a-r1bn-problem/; Last accessed on 8 June 2020.
  • [5] D. Copp. What collectives are: Agency, individualism and legal theory. Dialogue, 23:249–269, 1984.
  • [6] N. Guarino and C. Welty. Identity, unity, and individuality: towards a formal toolkit for ontological analysis. In W. Horn, editor,

    Proceedings of the European Conference on Artificial Intelligence (ECAI’00)

    , pages 219–223. IOS Press, Amsterdam, 2000.
  • [7] N. Guarino and C. Welty. An overview of ontoclean. In S. Staab and R. Studer, editors, Handbook on Ontologies, pages 201–220. Springer Verlag, 2009.
  • [8] G. Guizzardi. Ontological Foundations for Structural Conceptual Models. Phd thesis, University of Twente, The Netherlands. Telematica Instituut Fundamental Research Series No. 15, 2005.
  • [9] S. Karjiker. Copyright protection of computer programs. South African Law Journal, 133(1):51–72, 2016.
  • [10] C. M. Keet. A Formal Theory of Granularity. Phd thesis, KRDB Research Centre, Faculty of Computer Science, Free University of Bozen-Bolzano, Italy, April 2008.
  • [11] C. M. Keet and A. Artale. Representing and reasoning over a taxonomy of part-whole relations. Applied Ontology – Special issue on Ontological Foundations for Conceptual Modeling, 3(1-2):91–110, 2008.
  • [12] C. M. Keet and L. Khumalo. On the ontology of part-whole relations in Zulu language and culture. In S. Borgo and P. Hitzler, editors, 10th International Conference on Formal Ontology in Information Systems 2018 (FOIS’18), volume 306 of FAIA, pages 225–238. IOS Press, 2018. 17-21 September, 2018, Cape Town, South Africa.
  • [13] C. M. Keet, M. Xakaza, and L. Khumalo. Verbalising owl ontologies in isizulu with python. In E. Blomqvist, K. Hose, H. Paulheim, A. Lawrynowicz, F. Ciravegna, and O. Hartig, editors, The Semantic Web: ESWC 2017 Satellite Events, volume 10577 of LNCS, pages 59–64. Springer, 2017. 30 May - 1 June 2017, Portoroz, Slovenia.
  • [14] K. Koslicki. Substance, independence, and unity. In E. Feser, editor, Aristotle on Method and Metaphysics, chapter 9, pages 169–195. Palgrave McMillan, 2013.
  • [15] D. T. M. Lê and R. Janicki. A Categorical Approach to Mereology and Its Application to Modelling Software Components, pages 146–174. Springer Berlin Heidelberg, Berlin, Heidelberg, 2008.
  • [16] R. Mizoguchi. Functional ontology of artifacts. In Interdisciplinary Ontology Conference 2008, pages 1–10. Keio University Press Inc., 2008.
  • [17] R. Mizoguchi and S. Borgo. A preliminary study of functional parts as roles. In FOUST-II: 2nd Workshop on Foundational Ontology, Joint Ontology Workshops 2017, volume 2050 of CEUR-WS, page 9p, 2017. 21-23 September 2017, Bolzano, Italy.
  • [18] R. Mizoguchi, Y. Kitamura, and S. Borgo. A unifying definition for artifact and biological functions. Applied Ontology, 11:129–154, 2016.
  • [19] M. Noback. Principles of Package Design: Creating Reusable Software Components. Apress, 1st ed. edition, 2018.
  • [20] C. Prehofer, J. van Gurp, and J. Bosch. Compositionality in software product lines. In A. D. Lucia, F. Ferrucci, G. Tortora, and M. Tucci, editors, Emerging Methods, Technologies, and Process Management in Software Engineering, chapter 2, pages 21–42. John Wiley & Sons, 2008.
  • [21] A. Rector, J. Rogers, and T. Bittner. Granularity, scale and collectivity: When size does and does not matter. Journal of Biomedical Informatics, 39:333–349, 2006.
  • [22] P. Samuelson. Api copyrights revisited. Communications of the ACM, 62(7):20–22, 2019.
  • [23] S. Tripakis. Compositionality in the science of system design. Proceedings of the IEEE, 104(5):960–972, 2016.
  • [24] R. Turner, N. Angius, and G. Primiero. The philosophy of computer science. In E. N. Zalta, editor, The Stanford Encyclopedia of Philosophy. Metaphysics Research Lab, Stanford University, spring 2019 edition, 2019.
  • [25] A. C. Varzi. Mereology. In E. N. Zalta, editor, Stanford Encyclopedia of Philosophy. Stanford, fall 2004 edition, 2004. http://plato.stanford.edu/archives/fall2004/entries/mereology/.
  • [26] L. Vieu and M. Aurnague. Part-of relations, functionality and dependence. In M. Aurnague, M. Hickmann, and L. Vieu, editors, Categorization of Spatial Entities in Language and Cognition. John Benjamins, Amsterdam, 2005.
  • [27] M. Winston, R. Chaffin, and D. Herrmann. A taxonomy of partwhole relations. Cognitive Science, 11(4):417–444, 1987.