Privacy Ontology, Privacy Requirements, Privacy by Design (PbD), Requirements Engineering, Socio-technical Systems
Nowadays, most companies collect, store, and manage personal information (e.g., information about customers, citizens, etc.) to deliver their services. These companies need to protect the privacy of personal information not only for maintaining their credibility and repetition but also to comply with various privacy laws and regulations . More specifically, most developed countries have developed various laws and regulations to govern the use of personal information. For instance, the General Data Protection Regulation (GDPR)  has been recently developed by the European Union (EU) with the aim to safeguard the use of personal information among all EU member states. Moreover, the Australia Government issued the Privacy Act 1988 , which include a set of privacy rights known as the Information Privacy Principles (IPPs). Canada has developed the Personal Information Protection and Electronic Documents Act (PIPEDA)  that regulates how personal information can be collected, used and disclosed. In the United States, more domain-specific laws have been developed (e.g., HIPAA  for healthcare domain, the Financial Services Modernization Act , etc.)
Failing to comply with privacy laws and regulations results in huge monetary sanctions, which companies want to avoid . Accordingly, privacy has become a main concern for system designers. In other words, dealing with privacy-related concerns is a must these days because privacy breaches may have severe consequences [29, 47]. In particular, the absence of appropriate privacy protection mechanisms may lead to privacy breaches, which impose huge direct cost [1, 29], as well as long-term consequences [8, 10] such as having one’s personal information in the wrong hands . However, most of these breaches can be avoided if the privacy requirements of the system-to-be were captured properly during system design (e.g., Privacy by Design (PbD)) [52, 57]. Nevertheless, most existing work on privacy requirements often deal with them either as non-functional requirements (NFRs) with no specific techniques on how such requirements can be met , or as security requirements (e.g., [106, 52]), i.e., focusing mainly on confidentiality and overlooking important privacy aspects such as anonymity, pseudonymity, unlinkability, unobservability, etc.
On the other hand, privacy is one of the few concepts that has been studied across many discipline including law , sociology [102, 22], psychology , and information systems . Although it has been studied for more than a century, it is still elusive and vague concept to grasp [86, 87, 52]. Despite this, numerous attempts have been made by scholars to clarify the concept by linking it to more refined concepts such as secrecy, confidentiality, anonymity, pseudonymity, unlinkability, unobservability, control of personal information [87, 108, 71], or to solitude, intimacy, anonymity, and reserve as in . Other studies suggest that the notion of risk is also related to privacy as the loss of information control implies risk [72, 84, 56]. While Awad and Krishnan  investigated how transparency can influence privacy. However, there is no consensus on the definition of these concepts or which of them should be used to analyze privacy .
In addition, many of these concepts are overlapping, which contributes to the confusion while dealing with privacy . This has resulted in much confusion among designers and stakeholders, and has led in turn to wrong design decisions. Ontologies have proven to be a key success factor for eliciting high-quality requirements, as they reduce the conceptual vagueness and terminological confusion by providing a shared understanding of the related concepts between the designers and stakeholders of the system [96, 51, 18, 88]. In this context, a well-defined ontology that captures key privacy-related concepts and relationships could solve this problem.
Privacy is a social concept [60, 34]. Accordingly, the privacy ontology should conceptualize privacy requirements in their social and organizational context . In other words, the ontology should consider not only the technical aspects of privacy but also its related social and organizational aspects. Since most systems these days are socio-technical systems consisting not only of technical components but also of humans along with their interrelationships, where different kinds of vulnerabilities might manifest themselves [59, 36]. More specifically, focusing on the technical aspects and leaving the social and organizational aspects outside the system’s boundary leaves the system open to different kinds of vulnerabilities that might manifest themselves in the social interactions and/or the organizational structure of the system.
In previous research , we worked toward addressing this problem by proposing an ontology for privacy requirements that has been mined through a systematic literature review. In this paper, we extend the ontology proposed in  with new and more refined concepts concerning both personal information and privacy requirements. Moreover, we implement the ontology, apply it to an Ambient-Assisted Living (AAL) illustrating example, and then validate it by querying the ontology instance (e.g., the AAL example) depending on a set of competency questions. Finally, we evaluate the ontology against common pitfalls in ontologies with the help of some tools, lexical semantics experts, and privacy and security researchers.
The rest of the paper is organized as follows; Section (§2) presents the AAL example that is used to illustrate our work, and we describe the process we followed for developing the COPri ontology in Section (§3). Section (§4) describes the conceptual model of COPri, and we implement and validate the ontology in Section (§5) and (§6) respectively. We evaluate the ontology in Section (§7), and we discuss threats to its validity in Section (§8). Related work is presented in Section (§9), and we conclude and discuss the future work in Section (§10).
2 Illustrating example: the Ambient-Assisted Living (AAL) System
Longevity among the elderly has result in many challenges for society and the health care system as well, such as increasing in age-related diseases (e.g., Alzheimer, diabetes, etc.), which in turn leads to a shortage of caregivers . But this is not the only problem since most older people (around 89%) prefer to stay at their own homes [107, 79], and given the costs of home care nursing, it is imperative to develop technologies that help older people to age in place .
AAL systems sound to be an appropriate solution to this problem. AAL systems rely on monitoring and actuating devices to shift some of the healthcare services from a hospital-centered to a patient-centric treatment . In other words, instead of being measured face-to-face, a patient’s health status can be sensed remotely, continuously, and in real time, and then such information is processed and transferred to a hospital or health care center . Moreover, AAL technologies facilitate communication among physicians and patients, and allows for discussing medical data and negotiating treatment procedure remotely . This decrease both the costs of health care services and also the workload of medical practitioners [105, 63, 107]. However, numerous studies showed that privacy is one of the most highlighted criticisms for such technology .
Our motivating example concerns an old person called Jack that suffers from diabetes disease. Jack lives in a home that is equipped with AAL system, which provides an appropriate environment for Jack to live normally. In particular, the AAL system depends on various interconnected body sensors (e.g., electroencephalography (ECG), electromyography (EMG), Continuous Glucose Monitoring (CGM), location, and motion sensors) that collect various information concerning Jack’s vital signs, location, and activities. This information is transmitted to Jack’s Personal Digital Assistant (PDA) that assesses his health situation and provide required notification accordingly.
Jack’ PDA also forward the information to a nearby caring center, where a virtual nurse called Sarah can monitor such information, and she can also monitor some of Jack’s activities (e.g., watching tv, sleeping, preparing or having a meal, etc.) by collecting location and motion activities. Sarah can detect unusual situations and react accordingly, she also has access to all Jack’s health records and she may contact the required medical professional (e.g., General Practitioner (GP), consulting physicians) that might be needed depending on Jack’s situation. Jack, like many other users, wants to preserve his privacy by controlling what is collected and shared of his personal information, who is using such information, and for which reasons.
. Such architecture classifies the BAN communications into three types:1- intra-BAN that has a range of about two meters around the human body and covers communications between body sensors (e.g., ECG, EMG, CGM, and motion sensor) and PDA; 2- inter-BAN covers communications between a PDA and one or more access points (APs); and 3- beyond-BAN connects the APs to the internet and other networks. This architecture helps in better understanding and dealing with privacy requirements/concerns.
3 The process for developing the COPri ontology
The process for developing COPri has been developed based on [37, 95, 25] following the five principles proposed by Gruber  (e.g., clarity, coherence, extendibility, minimal encoding bias, and minimal ontological commitment). The process is depicted in Figure 2, and it is composed of five main steps:
Step 1. scope & objective identification aims at identifying the scope of the ontology, the purposes it will be used for, and its intended users [95, 25]. As previously highlighted, there is a need for addressing privacy concerns during the system design (e.g., Privacy by Design (PbD) [52, 57]). Nevertheless, based on the results of our systematic literature review , most existing studies miss several key privacy concepts and relationships. Therefore, it is almost impossible to address main privacy concerns during the system design. To this end, COPri aims at assisting software engineers while designing privacy-aware systems that belong to various domains by providing a generic and expressive set of key privacy concepts and relationships, which enable for capturing privacy requirements of the system-to-be in their social and organizational context.
Step 2. Knowledge acquisition aims at identifying and collecting knowledge needed for the construction of the ontology. We have conducted a systematic literature review with a main purpose of identifying the key concepts and relationships for capturing privacy requirements111A detailed version of the systematic literature review can be found at . In particular, five electronic database sources have been used for the acquisition of knowledge. 240 relevant papers have been returned, among which we have selected 34 after removing duplicated papers and applying several selections and quality assessment criteria. Then, we have analyzed the contents of selected studies identifying 38 privacy related concepts and relationships222In the case of multiple synonyms, some were omitted, which have been grouped into four main groups based on their type: 17 organizational concepts capture social and technical aspects of the system-to-be; 9 concepts to capture risks that might endanger privacy requirements; 5 treatment concepts capture countermeasure techniques to mitigate risks to privacy needs; and 7 privacy concepts capture the stakeholders privacy requirements/needs.
Step 3. Conceptualization aims at structuring the acquired knowledge into a conceptual model that captures the key concepts of the ontology along with their interrelationships . In , we have built an ontology based on the 38 selected key concepts and relationships. In his paper, we extend this ontology with more refined concepts concerning both personal information and privacy requirements. Additionally, we conducted a survey to collect feedback from privacy and security researchers to evaluate the completeness of our proposed ontology, i.e., determine whether the selected concepts and relationships are capable of properly dealing with privacy requirements or they need to be extended or refined. The feedback confirmed that most of the concepts and relationships are appropriate and the ontology is capable of capturing privacy requirements. Some feedback suggested to refine, include or exclude some concepts/relationships, we took these suggestions into account while developing the final ontology. A detailed description of the resulting ontology (COPri) is provided in the next section.
Step 4. Implementation
aims at codifying the ontology in a formal language. This requires an environment that guarantees the absence of lexical and syntactic errors from the ontology; translators, to guarantee the portability of the definitions into other target languages; and an automated reasoner to detect incompleteness, inconsistencies and redundant knowledge. Although there exist several environments for developing (codifying) ontologies (NeOn Toolkit , OntoEdit , SWOOP , protégé ). We have chosen Protégé333http://protege.stanford.edu/ that is a set of open-source and domain-independent ontology design software developed in Stanford Medical Informatics. Protégé can be used easily for creating, modifying, visualizing and checking the consistency of ontology. Moreover, the reasoner can be used to automatically compute a classification hierarchy (inferred hierarchy) based on a manually constructed class hierarchy that is called the asserted hierarchy. In addition, protégé offers several useful plug-ins for visualizing ontology, and most importantly it offers a plug-in for using SPARQL (Protocol and RDF Query Language) to extract knowledge from ontology through defined queries and rules . The implementation of COPri is discussed in section 5.
Step 5. Validation, aims at ensuring that the resulting ontology meets the needs of its usage, i.e., the ontology corresponds to the system, which it is supposed to represent . According to , informal and formal questions/queries can be used to validate ontology. Following [26, 16], we validated COPri after applying it to the AAL illustrating example by querying the ontology instances depending on Competency Questions (CQs), and then verifying the correctness of the results of such queries. More specifically, the CQs are used to evaluate whether the ontology captures enough detailed information about the targeted domain to fulfill the needs of its intended use. The validation of COPri is discussed in more details in section 6.
4 The conceptual model of COPri
In this section, we present the conceptual model of COPri in terms of its concepts and relationships. Figure 3 shows the meta-model of COPri as a UML class diagram. The concepts of COPri are organized into four main dimensions:
- Organizational dimension:
proposes concepts to capture the social and technical components of the system in terms of their capabilities, objectives, and dependencies.
- Risk dimension:
proposes concepts to capture risks that might endanger privacy needs at the social and organizational levels.
- Treatment dimension:
proposes concepts to capture countermeasure techniques to mitigate risks to privacy needs.
- Privacy dimension:
proposes concepts to capture the stakeholders’ (actors) privacy requirements/needs concerning their personal information.
(1) Organizational dimension, includes concepts for capturing the organizational aspects of the system, which are further organized into several categories such as agentive, intentional and informational entities, social dependencies and social trust. In what follows, we define each of these categories in terms of their concepts and relationships.
Agentive entities: captures the active entities of the system, we have three concepts along with two relationships:
represents an autonomous entity that has intentionality and strategic goals within the system, and it covers two entities: a role and an agent:
represents an abstract characterization of an actor in terms of a set of behaviors and functionalities within some specialized context. A role can be a specialization (is_a relationship) of one another.
represents an autonomous entity that has a specific manifestation in the system. An agent can plays a role or more within the system, where an agent inherits the properties of the roles it plays.
Intentional entities: captures the objectives that the actors aim to achieve. Therefore, we adopted the goal concept as well as and/or decomposition (refinement) relationships to represent such objectives.
- A goal
is a state of affairs that an actor aims to achieve. When a goal is too coarse to be achieved, it can be refined through and/or-decompositions of a root goal into finer sub-goals.
implies that the achievement of the root-goal requires the achievement of all of its sub-goals.
is used to provide different alternatives to achieve the root goal, and it implies that the achievement of the root-goal requires the achievement of any of its sub-goals.
Informational entities: capture the Information related concepts and relationships:
represents a statement provided or learned about something or someone. Information can be atomic or composite (composed of several parts), and we rely on partOf relationship to capture the relationship between an information entity and its sub-parts. Moreover, we differentiate between two types of information:
- Public information,
any information that cannot be related (directly or indirectly) to an identified or identifiable legal entity.
- Personal information,
Several researchers have advocated that not all personal information has the same sensitivity levels (e.g., [97, 17, 57]). Moreover, various sensitivity levels and categories for personal information have been proposed (e.g., [12, 21, 93, 55]). To this end, we include sensitivity level concept that personal information has in our ontology. Based on , we adopt four different sensitivity levels ordered as (R)estricted, (C)onfidential, (S)ensitive, and Secre(T), where Secre(T) is the most sensitive. Personal information with different sensitivity levels may have different privacy requirements, i.e., sensitivity levels can be used to facilitate the identification of privacy requirements.
On the other hand, numerous works (e.g., [67, 5, 70]) have linked the sensitivity of personal information to when and where such information has been collected and for what purposes, i.e., the context/state of affairs related to such information. Thus, we adopt the concept of situation as a mean to determine the sensitivity level of personal information, where a situation can be defined as a partial state of affairs in terms of things that exist in that state, their properties, and interrelations .
Use is a relationship between a goal and information, and it has three attributes:
- Type of Use (ToU),
our ontology provide four different types of use:
indicates that information is created by a goal;
indicates that information is consumed by a goal;
indicates that information is modified/altered by a goal;
indicates that information is acquired by a goal.
- Need to Use (NtU)
captures the necessary of use, and we differentiate between two main types:
- Purpose of Use (PoU),
we differentiate between two main categories of purposes of use:
indicates that the purpose for which information is used is compliant with the rules that guarantee the best interest of its owner;
indicates that the purpose for which information is used is not compliant with the rules that guarantee the best interest of its owner.
Describes is a relationship where information characterizes a goal (activity) while it is being pursued by some actor.
Information ownership & Permissions: capture the relationships among personal information, the legal entities who own them, and how such entities control the use of such information by others.
indicates that an actor is the legitimate owner of information, where information owner has full control over the use of information it owns.
is consent that identifies a particular use of a particular information in a system . Information owner (data subject444 Information owner and data subject are synonyms in this paper) controls the use of its own information depending on permissions over such information. In COPri, a permission has a type that is (P)roduce, (R)ead, (M)odify and (C)ollect, which cover the four relationships between goals and information that our ontology proposes.
Entities interactions: capture the interactions/dependencies among actors of the system concerning their objectives and entitlements. The ontology adopts three types of interactions:
- Information provision
captures the transmission of information (provisionOf) by an actor (provisionBy) to another one (provisionTo), where the source of the provision relationship is the provider and the destination is the requester. Moreover, information provision has a type that can be either confidential or nonConfidential, where the former guarantee the confidentiality of the transmitted information, while the last does not.
indicates that actors can delegate obligations and entitlements to one another, where the source of delegation called the delegator, the destination is called delegatee, and the subject of delegation is called delegatum. The concept of delegation is further specialized into two concepts: Goal delegation, where the delegatum is a goal; and Permission delegation, where the delegatum is a permission.
is considered as a key component of social commitment, and it indicates that an actor accepts to take responsibility for the delegated objectives and/ or entitlements from another actor .
Entities social trust: the need for trust arises when actors depend on one another for goals or permissions since such dependencies might entail risk [11, 31]. Therefore, our ontology adopts the concept of trust to capture the actors’ expectations of one another concerning their delegations. The source of trust called the trustor, the destination is called trustee, and the subject of trust is called trustum. Trust has a type that can be either:
means the trustor expect that the trustee will behave as expected considering the trustum (e.g., a trustee will achieve the delegated goal, or it will not misuse the delegated permission).
means the trustor expect that the trustee will not behave as expected considering the trustum (e.g., a trustee will not achieve the delegated goal, or it will misuse the delegated permission).
The concept of Trust is further specialized into two concepts GoalTrust, where the trustum is a goal; and PermissionTrust, where the trustum is a permission.
Monitoring: can be defined as the process of observing and analyzing the performance of an actor in order to detect any undesirable performance . We adopt the concept of monitoring to compensate the lack of trust or distrust in the trustee concerning the trustum [27, 106], where the source of monitoring is called the monitor, the destination is called monitoree.
The concept of monitor is further specialized into two concepts GoalMonitor, where the subject of the monitoring is a goal; and PermissionMonitor, where the subject of the monitoring is a permission.
(2) Risk dimension, includes risk related concepts along with their interrelationships (e.g., threat, vulnerabilities, attack, etc.) concerning personal information. In what follows, we define each of these concepts and their interrelationships:
- A vulnerability
- A threat
is a potential incident that threaten personal information by exploiting a vulnerability concerning such information [62, 85, 54]. A threat can be either natural (e.g. disaster), accidental (e.g. hardware or software failure), or intentional (e.g. theft of personal information) [98, 88]. COPri differentiates between two types of threat:
- Incidental threat
is a casual, natural or accidental threat that is not caused by a threat actor nor require an attack method. Incidental threat has a probability that measures the likelihood of its occurs, and it is is characterized by three different values high, medium or low.
- Intentional threat
- Threat actor
- Attack method
(3) Treatment dimension, includes countermeasure concepts to mitigate risks. COPri proposes a high abstraction level concepts to capture the required protection/treatment level (e.g., privacy goal), which can be refined into concrete protection/treatment constraints (e.g., mechanisms or policies) that can be implemented. The concepts of the treatment dimension are:
- A privacy goal
defines an aim to counter threats and prevents harm to personal information by satisfying privacy criteria concerning such information.
- A privacy constraint
is a privacy statement that defines the permitted and/or forbidden actions to be carried out by actors of the system toward information.
- A privacy mechanism
is a concrete technique to be implemented for helping towards the satisfaction of privacy goal. Some mechanisms can be directly applied to personal information (e.g., anonymity, unlinkability).
(4) Privacy dimension, introduce concepts to capture the actors’ privacy requirements/needs concerning their personal information. The concepts of the privacy dimension are:
- Privacy requirement
that is used to capture information owners’ privacy needs at a high abstraction level concerning their personal information. Privacy requirements are interpretedBy privacy goals. Moreover, privacy requirement is further specialized into seven more refined concepts:
Note that non-disclosure also covers information provision (e.g., confidential information provision). Therefore, non-disclosure can be analyzed depending on the existence of read permission as well as the confidentiality of information provision.
- Need to Know (NtK),
personal information can only be used if it is strictly necessary for completing a certain task . NtK can be analyzed depending on Need to Use (NtU) that captures the necessary of use, i.e., if the type of NtU is optional (i.e., not required) a violation can be raised.
- Purpose of Use (PoU),
the identity of information owner should not be disclosed unless it is strictly required [17, 87, 50, 71], i.e., the primary/secondary identifiers of the data subject (e.g., name, social security number, address, etc.) should be removed if they are not strictly required and information still can be used for the same purpose after their removal. Personal information can be anonymized depending on some privacy mechanism.
the identity of information owner should not be observed by others, especially third parties, while performing an activity (e.g., pursuing a goal) [50, 52, 71]. Unlike Anonymity and Unlinkability that try to hide the identity of information owner, Unobservability aims to hide some activities that are performed by the information owner .
Unobservability can be analyzed relying on the describes relationship, which enables for detecting situations where personal information that describes an activity (goal) being pursued by a data subject is being collected by some other actor .
information owner should be notified when its information is being collected [97, 87, 17]. Notice is considered mainly to address situations where personal information related to a legitimate entity is being collected without her knowledge. Notice can be analyzed depending on the collect relationship and its corresponding permission.
In case, personal information is being collected and there is no permission to collect, a notice violation will be raised. Providing a permission to collect means that the actor has been already notified and agrees his personal information to be collected.
a mechanism aims at verifying whether actors are who they claim they are. Authentication can be analyzed by verifying whether 1- the actor is playing a role that enables to identify its main responsibilities; and 2- the actor is not playing any threat actor role. If both of these rules did not hold, a violation can be raised.
a mechanism aims at verifying whether actors can use information in accordance with their credentials . Authorization can be analyzed by verifying whether the actor has the required permissions to perform a task at hand.
information owner should have a mechanism available to them to hold information users accountable for their actions concerning information [17, 54]. We rely on the non-repudiation principle to analyze accountability:
5 The implementation of COPri
This section describes how we have implemented (codified) the COPri ontology depending on Protégé software555http://protege.stanford.edu/. In Protégé, ontology consists of Classes, Properties, and Individuals. Classes are concrete representation of concepts, and they can be interpreted as sets that contain Individuals (also known as instances of classes). In other words, classes are used to specify conditions that must be satisfied by individuals to be a member of such classes, where all Classes are subclasses of the class Thing. While Properties are binary relationships among Classes/Individuals.
We have implemented the conceptual model of COPri relying on classes and object properties in Protégé, and we have to modify and create new classes and relationships during this process. We have also created new classes and subclasses to represent attributes of some classes, since classes in Protégé cannot have attributes. Moreover, for each class that has attributes with quantitative values, we have created a class (called a Value Partition pattern) to present each of these attributes, and several individuals to cover all quantitative values of each of the attributes.
For example, the class Sensitivity that has Sensitivity level attribute, which can have the following valuesSecret, Sensitive, Confidential or Restricted has been represented by a class named Sensitivity level that has four defined individuals slSecret, slSensitive, slConfidential or slRestricted. Furthermore, we have defined the hasSensitive property to link the PersonalInformation class to the Sensitivity level class.
Classes may overlap and to ensure that an individual that belongs to one of the classes cannot be a member of any other class, such classes must be made disjoint from one another. Thus, all primitive siblings classes (e.g., PersonalInformation and PublicInformation) in our ontology have been made disjoint. This helps the reasoner to check the logical consistency of the ontology. Moreover, we have used Probe Classes , which are classes that are subclasses of two or more disjoint classes to test and ensure that the ontology does not include inconsistencies.
Additionally, we have used a covering axiom to solve the open world assumption in OWL-based ontologies, where a covering axiom is a class that results from the union of the classes being covered. In other words, a covering axiom means that any member of the covered class must be a member of the classes being covered. For example, PersonalInformation and PublicInformation are the only subclasses of the Information class, and using a covering axiom here means that Information must be one of these two subclasses, i.e., Information is covered by PersonalInformation and PublicInformation.
A restriction in Protégé can be used to describe a class of individuals based on the relationships that members of the class participate in. In other words, the class contains all of the individuals that satisfy the defined restriction. Restrictions can be categorized into existential and universal restrictions:
- Existential restrictions
(also known as some restrictions (someValuesFrom) and denoted by ) describe classes of individuals that participate in at least one relationship along a specified property to individuals that are members of a specified class .
- Universal restrictions
(also known as all restrictions (allValuesFrom) and denoted ) describe classes of individuals that for a given property only have relationships along this property to individuals that are members of a specified class .
By relying on Existential restrictions, we could say that a class is a subclass of other class if some property (relationship) holds. For example, PersonalInformation is a subclass of the Information class if some related property to the Actor class exist. This is called necessary conditions and it means if something is a member of this class then it is necessary to fulfill these conditions. However, with necessary conditions, we cannot say that, if something fulfills these conditions then it must be a member of this class.
This problem can be solved by relying on sufficient conditions that use universal restrictions, which means if something fulfills the defined conditions then it must be a member of this class. In this context, sufficient conditions enable us to say that if something is a member of the class PersonalInformation it is necessary for it to be a kind of Information, and it is necessary for it to only have a property of type related to the Actor class.
Using only sufficient conditions, the class PersonalInformation may also contains individuals that are Information and do not participate in any property of type related to the Actor class because universal restrictions do not specify the existence of a relationship. They merely state that if a relationship exists for the property then it must be to individuals that are members of a specific class.
This problem can be solved by using both of necessary and sufficient conditions, which enables to say, if something is a member of the class PersonalInformation, then it is necessary for it to be a kind of Information, and it is necessary for it to have a property of type related to the Actor class. In other words, using both of the necessary and sufficient conditions is sufficient to recognize all classes that must be a member of the class PersonalInformation.
On the other hand, properties are used to link individuals from a domain to individuals from a range. Thus, we have defined the domain and range for each object property in our ontology (shown in Table 1), which can be used by a reasoner to make inferences and detect inconsistencies. For example, the domain of property (relationship) aims is the class Actor and its range is the class Goal.
Moreover, we used only one inverse property (e.g., related property between PersonalInformation and actor classes) in our ontology to minimize the number of properties and because most of such properties can be inferred. Finally, we have used cardinality restrictions to specify the number of relationships between classes depending on at least, at most or exactly keywords.
|Object property||Domain||Range||Object property||Domain||Range|
6 The validation of COPri
In this section, we discuss how we validated our ontology depending on Competency Questions (CQs) that the ontology should be able to answer, i.e., to evaluate whether the ontology is able to capture detailed information about the targeted domain to fulfill the needs of its intended use . In other words, CQs represent a set of questions that the ontology must be capable of answering to be considered competent for tackling the problem it has been developed to solve [26, 95, 16]. In particular, we applied the COPri ontology to the AAL illustrating example, and then we validated COPri by formulating a set of CQs to query the ontology instances and check whether these queries are able to return reliable answers. Figure 5 shows a partial diagram of the AAL illustrating example represented in an extended goal model language666Note that this modeling language has not been developed yet, the diagram has been developed to assist the reader better understanding the usefulness of CQs.
The CQs are meant to assist and guide requirements engineers while dealing with privacy requirements in their social and organizational context by returning useful knowledge concerning the ontology. Moreover, some CQs have been developed to assist designers capturing (detecting and reporting) wrong/bad design decisions related to the four dimensions of our ontology, namely organizational, risk, treatment, and most importantly privacy requirements (e.g., confidentiality violation, notice violation, etc.).
The formulation of the CQs was an iterative process and aimed at covering main wrong/bad design decisions that we call violations. Therefore, several CQs have been refined and extended before having the final set of CQs. Note that the concepts and relationships of the ontology have been refined and extended as well when we were formulating the CQs because some limitations and inadequacies in the ontology have been revealed. The set of CQs777Note that the main focus of the CQs is privacy requirements, not goal analysis is shown in Table LABEL:table:SPARQL, where each CQ is represented both informally (natural language) and formally (SPARQL query). In what follows, we describe each of these four groups of CQs:
CQ1-3 are used to query organizational related aspects, where CQ1 can be used to capture situations where a permission is delegated without a trust or trust compensation (e.g., monitoring). With the absence of trust and monitoring relationships, the delegator cannot guarantee that the delegatee will not misuse the delegated permission. For example, if there was no trust nor monitoring between Jack and Sarah concerning the delegation of read and/or collect permissions of Jack’s location (shown in Figure 5), CQ1 will detect and report such situation.
CQ2 can be used to capture situations, where an actor monitors a delegation of permission although he/she trusts the delegatee, i.e., both trust and monitoring are used concerning the delegation of permission. In such situation, monitoring is not required and it is considered a bad design decision. Concerning the previous example, if there is also a monitoring concerning the delegation of read/collect between Jack and Sarah, CQ2 will detect such situation and report that the monitoring relationship is not required.
CQ3 can be used to return different sets of personal information based on their sensitivity levels (e.g., Secret, Sensitive, Confidential or Restricted).
CQ4-13 are used to query risk related aspects, where CQ4 can be used to return all vulnerabilities as well as the personal information that is subject to them. For example, CQ4 will return “V1. Weak masking technique” and “I1. Jack’s glucose level” if applied to the AAL example (Figure 5). CQ5 can be used to return vulnerabilities as well as threats that can exploit such vulnerabilities. Concerning the AAL example, it will return “V1.” and both “T1. Linking info to Jack SL[M]” and “T2. Linking info to Jack SL[M] PL[L]”. CQ6 can be used to return unmitigated vulnerabilities. In the AAL example, only one unmitigated vulnerability will be returned (e.g., “V1.”). CQ7 can be used to return any threat that is threatening personal information as well as the threatened information. This CQ will return both “T1.”, “T2.” and “I1.” if applied to the AAL example.
CQ8 can be used to return different sets of threats based on their severity levels (e.g., Low, Medium, or High). If CQ8 has been applied to return threats with medium severity levels, both “T1.” and “T2.” will be returned. Otherwise, it will return nothing, i.e., no threats with Low nor High severity level exist in the AAL example. CQ9 can be used to return intentional threats as well as the personal information threatened by them. For example, CQ9 will return “T1.” and “I1.” as “T1.” is the only intentional threat in the example. CQ10 can be used to return threat actors and the intentional threats they intend for, applying CQ10 to our example, will return “Bob” and “T1.” since “Bob” is the only threat actor in our example.
CQ11 can be used to return attack methods and the intentional threats they are used for. For instance, CQ11 will return “AM1. De-masking technique” and “T1.” if applied to the AAL example. CQ12 can be used to return incidental threats and personal information that are threatened by them, and it will return “T2.” and “I1.” if applied to the AAL example. CQ13 can be used to return different sets of incidental threats based on their probability levels (e.g., Low, Medium, or High). If CQ13 has been applied to return incidental threats with a Low probability level, “T2.” will be returned. Otherwise, it will return nothing since no incidental threats with medium nor high probability exists in the example.
CQ14-15 are used to query treatment-related aspects, where CQ14 can be used to return privacy goals that have been realized by privacy constraints. Applying CQ14 to our example will return two privacy goals “PG1. Ensure anonymity” and “PG2. Ensure unlinkability”. CQ15 can be used to return privacy mechanisms as well as the personal information such mechanisms are applied to. Applying CQ15 to the example will return both “PC1. Anonymization mechanism” and “PC2. Unlinkability mechanism” privacy mechanisms as well as “I1.”
CQ16-26 are used to query privacy requirements related violations, where the definitions of the seven main privacy requirements violations are presented in Table 2. In particular, CQ16-19 are used for analyzing Confidentiality, where CQ16-17 are used for analyzing non-disclosure by detecting and reporting when personal information is read without the owner’s permission (CQ16), or it has been transferred relying on non-confidential transmission mean (CQ17). For example, if Sarah did not have a read permission concerning “Jack’s glucose level”, CQ16 will detect and report such situation. While if “Jack’s health situation” (personal information) is transferred to Mike relying on the non-confidential provision, CQ17 will detect and report such violation.
|Pri. req.||Privacy requirement violation definition|
|Confidentiality||Disclosure, personal information is disclosed without the owner’s consent (permission), personal information is used for tasks, where it is not strictly required, and/or personal information is used for tasks that are incompatible with the specific, explicit, legitimate purposes, which has been permitted to be used for.|
|Anonymity||Identifiability, the identity of information owner can be sufficiently identified, i.e., it can be disclosed.|
|Unlinkability||Linkability, the link between personal information and its owner can be sufficiently distinguished, i.e., it is possible to like personal information back to its owner.|
|Unobservability||Observability, the identity of information owner can be observed/detected by others while performing an activity.|
|Notice||Unnotified, personal information is being collected without notifying its owner.|
|Transparent||Untransparent, information owner is not able to know who is using his/her information and for what purposes.|
|Accountable||Unaccountable, information owner cannot hold information users accountable for their actions concerning its personal information.|
CQ18 is used for analyzing Need to Know (NtK) principle by verifying whether personal information is strictly required by goals using them. For example, if the Need to Use (NtU) of the goal “Assess Jack’s situation” concerning “Jack’s vital signs” is optional, CQ18 will detect and report that such information is not strictly required by the goal.
CQ19 is used for analyzing the Purpose of Use (PoU) principle by verifying whether personal information is used for specific, explicit, legitimate purposes that have been permitted to be used for. For instance, if the PoU of the goal “Assess Jack’s situation” concerning “Jack’s vital signs” is incompatible, CQ19 will detect and report that the PoU of such information is incompatible for specific, explicit, legitimate purposes that have been permitted to be used for.
CQ20 is used for analyzing anonymity by verifying whether the identity of the information owner can be sufficiently identified. For example, if “Jack’s glucose level” has not been anonymized relying on “PC1. Anonymization mechanism”, CQ20 will detect and report that “Jack’s glucose level” can be used to sufficiently identify the identity of Jack.
CQ21 is used for analyzing Unlinkability by verifying whether it is possible to link personal information back to its owner. For example, if “PC2. Unlinkability mechanism” was not applied to “Jack’s glucose level”, CQ21 will detect and report that it is possible to link “Jack’s glucose level” to Jack.
CQ22 is used for analyzing unobservability by verifying whether the identity of the information owner can be observed by others while performing some activity. Consider that Jack does not want his activities to be monitored while he is in the bathroom. Then “Jack’s location” should not be collected when he is in the bathroom, since such information can be used to describe activities performed by Jack, which he does not want it to be observed by others. If “Jack’s location” is collected, CQ22 will be able to detect and report that.
CQ23 is used for analyzing notice by verifying whether personal information is being collected without notifying its owner. We consider that providing permission to collect implies that the actor has been already notified and agreed upon the collection of its personal information. In case, personal information is being collected and there is no permission to collect, CQ23 will detect and report such violation.
CQ24-C25 are used for analyzing transparency, where CQ24 analyze the authentication principle and CQ25 analyze the authorization principle. In particular, CQ24 verifies whether an actor can be authenticated by checking if it is playing at least one role that enables for identifying its main responsibilities888If an actor is not playing any role, it will be impossible to authenticate it, and the actor is not playing any threat actor role. Accordingly, CQ24 will be able to detect and report whether such actor can be authenticated. Considering our example, Bob is playing a threat actor role, and it will be returned if CQ24 was applied. While CQ25 analyze authorization by verifying that actors are not using personal information without the required permissions. In case, Sarah was reading/collecting any of Jack’ personal information without read/collect permission, CQ25 will be able to detect and report such violation.
Finally, CQ26 is used for analyzing accountability relying on the non-repudiation principle by verifying that actors cannot repudiate that they accepted delegations, which can be done depending on the adoption concept, if there exists a delegatee without an adopt relationship to the delegatum, CQ26 will detect and report such violation. Concerning our example, if Sarah did not adopt the read and collect permissions that have been delegated by Jack, CQ26 will detect and report such violations.
Evaluation aims to provide evidence that artifact achieves the purpose for which it has been developed [45, 75, 99, 39]. We evaluate the COPri ontology against the common pitfalls in ontologies identified in 999The catalog of the pitfalls can be found in Appendix A. These pitfalls can be classified by criteria under 1- Consistency verifies whether the ontology includes or allows for any inconsistencies; 2- Completeness verifies whether the domain of interest is appropriately covered; and 3- Conciseness verifies whether the ontology includes irrelevant elements or redundant representations of some elements with respect to the domain to be covered. The pitfalls classification by criteria is shown in Table 3, where we can also identify the four different methods we followed to evaluate the COPri ontology against each of the pitfalls:
|Consistency||P1.||Creating polysemous elements||-||-||✓||-|
|P5.||Defining wrong inverse relationships||-||✓||-||-|
|P6.||Including cycles in the hierarchy||✓||✓||-||-|
|P7.||Merging different concepts in the same class||-||✓||✓||-|
|P15.||Misusing “not some” and “some not”||✓||-||-||-|
|P18.||Specifying too much the domain or the range||✓||-||-||-|
|P19.||Swapping intersection and union||-||✓||-||-|
|P24.||Using recursive definition||-||✓||✓||-|
|Completeness||P4.||Creating unconnected ontology elements||✓||✓||-||-|
|P9.||Missing basic information||-||-||-||✓|
|P11.||Missing domain or range in properties||✓||✓||-||-|
|P12.||Missing equivalent properties||-||✓||-||-|
|P13.||Missing inverse relationships||-||✓||-||-|
|P16.||Misusing primitive and defined classes||✓||-||-||-|
|Conciseness||P2.||Creating synonyms as classes||-||✓||✓||-|
|P3.||Creating the relationship “is” instead of using “subclassOf”, “instanceOf” or “sameIndividual”||-||✓||-||-|
|P17.||Specializing too much a hierarchy||✓||-||✓||-|
|P21.||Using a miscellaneous class||✓||✓||✓||-|
1- Protégé & HermiT Reasoner101010http://www.hermit-reasoner.com/: HermiT is the first publicly available OWL reasoner, and it can perform various automated checks such as consistency, satisfiability, etc. of OWL-based ontologies. Both Protégé & HermiT have been used to verify COPri against several pitfalls. In particular, HermiT is able to detect cycles in the hierarchy (P6.), and P4. has been verified depending on OntoGraf plug-in that enables to visualize the ontology. Concerning P10., we have already made all primitive siblings classes in our ontology disjoint, i.e., no missing disjoint can be found in the ontology. We have manually checked whether the domain and range of all object properties have been defined (P11.). Moreover, we verified P14. depending on Probe Classes , which can be used to test and ensure that the ontology does not include inconsistencies.
COPri ontology cannot suffer from P15. since we did not use complement operators to describe/define any of the classes. All defined classes have been defined depending on both necessary and sufficient conditions, which makes the inferred hierarchy exactly the same as the asserted one. The concepts of the ontology are general enough to avoid both P17. and P18., i.e., specializing too much hierarchy (P17.), or specifying the domain and/or the range too much (P18.). Additionally, when we need very specific concepts (e.g., attribute of a class), we have used individuals. No miscellaneous class have been identified (P21.), since the names of all classes and their sub-classes have been carefully chosen.
2- Evaluation with OntOlogy Pitfall Scanner (OOPS!): OOPS! is a web-based ontology evaluation tool111111http://oops.linkeddata.es/index.jsp for detecting common pitfalls in ontologies . The ontology can be uploaded to OOPS!, which return an evaluation report about the detected pitfalls, where each pitfall is described by its identifier, title, description, elements affected (e.g., classes, object properties, or even the whole ontology) and an importance level. There are three importance levels based on the impact that a pitfall may have on the ontology: 1- Critical: it is crucial to correct the pitfall. Otherwise, the consistency, reasoning, applicability, etc. of the ontology could be affected; 2- Important: it is not critical for functionality of the ontology, but it is important to be corrected; and 3- Minor: it does not represent a problem, but correcting it makes the ontology better organized and user friendly.
Result of evaluation: the COPri ontology was uploaded to the OOPS! pitfall scanner, which returned an evaluation report that is shown in Figure 6121212Evaluation with OOPS! has been performed after evaluating the ontology with Protégé & HermiT, i.e., several pitfalls have been already detected and corrected. In particular, two suggestions (Figure 7) have been returned, proposing that it might be better to characterize both of is_a and partOf relationships as symmetric or transitive. We took these suggestions into account, characterizing both of these relationships as transitive. 53 minor pitfalls (P13: inverse relationships not explicitly declared) have been identified. However, as mentioned earlier we used only one inverse property to minimize the number of properties/relationships in the ontology.
Finally, only one critical pitfall has been identified shown in Figure 8, and looking closely at this pitfall, we can see that the reseanor identify that we are using is_a relationship instead of using OWL primitives for representing the subclass relationship (rdfs:subClassOf). However, is_a relationship is used in most of Goal-based modeling languages, where we have adopted many of the concepts/relationships of the COPri ontology. Therefore, we chose not to replace it with the subClassOf relationship. The result of the second test after addressing one of the suggestions is shown in Figure 9.
3- Lexical semantics experts:
two lexical semantics experts with main focus on Natural Language Processing (NLP) have been provided with the COPri ontology and they were asked to check whether the ontology suffers from any of the following pitfalls:P1. Creating polysemous elements, P2. Creating synonyms as classes, P7. Merging different concepts in the same class, P17. Specializing too much a hierarchy, P21. Using a miscellaneous class, and P24. Using recursive definition.
Result of evaluation: several issues have been raised by the experts mainly concerning P2. Creating synonyms as classes, and P24. Using recursive definition. Most of these issues has been properly addressed. The experts’ feedback and how it was addressed can be found in Appendix B.
4- A survey with researchers: the main purpose of this survey was evaluating the adequacy and completeness of the COPri ontology in terms of its concepts and relationships for dealing with privacy requirements in their social an organizational context (P9.), i.e., whether the selected concepts and relationships are adequate to deal with privacy requirements or they need to be refined or extended. The survey was closed, i.e., it was accessible through a special link that is provided to the invited participants only to avoid unintended participants. In what follows, we describe the survey participants, the survey template design, and the result of the survey:
Survey participants: in total 25 potential participants were contacted to complete the survey, and they were asked to forward the email to anyone who fits in the participating criteria (e.g., has good experience in privacy and/or security). We have received 16 responses (64% response rate).
Survey template design: the survey template131313The survey template can be found at https://goo.gl/bro8nG, and it is composed of four main sections: S1. General information about the survey includes a description of the purpose of the survey, privacy and confidentiality statement, and informed consent to be read and accepted (checked) by participants before providing any input. S2. Participant demographic includes four questions related to the participant’s name, occupation, type of experience (academic and/or industry), and years of experience with privacy and/or security. S3. Evaluating the COPri ontology aims at collecting feedback from participates for evaluating the adequacy and completeness of the COPri ontology in terms of its main concepts and relationships categories and dimensions. S4. Final remarks] aims at collecting suggestion and/or criticism concerning the COPri ontology.
S2. Result of demographic questions. 15 (93.8%) of the participants are researchers (e.g., Professors, Post-docs, and PhD. candidates) and 1 (6.2%) is a student (e.g., MSc, bachelor). Concerning experience with privacy and/or security: 2 (12.5%) of the participants have both academic and industrial experience, and 14 (87.5%) have pure academic experience. Moreover, 3 (18.8%) have less than one year, 7 (43.8%) have between one and four years, and 6 (37.5%) have more than four years of experience.
S.3 Result of evaluation questions. This section is composed of 10 subsections, each of them is dedicated to collect feedback concerning the adequacy and completeness of a specific dimension/category of concepts and relationships. In each of these subsections, we provide the definitions of the concepts and relationships of the targeted dimension/category as well as a diagram representing them. Followed by a mandatory question, asking the participant to grade the completeness of the presented concepts and relationships with respect to the system aspects they aim to capture on a scale from 1 (incomplete) to 5 (incomplete).
In total, we have defined 10 questions each of them for a category or a dimension under evaluation. In particular, Q1-7 cover the seven main categories of concepts in the organizational dimension as follows: Q1 for the agentive entities category, Q2 for the intentional entities category, Q3 for the information entities category, Q4 for the goals & information interrelationships category, Q5 for the information ownership & permissions category, Q6 for the entities interactions category, and Q7for the entities social trust category. Moreover, we defined Q8for the risk dimension, Q9 for the treatment dimension, and Q10 for the privacy dimension. The result of the evaluation for each of these sections is summarized in Table 4. The result tends to demonstrate that most of the targeted dimension/category of concepts and relationships are properly covering the aspects they aim to represent.
Additionally, we have added an optional question in each of 10 sections to evaluate the adequacy of the concepts and relationships by collecting suggestions to improve the category/dimension under evaluation. Some feedback suggested to refine, include or exclude some of the concepts/relationships, we took some of these suggestions into account while developing the final ontology.
Result of the final remarks question: most of the feedback was valuable, has raised important issues and ranged from complementing to criticizing. For example, among the encouraging feedback we received “COPri covers a wide range of privacy-related concepts, with actor and goal oriented perspectives, which looks promising. We look forward to seeing it used to capture real-world privacy problem context”. Another feedback and suggestion was “I think it is very precise and a very good work. Maybe some others concepts could be expressed somewhere”. One of the comments we received was “How satisfaction of privacy requirements can be verified using it?”. We also received criticisms such as the following one “I have no idea how good it is unless it is applied to many real cases. I’m concerned that it is not grounded in reality. It’s also very complicated, which makes it hard to apply in industry”. However, such criticism opens the way for future research directions.
|Q1. Agentive cat.||0 (%0)||1 (%6.3)||3 (%18.8)||6 (%37.5)||6 (%37.5)|
|Q2. Intentional cat.||0 (%0)||1 (%6.3)||4 (%25.0)||7 (%43.8)||4 (%25.0)|
|Q3. Informational cat.||0 (%0)||2 (%12.5)||4 (%25.0)||4 (%25.0)||6 (%37.5)|
|Q4. Goals & info cat.||0 (%0)||2 (%12.5)||2 (%12.5)||6 (%37.5)||6 (%37.5)|
|Q5. Ownership cat.||0 (%0)||1 (%6.3)||1 (%6.3)||5 (%31.3)||9 (%56.3)|
|Q6. Interactions cat.||0 (%0)||1 (%6.3)||1 (%6.3)||6 (%37.5)||8 (%50)|
|Q7. Social Trust cat.||0 (%0)||0 (%0.0)||4 (%25.0)||7 (%43.8)||5 (%31.3)|
|Q8. Risk dim.||0 (%0)||3 (%18.8)||0 (%00.0)||8 (%50)||5 (%31.3)|
|Q9. Treatment dim.||0 (%0)||0 (%0.00)||3 (%18.8)||7 (%43.8)||6 (%37.5)|
|Q10. Privacy dim.||0 (%0)||2 (%12.5)||2 (%12.5)||5 (%31.3)||7 (%43.8)|
8 Threats to validity
After presenting and discussing the of our ontology, we list and discuss the threats to its validity in this section. Following Runeson et al. , we classify the identified threats under two types, internal and external:
1- Internal threats: is concerned with factors that have not been considered in the study, and they could have influenced the investigated factors [92, 81]. We have identified one threat: Authors’ background,] the authors of this study have good experience in goal modeling (especially in i*  based languages). This may have influenced the selection and definitions of the concepts and relationships of the ontology. However, i* languages have been developed with the aim to capture requirements in their social and organizational context, which is also a main objective of our ontology.
2- External threats: is concerned with to what extent the results of the study can be generalized . We have identified two threats: 1. Validity of the survey result, the number of participants can raise concerns about the validity of the result. However, most of them are experts with good experience in privacy, and some of them are high-profile researchers. 2. Extensive evaluation, the ontology has been evaluated against the common pitfalls in ontologies with the help of some tools, lexical semantics experts, and privacy researchers, yet it has not been applied in industry, which may reveal undetected errors and new ways to improve it. However, applying our ontology to real case studies from different domains is on our list for future work.
9 Related work
Several ontologies have been proposed for dealing with privacy and security. For example, Oltramari et al.  propose PrivOnto, a semantic framework for the analyzing privacy policies, they also developed an interactive online tool that allows users to explore 23,000 annotated data practice instantiated in the PrivOnto knowledge base. Singhal and Wijesekera  provide a security ontology that can be used to identify which threats endanger which assets and what countermeasures can be used. Moreover, Massacci et al.  propose ontology for security requirements engineering that adopts concepts from Secure Tropos methodology , and several industrial case studies. While Velasco et al.  introduce an ontology-based framework for representing and reusing security requirements based on risk analysis. Additionally, Kang and Liang  developed security ontology for software development, which includes most common security concerns, and Dritsas et al.  developed an ontology for designing and developing a set of security patterns that can be used to deal with security requirements for e-health applications. General privacy ontologies/taxonomies (e.g., Anton and Earp , Solove et al. , and Wuyts et al. ) can serve as a general knowledge repository for a knowledge-based privacy goal refinement.
On the other hand, several approaches for dealing with privacy requirements have been proposed in the literature. For instance, Spiekermann and Cranor  propose guidelines for building privacy-friendly systems and three-layer model of user privacy concerns and relate them to system operations in terms of data transfer, storage, and processing. In addition, they propose guidelines for building privacy-friendly systems. Moreover, Deng et al.  provide a methodology for modeling privacy-specific threats for software systems along with a catalog of privacy-specific threat tree patterns, which can be used to address threats identified during the analysis. Radics et al.  introduce the PREprocess, a framework for privacy requirements engineering, which has been designed to guide a privacy analyst during the collection and elicitation of privacy requirements through the identification of privacy-related patterns.
Moreover, Labda et al.  propose a privacy-aware Business Processes (BP) framework for modeling, reasoning and enforcing privacy constraints. The framework offers five concepts that can be used for analyzing privacy-related aspects such as access control, separation of Tasks (SoT), Binding of Tasks (BoT), user consent, Necessity to know (NtK), etc. Hong et al.  propose a privacy risk model specifically for ubiquitous computing, which captures privacy concerns at high abstraction level, and then refining them into concrete specific solutions. Gharib et al.  propose a holistic approach for analyzing privacy requirements that aim at assisting software engineers in designing privacy-aware systems by providing guidance and support while dealing with privacy requirements. Finally, Kalloniatis et al.  introduce PriS that is a security requirements engineering method, which supports eight types of privacy goals corresponding to the eight privacy concerns they identify in their work, namely: authentication, authorization, identification, data protection, anonymity, pseudonymity, unlinkability, and unobservability.
10 Conclusions and Future Work
We introduce COPri, a Core Ontology for Privacy requirements engineering that adopts and extends our previous work, where we proposed a privacy ontology that has been mined through a systematic literature review. In this paper, we extend and refine the concepts and relationships proposed in , we have also implemented the ontology depending on Protégé, and applied it to illustrating example concerning Ambient-Assisted Living systems. Then, we have validated the ontology by querying the ontology instance (AAL example) depending on Competency Questions (CQs). This allows evaluating whether the ontology is able to capture detailed knowledge about the targeted domain to fulfill the needs of its intended use. Finally, we evaluated the ontology against common pitfalls in ontologies with the help of some software tools, lexical semantics experts, and privacy and security researchers.
The main aim of developing COPri is assisting software engineers while designing privacy-aware systems by providing a generic and expressive set of key privacy concepts and relationships, which enable for capturing privacy requirements in their social and organizational context. This work is our second step towards proposing a well-defined privacy ontology, which when completed would constitute a great step forward in improving the quality of privacy-aware systems. However, much work is still to be done.
For future work, we plan to better validate our ontology by deploying it to capture privacy requirements for real case studies from different domains. Moreover, we will refine and analyze several privacy-related concepts. For example, we plan to better analyze how the sensitivity level of personal information can be determined based on the situation, and how sensitivity levels can be used to facilitate the identification of related privacy requirements. Moreover, we will refine the analysis of the Need to Use (NtU) property, trying to better characterize the relationship between a goal and personal information. Additionally, a special attention will be given for refining the analysis of the Purpose of Use (PoU) property, as Compatible/Compatible are two abstract to characterize such important property, and we will investigate how the PoU can be determined automatically based on the characteristics of goal.
-  Alessandro Acquisti, Allan Friedman, and Rahul Telang. Is There a Cost to Privacy Breaches? An Events Study. Fifth Workshop on the Economics of Information Security, pages 1—-20, 2006.
-  Irwin Altman. Privacy: a conceptual analysis. Environment and behavior, 8(1):7–29, 1976.
-  AnnieI. Anton and JuliaB. Earp. A requirements taxonomy for reducing Web site privacy vulnerabilities. Requirements Engineering, 9(3):169–185, 2004.
-  Awad and Krishnan. The Personalization Privacy Paradox: An Empirical Evaluation of Information Transparency and the Willingness to Be Profiled Online for Personalization. MIS Quarterly, 30(1):13, 2006.
-  Adam Barth, Anupam Datta, John C. Mitchell, and Helen Nissenbaum. Privacy and contextual integrity: Framework and applications. Proceedings - IEEE Symposium on Security and Privacy, 2006:184–198, 2006.
Shirley Beul, Martina Ziefle, and Eva Maria Jakobs.
It’s all about the medium: Identifying patients’ medial preferences
for telemedical consultations.
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 7058 LNCS, pages 321–336. Springer, 2011.
-  Stefano Braghin, Alberto Coen-Porisini, Pietro Colombo, Sabrina Sicari, and Alberto Trombetta. Introducing privacy in a hospital information system. In Proceedings of the fourth international workshop on Software engineering for secure systems - SESS ’08, pages 9–16. ACM, 2008.
-  Katherine Campbell, Lawrence A Gordon, Martin P MP Loeb, and Lei Zhou. The economic cost of publicly announced information security breaches: empirical evidence from the stock market. Journal of Computer Security, 11(3):431–448, 2003.
-  Cristiano Castelfranchi. Modeling social actions for AI agents. Artificial Intelligence, 103(January 1997):157–182, 1998.
-  Huseyin Cavusoglu, Birendra Mishra, and Srinivasan Raghunathan. The Effect of Internet Security Breach Announcements on Market Value: Capital Market Reactions for Breached Firms and Internet Security Developers. International Journal of Electronic Commerce, 9(1):69–104, 2004.
-  Kari Chopra and William A Wallace. Trust in Electronic Environments. In System Sciences, 2003. Proceedings of the 36th Annual Hawaii International Conference on, pages 10—-19. IEEE, 2002.
-  Edward V Comber. Managemen t of Confidential Information. In Proceedings of the joint computer conference, pages 135–143. ACM, 1969.
-  Mary J Culnan and Pamela K Armstrong. Information privacy concerns procedural fairness and impersonal trust: an empirical investigation. Organization science, 10(1):104–115, 1999.
-  Mina Deng, Kim Wuyts, Riccardo Scandariato, and Bart Preneel Wouter. A privacy threat analysis framework: supporting the elicitation and fulfillment of privacy requirements. Requirements Engineering, 16(1):1–27, 2011.
-  Tamara Dinev, Heng Xu, Jeff H. Smith, and Paul Hart. Information privacy and correlates: An empirical attempt to bridge and distinguish privacyrelated concepts. European Journal of Information Systems, 22(3):295–316, may 2013.
-  Hai Dong, Farookh Khadeer Hussain, and Elizabeth Chang. Application of Protégé and SPARQL in the field of project knowledge management. Second International Conference on Systems and Networks Communications, ICSNC 2007, 2007.
-  S Dritsas, L Gymnopoulos, M Karyda, T Balopoulos, S Kokolakis, C Lambrinoudakis, and S Katsikas. A knowledge-based approach to security requirements for e-health applications. Electronic Journal for E-Commerce Tools and Applications, pages 1–24, 2006.
-  Dang Viet Dzung and Atsushi Ohnishi. Ontology-based reasoning in requirements elicitation. In SEFM 2009 - 7th IEEE International Conference on Software Engineering and Formal Methods, pages 263–272. IEEE, 2009.
-  Golnaz Elahi, Eric Yu, and Nicola Zannone. A modeling ontology for integrating vulnerabilities into security requirements conceptual foundations. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), volume 5829 LNCS, pages 99–114. Springer, 2009.
-  Golnaz Elahi, Eric Yu, and Nicola Zannone. A vulnerability-centric requirements engineering framework: Analyzing security attacks, countermeasures, and requirements based on vulnerabilities. Requirements Engineering, 15(1):41–62, 2010.
-  Leonard Ellis. Privacy and the Computer - Steps to Practicality. British Computer Society, London, UK, 1972.
-  Amitai Etzioni. The Limits of Privacy. Ethics, 111(4):288, 1999.
-  Barbara J Evans, Bernard Lo, Deven Mcgraw, Deborah Peel, Richard Platt, and Kristen Rosati. The Health Insurance Portability and Accountability Act (HIPAA) of 1996. 25(1), 2011.
-  Federal Trade Commission. Gramm-Leach-Bliley Act: Financial Privacy and Pretexting. Technical report, 2002.
-  M Fernández-López, A Gómez-Pérez, and Natalia Juristo. METHONTOLOGY: From Ontological Art Towards Ontological Engineering. AAAI-97 Spring Symposium Series, SS-97-06:33–40, 1997.
-  Mark S. Fox, John F. Chionglo, and Fadi G. Fadel. A common-sense model of the enterprise. Proceedings of the 2nd Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, pages 25–34, 1993.
-  Christiane Funken, G Gans, M Jarke, St Kethers, and G Lakemeyer. Modeling the Impact of Trust and Distrust in Agent Networks. In Proceedings of the 3rd Workshop on Agent Oriented Information Systems AOIS01, volume Juni, pages 45–58, 2001.
-  Aldo Gangemi, Carola Catenacci, Massimiliano Ciaramita, and Jos Lehmann. Modelling ontology evaluation and validation. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 4011 LNCS:140–154, 2006.
-  Robert Gellman. Privacy , Consumers , and Costs - How The Lack of Privacy Costs Consumers and Why Business Studies of Privacy Costs are Biased and Incomplete. In Ford Foundation, pages 1 – 37, 2002.
-  General Data Protection Regulation. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 April 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing Directive 95/46. Official Journal of the European Union (OJ), 59:1—-88, 2016.
-  Mohamad Gharib and Paolo Giorgini. Analyzing trust requirements in socio-technical systems: A belief-based approach. In Lecture Notes in Business Information Processing, volume 235, pages 254–270. Springer, 2015.
-  Mohamad Gharib and Paolo Giorgini. Modeling and Reasoning About Information Quality Requirements. In Requirements Engineering: Foundation for Software Quality, volume 9013, pages 49–64. Springer, Springer, 2015.
-  Mohamad Gharib, Paolo Giorgini, and John Mylopoulos. Ontologies for Privacy Requirements Engineering: A Systematic Literature Review. arXiv preprint arXiv:1611.10097, 2016.
-  Mohamad Gharib, Paolo Giorgini, and John Mylopoulos. Towards an Ontology for Privacy Requirements via a Systematic Literature Review. In International Conference on Conceptual Modeling, volume 10650 LNCS, pages 193–208. Springer, nov 2017.
-  Mohamad Gharib, Paolo Lollini, and Andrea Bondavalli. A conceptual model for analyzing information quality in System-of-Systems. In 2017 12th System of Systems Engineering Conference, SoSE 2017, pages 1–6. IEEE, 2017.
-  Mohamad Gharib, Mattia Salnitri, Elda Paja, Paolo Giorgini, Haralambos Mouratidis, Michalis Pavlidis, Jose F. Ruiz, Sandra Fernandez, and Andrea Della Siria. Privacy Requirements: Findings and Lessons Learned in Developing a Privacy Platform. In Proceedings - 2016 IEEE 24th International Requirements Engineering Conference, RE, pages 256–265. IEEE, 2016.
-  a Gómez-Pérez, M Fernandez, and A De Vicente. Towards a method to conceptualize domain ontologies. Workshop on Ontological Engineering, pages 41–51, 1996.
-  Gómez-Péreza and Asunción. OOPS ! ( OntOlogy Pitfall Scanner !): supporting ontology evaluation on-line. 1:1–5, 2009.
-  Shirley Gregor and Alan R Hevner. Positioning and Presenting Types of Knowledge in Design Science Research. MIS Quarterly, 37(2):337–355, 2013.
-  Thomas R. Gruber. Toward principles for the design of ontologies used for knowledge sharing. International Journal of Human-Computer Studies, 43(5-6):907–928, 1995.
-  Nicola Guarino and Welty a Christopher. A Formal Ontology of Properties. Knowledge Engineering and Knowledge Management Methods, Models, and Tools, (1937):97–112, 2000.
-  Zahia Guessoum, Mikal Ziane, and Nora Faci. Monitoring and organizational-level adaptation of multi-agent systems. In Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems-Volume 2, pages 514–521. IEEE Computer Society, 2004.
-  Peter Haase, H Lewen, Rudi Studer, and DT Tran. The neon ontology engineering toolkit. Www, (April):4–6, 2008.
-  Daojing He, Chun Chen, Sammy Chan, Jiajun Bu, and Athanasios V. Vasilakos. A distributed trust evaluation model and its application scenarios for medical sensor networks. IEEE Transactions on Information Technology in Biomedicine, 16(6):1164–1175, 2012.
-  Alan R Hevner and Salvatore T March. Design Science in Information Systems Research. Information Systems Research, 28(1):75–105, 2004.
-  Jason I. Hong and James A. Landay. An architecture for privacy-sensitive ubiquitous computing. Proceedings of the 2nd international conference on Mobile systems, applications, and services - MobiSYS ’04, page 177, 2004.
-  Jason I. Hong, Jennifer D. Ng, Scott Lederer, and James A. Landay. Privacy risk models for designing privacy-sensitive ubiquitous computing systems. In Proceedings of the 2004 conference on Designing interactive systems processes, practices, methods, and techniques - DIS ’04, page 91. ACM, 2004.
-  Jennifer Horkoff, Daniele Barone, Lei Jiang, Eric Yu, Daniel Amyot, Alex Borgida, and John Mylopoulos. Strategic business modeling: Representation and reasoning. Software and Systems Modeling, 13(3):1015–1041, 2014.
-  Matthew Horridge, Holger Knublauch, Alan Rector, Robert Stevens, Chris Wroe, Simon Jupp, Georgina Moulton, Robert Stevens, Nick Drummond, Simon Jupp, Georgina Moulton, and Sebastian Brandt. A Practical Guide to Building OWL Ontologies Using Protege 4 and CO-ODE Tools. Matrix, pages 0–107, 2011.
-  ISO. ISO/IEC 15408-2. Information Technology, Security techniques. Evaluation criteria for IT security. Security functional components. Technical report, 2009.
-  H. Kaiya and M. Saeki. Using Domain Ontology as Domain Knowledge for Requirements Elicitation. In 14th IEEE International Requirements Engineering Conference (RE’06), pages 189–198. IEEE, 2006.
-  Christos Kalloniatis, Evangelia Kavakli, and Stefanos Gritzalis. Addressing privacy requirements in system design: The PriS method. Requirements Engineering, 13(3):241–255, 2008.
-  Aditya Kalyanpur, Bijan Parsia, Evren Sirin, Bernardo Cuenca Grau, and James Hendler. Swoop: A Web Ontology Editing Browser. Web Semantics, 4(2):144–153, 2006.
-  Wentao Kang and Ying Liang. A security ontology with MDA for software development. In Proceedings - 2013 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery, CyberC 2013, pages 67–74. IEEE, 2013.
-  TA Kosa, K El-Khatib, S Marsh J. Internet Serv. Inf, and Undefined 2011. Measuring Privacy. Journal of Internet Services and Information Security, 1:60–73, 2011.
-  Hanna Krasnova, Sarah Spiekermann, Ksenia Koroleva, and Thomas Hildebrand. Online social networks: why we disclose. Journal of Information Technology, 25(2):109–125, 2010.
-  Wadha Labda, Nikolay Mehandjiev, and Pedro Sampaio. Modeling of privacy-aware business processes in BPMN to protect personal data. In Proceedings of the 29th Annual ACM Symposium on Applied Computing, pages 1399–1405. ACM, 2014.
-  Luncheng Lin, Bashar Nuseibeh, Darrel Ince, Michael Jackson, and Jonathan Moffett. Introducing abuse frames for analysing security requirements. In 11th Requirements Engineering International Conference, pages 371–372. IEEE, 2003.
-  Lin Liu, Eric Yu, and John Mylopoulos. Security and privacy requirements analysis within a social setting. In 11th International Requirements Engineering Conference, pages 151–161. IEEE, 2003.
-  Stephen T. Margulis. Privacy as a social issue and behavioral concept. Journal of Social Issues, 59(2):243–261, jun 2003.
-  Fabio Massacci, John Mylopoulos, Federica Paci, Thein Thun Tun, and Yijun Yu. An extended ontology for security requirements. In Advanced Information Systems Engineering Workshops, pages 622–636. Springer, 2011.
-  Nicolas Mayer. Model-based management of information system security risk. PhD thesis, University of Namur, 2009.
-  Edward Alan Miller. The technical and interpersonal aspects of telemedicine: effects on doctor–patient communication. Journal of telemedicine and telecare, 9(1):1–7, 2003.
-  Ministry of Justice of Canada. Personal Information Protection and Electronic Documents Act, S.C. 2000, c. 5. Technical report, 2018.
-  H Mouratidis and P Giorgini. Secure Tropos: A security-oriented extension of the Tropos methodology. Journal of Software Engineering and Knowledge Engineering, 17(2):285–309, 2007.
-  Mobile Networks. Body Area Networks: A Survey. Mobile Networks and Applications, 16(2):171–193, 2010.
-  H Nissenbaum. Privacy as contextual integrity. Wash. L. Rev., pages 101–139, 2004.
-  Office of the Australian information commissioner. Australia. Privacy Act 1988. Technical report, 1988.
-  Alessandro Oltramari, Dhivya Piraviperumal, Florian Schaub, Shomir Wilson, Sushain Cherivirala, Thomas B. Norton, N. Cameron Russell, Peter Story, Joel Reidenberg, and Norman Sadeh. PrivOnto: A semantic framework for the analysis of privacy policies. Semantic Web, 9(2):185–203, 2018.
-  Inah Omoronyia, Luca Cavallaro, Mazeiar Salehie, Liliana Pasquale, and Bashar Nuseibeh. Engineering adaptive privacy: On the role of privacy awareness requirements. In Proceedings - International Conference on Software Engineering, pages 632–641, 2013.
-  Andreas Pfitzmann and Marit Hansen. A terminology for talking about privacy by data minimization: Anonymity, Unlinkability, Undetectability, Unobservability, Pseudonymity, and Identity Management. Technical University Dresden, pages 1–98, 2010.
-  Joseph Phelps, Glen Nowak, and Elizabeth Ferrell. Privacy concerns and consumer willingness to provide personal information. Journal of Public Policy & Marketing, 19(1):27–41, 2000.
-  María Poveda-villalón, Mari Carmen Suárez-figueroa, and Asunción Gómez-pérez. A Double Classification of Common Pitfalls in Ontologies. Development, pages 1–12, 2010.
-  Valentina Presutti. D2 . 5 . 1 : A Library of Ontology Design Patterns : reusable solutions for collaborative design of networked ontologies. pages 1–183, 2008.
-  Jan Pries-Heje, Richard Baskerville, and J Venable. Strategies for design science research evaluation. ECIS 2008 proceedings, pages 1–12, 2008.
-  Using Prot, Matthew Horridge, Holger Knublauch, Alan Rector, Robert Stevens, Chris Wroe, Simon Jupp, Georgina Moulton, Nick Drummond, and Sebastian Brandt. A Practical Guide To Building OWL Ontologies Using Protégè 4 and CO-ODE Tools - chapters 1,3,4. Matrix, pages 0–27, 2009.
-  Eric Prud’Hommeaux, Andy Seaborne, and Others. SPARQL Query Language for RDF (Working Draft). W3C recommendation, 2007.
-  Peter J Radics, Denis Gracanin, and Dennis Kafura. Preprocess before you build: Introducing a framework for privacy requirements engineering. In Social Computing (SocialCom), 2013 International Conference on, pages 564–569. IEEE, 2013.
-  Parisa Rashidi and Alex Mihailidis. A survey on ambient-assisted living tools for older adults. IEEE journal of biomedical and health informatics, 17(3):579–590, 2013.
-  Lillian Rostad. An extended misuse case notation: Including vulnerabilities and the insider threat. In The Twelfth Working Conference on Requirements Engineering: Foundation for Software Quality, pages 67–77. Springer, 2006.
-  Per Runeson and Martin Höst. Guidelines for conducting and reporting case study research in software engineering. Empirical software engineering, 14(2):131–164, 2009.
-  Ravi S Sandhu, Edward J Coyne, Hal L Feinstein, and Charles E Youman. Role-Based Access Control Models. IEEE computer, 29(2):38–47, 1996.
-  S.Drude. Abstracting information on body area networks. PhD thesis, University of Cambridge, 2006.
-  Kim Bartel Sheehan and Mariea Grubbs Hoy. Dimensions of privacy concern among online consumers. Journal of public policy & marketing, 19(1):62–73, 2000.
-  Anoop Singhal and Duminda Wijesekera. Ontologies for modeling enterprise level security metrics. In Proceedings of the Sixth Annual Workshop on Cyber Security and Information Intelligence Research, page 58. ACM, 2010.
-  Daniel J Solove. Conceptualizing privacy. California Law Review, pages 1087–1155, 2002.
-  Daniel J. Solove. A Taxonomy of Privacy. University of Pennsylvania Law Review, 154(3):477, 2006.
-  Amina Souag, Camille Salinesi, Raúl Mazo, and Isabelle Comyn-Wattiau. A security ontology for security requirements elicitation. In Engineering Secure Software and Systems, pages 157–177. Springer, 2015.
-  Sarah Spiekermann and Lorrie Faith Cranor. Engineering privacy. Software Engineering, IEEE Transactions on, 35(1):67–82, 2009.
-  M. C. Suárez-Figueroa. NeOn Methodology for Building Ontology Networks: Specification, Scheduling and Reuse. PhD thesis, 2010.
-  York Sure, Juergen Angele, and S Staab. OntoEdit: Guiding ontology development by methodology and inferencing. Proc. of the Confederated International Conferences CoopIS, DOA and ODBASE, pages 1205–1222, 2002.
-  W Trochim and J P Donnelly. The Research Methods Knowledge Base. Cengage Learning, 2006.
-  Rein Turn. Classification of personal information for privacy protection purposes. page 301, 1976.
-  Michael Uschold and Martin King. Towards a methodology for building ontologies. Citeseer, 1995.
-  Mike Uschold. Building Ontologies : Towards a Unified Methodology. Proceedings Expert Systems 1996, the 16th Annual Conference of the British Computer Society Specialist Group on Expert Systems, (September):1–18, 1996.
-  Mike Uschold and Michael Gruninger. Ontologies: Principles, methods and applications. The knowledge engineering review, 11(02):93–136, 1996.
-  G W Van Blarkom, J J Borking, and J G E Olk. Handbook of privacy and privacy-enhancing technologies. Privacy Incorporated Software Agent (PISA) Consortium, The Hague, 2003.
-  Joaquin Lasheras Velasco, Rafael Valencia-García, Jesualdo Tomás Fernández-Breis, Ambrosio Toval, and Others. Modelling reusable security requirements based on an ontology framework. Journal of Research and Practice in Information Technology, 41(2):119, 2009.
-  John Venable, Jan Pries-Heje, and Richard Baskerville. A comprehensive framework for evaluation in design science research. In International Conference on Design Science Research in Information Systems, pages 423–438. Springer, 2012.
-  Ju An Wang and Minzhe Guo. OVM: an ontology for vulnerability management. In Proceedings of the 5th Annual Workshop on Cyber Security and Information Intelligence Research, page 34. ACM, 2009.
-  Samuel D. Warren and Louis D. Brandeis. The Right to Privacy. Harvard Law Review, 4(5):193, 1890.
-  Alan F Westin. Privacy and freedom. Washington and Lee Law Review, 25(1):166, 1968.
-  Kim Wuyts, Riccardo Scandariato, Bart De Decker, and Wouter Joosen. Linking privacy solutions to developer goals. In Availability, Reliability and Security, 2009. ARES’09. International Conference on, pages 847–852. IEEE, 2009.
-  Eric Yu. Modelling Strategic Relationship for Process Reengineering. PhD thesis, University of Toronto, 1995.
-  Khairuddin Yusof, Karen Hong Beng Neoh, Muhammad Arif bin Hashim, Ishak Ibrahim, and Others. Role of teleconsultation in moving the healthcare system forward. Asia-Pacific Journal of Public Health, 14(1):29–34, 2002.
-  Nicola Zannone. A requirements engineering methodology for trust, security, and privacy. PhD thesis, University of Trento, 2006.
-  Martina Ziefle, Carsten Rocker, and Andreas Holzinger. Medical technology in smart homes: exploring the user’s perspective on privacy, intimacy and trust. In 35th Computer Software and Applications Conference Workshops (COMPSACW), pages 410–415. IEEE, 2011.
-  Detlev Zwick and Nikhilesh Dholakia. Whose identity is it anyway? Consumer representation in the age of database marketing. Journal of Macromarketing, 24(1):31–43, 2004.
Catalog of Common Pitfalls In what follows, we present the catalog of 24 pitfalls identified in :
Creating polysemous elements: an ontology element whose name has different meanings is included in the ontology to represent more than one conceptual idea.
Creating synonyms as classes: several classes whose identifiers are synonyms are created and defined as equivalent. For example, we could define “Car”, “Motorcar” and “Automobile” as equivalent classes. This pitfall is related to the guidelines presented in .
Creating the relationship “is” instead of using “subclassOf”, “instanceOf” or “sameIndividual”: the “is” relationship is created in the ontology instead of using OWL primitives for representing the subclass relationship (“subclassOf”), the membership to a class (“instanceOf”), or the equality between instances (“sameAs”). This pitfall is also related to the guidelines for understanding the “is-a” relation provided in .
Creating unconnected ontology elements: ontology elements (classes, relationships or attributes) are created with no relation to the rest of the ontology. An example of this type of pitfall is to create the relationship “memberOfTeam” and to miss the class representing teams; thus, the relationship created is isolated in the ontology.
Defining wrong inverse relationships: two relationships are defined as inverse relations when actually they are not. For example, something is sold or something is bought; in this case, the relationships “isSoldIn” and “isBoughtIn” are not inverse.
Including cycles in the hierarchy [28, 41]: a cycle between two classes in the hierarchy is included in the ontology, although it is not intended to have such classes as equivalent. That is, some class A has a subclass B and at the same time B is a subclass of A. An example of this type of pitfall is represented by the class “Professor” as a subclass of “Person”, and the class “Person” as a subclass of “Professor”.
Merging different concepts in the same class: a class is created whose identifier is referring to two or more different concepts. An example of this type of pitfall is to create the class “StyleAndPeriod”, or “ProductOrService”.
Missing annotations: ontology terms lack annotations properties. This kind of properties improves the ontology understanding and usability from a user point of view.
Missing basic information: needed information is not included in the ontology. Sometimes this pitfall is related to the requirements in the Ontology Requirements Specification Document (ORSD)  that are not covered by the ontology. Other times it is related to knowledge that could be added to the ontology in order to make it more complete.
the ontology lacks disjoint axioms between classes or between properties that should be defined as disjoint. For example, we can create the classes “Odd” and “Even” (or the classes “Prime” and “Composite”) without being disjoint; such representation is not correct based on the definition of these types of numbers.
Missing domain or range in properties: relationships without domain or range (or none of them) are included in the ontology. There are situations in which the relation is very general and the range should be the most general concept “Thing”. This pitfall is related to the common error when defining ranges and domains described in .
Missing equivalent properties: when an ontology is imported into another, classes that are duplicated in both ontologies are normally defined as equivalent classes. However, the ontology developer misses the definition of equivalent properties in those cases of duplicated relationships and attributes. For example, the classes “CITY” and “City” in two different ontologies are defined as equivalent classes; however, relationships “hasMember” and “has-Member” in two different ontologies are not defined as equivalent relations.
Missing inverse relationships: there are two relationships in the ontology that should be defined as inverse relations. For example, the case in which the ontology developer omits the inverse definition between the relations “hasLanguageCode” and “isCodeOf”, or between “hasReferee” and “isRefereeOf”.
Misusing “allValuesFrom” : this pitfall can appear in two different ways. In the first, the anomaly is to use the universal restriction (“allValuesFrom”) as the default qualifier instead of using the existential restriction (“someValuesFrom”). This means that the developer thinks that “allValuesFrom” implies “someValuesFrom”. In the second, the mistake is to include “allValuesFrom” to close off the possibility of further additions for a given property.
Misusing “not some” and “some not” : to mistake the representation of “some not” for “not some”, or the other way round. An example of this type of pitfall is to define a vegetarian pizza as any pizza which both has some topping that is not meat and also has some topping that is not fish. This example is explained in more detail in .
Misusing primitive and defined classes : to fail to make the definition “complete” rather than “partial” (or “necessary and sufficient” rather than just “necessary”). It is critical to understand that, in general, nothing will be inferred to be subsumed under a primitive class by the classifier. This pitfall implies that the developer does not understand the open world assumption. A more detailed explanation and examples can be found in .
Specializing too much a hierarchy: the hierarchy in the ontology is specialized in such a way that the final leaves cannot have instances, because they are actually instances and should have been created in this way instead of being created as classes. Authors in  provide guidelines for distinguishing between a class and an instance when modeling hierarchies. An example of this type of pitfall is to create the class “RatingOfRestaurants” and the classes “1fork”, “2forks”, and so on, as subclasses instead of as instances. Another example is to create the classes “Madrid”, “Barcelona”, “Sevilla”, and so on as subclasses of “Place”. This pitfall could be also named “Individuals” are not Classes.
Specifying too much the domain or the range [41, 74]: not to find a domain or a range that is general enough. An example of this type of pitfall is to restrict the domain of the relationship “isOfficialLanguage” to the class “City”, instead of allowing also the class “Country” to have an official language or a more general concept such as “GeopoliticalObject”.
Swapping intersection and union: the ranges and/or domains of the properties (relationships and attributes) are defined by intersecting several classes in cases in which the ranges and/or domains should be the union of such classes. An example of this type of pitfall is to create the relationship “takesPlaceIn” with domain “OlympicGames” and with the range the intersection of the classes “City” and “Nation”. This pitfall is related to the common error appear in [74, 41].
Swapping Label and Comment: the contents of the Label and Comment annotation properties are swapped. An example of this type of pitfall is to include in the Label annotation of the class “Crossroads” the following sentence “the place of intersection of two or more roads”; and to include in the Comment annotation the word “Crossroads”.
Using a miscellaneous class: to create in a hierarchy a class that contains the instances that do not belong to the sibling classes instead of classifying such instances as instances of the class in the upper level of the hierarchy. This class is normally named “Other” or “Miscellaneous”. An example of this type of pitfall is to create the class “HydrographicalResource”, and the subclasses “Stream”, “Waterfall”, etc., and also the subclass “OtherRiverElement”.
Using different naming criteria in the ontology: no naming convention is used in the identifiers of the ontology elements. Some notions about naming conventions are provided in . For example, we can name a class by starting with upper case, e.g. “Ingredient”, and its subclasses by starting with lower case, e.g. “animalorigin”, “drink”, etc.
Using incorrectly ontology elements: an ontology element (class, relationship or attribute) is used to model a part of the ontology that should be modeled with a different element. A particular case of this pitfall regarding the misuse of classes and property values is addressed in . An example of this type of pitfall is to create the relationship “isEcological” between an instance of “Car” and the instance “Yes” or “No”, instead of creating the attribute “isEcological” whose range is Boolean.
Using recursive definition: an ontology element is used in its own definition. For example, it is used to create the relationship “hasFork” and to establish as its range the following the set of restaurants that have at least one value for the relationship “hasFork”.
The list of the pitfalls identified by experts are shown in Table 5. Each pitfall is described with its identifier (e.g., P1., P2., P7., P17., P21. or P24.), affected element(s) (e.g., a class or a relationship), a description of the pitfall, followed by how we addressed it.
|P2.||“A goal is a state of affairs that an actor intends (aims) to achieve”.||The term “intends” might be confused with the term “intends” in the definition of the threat actor|
|We have refined the definition of the goal to address this comment as follows:|
|“A goal is a state of affairs that an actor aims to achieve.”|
|P24.||“Information represents any informational entity without intentionality.”||Information is used in its own definition|
|We have refined the definition of Information as follows:|
|“Information represents a statement provided or learned about something|
|P2.||“We adopt four different sensitivity levels that range from 1 to 4, where 4 is the most sensitive.”||There is no need to include numerical levels, such information is already presented as natural language describing sensitivity levels.|
|We have addressed this comment as follows:|
|“We adopt four different sensitivity levels ordered as (R)estricted, (C)onfidential,|
|(S)ensitive, and Secre(T), where Secre(T) is the most sensitive.”|
|P2.||“Accordingly, we adopt four corresponding categories (we represent as classes) of personal information, namely Restricted, Confidential, Sensitive, and Secret.”||Adding Restricted, Confidential, Sensitive, and Secret subclasses of personal information is not required, such information is already captured by the sensitivity levels of personal information.|
|“We have addressed this comment by removing the four subclasses of personal|
|information (e.g., Restricted, Confidential, Sensitive, and Secret).”|
|P24.||“Compliant indicates that the purpose for which information is used is compliant with the rules that guarantee the best interest of its owner;” (same for Incompliant)||Compliant/Incompliant are used in their own definitions|
|We have performed the following modifications:|
|“Compatible indicates that the purpose for which information is used is compliant|
|with the rules that guarantee the best interest of its owner”|
|“Incompatible indicates that the purpose for which information is used is not|
|compliant with the rules that guarantee the best interest of its owner”|
|P24.||“Describes is a relationship between information and goal, where information describes the goal while it is pursued by some actor.”||Describes is used in its own definition|
|We have refined the definition of Describes to address this comment as follows:|
|“Describes is a relationship between information and a goal, where information|
|characterizes the goal while it is being pursued by some actor”|
|P24.||“Information provision captures the provision of (provisionOf) information ..”||provision is used in its own definition|
|We have addressed this comment as follows:|
|“Information provision captures the transmission of information ..”|
|P21.||Information, personal and public information||Dividing information into public information and personal information indicates that a personal information cannot be public, which is not correct (the properties public and personal are not disjoint). I think that the sub classes should be public information and private information|
|We did not made any changes to address this comment since we believe the|
|concepts we adopt (e.g., public and personal) are fine, and they are highly|
|adopted and used by privacy researchers.|