Confidentiality of data (also called privacy or secrecy in some contexts) is a major security goal. Releasing data to a querying user without disclosing confidential information has long been investigated in areas like access control, -anonymity, inference control, and data fragmentation. Such approaches prevent disclosure according to some security policy by restricting data access (denial, refusal), by modifying some data (perturbation, noise addition, cover stories, lying, weakening), or by breaking sensitive associations (fragmentation). Several approaches (like [3, 8, 13, 14, 2, 15]) employ logic-based mechanisms to ensure data confidentiality. In particular,  use brave reasoning in default logic theories to solve a privacy problem in a classical database (a set of ground facts). For a non-classical knowledge base (where negation as failure is allowed)  study correctness of access rights. Confidentiality of predicates in collaborative multi-agent abduction is a topic in .
In this article we analyze confidentiality-preserving data publishing in a knowledge base setting: data as well as integrity constraints or deduction rules are represented as logical formulas. If such a knowledge base is released to the public for general querying (e.g., microcensus data) or outsourced to a storage provider (e.g., database-as-a-service in cloud computing), confidential data could be disclosed. We assume that users accessing the published knowledge base use a form of credulous (also called brave) reasoning to retrieve data from it; users also possess some invariant “a priori knowledge” that can be applied to these data to deduce further information. On the knowledge base side, a confidentiality policy specifies which is the confidential information that must never be disclosed. This paper is one of only few papers (see [11, 16, 10]) covering confidentiality for logic programs. This formalism however has relevance in multi-agent communications where agent knowledge is modeled by logic programs. With extended abduction () we obtain a “secure version” of the knowledge base that can safely be published even when a priori knowledge is applied. We show that computing the secure version for a credulous user corresponds to finding a skeptical anti-explanation for all the elements of the confidentiality policy. Extended abduction has been used in different applications like for example providing a logical framework for dishonest reasoning . It can be solved by computing the answer sets of an update program (see ); thus an implementation of extended abduction can profit from current answer set programming (ASP) solvers . To retrieve the confidentiality-preserving knowledge base from the input knowledge base , the a priori knowledge and the confidentiality policy , a row of transformations are applied; the overall approach is depicted in Figure 1.
In sum, this paper makes the following contributions:
it formalizes confidentiality-preserving data publishing for a user who retrieves data under a credulous query response semantics.
it devises a procedure to securely publish a logic program (with an expressiveness up to extended disjunctive logic programs) respecting a subset-minimal change semantics.
it shows that confidentiality-preservation for credulous users corresponds to finding a skeptical anti-explanation and can be solved by extended abduction.
In the remainder of this article, Section 2 provides background on extended disjunctive logic programs and answer set semantics; Section 3 defines the problem of confidentiality in data publishing; Section 4 recalls extended abduction and update programs; Section 5 shows how answer sets of update programs correspond to confidentiality-preserving knowledge bases; and Section 6 gives some discussion and concluding remarks.
2 EDPs and answer set semantics
In this article, a knowledge base is represented by an extended disjunctive logic program (EDP) – a set of formulas called rules of the form:
A rule contains literals , disjunction “;”, conjunction “,”, negation as failure “”, and material implication “”. A literal is a first-order atom or an atom preceded by classical negation “”. is called a NAF-literal. The disjunction left of the implication is called the head, while the conjunction right of is called the body of the rule. For a rule , we write to denote the set of literals and to denote the set of (NAF-)literals . Rules consisting only of a singleton head are identified with the literal and used interchangeably. An EDP is ground if it contains no variables. If an EDP contains variables, it is identified with the set of its ground instantiations: the elements of its Herbrand universe are substituted in for the variables in all possible ways. We assume that the language contains no function symbol, so that each rule with variables represents a finite set of ground rules. For a program , we denote the set of ground literals in the language of . Note that EDPs offer a high expressiveness including disjunctive and non-monotonic reasoning.
In a medical knowledge base states that a patient is ill with disease ;
states that is treated with medicine .
Assume that if you read the record and find that one treatment (Medi1) is recorded and another one (Medi2) is not recorded, then you know that the patient is at least ill with Aids or Flu (and possibly has other illnesses).
serves as a running example.
The semantics of can be given by the answer set semantics : A set of ground literals satisfies a ground literal if ; satisfies a conjunction if it satisfies every conjunct; satisfies a disjunction if it satisfies at least one disjunct; satisfies a ground rule if whenever the body literals are contained in () and all NAF-literals are not contained in (), then at least one head literal is contained in ( for an such that ). If an EDP contains no NAF-literals (), then such a set is an answer set of if is a subset-minimal set such that
satisfies every rule from the ground instantiation of ,
If contains a pair of complementary literals and , then .
This definition of an answer set can be extended to full EDPs (containing NAF-literals) as in : For an EDP and a set of ground literals , can be transformed into a NAF-free program as follows. For every ground rule from the ground instantiation of (with respect to its Herbrand universe), the rule is in if . Then, is an answer set of if is an answer set of . An answer set is consistent if it is not . A program is consistent if it has a consistent answer set; otherwise is inconsistent.
The example has the following two consistent answer sets
When adding the negative fact to , then there is just one consistent answer set left: for the unique answer set is
If a rule is satisfied in every answer set of , we write . In particular, if a literal is included in every answer set of .
3 Confidentiality-Preserving Knowledge Bases
When publishing a knowledge base while preserving confidentiality of some data in we do this according to
the query response semantics that a user querying the published knowledge base applies; we focus on credulous query response semantics
a confidentiality policy (denoted ) describing confidential information that should not be released to the public
background (a priori) knowledge (denoted ) that a user can combine with query responses from the published knowledge base
First we define the credulous query response semantics: a ground formula is in , if is satisfied in some answer set of – that is, there might be answer sets that do not satisfy . If a rule is non-ground and contains some free variables, the credulous response of is the set of ground instantiations of that are in .
Definition 1 (Credulous query response semantics)
Let be the Herbrand universe of a consistent knowledge base . The credulous query responses of formula
(with a vectorof free variables) in are
In particular, for a ground formula ,
It is usually assumed that in addition to the query responses a user has some additional knowledge that he can apply to the query responses. Hence, we additionally assume given a set of rules as some invariant a priori knowledge . Without loss of generality we assume that is an EDP. Thus, the priori knowledge may consist of additional facts that the user assumes to hold in , or some rules that the user can apply to data in to deduce new information.
A confidentiality policy specifies confidential information. We assume that contains only conjunctions of (NAF-)literals. However, see Section 5.1 for a brief discussion on how to use more expressive policy formulas. We do not only have to avoid that the published knowledge base contains confidential information but also prevent the user from deducing confidential information with the help of his a priori knowledge; this is known as the inference problem [6, 2].
If we wish to declare the disease aids as confidential for any patient we can do this with . A user querying might know that a person suffering from flu is not able to work. Hence . If we wish to also declare a lack of work ability as confidential, we can add this to the confidentiality policy: .
Next, we establish a definition of confidentiality-preservation that allows for the answer set semantics as an inference mechanism and respects the credulous query response semantics: when treating elements of the confidentiality policy as queries, the credulous responses must be empty.
Definition 2 (Confidentiality-preservation for credulous user)
A knowledge base preserves confidentiality of a given confidentiality policy under the credulous query response semantics and with respect to a given a priori knowledge , if for every conjunction in the policy, the credulous query responses of in are empty: .
Note that in this definition the Herbrand universe of is applied in the query response semantics; hence, free variables in policy elements are instantiated according to this universe. Note also that must be consistent. Confidentiality-preservation for skeptical query response semantics is topic of future work.
A goal secondary to confidentiality-preservation is minimal change: We want to publish as many data as possible and want to modify these data as little as possible. Different notions of minimal change are used in the literature (see for example  for a collection of minimal change semantics in a data integration setting). We apply a subset-minimal change semantics: we choose a that differs from only subset-minimally. In other words, there is not other confidentiality-preserving knowledge base which inserts (or deletes) less rules to (from) than .
Definition 3 (Subset-minimal change)
A confidentiality-preserving knowledge base subset-minimally changes (or is minimal, for short) if there is no confidentiality-preserving knowledge base such that .
For the example and and no a priori knowledge, the fact has to be deleted. But also can be deduced credulously, because it is satisfied by answer set . In order to avoid this, we have three options: delete , delete the non-literal rule in or insert . The same solutions are found for , and : they block the credulous deduction of . The same applies to and .
In the following sections we obtain a minimal solution for a given input , and by transforming the input into a problem of extended abduction and solving it with an appropriate update program.
4 Extended Abduction
Traditionally, given a knowledge base and an observation formula , abduction finds a “(positive) explanation” – a set of hypothesis formulas – such that every answer set of the knowledge base and the explanation together satisfy the observation; that is, . Going beyond that [9, 12] use extended abduction with the notions of “negative observations”, “negative explanations” and “anti-explanations”. An abduction problem in general can be restricted by specifying a designated set of abducibles. This set poses syntactical restrictions on the explanation sets and . In particular, positive explanations are characterized by and negative explanations by . If contains a formula with variables, it is meant as a shorthand for all ground instantiations of the formula. In this sense, an EDP accompanied by an EDP is called an abductive program written as . The aim of extended abduction is then to find (anti-)explanations as follows (where in this article only skeptical (anti-)explanations are needed):
given a positive observation , find a pair where is a positive explanation and is a negative explanation such that
[skeptical explanation] is satisfied in every answer set of ; that is,
[consistency] is consistent
given a negative observation , find a pair where is a positive anti-explanation and is a negative anti-explanation such that
[skeptical anti-explanation] there is no answer set of in which is satisfied
[consistency] is consistent
Among (anti-)explanations, minimal (anti-)explanations characterize a subset-minimal alteration of the program : an (anti-)explanation of an observation is called minimal if for any (anti-)explanation of , and imply and .
For an abductive program both and are semantically identified with their ground instantiations with respect to the Herbrand universe, so that set operations over them are defined on the ground instances. Thus, when contain formulas with variables, means deleting every instance of formulas in , and inserting any instance of formulas in from/into . When contains formulas with variables, the set inclusion is defined for any set of instances of formulas in . Generally, given sets and of literals/rules containing variables, any set operation is defined as where is the ground instantiation of . For example, when , for any constant occurring in , it holds that , , and , etc. Moreover, any literal/rule in a set is identified with its variants modulo variable renaming.
4.1 Normal form
Although extended abduction can handle the very general format of EDPs, some syntactic transformations are helpful. Based on  we will briefly describe how a semantically equivalent normal form of an abductive program is obtained – where both the program and the set of abducibles are EDPs. This makes an automatic handling of abductive programs easier; for example, abductive programs in normal form can be easily transformed into update programs as described in Section 4.2. The main step is that rules in can be mapped to atoms by a naming function . Let be the set of abducible rules:
Then the normal form is defined as follows where maps each rule to a fresh atom with the same free variables as :
We define that any abducible literal has the name , i.e., . It is shown in , that for any observation there is a 1-1 correspondence between (anti-)explanations with respect to and those with respect to . That is, for and : an observation has a (minimal) skeptical (anti-)explanation with respect to iff has a (minimal) skeptical (anti-)explanation with respect to . Hence, insertion (deletion) of a rule’s name in the normal form corresponds to insertion (deletion) of the rule in the original program. In sum, with the normal form transformation, any abductive program with abducible rules is reduced to an abductive program with only abducible literals.
We transform the example knowledge base into its normal form based on a set of abducibles that is identical to : that is ; a similar setting will be used in Section 5.2 to achieve deletion of formulas from . Hence we transform into its normal form as follows where we write for the naming atom of the only rule in :
4.2 Update programs
Minimal (anti-)explanations can be computed with update programs (UPs) . The update-minimal (U-minimal) answer sets of a UP describe which rules have to be deleted from the program, and which rules have to be inserted into the program, in order (un-)explain an observation.
For the given EDP and a given set of abducibles , a set of update rules is devised that describe how entries of can be changed. This is done with the following three types of rules.
[Abducible rules] The rules for abducible literals state that an abducible is either true in or not. For each , a new atom is introduced that has the same variables as . Then the set of abducible rules for each is defined as
[Insertion rules] Abducible literals that are not contained in might be inserted into and hence might occur in the set of the explanation . For each , a new atom is introduced and the insertion rule is defined as
[Deletion rules] Abducible literals that are contained in might be deleted from and hence might occur in the set of the explanation . For each , a new atom is introduced and the deletion rule is defined as
The update program is then defined by replacing abducible literals in with the update rules; that is,
Continuing Example 5, from we obtain
The set of atoms is the set of positive update atoms; the set of atoms is the set of negative update atoms. The set of update atoms is . From all answer sets of an update program we can identify those that are update minimal (U-minimal): they contain less update atoms than others. Thus, is U-minimal iff there is no answer set such that .
4.3 Ground observations
It is shown in  how in some situations the observation formulas can be mapped to new positive ground observations. Non-ground atoms with variables can be mapped to a new ground observation. Several positive observations can be conjoined and mapped to a new ground observation. A negative observation (for which an anti-explanation is sought) can be mapped as a NAF-literal to a new positive observation (for which then an explanation has to be found). Moreover, several negative observations can be mapped as a conjunction of NAF-literals to one new positive observation such that its resulting explanation acts as an anti-explanation for all negative observations together. Hence, in extended abduction it is usually assumed that is a positive ground observation for which an explanation has to be found. In case of finding a skeptical explanation, an inconsistency check has to be made on the resulting knowledge base. Transformations to a ground observation and inconsistency check will be detailed in Section 5.1 and applied to confidentiality-preservation.
5 Confidentiality-Preservation with UPs
We now show how to achieve confidentiality-preservation by extended abduction: we define the set of abducibles and describe how a confidentiality-preserving knowledge base can be obtained by computing U-minimal answer sets of the appropriate update program. We additionally distinguish between the case that we allow only deletions of formulas – that is, in the anti-explanation the set of positive anti-explanation formulas is empty – and the case that we also allow insertions.
5.1 Policy transformation for credulous users
Elements of the confidentiality policy will be treated as negative observations for which an anti-explanation has to be found. Accordingly, we will transform policy elements to a set of rules containing new positive observations as sketched in Section 4.3. We will call these rules policy transformation rules for credulous users ().
More formally, assume contains elements. For each conjunction (), we introduce a new negative ground observation and map to . As each is a conjunction of (NAF-)literals, the resulting formula is an EDP rule. As a last policy transformation rule, we add one that maps all new negative ground observations (in their NAF version) to a positive observation . Hence,
The set of policy transformation rules for is
Lastly, we consider a goal rule that enforces the single positive observation : .
We can also allow more expressive policy elements in disjunctive normal form (DNF: a disjunction of conjunctions of (NAF-)literals). If we map a DNF formula to a new observation (that is, ) this is equivalent to mapping each conjunct to the observation (that is, ). We also semantically justify this splitting into disjuncts by arguing that in order to protect confidentiality of a disjunctive formula we indeed have to protect each disjunct alone. However, if variables are shared among disjuncts, these variables have to be grounded according to the Herbrand universe of first; otherwise the shared semantics of these variables is lost.
5.2 Deletions for credulous users
As a simplified setting, we first of all assume that only deletions are allowed to achieve confidentiality-preservation. This setting can informally be described as follows: For a given knowledge base , if we only allow deletions of rules from , we have to find a skeptical negative explanation that explains the new positive observation while respecting as invariable a priori knowledge. The set of abducibles is thus identical to as we want to choose formulas from for deletion: . That is, in total we consider the abductive program . Then, we transform it into normal form , and compute its update program as described in Section 4.2. As for , we add this set to the update program in order to make sure that the resulting answer sets of the update program do not contradict . Finally, we add all the policy transformation rules and the goal rule . The goal rule is then meant as a constraint that filters out those answer sets of in which is true. We thus obtain a new program as
and compute its U-minimal answer sets. If is one of these answer sets, the negative explanation is obtained from the negative update atoms contained in : .
To obtain a confidentiality-preserving knowledge base for a credulous user, we have to check for inconsistency with the negation of the positive observation (which makes a skeptical explanation of ); and allow only answer sets of that are U-minimal among those respecting this inconsistency property. More precisely, we check whether
We combine the update program of with and the policy transformation rules and goal rule. This leads to the following two U-minimal answer sets with only deletions which satisfy the inconsistency property (1):
These answer sets correspond to the minimal solutions from Example 4 where must be deleted together with either or the rule named .
Theorem 5.1 (Correctness for deletions)
A knowledge base preserves confidentiality and changes subset-minimally iff is obtained by an answer set of the program that is U-minimal among those satisfying the inconsistency property (1).
(Sketch) First of all note that because we chose to be the set of abducibles , only negative update atoms from occur in – no insertions with update atoms from will be possible. Hence we automatically obtain an anti-explanation where is empty. As shown in , there is a 1-1 correspondence of minimal explanations and U-minimal answer sets of update programs; and anti-explanations are identical to explanations of a new positive observation when applying the transformations as in . By properties of skeptical (anti-)explanations we have thus but for every there is no answer set in which is satisfied. This holds iff for every policy element there is no answer set of that satisfies any instantiation of (with respect to the Herbrand universe of ); thus . Subset-minimal change carries over from U-minimality of answer sets.
5.3 Deletions and literal insertions
To obtain a confidentiality-preserving knowledge base, (incorrect) entries may also be inserted into the knowledge base. To allow for insertions of literals, a more complex set of abducibles has to be chosen. We reinforce the point that the subset of abducibles that are already contained in the knowledge base are those that may be deleted while the subset of those abducibles that are not contained in may be inserted.
First of all, we assume that the policy transformation is applied as described in Section 5.1. Then, starting from the new negative observations used in the policy transformation rules, we trace back all rules in that influence these new observations and collect all literals in the bodies of these rules. In other words, we construct a dependency graph (as in ) and collect the literals that the negative observations depend on. More formally, let be the set of literals that the new observations directly depend on:
Next we iterate and collect all the literals that the literals depend on:
and combine all such literals in a set .
As we also want to have the option to delete rules from (not only the literals in ), we define the set of abducibles as the set plus all those rules in whose head depends on literals in :
For the example , the dependency graph is shown in Figure 2. We note that the new negative observation directly depends on the literal and the new negative observation directly depends on the literal ; this is the first set of literals . By tracing back the dependencies in the graph, is obtained. Lastly, we also have to add the rule from to because literals in its head are contained in .
We obtain the normal form and then the update program for and the new set of abducibles . The process of finding a skeptical explanation proceeds with finding an answer set of program as in Section 5.2 where additionally the positive explanation is obtained as and is U-minimal among those satisfying
Theorem 5.2 (Correctness for deletions & literal insertions)
A knowledge base preserves confidentiality and changes subset-minimally iff is obtained by an answer set of program that is U-minimal among those satisfying inconsistency property (2).
(Sketch) In , positive update atoms from occur for literals on which the negative observations depend. For subset-minimal change, only these literals are relevant for insertions; inserting other literals will lead to non-minimal change. In analogy to Theorem 5.1, by the properties of minimal skeptical (anti-)explanations that correspond to U-minimal answer sets of an update program, we obtain a confidentiality-preserving with minimal change.
6 Discussion and Conclusion
This article showed that when publishing a logic program, confidentiality-preservation can be ensured by extended abduction; more precisely, we showed that under the credulous query response it reduces to finding skeptical anti-explanations with update programs. This is an application of data modification, because a user can be mislead by the published knowledge base to believe incorrect information; we hence apply dishonesties  as a security mechanism. This is in contrast to  whose aim is to avoid incorrect deductions while enforcing access control on a knowledge base. Another difference to  is that they do not allow disjunctions in rule heads; hence, to the best of our knowledge this article is the first one to handle a confidentiality problem for EDPs. In  the authors study databases that may provide users with incorrect answers to preserve security in a multi-user environment. Different from our approach, they consider a database as a set of formulas of propositional logic and formulate the problem using modal logic. In analogy to , a complexity analysis for our approach can be achieved by reduction of extended abduction to normal abduction. Work in progress covers data publishing for skeptical users; future work might handle insertion of non-literal rules.
-  Foto N. Afrati and Phokion G. Kolaitis. Repair checking in inconsistent databases: algorithms and complexity. In ICDT2009, volume 361 of ACM International Conference Proceeding Series, pages 31–41. ACM, 2009.
-  Joachim Biskup. Usability confinement of server reactions: Maintaining inference-proof client views by controlled interaction execution. In DNIS 2010, volume 5999 of LNCS, pages 80–106. Springer, 2010.
-  Piero A. Bonatti, Sarit Kraus, and V. S. Subrahmanian. Foundations of secure deductive databases. IEEE Trans. Knowl. Data Eng., 7(3):406–422, 1995.
-  Francesco Calimeri, Giovambattista Ianni, Francesco Ricca, Mario Alviano, Annamaria Bria, Gelsomina Catalano, Susanna Cozza, Wolfgang Faber, Onofrio Febbraro, Nicola Leone, Marco Manna, Alessandra Martello, Claudio Panetta, Simona Perri, Kristian Reale, Maria Carmela Santoro, Marco Sirianni, Giorgio Terracina, and Pierfrancesco Veltri. The third answer set programming competition: Preliminary report of the system competition track. In LPNMR 2011, volume 6645 of LNCS, pages 388–403. Springer, 2011.
-  Jürgen Dix, Wolfgang Faber, and V. S. Subrahmanian. The relationship between reasoning about privacy and default logics. In LPAR 2005, volume 3835 of Lecture Notes in Computer Science, pages 637–650. Springer, 2005.
-  Csilla Farkas and Sushil Jajodia. The inference problem: A survey. SIGKDD Explorations, 4(2):6–11, 2002.
-  Michael Gelfond and Vladimir Lifschitz. Classical negation in logic programs and disjunctive databases. New Generation Computing, 9(3/4):365–386, 1991.
Bernardo Cuenca Grau and Ian Horrocks.
Privacy-preserving query answering in logic-based information
In ECAI2008, volume 178 of
Frontiers in Artificial Intelligence and Applications, pages 40–44. IOS Press, 2008.
-  Katsumi Inoue and Chiaki Sakama. Abductive framework for nonmonotonic theory change. In Fourteenth International Joint Conference on Artificial Intelligence (IJCAI 95), volume 1, pages 204–210. Morgan Kaufmann, 1995.
-  Jiefei Ma, Alessandra Russo, Krysia Broda, and Emil Lupu. Multi-agent confidential abductive reasoning. In ICLP (Technical Communications), volume 11 of LIPIcs, pages 175–186. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2011.
-  Chiaki Sakama. Dishonest reasoning by abduction. In 22nd International Joint Conference on Artificial Intelligence (IJCAI 2011), pages 1063–1064. IJCAI/AAAI, 2011.
-  Chiaki Sakama and Katsumi Inoue. An abductive framework for computing knowledge base updates. Theory and Practice of Logic Programming, 3(6):671–713, 2003.
-  Phiniki Stouppa and Thomas Studer. Data privacy for knowledge bases. In Sergei N. Artëmov and Anil Nerode, editors, LFCS2009, volume 5407 of LNCS, pages 409–421. Springer, 2009.
-  Tyrone S. Toland, Csilla Farkas, and Caroline M. Eastman. The inference problem: Maintaining maximal availability in the presence of database updates. Computers & Security, 29(1):88–103, 2010.
-  Lena Wiese. Horizontal fragmentation for data outsourcing with formula-based confidentiality constraints. In IWSEC 2010, volume 6434 of LNCS, pages 101–116. Springer, 2010.
-  Lingzhong Zhao, Junyan Qian, Liang Chang, and Guoyong Cai. Using ASP for knowledge management with user authorization. Data & Knowl. Eng., 69(8):737–762, 2010.