1 Introduction
In this paper, we axiomatize firstorder consequences of inclusion logic. Inclusion logic was introduced by Galliani [6] as a variant of dependence logic introduced by Väänänen [29]. Another important variant of dependence logic is independence logic, introduced by Grädel and Väänänen [11]. Dependence logic and its variants adopt the framework of team semantics of Hodges [21, 22] to characterize dependency notions. Inclusion logic aims to characterize inclusion dependencies by extending firstorder logic with inclusion atoms, which are strings of the form , where and are sequences of variables of the same length. With team semantics inclusion atoms and other formulas are evaluated in a model with respect to sets of assignments (called teams), in contrast to single assignments as in the usual firstorder logic. Intuitively the inclusion atom specifies that all possible values for in a team are included in the values of in the same team .
Galliani and Hella proved that inclusion logic is expressively equivalent to positive greatest fixedpoint logic [8]. It then follows from the results of Immerman [23] and Vardi [31] that over finite ordered structures inclusion logic captures PTIME. Building on these results, Grädel defined modelchecking games for inclusion logic [9], which then found application in [10]. There also emerged some studies [16, 13, 27, 14] on the computational complexity and syntactical fragments of inclusion logic. Embedding the semantics of inclusion atoms into that of the quantifiers, Rönnholm [28] introduced the interesting inclusion quantifiers that generalize the idea of the slashed quantifiers of independencefriendly logic [20] (a close relative to dependence logic). Inclusion atoms have also found natural applications also in a recent formalization of Arrow’s Theorem in social choice in dependence and independence logic [26]. Motivated by the increasing interest in inclusion logic, we present in this paper a prooftheoretic investigation of inclusion logic, which is currently missing in the literature.
It is worth noting that inclusion atoms correspond exactly to the inclusion dependencies studied in database theory. The implication problem of inclusion dependencies, i.e., the problem of deciding whether for a set of inclusion dependencies (or inclusion atoms), is completely axiomatized in [4] by the following three rules/axioms:

(identity)

for (projection and permutation)

(transitivity)
The team semantics interpretation for inclusion atoms has recently been ulitized to study also the implication problems of inclusion atoms together with other dependency atoms [15, 18, 19]. In this paper, we study, instead, the axiomatization problem of inclusion logic, inclusion atoms enriched with connectives and quantifiers of firstorder logic, that is, the problem of finding a deduction system for which the completeness theorem
(1) 
holds for being a set of formulas of the logic.
It is known that dependence logic is not (effectively) axiomatizable, since the sentences of the logic are equiexpressive with sentences of existential secondorder logic (ESO) [29]. Nevertheless, if one restrict the consequence in (1) to a firstorder sentence and to a set of sentences in dependence logic, the axiomatization can be found. This is because, finding a model for such a set of sentences of dependence logic is the same as finding a model for a set of ESO sentences (i.e., sentences of the form for some firstorder ), which is then reduced to finding a model for a set of firstorder sentences (of the form ). A concrete system of natural deduction for dependence logic admitting this type of completeness theorem was given in [25]. The proof of the completeness theorem uses a nontrivial technique that involves showing the equivalence between a dependence logic sentence and its socalled game expression (an infinitary firstorder sentence describing a semantic game) over countable models, and the fact that the game expression can be finitely approximated over recursively saturated models. Subsequently, using the similar method a system of natural deduction axiomatizing completely the firstorder consequences in independence logic with respect to sentences was also introduced [12]. These partial axiomatizations for sentences were first generalized in [24] to cover the cases for formulas by expanding the language with a new predicate symbol to interpret the teams, and later generalized further in [32] to cover the case when the consequence in (1) is not necessarily firstorder itself but has an essentially firstorder translation by applying a trick that involves the weak classical negation and the addition of the RAA rule for .
As we will demonstrate formally in this paper, inclusion logic is not (effectively) axiomatizable either. Since inclusion logic is less expressive than ESO, by the same argument as above, the firstorder consequences of inclusion logic can also be axiomatized. In this paper, we give explicitly such an axiomatization. To be more precise, we introduce a system of natural deduction for inclusion logic for which the completeness theorem (1) holds for being a firstorder formula and being a set of formulas. Our completeness proof uses the technique developed in [25] together with the trick in [32]. Our system of inclusion logic is a conservative extension of the system of firstorder logic, as it has the same rules as that of firstorder logic when restricted to firstorder formulas only. The rules for inclusion atoms include some of those introduced in [12], and the rules characterizing interactions between inclusion atoms and the connectives and quantifiers appear to be simpler than the corresponding ones in the systems of dependence and independence logic [25, 12]. We also show that the crucial rule, the RAA rule for , needed for the trick of [32] is actually derivable in our system with respect to firstorder formulas.
The paper is organized as follows. In Section 2 we recall the basics of inclusion logic, and also give a formal proof that inclusion logic is not (effectively) axiomatizable. Section 3 discusses the normal form for inclusion logic. In Section 4, we define the game expressions and their finite approximations that are crucial for the proof of the completeness theorem of the system of natural deduction for inclusion logic. We introduce this system in Section 5, and also prove the soundness theorem as well as some useful derivable clauses in the section. The proof of the completeness theorem will be given in Section 6. We conclude with Section 7 showing some applications of our system, in particular, we derive in our system the axioms for anonymity atoms proposed recently by Väänänen [30].
2 Preliminaries
In this section, we recall the basics of inclusion logic and prove formally that inclusion logic is not (effectively) axiomatizable. We consider firstorder signatures with a builtin equality symbol . Fix a set of firstorder variables, and denote its elements by (with or without subscripts). Firstorder terms are built recursively as usual, and firstorder formulas are defined by the grammar:
Throughout the paper, we reserve the first greek letters (with or without subscripts) for firstorder formulas. As usual, we define and for firstorder formulas and . Formulas of inclusion logic () are defined recursively as follows:
where is an arbitrary firstorder formula. The formula is called an inclusion atom, and note that negation in applies only to firstorder formulas.
The set of free variables of an formula is defined inductively as usual except that we now have the new case
We write to indicate that the free variables of are among . formulas with no free variable are called sentences.
We assume that the domain of a firstorder model has at least two elements, and use the same letter to stand for both the model and its domain. An assignment of a model for a set of variables is a function . The interpretation of an term under and is defined as usual and denoted as . For any sequence of variables, we write or for . For any element , is the assignment defined as
A set of assignments of a model with the same domain is called a team (of ). In particular, the empty set is a team, and the singleton is a team with the empty domain.
For any formula of , any model and any team of with , we define the satisfaction relation inductively as follows:

iff .

iff for all , in the usual sense.

iff for all , in the usual sense.

iff for all , there is such that .

iff and .

iff there exist with such that and .

iff for some function , where

iff , where
We write if for all implies for all models and teams . We write simply for , and for . If both and , we wire .
Our version of the team semantics for disjunction and existential quantifier is known in the literature as lax semantics; see [6] for further discussion. In some literature (e.g., [6]) inclusion atoms are allowed to have arbitrary terms as arguments, namely strings of the form are considered wellformed formulas, and the semantics of these inclusion atoms are defined (naturally) as:

iff for all , there is such that
It is easy to check that inclusion atoms of this type are definable in our version of inclusion logic, since , where abbreviates for some , and is short for .
For any assignment and any set of variables, we write for the assignment restricted to . For any team , define . We list the most important properties of formulas in the following lemma. The reader is referred to [6, 8] for other properties.
Let be an formula, an model, and and teams of with .
 (Locality)

If , then .
 (Union Closure)

If and , then .
 (Flatness of Firstorder Formulas)

For any firstorder formula ,
Consequently, firstorder formulas are also downwards closed, that is, and imply .
If is a sentence, the locality property implies that iff for all teams of . We call a model of , written , if .
By the result of [6], sentences can be translated into existential secondorder logic (ESO), namely, for every sentence , there exists a ESOsentence such that iff . Since ESO is wellknown to be compact, it follows that is compact as well, that is, if every finite subset of a set of sentences has a model, then the set itself has a model. It was further proved in [8] that is expressively equivalent to positive greatest fixed point logic () in the sense of the following theorem.
[[8]] For any formula of with , there exists an formula of with a fresh ary predicate symbol such that for all models and teams of with ,
and vice versa, where is an ary predicate on that serves as the interpretation for . In particular, sentences can be translated into and vice versa.
As a consequence of [23], over finite models, and least fixed point logic have the same expressive power. In particular, by [23, 31], over ordered finite models, captures .
Due to the strong expressive power, is not (effectively) axiomatizable. We now give an explicit proof of this fact by following a similar argument to that in [25].^{2}^{2}2The author would like to thank Jouko Väänänen for suggesting this proof, and the formula used in Proposition 2 is taken essentially from [8].
For any model with the signature of arithmetic, iff is not wellfounded.
Proof.
It is easy to prove that holds in iff contains an infinite descending chain . We leave the proof details to the reader. ∎
Now, put , and let be a (firstorder) sentence stating that all (finitely many) axioms of Peano arithmetic except for the induction axiom fail. For any sentence of arithmetic, we have
(2) 
where is the standard model of arithmetic. To see why, for the left to right direction, suppose that and that is a model of such that . By Proposition 2, is wellfounded, which implies that is (isomorphic to) the standard model of arithmetic. Thus . Conversely, suppose . By Proposition 2, is not true in the standard model of . Thus we must have .
The equivalence (2) shows that truth in the standard model can be reduced to logical validity in inclusion logic. This means that validity in inclusion logic is not arithmetical, and therefore inclusion logic cannot have any (effective) complete axiomatization.
Nevertheless, there can be partial axiomatizations for the logic. The main objective of the present paper is to introduce a system of natural deduction for that is complete for firstorder consequences, in the sense that
holds whenever is a set of formulas, and is a firstorder formula. Our completeness proof will mainly follow the argument of [25] that roughly goes as follows: First, we show that any sentence is semantically equivalent to a formula in certain normal form, and also in the system to be introduced every formula implies its normal form. Then, we show that is equivalent over countable models to a firstorder sentence (called its game expression) of infinite length. Next, we show that the game expression can be approximated in a certain sense (in the sense of Theorem 4) by some firstorder sentence () of finite length. Finally, making essential use of these approximations we will be able to prove the completeness theorem by certain model theoretic argument together with a trick using the weak classical negation developed in [32].
3 Normal form
In this section, we prove that every formula is (semantically) equivalent to a formula in the form where is a conjunction of inclusion atoms, and is a firstorder quantifierfree formula. This normal form is similar to the normal forms for dependence and independence logic introduced in [29, 12], and is more refined than the two normal forms for formulas introduced in the literature, which we recall in the following.
[[7]] Every formula is semantically equivalent to a formula of the form
(3) 
where and is a quantifier free formula.
Proof (sketch).
This follows from the observation that if , then

;

; (4)

;

, where are fresh variables.∎
[[13]] Every formula of the form (3) is semantically equivalent to a formula of the form
(5) 
where , is fresh and is the quantifier free formula in (3).
Proof (idea).
This is proved by exhaustively applying the equivalences
(6) 
and ∎
We show next that the quantifierfree formula in the above two theorems can be turned into an equivalent formula in some normal form.
Every quantifierfree formula is semantically equivalent to a formula of the form
(7) 
where is a firstorder quantifierfree formula, and each and are sequences of variables from .
Proof.
We prove the lemma by induction on . The case being a firstorder formula is trivial. If , clearly
Assume that where are firstorder and quantifierfree, the sequences and do not have variables in common,
(8) 
If , then by (3) we have
If , we show that is equivalent to
(9) 
We first claim that for any firstorder formula , any formula ,
(10) 
where each and consist of variables from the sequence . Then can prove by consecutively applying (10) as follows:
(by (3)) 
We now finish the proof by verifying the claim (10). For the direction left to right, suppose . Then there are teams and suitable sequence of functions for such that , and . We now define suitable (sequence of) functions for the quantifications as follows: Pick two distinct elements .

Define in such a way that the resulting team satisfies
where . We omit here the precise technical definition.

Define by taking .

Define by taking
Put . Clearly, . It remains to show that and for all .
For the former, define
Clearly , as . Since , and , we obtain and by locality.
For the latter, let be arbitrary. If , since , there exists such that . Now, since , by the definition of and , we know that . Thus, for , we have
If , then and thereby by the definition of . Thus, , namely that itself is the witness of for .
For the direction right to left of the claim (10), suppose there are suitable (sequence of) functions for the quantifications such that for , we have that . Then there are teams such that , and . Since is flat, we may let be the maximal such team.
Consider and . Clearly, , and by locality. It remains to show that .
Define a suitable sequence of functions for in such a way that . We omit the precise technical definition here. Now, since , we have by locality. To show that satisfies each , take any . Let be an arbitrary extension of . Since , there exists such that . Since and , we have . It then follows that , which in turn implies that . Then , as was assumed to be the maximal subteam of that satisfies . Hence, and . ∎
Finally, applying the above normal form results we obtain the desired more refined normal form as follows.
Every formula is semantically equivalent to a formula of the form
(11) 
where is a firstorder quantifierfree formula, and each and are sequences of variables from .
Proof.
By Theorem 3, we may assume that is in prenex normal form (3). Furthermore, by Lemma 3, the quantifier free formula in (3) is equivalent to a formula of the form (7). Hence, is equivalent to a formula of the form
Finally, applying Theorem 3 to the above formula (and rearranging the order of the existential quantifiers) we obtain an equivalent formula of the form (11). ∎
To simplify notations in the normal form (11), we now introduce some conventions. For any permutation and , we define a function by taking for any sequence . That is, is a sequence of variables from . When no confusion arises we drop the superscripts in and write simply . We reserve the greek letters (with or without superscripts) for such functions. The normal form of an sentence (with no free variables) can then be written as
(12) 
Observe that the formula in the above normal form has only one (explicit) universal quantifier (i.e., ), and yet, some existentially quantified variables from are essentially universally quantified, as ensured by the inclusion atoms (cf. Equivalence (6)).
4 Game expression and approximations
In this section, we define the game expression for every sentence in normal form. Intuitively the formula is a firstorder sentence of infinite length that simulates all possible plays in the semantic game (in team semantics) of the formula , and we will show that over countable models and are equivalent. For a game of finite length , we define a firstorder formula of finite length, called the approximation of . It follows from the modeltheoretic argument in [25] that is equivalent to the (infinitary) conjunction of all its approximations over countable recursively saturated models. These game expressions and their finite approximations will be essential for proving the completeness theorem for the system of to be introduced in the next section.
Now, let be an sentence (with no free variables). By Theorem 3, we may assume that is in normal form (12). We now define the game expression of as the following firstorder sentence of infinite length:
where

and with

being the set of indices of variables introduced as witnesses for each with respect to the variables from ,

being the set of indices of variables introduced as witnesses for each with respect to all new pairs with from and from (denote the set consisting of all such new pairs by , and note that we are requiring that );


;

;

.
The formula is defined in layers that correspond essentially to the plays in the semantic game of the formula (see e.g., [5] for the definition of the semantic game for ). Each layer of consists of the subformula with and . The intuitive reading of each layer is as follows: Each layer introduces new existentially quantified variables and one universally quantified variable , and specifies (via ) that holds for the existentially quantified variables . For each inclusion atom in , with respect to each sequence of existentially quantified variables introduced in layer , a witness sequence of variables (as specified via the formula ) together with the accompanying sequence are introduced in layer as part of . Similarly, for each inclusion atom in , with respect to each new combination of existentially quantified variables introduced up to layer and universally quantified variables introduced up to layer , a witness sequence of variables (as specified via the formula ) together with the accompanying sequence are introduced in layer as part of .
We assume that the reader is familiar with the gametheoretic semantics of firstorder and infinitary logic. Let us now recall the semantic game of the formula over a model , which is an infinite game played between two players belard and loise. At each round the players take turns to pick elements from for the quantified variables and as illustrated in the following table:
round  0  1  n  

The choices of the two players generate an assignment for the quantified variables defined as
The player loise wins the (infinite) game if for each natural number ,
Finally,
where a winning strategy for loise is a function that tells her what to choose at each round, and guarantees her to win every play of the game. We now show that an sentence is semantically equivalent to its game expression over countable models by using the gametheoretic semantics.
Let be an sentence, and a model. Then

,

and whenever is a countable model.
Proof.
(i) Suppose . Then, there exists a suitable sequence of functions for such that for ,
(13) 
We prove by constructing a winning strategy for loise in the semantic game as follows:

In round , choose any assignment in , and let loise choose and . Let be the assignment for generated by loise’s choices so far. By (13), we have , which implies , thus the winning condition is maintained.

Let be the assignment generated by the choices of the two players up to round . Assume that we have maintained that for each in the domain of , the assignment for defined as is in , and assume that belard has chosen in round .

For any with the corresponding choices by the two players in (at most) two earlier than rounds, the assignment must be in . For each , since , there exists such that . Let loise choose and . Clearly,
