1 Introduction
Pseudofinite model theory is a branch of model theory that studies the class of finite structures by studying the expansion of this class with infinite structures that have “finitary” behaviour. In that, every first order () sentence true in such an infinite structure is also true in a finite structure. These infinite structures are called pseudofinite. The notion usually is made to also include finite structures which are trivially pseudofinite. While the study of pseudofinite structures has been pursued in mathematics since at least the ’80s [9], the study of such structures with reference to computer science is more recent, being largely initiated in [18]. The mentioned paper develops the subject as an alternative approach to studying finite models as contrasted with finite model theory that typically either studies extensions of FO over all finite structures [13, 17, 5], or studies FO (and its extensions) over restricted classes of finite structures [2, 1, 16, 12]. The class of all pseudofinite structures forms an elementary class (that is, definable by an theory) – indeed it is the class of arbitrary models of the theory of the class of all finite structures. This is an equivalent definition of pseudofiniteness. Given that most results of model theory relativize to elementary classes, pseudofinite structures are naturally modeltheoretically wellbehaved.
Shrubdepth [6] is a graph parameter that has been introduced in the context of obtaining algorithmic meta theorems for model checking properties expressible in a wellstudied extension of called Monadic Second Order logic (), with improved dependence on the sizes of the input sentences considered as a parameter. In contrast to the usual nonelementariness for the dependence on this parameter for even model checking over the class of all trees [8], graph classes of bounded shrubdepth admit fixed parameter tractability of model checking with a parameter dependence that is a fixed tower of exponentials, of height proportional to the shrubdepth. These classes are defined using tree models of height that use at most labels for some naturals and , where informally, such a tree model of a graph is a rooted tree whose leaves are the vertices of , that are assigned labels from the set . The presence or absence of an edge between two vertices of is determined by the labels of these vertices in and the distance between them in . Since its inception, shrubdepth has seen plenty of active research for not just its algorithmic properties, but also its structural and logical aspects [6, 14, 4, 11].
In this paper, we study a relativization of the notion of pseudofiniteness to classes of graphs of bounded shrubdepth, and go further to consider the version of it for . That is, instead of considering and all finite graphs in the definition of pseudofiniteness, we instead consider and the class of finite graphs that have tree models of height and labels. This gives us the class of arbitrary graphs that are pseudofinite relative to . The graphs in this class are those for which every sentence true in the graph is true also in some graph of . Equivalently, these are the graphs satisfying the theory of . Of central interest to us in the paper is understanding the graphs structurally, akin to this understanding for . Towards this study, we consider arbitrary graphs that have treemodels of height and labels, where the treemodels could now be infinite. We denote this class of graphs . Clearly is the class of finite graphs of . As the first central result of this paper, we show the following.
Theorem 1.1.
For every and every graph of , there exists a graph in such that: (i) is an induced subgraph of ; (ii) and agree on all sentences of MSO of quantifier rank at most ; and (iii) the size of is at most a fold exponential in and . Thus, in particular, the graphs of are pseudofinite relative to .
Theorem 1.1 in fact shows a stronger property than pseudofiniteness for any infinite graph of ; namely that, every sentence true in is also true in a finite induced subgraph of . Theorem 1.1 can therefore also be seen as showing a strong form of the classic LöwenheimSkolem theorem from model theory for graphs of , by which, not only is it the case that any sentence true in an infinite graph is true also in a countable induced subgraph of , but also that same holds with instead of and ‘finite’ instead of countable. Further, as a consequence of the bounds provided, we obtain also that the index of the equivalence relation over , which relates two graphs of if they agree on all MSO sentences of rank at most , is an elementary function of , indeed a fold exponential in . This is in contrast to the usual nonelementary lower bound for this index even for FO over the class of all finite trees.
Theorem 1.1 tells us that the graphs of are models of the theory of . We go further to investigate what other graphs are models of this theory. Before considering infinite models, one first observes that in contrast to the case of the class of all graphs that are ()pseudofinite (relative to all finite graphs), where the class naturally includes all finite graphs, it is a nontrivial question whether there are any finite graphs not in , that are pseudofinite (or even pseudofinite) relative to . It turns out that there are no such graphs. This is because has a characterization in terms of a finite number of excluded induced subgraphs [6], and therefore is axiomatized in the finite by a universal FO sentence that describes this characterization. This sentence hence belongs to the and theories of , and indeed axiomatizes both of these in the finite.
Moving onward to the infinite, we now ask what infinite models other than those in does the theory of have. This isn’t an easy question as such since there aren’t many tools to deal with infinite structures for their properties, as there are for their properties. Since the graphs of are also pseudofinite, we examine the infinite models of the FO theory of . Here we prove the second central result of this paper.
Theorem 1.2.
The class is closed under ultraproducts and ultraroots.
The above theorem in conjunction with the fact that is closed under isomorphisms, readily gives us that is an elementary class using a wellknown characterization of elementariness under the mentioned closure properties (Theorem 2.4). And this inference in conjunction with Theorem 1.1 gives us the following characterization of pseudofiniteness (and also pseudofiniteness) relative to .
Theorem 1.3.
The class is exactly the class of arbitrary models of the theory, and hence also the theory, of . As a consequence, is characterized over all graphs by the same finite set of excluded finite induced subgraphs that characterizes over finite graphs.
The main tool we use for showing Theorem 1.1 is a version of the FefermanVaught composition theorem proved in [7]. This version allows evaluating any sentence over the disjoint union of an arbitrary family of structures, by examining the truth of an sentence over an type indicator . This is a structure over a monadic vocabulary, that contains all the information about the equivalence classes of the relation to which the structures in belong (indeed these classes constitute the mentioned vocabulary). The fact that the vocabulary of the type indicator is monadic allows us to shrink the structure to under a certain threshold size (that depends on the rank) without any change in the theory in going to the shrunk structure, where is the considered rank. Using this simple observation, we first prove our results for trees of bounded height noting that trees are after all constructed inductively from forests of lesser height, and the latter lend themselves to using the mentioned composition theorem. Subsequently, the results for trees are transferred to using the interpretability of the latter in the former. The proof above turns out to give elementary sized small models for over and is a considerably simpler proof than the one for the mentioned result shown in [6]. For Theorems 1.2 and 1.3, we use a combination of combinatorial and infinitary (compactness based) reasoning to show the results.
2 Background
We assume the reader is familiar with the terminology in connection with the syntax and semantics and . A sequence of variables is denoted . An formula whose free variables are in is denoted . A sentence is a formula without free variables. The quantifier rank, or simply rank, of an formula , denoted , is the maximum number of quantifiers (both first order and second order) appearing in any root to leaf path in the parse tree of the formula. We denote by the class of all formulae of rank at most .
A simple, undirected graph is an structure in which the binary relation is interpreted as an irreflexive and symmetric relation. All graphs in the paper are simple and undirected. A tree is a connected graph that does not contain any cycles. A labeled rooted tree is a structure such that: (i) the reduct of is a tree; (ii) the unary relation symbol is interpreted as a set consisting of a single element called the root of , and denoted ; and (iii) the s for form a partition of the nodes of (some of the parts could be empty). We shall often call labeled rooted trees, as simply trees when is clear from context. We use the standard notions of parent, child, ancestor and descendent in the context of trees. For a tree and a node of it, the subtree of rooted at , denoted , is the substructure of induced by the descendants of in (these include ), except for the interpretation of the predicate, which is (instead of ). A labeled rooted forest is a disjoint union of labeled rooted trees for belonging to an index set ; we then write where denotes disjoint union. A (labeled rooted) tree is said to be a leafhereditary subtree of a tree if the roots of and are the same, and there is a subset of nodes of deleting the subtrees rooted at which the resulting tree is . Equivalently, is a leafhereditary subtree of if the roots of and are the same, is a substructure of , and every leaf of is also a leaf of . If is a leafhereditary subtree of , then it follows that: (i) for any two leaf nodes of , the distance between them in is the same as the distance between them in ; (ii) if is the forest of labeled rooted trees obtained by removing the root of for , then for every tree of , there exists a tree of such that is a leafhereditary subtree of . The height of a tree is the maximum root to leaf distance in . We denote by the class of arbitrary (finite or infinite) labeled rooted trees of height at most .
Shrubdepth:
We recall the notion of tree models from [6] and state it in its extended version for arbitrary cardinality graphs. For where denotes the set of naturals including 0, a tree model of labels and height for a graph is a pair where is an labeled arbitrary rooted tree of height and is a set called the signature of the tree model such that:

The length of every root to leaf path in is exactly .

The set is exactly the set of leaves of .

Each leaf has a unique label from and all internal nodes are labeled .

For any and , it holds that if and only if .

For vertices , if and are the labels of and seen as leaves of , and the distance between and in is , then iff . Observe that the distance between and in is an even number as all root to leaf paths are of length , and is thus the distance between (or ) and the least common ancestor of and .
The class of arbitrary tree models of labels and height is denoted and the class of trees contained in these tree models is denoted ; so . The class of arbitrary graphs that have tree models in is denoted , and the class of finite graphs of is denoted . Recalling from [6], a class of finite graphs has shrubdepth at most it is a subclass of ; then analogously we can define a class of arbitrary graphs to have shrubdepth at most if it is a subclass of . It is easy to see that and are both hereditary classes, that is, closed under induced subgraphs.
We make some observations from the definition of that we will need in Section 3. We see that for every signature , there exists a pair of formulae that when evaluated on a tree in produces the graph of which is a tree model. Specifically: (i) the formula says that is a leaf node; and (ii) the formula says that and are leaves, and for some , it is the case that and are true, and that the distance between and is exactly . The pair is called an interpretation in . Thus defines a function from to , which also we denote as ; so for and as above . Then .
We now make some important observations about . Firstly, the rank of , defined as the maximum rank of the formulae appearing in it, is such that . Next, if , then for an formula of rank in the vocabulary of , there is an formula that we denote , of rank in the vocabulary of such that the following holds. (This is a special case of a more general result called the fundamental theorem of interpretations.)
. As a consequence, if and are trees of , then
Finally for , if is a leafhereditary subtree of , then and is an induced subgraph of .
FefermanVaught composition:
Let be one of the logics or . Given and structures and from a class of structures (say labeled rooted trees or unlabeled graphs) over a vocabulary , we say and are equivalent, denoted , if and agree on all sentences of rank at most . It is known that the relation has finite index [3]. For , we let denote this index (of the relation) restricted to . Let be a family of structures of with disjoint universes, indexed by an index set of an arbitrary cardinality. Let and be the relational vocabulary consisting of a distinct unary predicate symbol for each equivalence class of the relation over , and containing no other predicate symbols. The type indicator for the family is now defined as a structure such that: (i) the universe of is ; and (ii) for that corresponds to an equivalence class of over , the interpretation of in is the set . Observe that for each , there is exactly one predicate such that is in the interpretation of in . We now have the following theorem from [7]. (This is the special case of and in [7, Theorem 14].)
Theorem 2.1 (Theorem 14, [7]).
Let be a class of structures over a vocabulary . For every sentence over of rank , there exists an sentence over such that if is a family of structures of with disjoint
universes for an index set of an arbitrary cardinality, then the following holds:
Further, if is the class of structures of expanded with (all possible interpretations of) new unary predicate symbols, then the rank of is .
Remark 2.2.
In [7], the result is actually stated for which does not contain any variables and whose atomic formulae, instead of being the usual atomic formulae (of the form , for an variable , and where and are variables), are instead of the “second order” forms and where are variables and is either an variable or a predicate of , with being the arity of . The semantics for these atomic forms are as suggested by their names Every “usual” formula can be converted into an equivalent formula over the mentioned second order atomic formulae, without any change of quantifier rank (see [7, page 4]). We have hence recalled [7, Theorem 14] in the form stated above in Theorem 2.1 which features as a usual formula.
We provide here the justification for the last statement of Theorem 2.1 which does not appear explicitly in [7] but is indeed a consequence of the proof of [7, Theorem 14]. We refer the reader to [7, Section 3.1, pp. 6 – 9] to find the formulae and other constructions we refer to in our description here.
We first observe in the proof of Lemma 8 of [7], that the “capping” constant for the formula is simply the rank of , since is an FO sentence over a monadic vocabulary (and we also see a similar such result in Lemma 3.3). Then the number mentioned in the proof is at most , whereby the rank of , and hence the rank of , is at most , where denotes the vocabulary of , namely . Call this observation (*).
We now come to the proof of Theorem 14 of [7], and make the following observations about the rank of following the inductive construction of as given in the proof. For the base cases, since is a quantifierfree formula for any , we get that if , then rank of is 1, and if or , then the rank of is 2 since the width is 0 by our assumption. If is a Boolean combination of a set of formulae, then is the maximum of the ranks of the formulae in the mentioned set. We now come to the nontrivial case when .
We see from [7, page 9, para 2] that the rank of is the maximum of the ranks of where is obtained from (denoted as simply in the proof as a shorthand) by substituting the atoms with the quantifierfree formula for a suitable . Then . It follows from (*) above that ; call this inequality (**). Let denote the equivalence relation that relates two structures over the same vocabulary iff they agree on all sentences (over the vocabulary of the structures) of rank at most . Then the vocabulary of is the set of all equivalence classes of the relation over all structures over the vocabulary of the family (where is as in the statement of [7, Theorem 14]), expanded with (all possible interpretations of) set predicates, where is the number of free variables of and is the rank of . Here we now importantly observe that if the structures of the family come from a class , then it is sufficient to consider just those equivalence classes of the relation that are nonempty when restricted to the structures of expanded with set predicates. Then if is a subformula of a rank sentence over a vocabulary and we are interested only in a given class of structures and expansions of these with set predicates, then the size of the vocabulary of is at most the index of the equivalence relation over the class of structures of expanded with (all possible interpretations of exactly) set predicates. Applying this observation to (**) iteratively, we then get that if and then
showing the last statement of Theorem 2.1.
Ultraproducts:
Given a family of structures over a relational vocabulary and an ultrafilter on the index set , let be the direct (Cartesian) product of the ’s. Let be the equivalence relation on the universe of defined as: if and are tuples of elements from , then if, and only if, . Let denote the equivalence class of under . Then the ultraproduct of is the structure defined as: (i) the universe of is the set ; (ii) for a ary relation and tuples of for , it holds that if, and only if, . Given the definition of , it can be seen that the presented interpretation of in is welldefined. If for all , then is called the ultrapower of with respect to , and is called the ultraroot of with respect to . Two wellknown theorems concerning the ultraproduct are as below.
Theorem 2.3 (Łoś theorem; Theorem 4.1.9 [3]).
Let and be the direct product and ultraproduct respectively of a family of structures over a vocabulary , with respect to an ultrafilter over an index set . Then for an formula over and elements of for ,
Theorem 2.4 (Theorems 4.1.12 and 6.1.15 [3]).
A class of structures is elementary iff it is closed under isomorphisms, ultraproducts and ultraroots.
For a class of finite structures and , we denote by the class of all sentences that are true in all structures of . We say an arbitrary structure is pseudofinite relative to if ; that is (equivalently) if every sentence of is true in some structure of .
3 Pseudofiniteness of relative to
Following are the central results of this section. Define the function as: and . The core technical result is Theorem 3.2 that shows a relativized pseudofiniteness theorem for . This is then transferred to in Theorem 3.1 using interpretations.
Theorem 3.1.
Let be given. There exists an increasing function such that the following are true for each .

For every graph , there exists such that: (i) ; (ii) is at most , and (ii) .

The index of the relation over is at most .
Theorem 3.2.
Let be given. There exists an increasing function such that if is the function given by , then the following are true for each .

For every tree , there exists a leafhereditary subtree of such that: (i) the heights of and are the same; (ii) is at most ; and (iii) .

The index of the relation over is at most if , and is if .
Proof of Theorem 3.1.
Since , from Section 2 there exists a tree model for and an interpretation such that . Let be the rank of ; we know that . Since can be seen as a tree in , by Theorem 3.2(1), there exists a leafhereditary subtree of such that (i) , and (ii) . Since every roottoleaf path of is also a roottoleaf path of , and every internal node of is labeled with the label while the leaf nodes are labeled with labels from , we get that and that is a tree model for . From the properties of the interpretation as discussed in Section 2, we infer the following: (i) Since is a leafhereditary subtree of , we have ; (ii) Since is the set of leaves of , we have where , and ; (iii) Since , we have .
We now look at the index of the relation over . For a given signature , denote denote the subclass of of those graphs that have a tree model . Then is a surjective map from to . For as above, since any equivalence class of the relation over gets mapped by to a subclass of an equivalence class of the relation over , we get by the surjectivity of that for as above. Then . ∎
Towards the proof of Theorem 3.2, we will require the following lemma that can be shown using a simple EhrenfeuchtFräissé game argument.
Lemma 3.3.
Let be a finite vocabulary consisting of only monadic relation symbols. Let be an arbitrary structure over such that every element of is in the interpretation of exactly one predicate of . Let be a given element of , and let and . Then there exists a substructure of such that: (i) contains , (ii) , and (iii) .
Proof of Theorem 3.2.
We prove the theorem by induction on . The base case of is trivial to see by taking . Assume as induction hypothesis that the statement is true for for .
Consider a tree of height equal to . Let be the forest of labeled rooted trees obtained by removing the root of . Let be the class of rooted forests whose constituent trees belong to ; so . Consider now the type indicator for the family and the sentence that axiomatizes the equivalence class of (this is known to exist [15]). By Theorem 2.1, there is an FO sentence over the vocabulary such that
Now we know from Theorem 2.1 that for any sentence over the vocabulary of , the sentence given by the theorem has rank that is where is the expansion of with new unary predicates. Now there is a natural 11 correspondence between and , and two structures are equivalent iff their corresponding structures are. Then the rank of is . Let be a constant such that for all . In fact, as shown in Section 2, one can take and .
From the analysis above then, we have that . Now by induction hypothesis, we see that: (i) if , then ; and (ii) if , then . Then the rank of is at most

if , and

, if .
If is the function given by , then we see that in either case above, the rank of is at most . We also see that is the vocabulary which contains one unary predicate symbol for every equivalence class of the relation over , and only those predicates; then which is equal to if , and at most if . Then for all .
We observe now that is a structure over a finite monadic vocabulary such that each element of its universe is in the interpretation of exactly one predicate in the vocabulary. Let for be such that the height of is equal to (there must be such a tree in since height of is equal to ). Recall that is the universe of . Then by Lemma 3.3, taking and in the lemma, we get that there exists a substructure of such that (i) contains , (ii) , and (iii) . Then can be seen as the type indicator of the family . for a subset , that contains . Then by Theorem 2.1, we have
Since (i) , (ii) , and (iii) , we have and therefore