Structural information is captured very well by homomorphism counts. Indeed, an old theorem due to Lovász  states that two graphs are isomorphic if and only if for all graphs . Here denotes the number of homomorphisms from graph to graph ; homomorphisms are mappings between vertices that preserve adjacency. This simple theorem is quite useful and can be seen as a the starting point for the theory of graph limits [9, 31, 32]: by associating each graph
with the vector, we map graphs into an infinite dimensional real vector space, which can be turned into a Hilbert space by defining a suitable inner product. This transformation enables us to analyse graphs with methods of linear algebra and functional analysis and, for example, to consider convergent sequences of graphs and their limits, called graphons (see 
). Vector embeddings of graphs are also crucial for applying machine learning methods to graphs. Notably, there is a close connection between homomorphism counts and so-called graph kernels (e.g.[40, 24]
) and graph neural networks (e.g.[33, 36]).
However, not only the full homomorphism vector of a graph , but also its projections on natural subspaces capture very interesting information about . For a class of graphs, we consider the projection
of onto the subspace indexed by the graphs in . Following , we call graphs homomorphism-indistinguishable over if . Dvorák  proved that two graphs are homomorphism-indistinguishable over the class of graphs of tree width at most if and only if they are not distinguishable by the -dimensional Weisfeiler-Leman algorithm, a well-known combinatorial isomorphism test. As we can always restrict homomorphism vectors to connected graphs without loss of information, this implies that two graphs are homomorphism-indistinguishable over the class of trees if and only if they are not distinguishable by the -dimensional Weisfeiler-Leman algorithm, which is also known as colour refinement and naive vertex classification. Via well-known characterisations of Weisfeiler-Leman indistinguishability in terms of the solvability of certain natural systems of linear inequalities [2, 20, 34] or systems of polynomial equations or inequalities [3, 6, 19], this also yields algebraic characterisations of homomorphism indistinguishability over classes of bounded tree width. A related algebraic characterisation was obtained for homomorphism indistinguishability over the class of paths . It is well-known (though usually phrased differently) that two graphs are homomorphism-indistinguishable over the class of cycles if and only if they are co-spectral
, that is, their adjacency matrices have the same eigenvalues with the same multiplicities. Böker proved that two graphs are homomorphism-indistinguishable over the class of bipartite graphs if and only if they have isomorphic bipartite double covers. The most recent addition to this picture is a result due to Mančinska and Roberson  stating that two graphs are homomorphism-indistinguishable over the class of all planar graphs if and only if they are quantum isomorphic. Quantum isomorphism, introduced in , is a complicated notion that is based on similar systems of equations as those characterising homomorphism indistinguishability over graphs of bounded tree width, but with non-commutative variables ranging over the elements of some -algebra.
What we see emerging is a rich theory connecting combinatorics, structural graph theory, and algebraic graph theory. It turns out that logic is also an integral part of this theory, not only because some of the algebraic characterisations of homomorphism indistinguishability can be phrased in terms of propositional proof complexity [3, 6, 19], but also because there is a well-known characterisation of the Weisfeiler-Leman algorithm and hence homomorphism indistinguishability over classes of bounded tree width in terms of logical equivalence. The logic C is the extension of first-order logic by counting quantifiers of the form (“there exists at least elements ”). Every C-formula is equivalent to a formula of plain first-order logic. However, we are mainly interested in fragments of the logic obtained by restricting the quantifier rank or the number of variables of formulas, and the translation from C to first-order logic preserves neither the quantifier rank nor the number of variables (see Remark 2.1). The logic C and its finite variable fragments have first been considered by Immerman in the 1980s [22, 23], and they have played an important role in finite model theory since then. Cai, Fürer, and Immerman  showed that equivalence in the -variable fragment of C corresponds to indistinguishability by the -dimensional Weisfeiler-Leman algorithm. Thus, two graphs are -equivalent if and only if they are homomorphism indistinguishable over the class of graphs of tree width at most .
Rather than restricting the number of variables in a formula, it is, arguably, even more fundamental to restrict the quantifier rank (maximum number of nested quantifiers in a formula). Our main result is the following characterisation of equivalence in the fragment of C consisting of all formulas of quantifier rank at most .
For all and all graphs the following are equivalent.
and are homomorphism-indistinguishable over the class of all graphs of tree depth at most .
and satisfy the same -sentences.
Tree depth, introduced by Nešetřil and Ossona de Mendez , is a structural graph parameter that has received a lot of attention in recent years (e.g. [5, 10, 12, 17, 16]). Our result adds a characterisation of homomorphism indistinguishability over classes of bounded tree depth to the theory of homomorphism indistinguishability sketched above.
However, our result is also interesting from a purely logical point of view. It can be seen simultaneously as a locality theorem and as a quantifier elimination theorem. Locality, because as noted above, when considering homomorphism indistinguishability, we can restrict our attention to connected graphs. Connected graphs of tree depth at most are known to have a radius of at most (see ), and hence their homomorphic images will always be contained in neighbourhoods of radius at most . This means that homomorphism indistinguishability over graphs of tree depth and thus -equivalence only depend on neighbourhoods of radius at most . This consequence of our main theorem was known before , but we believe that our approach sheds some new light on locality. It should be seen in the context of other recent and not-so-recent locality results for counting logics [27, 28, 29, 25, 26, 39]. Let us remark (as already noted by Libkin ) that the exact choice of a counting extension of first-order logic is not so important when we only study equivalence between structures.111The reason is that over a fixed finite graph, formulas of other counting extensions of first-order logic, such as the logic of , are equivalent to C-formulas of the same quantifier rank.
Our theorem is a quantifier-elimination result, because it says that we can replace the nested quantifiers of a -formula, which may involve alternations between existential and universal quantifiers, by flat, unnested homomorphism counts. While new in this context, replacing quantifier alternation by counting is a common theme in complexity theory, most prominently represented by Toda’s theorem  that contains the polynomial hierarchy.
The proof of our theorem is harder than one might expect in view of the numerous previous results on homomorphism indistinguishability. The overall structure of the proof is as follows: in the first step we use linear algebraic techniques that go back to Lovász 
to show that homomorphism counts can be expressed by counts of more restrictive structure preserving mappings. In the second step, the connection to logic is established via an Ehrenfeucht-Fraïssé game and interpolation techniques. To carry out the first step, we need to prove the invertibility of certain homomorphism matrices, which we achieve by a decomposition into lower-triangular and upper triangular matrices of full rank. The precise nature of this decomposition is what makes the proof difficult; we need to go through various intermediate mappings obeying certain carefully chosen constraints.
The structure of the paper is simple: we prove the theorem and then discuss some of its consequences.
2.1 Graphs and Homomorphisms
We always assume graphs to be undirected and vertex-coloured. Thus a graph is a triple where is a finite set, , and for some set whose elements we view as “colours”.222For clarity of the presentation, we decided to focus on undirected graphs here. The result can be extended to arbitrary relational structures, see Section 5.3 for a brief discussion. The order of a graph is . A graph is a subgraph of a graph (we write ) if , , and for all .
A homomorphism from a graph to a graph is a mapping such that for all and for all . We write to denote that is a homomorphism from to . We denote the number of homomorphism from to by . Graphs are homomorphism-indistinguishable over a class of graphs if for all ; otherwise they are homomorphism-distinguishable over .
Observe that for a disconnected graph with connected components and for an arbitrary graph it holds that . This means that if is a class of graphs such that all connected components of graphs in belong to as well, then graphs are homomorphism-indistinguishable over if and only if they are homomorphism-indistinguishable over the class of all connected graphs in .
A homomorphism is an embedding (or monomorphism) from to (we write ) if it is injective. A homomorphism is an epimorphism from to (we write ) if is surjective and for every edge there is an edge such that and . (Note that not every surjective homomorphism is an epimorphism.) If is an epimorphism, then is a homomorphic image of . By and we denote the numbers of embeddings and epimorphisms from to .
If is a partial mapping from to , then by we denote the number of homomorphisms from to that extend . In particular, for vertices and , by we denote the number of homomorphism with . We use similar notations for embeddings, epimorphisms, and other types of mappings that we shall introduce later.
2.2 First-Order Logic with Counting
To define the syntax of the logic C, we assume that we have an infinite supply of variables, which we denote by and variants such as . Variables range over the vertices of a graph. Atomic formulas (in the language of graphs) are of the form , (“there is an edge between ”), and for colours (“ has colour ”). C-formulas are constructed from atomic formulas using negation , disjunction , and counting quantifiers , where , is a variable, and , are formulas.
An occurrence of a variable is free in a formula if it is outside the range of all quantifications . A sentence is a formula without any free variables. We often write to indicate that the free variables of are among . (Not all of these variables are required to appear in .) For a formula , a graph , and vertices , we write to denote that satisfies if the variables are interpreted by the vertices . We also write and for tuples , . Now we can define the semantics of the logic C inductively in the obvious way. In particular, for we let if there are mutually distinct such that for all .
The quantifier rank of a C-formula is defined inductively by letting for all atomic formulas and , , and . By we denote the fragment of C consisting of all formulas of quantifier rank at most . Graphs are -equivalent if for all -sentences . We write to denote that and are -equivalent We extend this notation to formulas with free variables, writing for tuples to denote that for all -formulas it holds that .
Interpreting the usual existential quantifier as , we can view first-order logic FO as a fragment of C. Observe that C has the same expressive power as its fragment FO, because can be equivalently expressed as
However, this increases the quantifier rank. It is easy to see that for every , is strictly more expressive than the fragment of first-order logic consisting of all formulas of quantifier rank at most . Actually, for every the -formula is not equivalent to any -formula.
2.3 The Bijective Pebble Game
The bijective pebble game, introduced by Hella , gives a combinatorial characterisation of equivalence in the logic C and its fragments .
Let be graphs of the same order. The bijective pebble game on and is played by two players called Spoiler and the Duplicator. Positions of the game are pairs where for some . A play of the game consists of a sequence of rounds, starting from some initial position , where and for some . The default initial position is the “empty position” . In round of the game, Duplicator chooses a bijection . Then Spoiler chooses a , and we let . The position after round is . In the -round game, the play ends after -rounds, and Duplicator wins the play if is a local isomorphism from to , that is, for all the following conditions are satisfied:
If is not a local isomorphism, then Spoiler wins the play.
We can now define winning strategies for Spoiler and Duplicator in the usual way.
The following lemma, which links the bijective pebble game to the logic C, is a minor variant of a theorem due to Hella  and of the standard characterisation of first-order logic in terms of Ehrenfeucht-Fraïssé games (see, for example, ).
For all , all graphs of the same order, and all the following are equivalent.
Duplicator has a winning strategy for the -round bijective pebble game on with initial position .
If we do not specify the initial position of the game, we always assume it is the empty position . Thus the lemma implies that Duplicator has a winning strategy for the -round bijective pebble game on if and only if .
2.4 Graphs of Bounded Tree Depth
It will be convenient in this paper to view trees and forests as partially ordered sets. A forest is a pair consisting of a (finite) vertex set and a partial order on such that for every the set is a chain, that is, its elements are pairwise comparable. We denote the strict partial order associated with by . If and there is no such that and , then we say that is a child of and that is the parent of . This gives us a one-to-one correspondence between forests viewed as partially ordered sets and rooted forests in the usual graph-theoretic sense. The minimal elements of are called the roots of . The height of is the length of the longest chain in . Note that, differing from the standard graph theoretic definition, we count the number of vertices (and not the number of edges) on a path from the root to a leaf. In particular, a forest consisting of roots only has height .
A forest with a unique root is a tree. We denote the root of a tree by . A subtree of a tree is a tree with such that is the restriction of to . Thus a subtree is an induced substructure that is a tree itself. Observe that a set induces a subtree of if and only if has a unique -minimal element. This notion of subtree does not coincide with the usual graph-theoretic notion of a subtree of a tree. In particular, elements of a subtree can be interleaved with elements that do not belong to the subtree.
An elimination forest of a graph is a forest such that and for every edge , either or . If an elimination forest of is a tree, we also call it an elimination tree of . The tree depth of a graph is the minimum such that has an elimination forest of height . We denote the class of all graphs of tree depth at most by and the class of all connected graphs in by .
Lemma 2.3 (Nešetřil and Ossona de Mendez ).
consists of all -vertex graphs.
For , is the class of all connected graphs that have a vertex such that all connected components of are in .
For all , is the class of disjoint unions of graphs in .
We let be the class of all pairs where is a graph and an elimination tree of . We usually denote elements of by .333The reader may wonder why we chose the letter “d” (in and ). One reason is that it picks up the “d” in depth and that is close to . Or think of “d” as standing for “decomposed graph”.
For , we let , and , , , , and . We call the root of . The height of is the height of . We denote the class of all of height at most by . Observe that a connected graph is in if and only if there is a such that .
There is a strange asymmetry in the definition of : for pairs , we require to be a tree, not an arbitrary forest, but we do not require the graph to be connected. Yet this definition is carefully chosen. In particular, if we required to be connected then we would run into difficulties in the proof of Lemma 4.6.
3 Past-Preserving Homomorphisms
Let , and let be an arbitrary graph. A homomorphism from to is simply a homomorphism from to . We write to denote that is a homomorphism from to , and we let be the number of homomorphisms from to . A homomorphism is an epimorphism (we write ) if it is an epimorphism from to .
A homomorphism is past-injective if for all with we have . If in addition, for all with we have , then is past-preserving. We denote the number of past-injective homomorphisms from to by and the number of past-preserving homomorphisms from to by . In this section, we shall prove that we can compute the numbers of past-preserving homomorphisms to a graph from the numbers of homomorphisms and vice versa. The difficult first step will be to establish an equivalence between the numbers of past-injective homomorphisms and homomorphisms.
The general strategy for establishing such an equivalence, going back to Lovász , is to establish a linear relationship between the corresponding counting vectors, in our case the vectors and the corresponding vector of past-injective homomorphism counts and then show that the matrix relating the two vectors is invertible (this will happen in Lemma 3.2, Corollary 3.3, and Lemma 3.5). On the linear algebra side, we shall write the (infinite) matrix of homomorphism counts as a product of an upper-triangular matrix with nonzero diagonal entries and a lower-triangular matrix with nonzero diagonal entries. This decomposition of the homomorphism matrix corresponds to a decomposition of homomorphisms. The upper triangular matrix is obtained by considering some form of injective homomorphisms, in our case past-injective homomorphisms. The lower triangular matrix corresponds to suitable surjective homomorphisms, in our case shrinking epimorphisms, to be introduced next. The reason that we cannot just work with plain injective and surjective homomorphisms (or rather epimorphisms) is that the homomorphic image of a graph of tree depth at most may have larger tree depth than . However, we shall prove (in Lemma 3.1) that shrinking epimorphisms preserve tree depth.
Let , and let be a graph with (but not necessarily ). A shrinking homomorphism from to is a homomorphism such that for all and is idempotent, that is, for all . We are mainly interested in shrinking epimorphisms. We denote the number of shrinking epimorphism from to by . Note that if is a shrinking epimorphism from to then for all . Indeed, since is surjective, we have for some and therefore . This implies that for all we have .
To simplify the notation, for graphs we write if and for all . For a we write instead of .
Let , , and let be a shrinking epimorphism from to . Then induces a subtree on , and this subtree is an elimination tree of of height at most , that is, .
We first prove that is a tree of height at most . Observe that and thus . Hence has a unique -minimal element, and is a tree. Clearly, the height of is at most the height of and hence at most .
It remains to prove that is an elimination tree of . Let . We shall prove that either or . Since is an epimorphism, there is an edge such that and . Then and . Since is an elimination tree of , either or . Without loss of generality we assume . Then . Since the set is a chain in the tree , either or . As is the restriction of to , this implies that or . ∎
Let , and let be a homomorphism from to some graph . Then there is a graph , a shrinking epimorphism , and a past-injective homomorphism such that .
Furthermore, , , and are unique. That is, if and is a shrinking epimorphism and is past-injective such that , then , , and .
Let be the range of . Then the sets , for , form a partition of . For every , let be the -minimal element in . There is at most one such element because is a chain. Note that is idempotent. Let be the graph with vertex set and edge set . Then is a shrinking epimorphism. Hence by Lemma 3.1, the induced subtree is an elimination tree of of height at most .
For all , if then . Thus there is a mapping such that . As is a homomorphism and , the mapping is a homomorphism from to . Indeed, for every edge there is an edge such that and . Then .
To prove that is past-injective, suppose for contradiction that there are such that and . Note that implies . As is the identity on , we have . By the definition of , this means that and are -minimal elements in . Since , it follows that . This is a contradiction.
It remains to prove the uniqueness. Let and a shrinking epimorphism and a past-injective homomorphism such that . If then , because , and because and are surjective. Suppose for contradiction that . Let such that and, subject to this condition, is -minimal.
- Case 1:
Let . Then and, since is idempotent, . Thus
By the minimality of , we have . This implies
Since and and is a tree, either or . Since is past-injective, by (3.) we have neither nor and thus . This is a contradiction.
- Case 2:
Let . Then . Since is idempotent, we have and thus . By the minimality of we have . Hence , which contradicts being past-injective. ∎
Let , and let be a graph. Then
Let , and let be graphs such that for all with it holds that
Let , and let be graphs such that for all with ,
Let be the set of all such that and . In particular, .
Let be an enumeration of such that and for . Observe that for all and that for only if and hence . Let be the matrix with entries . Then is a lower triangular matrix with diagonal entries for all . This implies that is invertible.
Let be the vector with entries , and let be the vector with entries . By Corollary 3.3, for every we have
Thus , and since is invertible, .
Now let be the vector with entries , and let be the vector with entries . Then .
By the assumption of the lemma, we have . Thus . In particular,
Let us now move on to past-preserving homomorphisms.
Let , and let be a past-injective homomorphism from to a graph . Then there is a unique graph with such that is an elimination tree of and is a past-preserving homomorphism from to .
Suppose that . We let be the graph with ,
and . Then , because is a homomorphism, and is an elimination tree of , because implies or . Moreover, is past-preserving, because it is past-injective and for we have .
It remains to prove the uniqueness. Let with such that is an elimination tree of and is a past-preserving homomorphism from to . Then , because is a homomorphism from to . Moreover, for all , either or , because is an elimination tree of , and , because is past-preserving. Thus and therefore . ∎
Let , and let be a graph. Then
Let , and let be graphs such that for all with and ,
Let , and let be graphs. Suppose that for all with and we have
Let . Let be the set of all such that and is an elimination tree of . In particular, . Let be an enumeration of such that and for . Let be the matrix with entries if and