# Amalgamation is PSPACE-hard

The finite models of a universal sentence Φ in a finite relational signature are the age of a homogeneous structure if and only if Φ has the amalgamation property. We prove that the computational problem whether a given universal sentence Φ has the amalgamation property is PSPACE-hard, even if Φ is additionally Horn and the signature of Φ only contains relation symbols of arity at most three. The decidability of the problem remains open.

## Authors

• 27 publications
• 7 publications
• 4 publications
• ### Universal Horn Sentences and the Joint Embedding Property

The finite models of a universal sentence Φ are the age of a structure i...
04/22/2021 ∙ by Manuel Bodirsky, et al. ∙ 0

• ### Domain Range Semigroups and Finite Representations

Relational semigroups with domain and range are a useful tool for modell...
06/04/2021 ∙ by Jaš Šemrl, et al. ∙ 0

• ### Trakhtenbrot's Theorem in Coq, A Constructive Approach to Finite Model Theory

We study finite first-order satisfiability (FSAT) in the constructive se...
04/15/2020 ∙ by Dominik Kirst, et al. ∙ 0

• ### A universal-algebraic proof of the complexity dichotomy for Monotone Monadic SNP

The logic MMSNP is a restricted fragment of existential second-order log...
02/09/2018 ∙ by Manuel Bodirsky, et al. ∙ 0

• ### Trakhtenbrot's Theorem in Coq: Finite Model Theory through the Constructive Lens

We study finite first-order satisfiability (FSAT) in the constructive se...
04/29/2021 ∙ by Dominik Kirst, et al. ∙ 0

• ### Density and Fractal Property of the Class of Oriented Trees

We show the density theorem for the class of finite oriented trees order...
03/23/2019 ∙ by Jan Hubička, et al. ∙ 0

• ### How hard is it to predict sandpiles on lattices? A survey

Since their introduction in the 80s, sandpile models have raised interes...
09/26/2019 ∙ by Kévin Perrot, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

A relational structure is called homogeneous if every isomorphism between finite substructures can be extended to an automorphism. In model theory, homogeneous structures with a finite relational signature are an important source of structures with a large automorphism group and quantifier elimination. Famous examples are the ordered rational numbers , the Random graph, or the countable homogeneous poset that embeds all finite posets.

Fraïssé proved that the age of a homogeneous structure , i.e., the class of all finite structures that embed into , forms an amalgamation class. Conversely, for every amalgamation class there exists a countably structure whose age equals , and this structure is unique up to isomorphism. Hence, a homogeneous structure is uniquely given by its age.

All countable homogenous posets have been classified by

[Sch79], all countable homogeneous tournaments by Lachlan [Lac84], all countable undirected graphs by Lachlan and Woodrow [LW80], and all countable homogeneous digraphs by Cherlin [Che98]. There are only finitely many homogeneous tournaments and posets, and only countably many homogeneous graphs. It is an interesting question whether countable homogeneous structures can be classified in general. However, it is not entirely clear what one could mean by ‘classify’ in this context. There are uncountably many countable homogeneous digraphs, but there are only countably many that do not have the free amalgamation property – so Cherlin’s result about the homogeneous directed graphs can still be considered a classification despite the fact that there are uncountably many.

For many of the well-known examples of homogenous structures the age of can be described by finitely many forbidden substructures. Let be a finite set of finite structures with finite relational signature and let be the class of finite -structures containing no isomorphic copy of a structure in . For binary , long before the second and third author were born, Knight and Lachlan [KL87] (page 227) proved the following claim.

“Claim: (…) we can effectively check whether is an amalgamation class.”

They proceed

“We do not know whether the claim is true for languages of arbitrary arity.”

This problem is still open, and still relevant and of interest; it can also be found in the recent monograph of Cherlin [Che20] (Problem 6). We prove that the problem is undecidable. This shows that we may not expect a solution to the general classification problem for countable homogeneous structures that would be sufficiently explicit to help answering whether concretely given classes of finite structures are amalgamation classes. Our proof is by a reduction from the undecidability of the inclusion problem for context-free languages. Our method of producing classes of finite relational structures (which might or might not be amalgamation classes) from pairs of context-free languages might be an interesting source of example structures for many other open problems about homogeneous structures.

## 2 Preliminaries

The set is denoted by . We use the bar notation for tuples; for a tuple indexed by a set , the value of at the position is denoted by . We always use capital Fraktur letters , , , etc to denote structures, and the corresponding capital Roman letters , , , etc for their domains.

Let be a finite relational signature. A class of relational -structures has the amalgamation property (AP) if: whenever there exist embeddings () for some , then there exists and embeddings () such that . A class of finite relational -structures is called an amalgamation class if it is closed under isomorphism, substructures, and has the AP.

If is a -structure and is a -sentence, we write if satisfies . If and are -sentences, we write if every -structure that satisfies also satisfies . For universal sentences, this is equivalent to the statement that every finite -structure that satisfies also satisfies , by a standard application of the compactness theorem of first-order logic. If is a -sentence, we denote the class of all finite models of by . We sometimes omit universal quantifiers in universal sentences (all first-order variables are then implicitly universally quantified). It is well-known that a class of finite -structures can be described by forbidding isomorphic copies of fixed finitely many finite -structures if and only if for some universal -sentence . This allows us to reformulate the decision problem from the introduction as follows:

INPUT: A universal sentence in a finite relational signature.
QUESTION: Does have the AP?

We assume that equality as well as the symbol for falsity are always available when building formulas; thus, atomic formulas are of the form , for , and for some relation symbol of arity and . A universal sentence is called Horn if its quantifier-free part is a conjunction of disjunctions of positive or negative atomic formulas each having exactly one positive disjunct. Equivalently, every conjunct of can be written as a Horn implication

 (ϕ1∧⋯∧ϕn)⇒ϕ0

where are atomic formulas (possibly ).

###### Definition 1

Let be a universal Horn sentence and a Horn implication, both in a fixed signature . An SLD-derivation of length of from is a finite sequence of Horn implications such that is a conjunct in and each () is a binary resolvent of and a conjunct from , i.e., for some other than , we have

 ψi−1(¬ψi−11∨⋯∨¬ψi−1j∨⋯∨¬ψi−1ni−1∨ψi−10)ϕi(¬ϕi1∨⋯∨¬ϕimi∨ψi−1j)

There exists an SLD-deduction of from , written as , if is a tautology or if there exists an SLD-derivation from of a Horn implication such that, up to a substitution of variables, every disjunct of other than is contained in .

The following theorem presents a fundamental property of universal Horn sentences.

###### Theorem 2.1 (Theorem 7.10 in [NCdW97])

Let be a universal Horn sentence and a Horn implication, both in a fixed signature . Then

 Φ⊨ψiffΦ⊢ψ.

## 3 Universal Horn Sentences and the AP

This section contains some important observations about the AP in the context of universal Horn sentences which we later use in the proof of our main result. Let be a relational signature. We say that a Horn implication is complete if the graph with vertex set where and form an edge if they appear jointly in a conjunct of , forms a complete graph on all variables occurring in in the usual graph-theoretic sense.

In the following, we fix a finite relational signature . Let be relational -structures with for . W.l.o.g. we may assume that , otherwise we rename elements accordingly. Then the free amalgam of is defined as the union of , and .

###### Proposition 1

Let be a conjunction of complete Horn implications. Then has the AP.

###### Proof

The class is closed under the formation of free amalgams, and therefore has the AP. ∎

A class of relational -structures has the one-point amalgamation property if it has the AP restricted to triples which satisfy .

###### Definition 2 (Subsumption)

If and are conjunctions of atomic formulas, then we say that subsumes on (the variables from) with respect to a universal Horn sentence if for every atomic formula ,

Note that subsumption can be tested using SLD-deduction (Theorem 2.1). In our proofs later it will be useful to take a proof-theoretic perspective on the amalgamation property.

###### Lemma 1

Let be a universal Horn sentence over the relational signature . Then the following are equivalent:

1. has the amalgamation property.

2. has the one-point amalgamation property.

3. Suppose that , , and are conjunctions of atomic formulas such that subsumes both and on with respect to . Then subsumes on with respect to .

###### Proof

The equivalence (1)(2) is well-known; see, e.g., Proposition 2.3.17 in [Bod21].

(2)(3): Let , , and be as in item (3). Let , , and be the conjunctions of all atomic formulas implied by , , and , respectively. If , , or contains , then contains by the subsumption assumption and we are done. So suppose that all three conjunctions are satisfiable. Define , , and for variables from , , and , if or contains the conjunct , respectively. Define , , and as the structures whose domains consist of the equivalence classes of , , and , respectively, and where , , or is in a tuple of a relation for if the conjunct is contained in or , respectively. Since and are subsumed by on w.r.t. , there exist embeddings for . Note that, by construction, , , and satisfy every Horn implication in . Since is universal Horn, this implies that . Since has the one-point amalgamation property, there exists together with embeddings for such that . Let be an atomic formula (not necessarily containing the variable ) such that By the construction of ,, and , and because and are homomorphisms, there exist such that . In particular, cannot be equal to . Since is an embedding, we must also have . Thus, by the construction of ,, and , it follows that

(3)(2): Let be such that and for . We construct a structure with and as follows. Without loss of generality we may assume that , i.e., is the intersection of and . Let be a tuple of variables representing the elements of in some order, and let be the conjunctions of all atomic -formulas which hold in , respectively. Note that both and are subsumed by on w.r.t. . Let be the conjunction of all atomic formulas implied by . We claim that does not contain : otherwise, , and then (3) implies that or , which is impossible since and . Define , for variables from , if contains the conjunct . Define as the structure whose domain are the equivalence classes of , and the tuple is in the relation for of if contains the conjunct . For and define . We claim that is an embedding from to . It is clear from the construction of that is a homomorphism. Suppose for contradiction that there exist an atomic -formula and such that while . For the sake of notation, we assume that ; the case that can be shown analogously. Note that the construction of implies that . Then implies that , a contradiction to . Thus, is an embedding from to . By the construction of we also clearly have that , which concludes the proof of the one-point amalgamation property. ∎

The following example illustrates the use of Lemma 1 and is relevant for the proof of Theorem 4.1 below.

###### Example 1

For , we define

 Ck\coloneqq⋀i,j∈[k]E(xi,xj),Cleftk\coloneqq⋀i∈[k]Eleft(xi,y1),Crightk\coloneqq⋀i∈[k]Eright(xi,y2).

Let be the universal Horn sentence with the signature

 τ\coloneqq{Ra1,Ra2,Ra3,Ra4,Rleftb1,Rrightb1,Rleftb2,Rrightb2,E,Eleft,Eright,F}

which consists of the following five Horn implications.

 C3∧Cleft3∧Cright3∧⋀i∈[2]Rai(xi,xi+1)⇒Rleftb1(x1,y1,y2) C3∧Cleft3∧Cright3∧⋀i∈[2]Rai(xi,xi+1)⇒Rrightb1(x3,y1,y2) C3∧Cleft3∧Cright3∧⋀i∈[2]Rai+2(xi,xi+1)⇒Rleft% b2(x1,y1,y2) C3∧Cleft3∧Cright3∧⋀i∈[2]Rai+2(xi,xi+1)⇒R% rightb2(x3,y1,y2) F(x1,x3)∧C3∧Cleft3∧Cright3∧⋀i∈[2]Rleftbi(xi,y1,y2)∧Rrightbi(xi+1,y1,y2)⇒⊥

We claim that entails

 F(x1,x5)∧C5∧Cleft5∧Cright5∧⋀i∈[4]Rai(xi,xi+1)⇒⊥. (1)

Note that the first four Horn implications in this example applied to the premise of (1) produce the conjunction

 ψ\coloneqqRleftb1(x1,y1,y2)∧R% rightb1(x3,y1,y2)∧Rleftb2(x3,y1,y2)∧Rrightb2(x5,y1,y2).

Then the last Horn implication applied to the premise of (1) with a suitable substitution of variables and conjoint with produces . It is easy to see that both

 F(x1,x5)∧C5∧Cleft5∧⋀i∈[4]Rai(xi,xi+1) andF(x1,x5)∧C5∧Cright% 5∧⋀i∈[4]Rai(xi,xi+1)

are subsumed by w.r.t.  because both conjunctions do not fire any Horn implications in . However, (1) witnesses that is not subsumed by on w.r.t. . Thus, by Lemma 1, does not have the AP. Observe that the failure of the AP for requires all five Horn implications to be present in . Also note that the AP can be restored by conjoining with the Horn implication

 F(x1,x5)∧C5∧Cleft5∧⋀i∈[4]Rai(xi,xi+1)⇒⊥.

## 4 Undecidability of the AP

In this section we prove that there is no algorithm that decides whether the class of all finite models of a given universal Horn sentence has the amalgamation property. Our proof is based on a reduction from the problem of deciding containment for context-free languages [AN00]. A context-free grammar (CFG) is a -tuple where

• is a finite set of non-terminal symbols,

• is a finite set of terminal symbols,

• is a finite set of production rules, and

• is the start symbol.

For we write if there exist and such that and . The transitive closure of is denoted by . The language of is . We write for the empty word, i.e., the word of length .

###### Theorem 4.1

For a given universal Horn sentence the question whether has the AP is undecidable even if the signature is limited to ternary relation symbols.

###### Proof

Let be a CFG such that

• for every (no ‘empty productions’), and

• is not a subword of for every pair .

The containment problem for the languages of such CFGs is known to be undecidable [AN00]. For formal reasons, we also require that for every . Note that this does not influence the language at all. As in Example 1, for , let

 Ck\coloneqq⋀i,j∈[k]E(xi,xj).

Let be the signature that contains the binary symbols and, for every element a binary relation symbol . Let be the universal Horn sentence that contains for every the Horn implication

 Cn+1∧⋀i∈[n]Rai(xi,xi+1) ⇒Rb(x1,xn+1) if b≠s F(x1,xn+1)∧Cn+1∧⋀i∈[n]Rai(xi,xi+1) ⇒⊥ if b=s.

Note that, due to the presence of , all conjuncts of form a complete graph on , which means that has the AP by Proposition 1.

The following correspondence can be shown via a straightforward induction.

###### Claim 1

For every and every , we have

 b→∗Gw if and only if ΦG⊨∀x1,…,xn+1 (Cn+1∧⋀i∈[n]Rai(xi,xi+1)⇒Rb(x1,xn+1)). (2)

For , we have

 s→∗Gw if and only if ΦG⊨∀x1,…,xn+1 (F(x1,xn+1)∧Cn+1∧⋀i∈[n]Rai(xi,xi+1)⇒⊥). (3)
###### Proof

We first prove the case in detail and then finish the proof with a short argument for the case .

”: Suppose that . Then there is a path in from to of length . We prove the statement by induction on .

In the induction base we have in which case (2) is a conjunct of .

In the induction step , we assume that the claim holds for all paths of length , and that there exists a path of length from to , i.e., there exists a path of length from to some such that, for some ,

 c1…cℓ−1 =a1…aℓ−1, cℓ+1…cm =ar+1…an, and (cℓ,aℓ…ar)∈P.

By the induction hypothesis, we have that

 ΦG⊨∀x1,…,xm+1(Cm+1∧⋀i∈[m]Rci(xi,xi+1)⇒Rb(x1,xm+1)). (4)

By the construction of , we also have that

 ΦG⊨∀x1,…,xr−ℓ+2(Cr−ℓ+2∧⋀i∈[r−ℓ+1]Rai+ℓ−1(xi,xi+1)⇒Rcℓ(x1,xr−ℓ+2)).

Note that the Horn implication above entails the Horn implication

 Cn+1∧⋀i∈[n]Rai(xi,xi+1)⇒Rcℓ(xℓ,xr+1)

because the premise of the first implication is a subformula of the premise of the second implication up to a positive index shift by for each variable , i.e.,

 ΦG⊨∀x1,…,xn+1(Cn+1∧⋀i∈[n]Rai(xi,xi+1)⇒Rcℓ(xℓ,xr+1)). (5)

If we substitute in (4) with (), then it follows from (4) together with (5) that

 ΦG⊨∀x1,…,xn+1(Cn+1∧⋀i∈[n]Rai(xi,xi+1)⇒Rb(x1,xn+1)).

”: Suppose that . By Theorem 2.1, there is an SLD-deduction of (2) from . We prove the claim by induction on the number of SLD-derivations used in the SLD-deduction.

In the base case , the Horn implication in (2) must be a tautology or a conjunct of . If it is a tautology, then and in which case because we assume that . If it is a conjunct of , then, by the construction of , we get that and thus . Note that (2) cannot have an SLD-deduction from using a derivation sequence of length starting with any other Horn implication besides the one that is obtained from . This is because our encoding of each in uses: for , a chain of length that produces a literal on the endpoints , and, for , an additional predicate in the premise that is not present in (2).

In the induction step , we assume that the claim holds for all SLD-derivations of length and that (2) requires an SLD-derivation of length . By the construction of , there must exist with and such that contains a conjunct of the form

 Cr−ℓ+2∧⋀i∈[r−ℓ+1]Rai+ℓ−1(xi,xi+1)⇒Rcℓ(x1,xr−ℓ+2) (6)

that is used as the last implication in a shortest possible SLD-derivation. Moreover, since , for and , there exists an SLD-deduction from using an SLD-derivation of length of

 Cm+1∧⋀i∈[m]Rci(xi,xi+1)⇒Rb(x1,xm+1). (7)

By the induction hypothesis, (7) is equivalent to . Then follows by composition.

In the case , we proceed similarly as in the case , with one small caveat. One might think that a Horn implication of the form (3) can have an SLD-deduction from that ignores some of the -atoms in the premise. But note that every Horn implication in whose conclusion is of the form has an -atom in its premise that contains: the first variable of the chain in its first argument and the last variable of the chain in its second coordinate. This prevents the SLD-deduction of (3) from using any shorter subword (, ). ∎

Given an instance of the containment problem consisting of two context-free grammars and as above Claim 1, we may assume that and . Let be the universal Horn sentence in a new signature obtained from as follows.

The signature differs from in that it contains: two new binary relation symbols and , and, for every , two ternary relation symbols and instead of a single binary symbol . As in Example 1, for , let

 Cleftk\coloneqq⋀i∈[k]Eleft% (xi,y1)andCrightk\coloneqq⋀i∈[k]Eright(xi,y2).

In every Horn implication in we introduce two new universally quantified variables , and extend its premise by the conjunctions and where is the length of the word such that is encoded by . We also replace every literal for with the conjunction