 # A New Perspective on FO Model Checking of Dense Graph Classes

We study the first-order (FO) model checking problem of dense graphs, namely those which have FO interpretations in (or are FO transductions of) some sparse graph classes. We give a structural characterization of the graph classes which are FO interpretable in graphs of bounded degree. This characterization allows us to efficiently compute such an FO interpretation for an input graph. As a consequence, we obtain an FPT algorithm for successor-invariant FO model checking of any graph class which is FO interpretable in (or an FO transduction of) a graph class of bounded degree. The approach we use to obtain these results may also be of independent interest.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

Algorithmic metatheorems are theorems stating that all problems expressible in a certain logic are efficiently solvable on certain classes of (relational) structures, e.g. on finite graphs. Note that the model checking problem for first-order logic – given a graph and an FO formula we want to decide whether satisfies (written as ) – is trivially solvable in time . “Efficient solvability” hence in this context often means fixed-parameter tractability (FPT); that is, solvability in time for some computable function .

In the past two decades algorithmic metatheorems for FO logic on sparse graph classes received considerable attention. After the result of Seese  establishing fixed-parameter tractability of FO model checking on graphs of bounded degree there followed a series of results [10, 5, 7] establishing the same result for increasingly rich sparse graph classes. This line of research culminated in the result of Grohe, Kreutzer and Siebertz , who proved that FO model checking is FPT on nowhere dense graph classes.

The result of Grohe, Kreutzer and Siebertz  is essentially the best possible of its kind, in the following sense: If a graph class is monotone (i.e., closed on taking subgraphs) and not nowhere dense, then the FO model checking problem on is as hard as that on all graphs. Possible ways to continue the research into algorithmic metatheorems for FO logic include the following two directions:

First, one can study relational structures other than graphs. This line of research has recently been initiated by Bova, Ganian and Szeider , who gave an FPT algorithm for existential FO model checking on partially ordered sets of bounded size of a maximum antichain. Their result was first improved upon in  and shortly after that followed the result of Gajarský et al. , who extended  to full FO. Apart from these results, very little is known and it remains to be seen what other types of structures and their parameterizations admit fast FO model checking algorithms.

Second, one may consider metatheorems for FO logic on classes of graphs which are not sparse. Again, little is known along this line of research. One can mention the result of Ganian et al.  establishing that certain subclasses of interval graphs admit an FPT algorithm for FO model checking. Besides, the aforementioned result of  can also be seen as a result about dense (albeit directed) graphs, and  actually happens to imply the result of .

We would like to initiate a systematic study of dense graph classes for which the FO model checking problem is efficiently solvable. It appears that a natural way to arrive at new graph classes admitting FPT algorithms for FO model checking, is by means of interpretation, or transduction. In a simplified setting of interpretations – given a graph and an FO formula with two free variables, we can define a graph on the same vertex set as and the edge set determined by : a pair of distinct vertices is an edge of iff . We then say that is interpreted in using . A graph class is FO interpretable in a graph class if there exists an FO formula such that every member of is interpreted in some member of using .

For now let us assume we have an efficient FO model checking algorithm for the previous class , and consider the FO model checking problem of the class . If an input graph from was given together with the corresponding FO interpretation in a graph from , then one could easily solve the model checking problem using the existing algorithm for . This is based on the following natural property of interpretations: if is interpreted in using formula , and our question is to decide whether , it is a standard routine to construct from and a sentence such that if and only if . Then is decided by the algorithm given for .

However, if the assumed interpretation (or transduction) is not given, then the situation is markedly harder. In this context we ask the following question:

###### Question 1.1.

Let be a graph class admitting an FPT algorithm for FO model checking, and be a graph class FO interpretable in . Does there exist an FPT algorithm for FO model checking on ?

As outlined above, the difficulty of this question lies in the fact that our inputs come from , without any reference to the respective members of in which they are interpreted. Even if the interpretation formula is fixed and known beforehand, we have generally no efficient way of obtaining the respective member for an input . Thus, Question 1.1 can be reduced to the following:

###### Question 1.2.

Let be graph classes such that is FO interpretable in . Does there exist an integer and a polynomial-time algorithm such that; given as input, outputs and an FO formula of size at most  such that is interpreted in using ?

An answer to Question 1.2 is far from being obvious, and it can strongly depend on the choice of . Take, for example, the following particular FO interpretation: A graph is the square of a graph if the edges of are those pairs of vertices which are at distance at most in . Then the problem; given find such that is the square of , is NP-hard . Another such negative example, specifically tailored to our setting, is discussed in Section 7. These examples show that it is important to choose a suitable interpretation formula (avoiding the hard cases) in an attempt to answer Question 1.2.

#### Our contribution

We answer both Questions 1.1 and 1.2 in the positive for the case when is a class of graphs of bounded degree. Our answers cover also the more general case of FO transductions of bounded-degree classes, and include checking successor-invariant FO properties in addition to ordinary FO ones.

We first define near-uniform graph classes (Definition 4.2), based on a new notion of near--twin relation, which generalizes the folklore twin-vertex relation and is related also to the neighbourhood diversity parameter of 

. The idea behind this approach is to classify pairs of vertices which have almost the same adjacency to the rest of the graph. The approach seems promising and may be of independent use in further investigation of well structured dense graph classes. While the definition of non-uniformity lends itself well to being used in proofs, it is sometimes unnecessarily technical to reason about. We therefore also introduce an equivalent notion of near-covered graph classes (Definition

4.3), which is more intuitive, easier to grasp and offers a slightly different perspective.

We then give an efficient FO model checking algorithm (Theorem 5.1) for the near-uniform graph classes. This algorithm is based upon the above idea of interpretation; briefly, given a graph we use the near--twin relation for a suitable value of to partition the vertex set of and to find a bounded degree graph , such that is interpreted in using a universal formula depending only on the class in question (Theorem 5.5). Then we employ the aforementioned algorithm of Seese . Furthermore, we extend our algorithm to include also stronger successor-invariant FO properties (see Section 5.3 for more details), for which we can use the recent result of .

In the second half of the paper we argue that the concept of near-uniform graph classes is robust and sufficiently rich in content. We prove that the near-covered (and therefore also near-uniform, since the two are equivalent) graph classes are exactly those which are FO interpretable in graphs of bounded degree (Theorem 6.3) and, more generally, that any FO transduction of a graph class of bounded degree is a near-covered graph class (Theorem 6.4). The key tool we use is Gaifman’s theorem . At this place we remark that properties of graphs which are FO interpretable in graphs of bounded degree have already been studied, e.g., by Dong, Libkin and Wong in  in a different context, but those previous results do not imply our conclusions.

We then complement the previous tractability results with a negative example of a particular FO interpretation which is NP-hard to “reverse” even on the class of graphs of degree at most  (Theorem 7.1). We finish by sketching some interesting open directions for future research.

## 2 Definitions and preliminaries

We begin by clarifying the terminology and recalling some established concepts concerning logic on graphs. We assume that is a natural number, i.e. . Let denote the symmetric difference of two sets.

#### Graph theory

We work with finite simple undirected graphs and use standard graph theoretic notation. We refer to the vertex set of a graph as to and to its edge set as to . As it is common in the context of FO logic on graphs, vertices of our graphs can carry arbitrary labels.

#### FO logic

The first-order logic of graphs (abbreviated as FO) applies the standard language of first-order logic to a graph viewed as a relational structure with the domain and the single binary (symmetric) relation . That is, in FO we have got the standard predicate , a binary predicate with the meaning , an arbitrary number of unary predicates with the meaning that holds the label , usual logical connectives , and quantifiers , over the vertex set .

For example, states that the vertices have a common neighbour in  which has got label ‘red’.

#### Parameterized model checking

The instances of a parameterized problem can be considered as pairs where is the main part of the instance and is the parameter of the instance; the latter is usually a non-negative integer. A parameterized problem is fixed parameter tractable (FPT) if instances of size can be solved in time where is a computable function and is a constant independent of . In parameterized model checking, instances are considered in the form where is a structure, a formula, the question is whether and the parameter is the size of .

When speaking about the FO model checking problem in this paper, we implicitly consider the formula (its size) as a parameter.

#### Interpretations

In order to simplify our exposition and proofs we work with a simplified version of FO interpretations (note, however, this does not impact generality of our conclusions, as we will see later).

Let be an FO formula with two free variables over the language of (possibly labelled) graphs such that for any graph and any it holds that and , i.e. the relation on defined by the formula is symmetric and irreflexive. From now on we will assume that formulas with two free variables are symmetric and irreflexive (which can easily be enforced). Given a graph , the formula maps to a graph defined by and . We then say that the graph is interpreted in . Notice that even though the graph can be labelled, our graph is not. This is to simplify our notation – nevertheless, one may easily inherit labels from to if needed.

In the rest of the paper, whenever we consider graphs and in context of interpretations, graph will be the graph in which we are interpreting, and graph will be the “result” of the interpretation.

The notion of interpretation can be extended to graph classes – to a graph class the formula assigns the graph class . We say that a graph class is interpretable in a graph class if there exists formula such that . Note that we do not require , as we just want every graph from to have a preimage in .

Interpretations are useful for defining new graphs from old using logic (again, we think of as a result of application of to ), but can also be used to evaluate formulas on quickly, provided that we have a fast algorithm to evaluate formulas on . Let , let be a sentence and let be a sentence obtained from by replacing every occurrence of the atom by . Then, obviously, .

#### FO transductions

While interpretations are restricted in a choice of the target domain (and, in our case, we even require  ), a more general view is provided by so called transductions, see Courcelle and Engelfriet . Informally, in addition to an interpretation this allows to add to a graph arbitrary “parameters” (as labels) and to make several disjoint copies of the graph.

Here we provide a brief definition based on , simplified to target only the FO graph case. A basic FO-transduction is a triple of FO formulas with 0, 1 and 2 free variables, respectively, such that maps a graph into a graph on the vertex set and the edge set (an induced subgraph of ), or is undefined if .

The -copy operation maps a graph to the graph such that , the subset for each induces a copy of  (there are no edges between distinct copies), and is additionally equipped with a binary relation and unary relations such that; for iff , and . The -parameter expansion maps a graph to the set of all graphs which result by expansion of by unary predicates.

Altogether, a many-valued map is an FO transduction (of simple undirected graphs) if it is where is a basic transduction, is a -copy operation for some , and is a -parameter expansion for some . Note that, in this formal setting, the formulas of may also refer to the relations and established by the copy operation .

We remark, once again, that the result of a transduction of one graph is generally a set of graphs, due to the involved -parameter expansion. For a graph class , the result of a transduction of the class is the union of the particular transduction results, precisely, .

## 3 Outline of our approach

Before diving into technical details of our claims and proofs, we give a brief exposition of ideas leading to our results. We start by explaining the core ideas behind our approach to analysing dense graphs and then we sketch the how interpretations are combined with our approach to dense graphs to obtain the results presented in Sections 4 and 6.

### 3.1 Locality, indistinguishability, and the new approach

The existing FPT algorithms for FO model checking of sparse graph classes we mentioned at the beginning of Section 1 rely heavily on the use of locality of FO logic – i.e. the fact that evaluating FO formulas can be reduced to evaluating local FO formulas (cf. Gaifman’s theorem , also in Section 6). This, together with the fact that in sparse graphs it is possible to evaluate local formulas efficiently, made the locality-based approach suitable for studying FO logic on sparse graphs. The problem with using this approach for dense graphs is obvious – in a dense graph the whole graph can be in the 1-neighbourhood of a single vertex222This is also true for some sparse graphs, say stars, but we hope that it is clear that for dense graphs this can cause substantial problems.. This makes evaluating local formulas around such a vertex expensive (from the FPT perspective), because this amounts to evaluating them on the whole graph.

An alternative approach to FO model checking, as described in Section 4, is based on the concept of vertex indistinguishability. This approach can be used for dense graphs, but is a bit too limited in its scope. The key notion here is that of twin vertices – two vertices of a graph are twins if they have the same neighbourhood. The fact that two vertices are twins means that they behave in the same way with respect to any other vertex in a graph. Consequently, no FO formula can distinguish between and . It is not hard to see that the twin relation is an equivalence on the vertex set of a graph. One may also say that the set of vertex neighbourhoods occurring in is “covered” by the set of neighbourhoods of representatives of each twin class of . The number of equivalence classes of the twin relation is called the neighbourhood diversity  of a graph, and graph classes of bounded neighbourhood diversity admit a very simple FPT algorithm for FO model checking. However, as already mentioned, the problem with this approach is that it is too restrictive – even such simple graph classes as paths have unbounded neighbourhood diversity.

Our approach is based on observing that the locality-based approach, when used on sparse graphs, exploits, in its essence, the indistinguishability of vertices. Take, for example, the graphs of bounded degree. Here any two vertices behave the same way with respect to the rest of the vertex set (they are non-adjacent to it), with only a few exceptions (the vertices in their neighbourhood). In other words, any two vertices have almost the same neighbourhood. This leads to a relaxation of the notion of twin vertices. We say that two vertices are near--twins if their neighbourhoods differ in at most vertices. To see how this notion works around the issues with locality and indistinguishability explained above, let us consider the near--twin relation on the class of graphs of degree at most and on the class of its complements. On every graph from these graph classes, the near--twin relation is an equivalence with just one class. Yet, graphs from are dense and some of them contain universal vertices.

The above considerations lead us to studying graph classes such that for each graph from these classes there exists a small such that the near--twin relation is an equivalence with a small number of classes – the near-uniform graph classes. Though, unlike the ordinary twin relation, the near--twin relation is not automatically guaranteed to be an equivalence (this depends heavily on the choice of and ) and, consequently, dealing with near-uniformity is slightly cumbersome and requires a great care.

However, there is also another (and perhaps simpler to deal with) way to view and formally capture the above informal discussion of diversity of neighbourhoods in a graph – that one can “cover” all distinct neighbourhoods in the graph with only few representative neighbourhoods. This view leads to a new definition – a class of graphs is near-covered if there exists a small such that every graph in this graph class contains a small (of a constant size) set of vertices such that every vertex is a near--twin of at least one vertex from . It is easily seen that near-uniformity implies near-coveredness – just pick any one representative from each equivalence class. As we shall see, the converse is also true and the two notions are (asymptotically) equivalent. Precisely, we shall prove that for any graph class the following conditions are equivalent:

1. is near-uniform (Definition 4.2);

2. is near-covered (Definition 4.3);

3. is interpretable in a class of graphs of bounded degree.

Since we can efficiently compute the interpretation claimed in (3), we can then solve FO model checking on near-uniform graph classes in FPT using established tools, such as the algorithm of  for FO model checking on graphs of bounded degree. Our proof is structured as follows; we first prove the equivalence between (1) and (2) (Lemma 4.4), and then the implications (1) (3) (Theorem 5.5) and (3) (2) (Theorem 6.3).

One may, with respect to technical difficulties related to the near-uniformity notion, question whether it is necessary to consider near-uniformity at all and not to go with just near-coveredness alone. However, the equivalence aspect of the near--twin relation is crucial in proving that graphs with certain properties are interpretable in graphs of bounded degree. We therefore believe that it deserves a separate definition.

### 3.2 Interpretability in graphs of bounded degree

Besides dealing with the FO model checking problem via interpretation of certain graph classes into classes of bounded degree, we are also interested in the other direction – to find out which graph classes can be FO interpreted into classes of bounded degree (the direction (3) (2) above).

Our characterization of such classes relies on a simple corollary of Gaifman’s locality theorem: For a graph  and two vertices which are far apart form each other, the truth value of the formula depends only on formulas with one free variable (up to the quantifier rank , which depends on ) valid on and (i.e. its logical -types). This in turn means that when the formula is used for interpretation (to obtain the graph from a graph of degree at most ) and vertices and satisfy the same formulas with one free variable (again, up to the quantifier rank ), and will be adjacent to the same vertices in the resulting graph, except for a small number of vertices which were in their respective -neighbourhoods in graph (here also depends on ). Any two vertices of the same -type will therefore be near--twins for .

While the previous consideration is quite simple, note the following possible pitfall. Since the relation “being of the same -type” is an equivalence with a bounded number of classes, it is tempting to believe that the near--twin relation (for a suitably chosen ) is an equivalence with a bounded number of classes (independent of ) for any graph from a graph class FO interpretable in a class of graphs of bounded degree. This, however, is not true – it can happen that some vertices and of different -types can be near--twins and a vertex of yet different -type can be near--twin of but not of , thus failing the transitivity.

Instead, we finish as follows. Since for any there are finitely many (say ) different -types, in there exist at most vertices such that every vertex is a near--twin of (at least) one of them. This in turn means that graph classes FO interpretable in graphs of bounded degree are near-covered, and hence also near-uniform by (1) (2) above.

## 4 Near-uniform and near-covered graph classes

In this section we formally establish the key concepts. For a graph and a vertex , we define the neighbourhood of as . If the graph is clear from the context, we write just . Note that, by definition, .

A useful concept in graph theory is that of twin vertices. Two vertices are called false twins if , and they are true twins if . We actually follow the concept of false twins, which better suits our purposes, in the next definition.

###### Definition 4.1 (near-k-twin relation).

For a graph and , the near--twin relation of  is the relation on defined by .

Considering, e.g., a small parameter and a large graph then, intuitively, two vertices of are near--twins if they have “almost the same” neighbourhood. This relation, unlike the ordinary twin relations on graph vertices, does not always “behave nicely”; in particular, may not be an equivalence relation (see e.g. the examples below). On the other hand, if the near--twin relation is an equivalence of bounded index, then we can use it to decompose the vertex set of the graph into similarly behaving clusters. This leads to the following.

###### Definition 4.2 (near-uniform).

A graph is -near-uniform if there exists for which near--twin relation of is an equivalence of index at most .
A graph class is -near-uniform if every member of is -near-uniform, and is near-uniform if there exist integers such that is -near-uniform.

To simplify the discussion, we use the following as a shorthand. If of Definition 4.1 is an equivalence relation, then we call the near--twin equivalence of , and the equivalence classes of the near--twin classes of .

For example, take a class of the graphs of maximum degree at most , and let . Then the near--twin relation is a trivial equivalence of index one (i.e., with one class) for every graph from . The same holds for the class of the complements of graphs of . Another sort of examples comes, say, with a class of the graphs obtained from complete bipartite graphs by subtracting a subgraph of degrees at most . For and every graph of , the near--twin relation is an equivalence of index at most two. On the other hand, we can easily see that the near--twin relation of, e.g., a path of length  is not an equivalence; see Figure 1. Even more, examples such as that of Figure 1 show that, having a near--twin equivalence for some , does not imply that the near--twin relation is an equivalence for . That is why we cannot simply use one universal value of in Definition 4.2.

The fact that the near--twin relation of a graph is an equivalence on can used as follows: the neighbourhood of a vertex is represented by the neighbourhood of a selected representative of its class and the (small) difference of these two neighbourhoods. For such purpose of representation it is not always necessary to have a near--twin equivalence; just having at least one such representative for every vertex of may be sufficient (we may not care that there are more than one “close” representatives). This simplified scenario leads to the following definition.

###### Definition 4.3 (near-covered).

A graph is -near-covered if there exist vertices in such that each vertex is a near--twin of at least one of .
A graph class is -near-covered if every member of on at least vertices is -near-covered, and is near-covered if there exist integers such that is -near-covered.

The following lemma establishes that the two notions – being near-uniform and being near-covered – are in fact equivalent. While the definition of being near-covered is less technical and easier to grasp, the definition of near-uniformity is more convenient to work with in the algorithmic context of Section 5, which is the main reason for including both definitions.

###### Lemma 4.4.

A graph class is near-uniform if and only if is near-covered.

###### Proof.

It is easy to see that if is -near-uniform then it is -near-covered: for every graph there is such that near -twin relation is an equivalence with classes where . We pick an arbitrary vertex from from each class to obtain vertices . Clearly, each vertex of is a near--twin of one of these vertices.

To prove the opposite direction, consider first the following construction: To any graph and , we define auxiliary graph on the same vertex set by setting if and only if and are near--twins in . Observe the following easy properties of this construction:

1. Graph is -near-covered if and only if has a dominating set of size at most .

2. If for some the graph is a disjoint union of at most cliques, then near--twin is an equivalence with classes on (and so is -near-uniform).

3. If two vertices are at distance at most in then they are -near-twins in .

4. If contains a component with radius greater than then this component has to be dominated by at least two vertices. Moreover, in any dominating set of such connected component there are two vertices which are at distance at most in .

We now prove that any graph class which is near-covered with parameters and is also near-uniform. We proceed by induction on . For the case when the graph has a dominating set of size . This means that every two vertices in are at distance at most , it follows from (3) that any two vertices of are near--twins. Graph is therefore -near-uniform, which finishes the induction basis.

For the induction step, we fix and assume that every -near-covered graph class, for any , is -near-uniform for some values depending only on and . Consider now a graph class which is -near-covered. We will prove that every graph from is -near-uniform or -near-covered. The latter case, from the induction hypothesis, implies that is -near-uniform where depend only on and . As a result, every graph in is -near-uniform and so is near-uniform.

We take a graph which is -near-covered, and consider the derived graph . If has dominating set of size smaller than then it is actually -near-covered, which means it is also -near-covered as desired. From now on we therefore assume that has a smallest dominating set . We distinguish two cases:

1. contains a connected component with radius at least . By property (4) of the construction, there are two vertices from which are at distance at most in . Consider now the graph . We claim that has a dominating set of size at most , which means that is -near-covered and therefore also -near-covered as desired.

First note that is supergraph of , so is a dominating set of . We claim that (of size ) is also a dominating set of . To see this, consider any vertex dominated by in . Since the distance between and in is at most , the distance between and is at most in . This means, by (3), that and are -near-twins in . This in turn means that there is an edge between and in , and so is dominated by in . Since was an arbitrary neighbour of (in ), every vertex dominated by in is dominated by in . Therefore, is a dominating set in of size .

2. All connected components of have radius at most . This means that consists of components such that for . In this case we consider the graph . Since every two vertices in the same component of are at distance at most , they are -near-twins in and so there is an edge between them in , which means that each component forms a clique in . We distinguish two possibilities:

1. There is no pair of distinct indices such that there exists an edge in between some vertices and . In this case the graph is a disjoint union of cliques, which means that is -near-uniform by property (2).

2. There exists a pair of distinct indices such contains an edge between some vertices and . Recall that and are the vertices from which are contained in and , respectively. These vertices are in the same component in and at distance at most . By the same argument as in the case I, the set is a dominating set of size of the graph , which means that is -near-covered, as desired.

## 5 FO model checking algorithm

This section constitutes the main algorithmic contribution of the paper.

Our model checking algorithm for near-uniform graph classes can be shortly summarized as follows. Input is a graph from a -near-uniform graph class and an FO sentence . Perform the following steps:

1. For each ; compute the near--twin relation of , and check whether is an equivalence of index at most . This test has to succeed for some value of  (Definition 4.2).

2. Compute a universal formula depending on and , and the graph depending on and found in step 1, such that and the vertex degrees in are at most  (Theorem 5.5).

3. Run the algorithm of  for FO model checking on graphs of bounded degree on and the sentence , where is obtained from by replacing every occurrence of with .

###### Theorem 5.1.

Let be a -near-uniform graph class for some . Then the FO model checking problem of is fixed-parameter tractable when parameterized by the formula size, i.e., solvable in time for a computable function and input .

The rest of this section is devoted to the proof of this statement.

### 5.1 Properties of the near-k-twin relation

To give details of the algorithm and to prove Theorem 5.1, we study some structural properties of graphs for which the near--twin relation is actually an equivalence.

As outlined above in the algorithm, our key step is to show that all near-uniform graph classes are FO interpretable in graph classes of bounded degree. For this we show that for any two large enough equivalence classes of a near-k-twin equivalence, it holds that every vertex from one class is connected to almost all or to almost none vertices of the other class and vice versa. More precisely:

###### Lemma 5.2.

Let and be a graph such that the near--twin relation of is an equivalence on . Let and be two near--twin classes of with at least vertices each (it may be ). Then for every we have

 min{|U∩N(v)|,|U∖N(v)|}≤2k.

Note that the claim of Lemma 5.2 universally holds only when both and are sufficiently large. A counterexample with small is a graph consisting of and inducing a large clique, such that is connected to half of the vertices of . For this graph the near--twin classes are exactly and , but both and are unbounded.

###### Proof.

For and , let . Thus to prove the lemma we need to show that for .

Towards a contradiction assume for some . Clearly, there is a subset such that and , too. Since for any by the definition of , we also get for all .

We are going to count the number of pairs such that , are distinct vertices and exactly one of , is an edge of . See Figure 2. On the one hand, for any fixed , every forming such a desired pair belongs to and so we have got an upper bound

 D≤∑{u,u′}∈(U′2)|N(u)△N(u′)|≤≤(|U′|2)⋅k=(4k+22)⋅k<3k2(4k+2), (1)

where holds by the definition of .

On the other hand, we may fix and count the number of unordered pairs such that exactly one of , is an edge of ; this number is equal to if , and to or if . Therefore,

 D≥∑w∈V(αU′(w)−1)⋅(|U′|−1−αU′(w))≥∑w∈V(k+1−1)(4k+2−1−k−1)=|V|⋅3k2≥3k2(4k+2) (2)

since we have got and .

Now, (1) and (2) are in a contradiction, and hence the sought conclusion follows. ∎

###### Corollary 5.3.

Let and be the two classes of Lemma 5.2 such that . Then exactly one of the following two possibilities holds:

1. every vertex of is connected to at most vertices of  and every vertex of is connected to at most vertices of , or

2. every vertex of is connected to all but at most vertices of  and every vertex of is connected to all but at most vertices of .

###### Proof.

We first show that either

• every vertex of is connected to at most vertices of , or

• every vertex of is connected to all but vertices of .

Indeed, for any vertex taken separately, only one of these cases can happen since , and one of these cases has to happen by Lemma 5.2. Assume that there exist with having at most neighbours in while is connected to all but at most vertices of . Then , contradicting the definition of .

To finish the proof, we have to show the the following case (relevant if ) is impossible: every vertex of connected to at most vertices of  and every vertex of connected to all but at most vertices of . In the argument we count the total number of edges between and ; it would be at most and, at the same time, at least

. Though, the difference between these lower and upper estimates is

 (|U|−2k)⋅|V|−2k⋅|U|=|U|⋅|V|−2k⋅(|U|+|V|)=(|U|−4k)(|V|−4k)+2k(|U|+|V|)−16k2>k⋅k+2k(5k+5k)−16k2=5k2>0, (3)

a contradiction, thus finishing the whole proof. ∎

###### Remark 5.4.

Note that Corollary 5.3 still applies if . I.e., for a single near--twin equivalence class with either

1. every vertex of has at most neighbours in , or

2. every vertex of has at least neighbours in .

### 5.2 From near-k-twins to bounded degree

Here we present the core of our algorithm – a procedure which, given a graph for which the near--twin relation of is an equivalence of bounded index, produces a (labelled) graph (on the same vertex set) of bounded degree, and a formula such that .

The idea behind the procedure is the following: We start by dividing the near--twin classes of into “small” and “large” ones (w.r.t. ), dealing with each of these two types of classes separately.

• Each large class (more precisely, the vertices in the class) is assigned a label and each pair of large classes receives another label indicating whether there are “almost all” or “almost none” edges between the two classes. The exceptions to “almost all” or “almost none” rules will be remembered by edges of the graph (by Corollary 5.3 each vertex has a bounded number of such exceptions, hence the bounded degree of ). Using these labels and the graph we properly encode the -adjacency between the vertices in the large classes.

• The -adjacency of the vertices from small equivalence classes (both within the small classes and also to the large ones) is encoded by assigning a new label to each such vertex and another new label to its neighbourhood. The vertices from small classes have no edges in the graph .

Note that the construction sketched above depends on and also on the number of near--twin equivalence classes of . Unfortunately, as explained earlier, we cannot fix one universal value of the parameter beforehand, but at least we can use upper bounds on both and the number of equivalence classes (as in Definition 4.2). With a slightly more complicated use of labels, we can then give a universal formula which depends only on the parameters and of a -near-uniform graph class , but is independent from particular . This way we get a result even stronger than what is required for the proof of Theorem 5.1 (see Section 6 for more discussion):

###### Theorem 5.5.

Let , and be a -near-uniform graph class. There exists an FO formula , depending only on and , such that where denotes the class of (finite) graphs of degree at most .

Furthermore, for any and such that the near--twin relation of is an equivalence of index at most , one can in polynomial time compute a graph such that .

###### Proof.

We are going to prove the theorem by defining the formula and, for each , efficiently constructing a graph such that . We give the construction of the graph first, while postponing the definition of to the end of the proof.

Let be such that the near--twin relation of is an equivalence of index at most . Let where be the near--twin classes of with more than vertices (possible “small” near--twin classes are ignored now). Observe that contains all but at most vertices of . Let denote the remaining vertices in “small” equivalence classes. See an illustration in Figure 3.

We will construct the graph in three stages. First, we define the graph on the set , where the edge sets are given as:

• Let be the set of those indices from such that every vertex of has at least neighbours in (case (b) of Remark 5.4). We put .

• Let be the set of those index pairs from such that every vertex of is connected to all but at most vertices of  and every vertex of is connected to all but at most vertices of  (case (b) of Corollary 5.3). We put .

In the second step, we adjust by the original edges from : Let . Then we put . See in Figure 4. Note that every vertex of has degree at most  by Corollary 5.3.

In the degenerate case of we arrive at the same conclusion by the following alternative argument. By the definition, each near--twin class is an independent set and each pair of classes is again independent or induces a complete bipartite subgraph—this now defines and which is actually edgeless.

In the third step we add back the vertices from (remember that ) by putting . Note that ,

Finally we label the vertices of by the following fixed label set, which is independent of particular :

 L := {λi,λ′i:i=1,…,p} ∪{μi,j,νi,j,μ′i,j,ν′i,j:1≤i

The vertices of are labelled as follows (see again Figure 4):

• For , each vertex of is assigned label if , and label otherwise.

• For , each vertex of is assigned label and each of label if , and labels and , respectively, if .

• Let be indexed in any chosen order. For , the vertex is assigned label and each neighbour of in is assigned label .

With in place, we can now define the formula

 ψ(x,y)≡(x≠y)∧(ψ′(x,y)∨ψ′(y,x))

where

 ψ′(x,y) ≡⋁