Testing graph properties has been at the core of combinatorial property testing since the very beginning with the important results of Goldreich-Goldwasser-Ron . There are several different models of interest. In the dense graph model an -vertex graph is given by its boolean adjacency matrix. For this model there are characterizations of the properties that can be tested in constant amount of queries by -sided error tests , -sided error tests , and the properties that are defined by forbidden induced subgraphs and are testable by very small query complexity .
In the other model, called the incidence-list model, an -vertex graph is represented by its incidence lists. That is, an array of size in which every entry is associated with a vertex, and contains a list of the neighbours of that vertex. This model contains the important special case of the bounded-degree model in which the degree of the vertices is bounded by a universal parameter (and hence the lists are of size at most ).
The bounded-degree model, first considered in the property testing context by Goldreich and Ron , attracts much of the research interest in combinatorial property testing in the past decade. One reason is the algorithmic sophistication and wealth of structural results that were developed in the studies of property testing in this model. E.g., the use of random walks to test partition properties, starting in , and with the sophisticated recent results in [9, 6] for expander and clustering testing, the “local-partition” oracle [16, 17, 20], and others.The other motivation is the rapidly growing research of very large networks, e.g., the Internet, and other natural large networks such as social networks. These large networks often turn to be represented by bounded-degree (di)graphs (or very sparse (di)graphs). Property testing of sparse graphs can provide a useful filter to discard unwanted instances at a very low cost (in time and space), as well as algorithmic and structural insights regarding the tested properties.
Despite of the focus and wealth of results, the bounded-degree model remains far from being understood. In particular, as of present, there is no characterization of the properties that are testable in constant query complexity, neither by -sided error tests, nor by -sided error tests.
We focus on -sided error testing. Our main result is a characterization of the monotone (di)graph properties, and the hereditary (di)graph properties, that are -sided-error strongly-testable111For formal definiton of “property testing” see Section 2. Here “strongly-testable” means that the property can be tested by a constant number of queries that is independent of the graph size, but may depend on the distance parameter . The characterization essentially states that a monotone graph property is strongly-testable if and only if it is close (see Definition 2.5) to a property that is defined by a set of forbidden subgraphs of constant size (Theorem 6.3). For hereditary property we obtain a similar result (Theorem 6.4) except that forbidden subgraphs are replaced with forbidden as induced subgraphs.
We believe that our results form a first step towards a characterization of all -sided error strongly-testable graph properties in the bounded-degree model.
The bounded-degree model extends naturally to directed graphs. There are two different models that have been studied for directed graphs: In the first, the access to the graph is via queries to outgoing neighbours, and correspondingly, only the out-degree of vertices is bounded. This model corresponds to the standard representation of directed graphs in algorithmic computer science. Namely, where an -vertex directed graph (digraph) is represented by lists, each being associated with a distinct vertex in the graph, and contains the list of forward edges going out from . The access to a -outdegree bounded digraph in this model is via queries of the following type: a query specifies a pair where and . As a response, the algorithm discovers the th outgoing neighbour of the vertex 222if there is one, or a special symbol otherwise. In what follows we abbreviate this model as the -model, where is the upper bound on the out-degree of vertices.
In the other model, both the in-degree and out-degree are bounded by . In this case an -vertex graph is represented by lists; the list of outgoing edges and the list of incoming edges for each vertex. The query type changes accordingly and allows both ‘outgoing’ and ‘incoming’ edge queries. We denote this model as the -model (‘forward’ and ‘ backward’ queries). This model contains the model of undirected -bounded degree graphs (where each undirected edge is replaced by a pair of anti-parallel edges).
We note that the model, as a collection of graphs, strictly contains the model, while algorithmically it is more restricted by the limited access to the graph.
In all models, an -vertex (di)graph is said to be -far from a (di)graph property if it is required to change (delete and/or insert) at least edges in order to get a -bounded degree graph (in the corresponding model) that has the property .
The results in this paper are the characterization of the monotone digraph properties and hereditary digraph properties that are -sided error strongly-testable in the -model (Theorem 3.3, and 3.5). The results for the model easily follow from these for the -model. As the -model contains the undirected case, an analogous characterization of graph properties for the -bounded degree undirected graph model is implied. We note that these are the first results that do not restrict the family of graphs, nor the family of testers under consideration (apart of being -sided-error).
Related results: There are many results for the bounded-degree model on the testability of specific properties of graphs or digraphs, cf. [4, 9, 14, 12, 18, 21, 23, 22], and others. In  the authors relate (2-sided error) testability in the and models. Other general results fall typically into three categories. In the first not all -bounded degree graphs are considered, but rather a restricted family of graphs. It is shown e.g., in [16, 17, 19] (and citations therein)333 shows that any graph property is -sided error strongly-testable for any hyperfinite family of graphs. that under certain restriction of the input graphs all graph properties are -sided error strongly-testable. The other two types of general results are when the graph properties under study are restricted, or the class of testers is restricted. Most relevant for this work are the results of Czumaj, Shapira and Sohler , and Goldreich-Ron . In  it is shown that any hereditary property is -sided error strongly-testable if the input graph belongs to a hereditary and non-expanding family of graphs In  restricted -sided error testers called proximity oblivious testers (POT) for graph properties (and other properties) are studied. The POT is not being constructed for an explicitly given distance parameter . Instead, the tester works for any distance parameter
, but its success probability deteriorates astends to .  give several general results to when graph properties have a POT in the bounded-degree model (and other models).
Techniques and description of results: Attempting for a characterization result we should understand what are the limitations that a 1-sided error test, making queries, puts on the structure of the property it tests. It turns out that this is relatively simple. Using the tools from  (see also ), one can transform any -sided error tester into a “canonical” one that picks (uniformly) random vertices in , and then scans the balls of radius around each. Finally, it makes its decision based only on the subgraph it discovers and its interface to the rest of the graph. To make this latter point clearer consider the -degree bounded model and the property of not having a vertex of degree two. This property is -sided error strongly-testable simply by looking at a random vertex and rejecting if its degree is exactly . Note that this decision cannot be concluded just by the fact that the subgraph seen is a subgraph of . It is important that the sampled vertex is not connected to any other vertex besides the discovered neighbours of it. Namely, this property is not specified by a forbidden subgraph (or induced subgraph). This suggests the notion of configuration appearing also in , and defined for our setting in Section 2.
Loosely speaking a configuration specifies an induced subgraph with an induced “interface” to the rest of the graph (see Definition 2.12). With this notion it is fairly easy to see that any -sided error test can essentially test only graph properties that are close to being defined by a collection of forbidden configurations (the additional subtleties arise from the fact that the tester is actually being designed for a distance parameter , and for different ’s testers might reject different configurations).
Is the converse true? Namely, is every property that is defined by a set of forbidden configurations (let alone, being ”close” to such) strongly-testable? This is open at this point.
Showing that a property that is defined by a forbidden set of configurations is -sided error strongly-testable usually amounts to proving what is called “removal lemmas”. Namely, a lemma stating that if a graph is -far from a property then it has a large number of appearances of forbidden configurations (here “large” is , namely linear in ). While this is not generally sufficient for testing, it is essential.
In the case of monotone properties the notion of a ‘forbidden configurations’ can be replaced with ‘forbidden subgraphs’. As it turns out, a removal lemma is true for monotone properties in all models. For hereditary properties ‘forbidden configurations’ can be replaced with ‘forbidden induced subgraphs’. A removal lemma is also true for hereditary properties in the -model, but not true for the -model. In the latter case we use a somewhat different argument and test.
Our main results show that for all the bounded-degree models, for both monotone properties, and hereditary properties, a property is -sided error strongly-testable if and only if it is “close” to a property that is defined by an appropriate set of forbidden graphs (see Section 2 for the exact definition of “close” in this context). It could be that by replacing forbidden graphs with forbidden configurations, this becomes true for any graph property. If indeed true, this will settle the characterization problem of -sided error strongly-testable properties (see the discussion at the end of Section 4). We do not currently know if a generalization of some sort is true even for undirected -degree bounded graphs.
Finally, the characterization that we present is a structural result on -sided error strongly-testable properties. It provides a better understanding of the different models and the difference between them. One could further ask whether the characterization could be used to easily determine whether a given property is -sided error strongly-testable using arguments totally outside the area of property testing. This is indeed demonstrated (Section 7) by proving (the known results) that 2-colorability is not -sided error strongly-testable, and that not having a -star as a minor is strongly-testable (here is constant).
Organization: We start with the essential notations and preliminaries in Section 2. Section 3 contains a statement of our main results for the -model, and Section 4 contains the proofs of the main results. Section 5 contains further discussion, and examples of properties that are strongly-testable but not monotone, neither hereditary. Section 6 contains the analogous characterizations for the -model. Finally Sections 7 and 8 contain the application of our results to simply prove some known results, and some concluding remarks, respectively.
2.1 Graph related notations
Graphs here are mostly directed, can have anti-parallel edges but no multiple edges. We will describe the results (and corresponding definitions) mainly for the -model which is the more interesting technically. Moreover, as we do not have a bound on the in-degree for this model, better understanding this model may form a tiny step towards better understanding testing in sparse graphs (of unbounded degree).
For a directed graph we denote by the directed edge . That is, is a forward edge from . In turn, will be a member in the outgoing list of neighbours of .
Definition 2.1 (Neighbourhood)
For a digraph and a vertex we denote by the set of outgoing neighbours of . Formally, .
Similarly, and .
Note that for undirected graphs and coincide.
We generalize the notion of neighbourhood for sets of vertices: For we denote by . and are defined analogously.
Definition 2.2 (Degree Bound)
For an integer a digraph is called -bounded-out-degree if for every . The -model contains all digraphs that are -bounded-out-degree.
Note that the in-degree of a vertex can be arbitrary.
For a (di)graph and , we denote by the (di)graph on that is obtained form by deleting the vertices in . We denote by the induced subgraph of on (that is, contains all edges in with both endpoints in ).
A directed -star is the graph containing vertices and the edges . In this case is called the “center”.
2.2 Properties and testers
(The -model; queries) Let be a graph on vertices in the -model. The access to is via the following oracle: A query specifies a name of a vertex . As a result the oracle provides as an answer.
Note that an algorithm has no direct access to the incoming edges of a specified vertex .
We note that a standard query in the incident list model is for a pair , where and an index, on which the oracle’s answer is the th vertex in the ordered list . For the two query-types are asymptotically equivalent (up to multiplying the number of queries by a factor of ). We use the definition above to emphasise that algorithms, as well as properties, are invariant to the order of the vertices in .
The -model is similar where for a query , the answer is the pair of sets and (both sets are of size at most ). In the undirected case the result is (of size bounded by ). In terms of property testing, the -bounded degree model for undirected graphs can be seen as a submodel of -model where each undirected edge is represented as two anti-parallel edges.
Definition 2.4 ((di)Graph Properties)
A (di)graph property is a set of (di)graphs that is closed under isomorphism. Namely if then any isomorphic copy of is in . We write , where is the set of -vertex graphs in .
Definition 2.5 (Graph distance, distance to a property, distance between properties)
Let and be (di)graphs on vertices in any of the -bounded degree models (that is, the -model, , or -bounded degree undirected graph model). The distance, , is the number of edges that needs to be deleted and / or inserted from in order to make it .
We say that are -far (or is -far from ) if . Otherwise are said to be -close.
Let be properties of -vertex (di)graphs. is -close to if it is -close to some . We say that are -close (or is -close to ) if every graph in is -close to and every graph in is -close to .
Definition 2.6 (Monotone properties and hereditary properties)
A (di)graph property is monotone (decreasing) if for every , deleting any edge results in a (di)graph that is in . A (di)graph property is hereditary if for every and , .
Many natural (di)graph properties are monotone, e.g., being acyclic, being -colourable etc. Note that if is a monotone graph property then for every is by itself monotone.
Definition 2.7 (The (di)Graph Properties and )
Let be a set of digraphs. A digraph is -free if for every does not contain any subgraph that is isomorphic to .
The monotone property contains all digraphs that are -free, and contains all -vertex (di)graphs in . Similarly, we denote by the property that is defined by being -free as and the set of -vertex (di)graphs in .
Definition 2.8 (bounded-size collections)
Let be a set of (di)graphs. We call a -set if every member has at most vertices.
A natural example of monotone decreasing graph property is a property that is defined by a family of forbidden subgraphs . It is immediate from the definition that every monotone graph property is defined by a family of forbidden subgraphs but this family may be infinite.
Recall that is monotone if and only if is monotone for every . Namely, being monotone is defined for every separately. In this respect being monotone is not a ‘global’ feature of but rather a feature of the individual . In what follows it will important to us how the individual monotone properties are defined. Obviously for any fixed , is defined by an -set of forbidden subgraphs, but may depend on .
To make this clearer, consider the property of being acyclic. This property is defined by forbidding all di-cycles, which is an infinite family. For the individual slices the corresponding family although finite, it is not a -set unless . An example of slightly different nature is that of the monotone property that contains the digraphs that are not Hamiltonian. For every is defined by one forbidden subgraph (the simple directed -cycle). Thus is defined by a -set of forbidden subgraphs but for no fixed , can be defined by an -set for every .
This distinction will become important in our characterization results. It will turn out that the strongly-testable monotone properties are tightly related to properties that are defined by -sets of forbidden subgraphs for that is independent of .
For family of forbidden digraphs, the monotone property of being -free is determined by the minimal members of (w.r.t edge deletions). That is, if for it holds that is a subgraph of , then being -free is identical to being -free.
Hereditary (di)graph properties are very natural in graph theory. It is immediate from the definition that a property is hereditary if and only if it is defined by a collection (possibly infinite) of forbidden induced subgraphs. E.g., the property of not containing an induced (di)cycle of length
, and the property of being bipartite (that is expressed in this case as not containing an odd size cycle). Both these properties are monotone and hereditary.
Hereditary properties are not necessarily monotone, and monotone properties are not necessarily hereditary. Further, the feature of being hereditary, unlike being monotone, depends on the entire property and cannot be defined for a single -slice .
Testers: We define here -sided error testers for digraph properties in the -model.
Definition 2.10 (-sided error -test for a digraph property , -model)
A -sided error test for a digraph property is a randomized algorithm that gets two parameters, and a distance parameter . It accesses its input graph via vertex queries (Definition 2.3), and satisfies the following two conditions.
It accepts every -vertex digraph in that belongs to with probability .
It rejects every -vertex digraph that is -far from with probability at least .
The query complexity of the test is the maximum number of queries it makes for any input graph (in or not in ) and for every run. Hence the query complexity is a function of and .
A note on the definition of testers: A test for a graph property is formally an infinite set of tests , where is a test for and distance parameter . Namely, we deal here with a non-uniform model of computation. We often use the term -test to emphasize that the test is designed for an error parameter . This will be of special importance in this paper, as for different distance parameters, the test will behave differently. We are interested, as usual, in the query complexity as a function of and . Note further that since our models are parameterized by , the query complexity (or even the fact whether a property is testable in the corresponding -bounded degree model) may depend on . We may state the query complexity dependence on but this is of no particular importance in this paper.
Definition 2.11 (strong-testability)
Let . If a property has an -test whose query complexity on every -vertex graph is bounded by , we say that is -strongly-testable. If is -strongly-testable for every we say that is strongly-testable.
2.3 Configurations - the -model
The following definition of configuration is of major importance in this paper. The motivation behind the definition is that a configuration is what a tester discovers after making some queries to the graph. It will turn out that the configuration that a tester discovers contains all the information that is used by the tester in order to form its decision.
Definition 2.12 (Configuration, -model)
A configuration is a pair , where is a -bounded-out-degree graph, and is a function . The out-degree of every frontier vertex is .
Consider a run of a tester on a graph . The tester discovers all (the at most ) outgoing neighbours of every queried vertex. At the end of the run, after making queries, the tester discovers a subgraph of . contains the vertices that are queried; these correspond to the vertices in the configuration it discovers. may also contain vertices that are neighbours of queried vertices but that were not themselves queried. These vertices are the vertices. A frontier vertex that is discovered by the tester and was not queried may have outgoing neighbours, but the corresponding edges (the forward edges from the frontier vertex) will not be discovered by the tester. Consequently, the out-degree of a frontier vertex in the discovered configuration is . In contrast, all forward edges of a developed vertex are discovered.
We now make the above formal using the defintion below.
Definition 2.13 (-Free, -model)
Let be a configuration, where a digraph and . Let be a digraph in the -model. We say that has a - if there is an injective mapping with the following two properties:
and , .
For every developed if then .
We say that is -free if has no -appearance.
Let a configuration with being the developed vertices. Definition 2.13 implies that if has a -appearance on a vertex set , with being the mapping as in the definition, then is isomorphic to . Namely induces an isomorphic digraph on its developed vertices as does on the vertices that are the images of the developed set of vertices . Further, the 2nd requirement in Definition 2.13 asserts that for every , all forward edges of in are the ‘images of edges’ in . It is not necessarily that is isomorphic as an induced subgraph to . This is since there might be an edge that is not in . This can happen only if is an image of a frontier vertex.
To exemplify Defintion 2.13 further, consider , where is the directed -star and the center is the only developed vertex in . A digraph has a -appearance if and only if it has a vertex with exactly two outgoing neighbours . There could be an edge and hence the subgraph that induces on might not be isomorphic to . There could also be an edge . However, there cannot be an edge where .
We sum up this discussion with the following obvious fact.
Let be a configuration and a digraph (all with respect to the -model). Then:
If has a -appearance then contains as a subgraph.
If is isomorphic to as an induced subgraph, then a subgraph of that is obtained by deleting in has a -appearance (for the given ).
Finally, looking towards a characterization theorem, it would be of use if we could restrict the behaviour of possible testers to “canonical” ones. This proved useful in the dense graph model in  and it is of similar flavour (and simpler) here. It was already done in  for undirected -bounded degree graphs and the extension to directed graphs (in both models) is straightforward. We state it here in order to be consistent with our notations.
Definition 2.15 (-disc around a vertex, -model)
Let be a digraph and . The -disc around denoted , is the subgraph of that is induced by all vertices for which there is a path from to of length at most .
We note that a tester can discover the -disc around a given vertex . This is done by making a ‘BFS-like’ search from , where at each step the tester queries the next first discovered but not yet queried vertex that is of distance less than from . Discovering takes at most queries for a graph in the -model. It is useful to consider such a procedure as an augmented query, motivating the following definition.
Definition 2.16 (-disc query, -model)
An -disc query is made by specifying a vertex for which the answer is the -disc around .
Definition 2.17 (canonical-testers)
A -canonical tester for a graph property is a tester that chooses vertices uniformly at random . It then makes an -disc query around for . Then, depending only on the configuration it sees and possibly on (but not the order of the queries, or the internal coins) it makes its decision.
The following result , shows that strongly-testable properties can be tested by canonical-testers444In  it is done only for undirected graphs, but the generalization to directed graphs in both models is straightforward..
Let be a -sided error -test for a digraph property in the -model. If the query complexity of is bounded by then there is a -canonical tester that is a -sided error -test for .
Note that a -canonical-tester is a ‘non-adaptive’ algorithm with respect to -disc queries.
3 Our main results
We consider in what follows the model (for constant ). The -model is the more natural model from the algorithmic point of view, being consistent with the standard data structures for directed graphs. It contains a strictly larger set of graphs than the -model (as the in-degree is not bounded). From the property testing perspective it is more restricted algorithmically due to the limited access to the graph.
We prove here that the strongly-testable monotone graph properties are these that are close (in the sense of Definition 2.5) to be expressed by an -set of forbidden subgraphs that have some additional connectivity requirements. For hereditary properties the results are essentially the same where forbidden subgraphs are replaced with forbidden induced subgraphs. We need the following definitions.
Definition 3.1 (Component)
Let be a directed graph. A subset defines a component of , if by disregarding the directions of the edges of , induces a connected component in the resulting undirected graph. We say in this case that , the directed subgraph of that is induced by , is a component of .
We note that Definition 3.1 is not a standard graph-theory term, and we warn the reader not to confuse it with strongly connected components of the digraph. We are concerned with graphs of multiple components as the forbidden graphs that define a monotone property might be such. E.g., let be the directed -cycle, and consider the property of being -free, the property of being -free, the property of being -free, and the property of being free of the single graph that is a vertex disjoint union of and . Namely, a graph is not in if it has a subgraph and a disjoint subgraph. All properties are distinct. The properties are defined by one forbidden graph. is defined by two forbidden graphs. The forbidden graphs defining have one component each, while the single forbidden graph defining has two components.
Definition 3.2 (Rooted digraph)
A digraph is rooted if every component of has a vertex such that for every , there is a di-path from to in .
We note that a digraph can have many roots. In particular, if it is strongly connected then every vertex of it is a root. The significance of being a root in a component of size at most is that making an -disc query around will discover the whole component that contains .
Our main theorem, characterizing the strongly-testable monotone properties is the following.
Let be a monotone digraph property in the -model. Then is strongly-testable there is a function such that for any and there is a non-redundant -set of rooted digraphs such that the property that consists of the -vertex digraphs that are -free satisfies the following two conditions:
(b) is -close to .
We note that the sets in Theorem 3.3 may depend on (as the bound depends on ).
A Similar theorem for hereditary properties is the following.
Let be a set of digraphs. We say that is essential if the digraph is -free as induced subgraph. Namely, does not contain as an induced subgraph any member of except for itself. If every is essential, we say that is non-redundant.
Let be a set of digraphs. Recall the definition of the property from Definition 2.6. We denote by the set of -vertex digraphs in .
Let be an hereditary digraph property in the -model. Then is strongly-testable there are functions and such that for any there is a -set of rooted digraphs such that for every satisfies the following two conditions:
(b) is -close to .
Some comments on the results:
The lower bound in Theorem 3.5 is essential and not an artifact of the proof. Consider the -model and let be the directed cycle of size . Let be the property that contains an -vertex graph if it is free of all cycles for (as induced subgraphs). This is a strongly-testable hereditary (and monotone) property as asserted by Theorem 3.5 and the set that contains all cycles up to size , for .
However, for any possible -set for which , for to -close to , should contain all cycles of size at most . But then only for .
The ‘only if’ direction of Theorem 3.3 is restated as Theorem 4.9. In Theorem 4.20 we generalize Theorem 4.9 by replacing the forbidden set of digraphs with a finite set of forbidden configurations (see Definitions 2.12 and 2.13 ). In turn, this stronger (and more immediate theorem) is true for any strongly-testable digraph property (rather than just for monotone). Thus, Theorem 4.20 gives a necessary condition for any graph property to be -sided error strongly-testable. For all we know, this could also be a sufficient condition. This will be further discussed in Section 8.
One may ask whether the extra restriction that (or in case of hereditary property) is -close to rather than just being is a necessity or rather just an artifact of our proof. The answer is that this is needed. Indeed, as mentioned in the introduction, acyclicity is not strongly-testable in the -model for large enough , even by -sided error testes . However, it is easy to see that directed acyclicity is -sided error strongly-testable in the -model. Acyclicity, while monotone, can not be defined by an -set of forbidden subgraphs in the -model for any fixed . Rather, it is -close (in the -model) to be -free as induced graphs for the -set that contains all cycles of size at most .
4 Proofs of the main results
4.1 Monotone properties and hereditary properties that are strongly-testable
Theorem 3.3 states that if is -close to for an -set of rooted digraphs then is -sided error strongly-testable. We start by proving that the monotone property itself is strongly-testable for a fixed -set .
Let be a -set of digraphs and the monotone property that contains the digraphs that are -free. Remark 2.9 implies that we may assume in what follows that does not contain two graphs such that one is a subgraph of the other. We also note that if contains a graph that is an isolated vertex (or a set of isolated vertices) then becomes trivial (empty for large enough ). We assume in what follows that the above does not happen.
We start with the following preliminary proposition for the subcase of Theorem 3.3, where .
Let be a -set of rooted digraphs and . Then the monotone property has a -sided error -test in the -model, making neighbourhood queries.
Proof. The top level idea is simple, and a similar idea was used in : Suppose that a digraph is -far from being -free. We will show that there is a large set of vertices, each being a root in an -appearance in for some . Hence sampling of a random vertex and scanning the -disc around it will find a forbidden -appearance in . Some extra care should be taken for disconnected forbidden subgraphs.
Formally, we prove that the following test is a test for .
: Repeat for times independently: Chose a vertex uniformly at random and make an -disc query around . If some is found as a subgraph in the discovered subgraph of then reject. Otherwise accept.
Obviously the test accepts with probability every graph that is -free. Further, the claimed complexity is clear.
Assume that is a digraph on vertices that is -far from . We claim that contains at least edge disjoint subgraphs, each that is isomorphic to some . This is so as let be any maximal edge disjoint collection of subgraphs of , each that is isomorphic to some . By deleting all outgoing-edges that are adjacent to vertices in (at most ) none of the subgraphs in is a forbidden subgraph anymore. Further, no new forbidden subgraph is created (by the assumption that no graph in is a subgraph of another graph in ). Therefore, becomes -free after deleting these edges. We conclude that .
Fix such a collection of subgraphs . We deduce that there is some fixed graph that is isomorphic to at least of the digraphs in . Fix such edge disjoint subgraphs in , which we refer to as .
Assume first that is composed of one single rooted component. Since the subgraphs in are edge disjoint, a root vertex can appear in at most such distinct subgraphs (on account that it must have at least one forward edge in each such appearance). We conclude that there are at least distinct vertices, each being a root in an -appearances in . Hence, with probability a random vertex will be one of these roots. Assuming that such a vertex is chosen by , then making the -disc query to will discover the corresponding -appearance. Thus the failure probability is bounded by .
Finally, assume that is composed of several rooted components. Since , is composed of at most components . In this case, finding vertices , with the th being the root of a subgraph isomorphic to will discover an isomorphic copy of in . The probability of sampling a root of a component of type is at least . The union-bound implies that the probability that there exists some type that we don’t sample a root of is at most . This concludes the proof.
It is assumed implicitly in Proposition 4.1 that is a collection of digraphs in the -model. Therefore, the fact that is an -set implies that is bounded in terms of (exponentially). Although not of prime interest for this paper, we still give the above tighter dependence on because could be much smaller than the worst case bound.
For hereditary properties a Proposition analogous to Proposition 4.1 will be stated. In this case being -free as subgraphs is replaced by being free as induced subgraphs. However, unlike the easier case of monotone properties, we can’t assume that if is -far from the property, then it contains many vertices that are roots of -appearances. The reason is that deleting edges in an -appearance in may create a new -appearance555It could be true that for every , if is far from being -free as induced subgraphs, then there are many -appearances in , but we do not have a proof nor a counter example for this.. We use a different argument.
Let be a non-redundant -set of rooted digraphs. Then the hereditary property of being -free as induced subgraphs is -sided error strongly-testable in the -model.
The following lemma is folklore. We state it for completeness.
Lemma 4.3 (sampling a random edge)
Let be a graph in the -model with . Then, with probability at least , the following randomized algorithm outputs an edge that is distributed uniformly in , and outputs a special failure indication otherwise. The algorithm sample a vertex uniformly at random, queries this vertex to obtain , and outputs each edge going out of with probability . In other words, letting , the algorithm stops indicating failure with probability , and otherwise it samples uniformly at random and outputs .
Proof. Since there are at least vertices each with outdegree at least . Let this set be . The algorithm will output an edge in the case it chooses , and that it does not choose to indicate failure after choosing . This occurs with probability at least .
The algorithm outputs a fixed edge with probability
. Since this is identical for all edges, the algorithm induces the uniform distribution on.
Proof. [of Proposition 4.2] For this proof, we abbreviate “-appearance” and “-appearance” for -appearance as induced subgraph, and -appearance as induced subgraphs, respectively.
We may assume that does not include an isolated vertex as a member, as otherwise, being -free is an empty property. Further, we may assume that for no contains an isolated vertex. As otherwise, we replace such with that is obtained from by removing the isolated vertices. Obviously, for large enough, contains as an induced subgraph if and only if contains as induced subgraph.
The test samples some vertices and scans the -disc around each. It rejects only if it finds a -appearance in the subgraph of that it discovers. The vertex set that is sampled is a set of endpoints of random edges. This is done by calling the algorithm of Lemma 4.3 for times. Note that the lemma guarantee a success probability of per edge query only for graphs with edges. In general, these calls could result in some random edges or none at all. If less than edges are produced by the calles to the algorithm in Lemma 4.3, the algorithm will stop and accept. Thus the overal query complexity is neighbourhood queries in addition to -disc queries.
It is clear that for that is -free the test accepts with probability .
Let be a digraph on vertices that is -far from being -free as induced subgraphs. Since must be -far from the empty graph, it follows that . This implies that with probability at least the calls to the algorithm in Lemma 4.3 will indeed produce at least random edges. In what follows we condition the analysis on the assumption that indeed random edges are produced.
For simplicity we first analyze the test for the case that each has only one rooted component (i.e, this does not cover, e.g., the property of being free of a disjoint pair of a di-triangle and a -cycle). The argument for the general case will be somewhat harder.
Let be a maximal set of subgraphs of , each being an -appearance, and in which the forward-edges of the roots are disjoint. For each subgraph in fix one root vertex. Let this set of vertices be .
Assume first that . Then for an edge , sampled uniformly at random from , is a root of an -appearance with probability at least . Hence, choosing random edges will find a vertex that is a root of an -appearance with probability of at least .
Suppose now that . Then (as we fixed one root vertex per member in ). Let .
Assume first that . Let be the set of all edges adjacent to (both incoming and outgoing edges). Then . Therefore deleting all edges in results in a subgraph in which the vertices in become isolated and all old -appearances in will be destroyed. We claim that the resulting graph becomes -free. Indeed if is isomorphic to some , either is also so, or it is created by the absence of some old edges that are deleted. In the first case, must share an edge with an appearance in , and where is a root in both appearances. This cannot happen as the edge is deleted. For the second possibility, as we delete all edges (forward and backwards edges) adjacent to roots, deleting an edge makes isolated in and hence, by the discusion in the first paragraph of the proof, cannot be part of an -appearance.
The fact that becomes -free is in contradiction with the assumption that is -far from being such, as we have deleted less than edges. Hence . But then sampling a random edge will result in for which with success probability at least . Thus, choosing random edges implies that we pick a root of an -appearance with probability at least .
We conclude that in all cases (of sizes of ) we find a vertex that is a root vertex of an -appearance with probability at least . If this happens, then scanning the -disc around the endpoints of the sampled edges will discover the -appearance. This concludes the proof for this simple case (in which each has a single rooted component).
The general case: For the general case, the same argument does not work directly. To realize what is the difficulty, assume that a forbidden graph consists of two components: a di-triangle and a disjoint -cycle. Assume also that is -far from being -free and that there is a small number of -appearances in . Then, similarly to the second case above, we conclude that is large, where is the set of roots of the -appearances. This would mean that we can find a root vertex in an -appearance by making only a small number of queries. But what if most of these edges are going into vertices in di-triangles, and only very few to vertices in -cycles. In order to discover a forbidden subgraph we also need to discover a -cycle. In the general case we need to combine more carefully the several cases of different sizes of . This we do as follows:
Let be a digraph on vertices that is -far from being -free as induced subgraphs (where we no longer assume that each forbidden graph in has only one component).
For let be composed of disjoint components . Let be a maximal set of subgraphs of , each being an -appearance for some , and in which the forward-edges of the roots are disjoint.
We can write where contains the corresponding appearances of in . Let be the set of the corresponding roots, one per each appearance in , and . Note that ranges over and ranges over all possible components types of which is a number .
For each let .
case (a): Assume that for some , for every , .
In this case, for every , for a random edge , is going to be a root of an appearance (namely in ) with probability at least . In addition, for every , a random edge picked uniformly from will have with probability at least (as could be a root of at most distinct members in ).
Hence sampling random edges implies that a root in an appearance of for every will be found with probability at least . Calling the sampling algorithm of Lemma 4.3 for times results in at least random edges with probability at least . Therefore, the overall success probability in this case is at least .
case (b): If case (a) does not hold, then for every , there is for which . (It could be that for some there are more than one as above; in that case, choose an arbitrary one.)
But then deleting, for every , all edges incident to every root in (forward and backward edges), all -occurrences in will be destroyed (as for each we have destroyed all appearances of in ). Moreover, no new appearances are created by the same reasoning as in the simple case. Finally, we have deleted at most edges which contradicts the assumption that is -far from being -free.
We have proved so far that monotone or hereditary properties that are defined by an -set of forbidden rooted digraphs are strongly-testable. To prove the ‘if-part’ of Theorems 3.3 and 3.5, we will also show that properties that are close to such properties are strongly-testable. This is done next. The following is a restatement of the ‘if-part’ of Theorem 3.3.
Let be a -set of rooted digraphs and for let the monotone property that contains all -vertex digraphs that are -free as subgraphs. Let be a digraph property in the -model for which, (a) , and (b) is -close to . Then, is -sided error -strongly-testable in the -model.
Proof. By Proposition 4.1, for every there is a -sided error -test for . Let and be a corresponding -sided error -test for . We run on , accept if accepts and reject otherwise. If then since the test will accept w.p. . On the other hand, if is -far from , then it must be -far from as is -close to . Hence, is rejected with probability at least .
Let be a non-redundant -set of rooted digraphs and for let the hereditary property that contains all -vertex digraphs that are -free as induced subgraphs. Let be a digraph property in the -model for which (a) and (b) is -close to . Then is -sided error -strongly-testable in the -model.
Theorem 4.4 is stated in terms of a fixed family of forbidden digraphs . However, since the conditions (a) and (b) in the theorem are in terms of the slices , namely for -vertex graphs, the family may depend on . The only global requirement of is that it is an -set, where is a function of only.
To make this clearer consider e.g., the property in the -model that contains every -vertex graph if is even, and contains the digraphs that do not have a directed -cycle otherwise. is monotone but it is not defined by a single set of forbidden subgraphs. Rather, for every is a slice of a property that is defined in this way. Hence, is -sided error strongly-testable.
Note that the digraph property that is asserted to be strongly-testable in Theorem 4.4 is not necessarily monotone. It is only required that it is close to a monotone property. In this sense, Theorem 4.4 is slightly stronger than the ‘if-part’ of Theorem 3.3. An analogous remark also holds for the property in Theorem 4.5
Theorem 3.3 requires that the corresponding family contains members that are rooted. We first show why this restriction is needed. We say that is minimal if there is not for which is a subgraph of .
Let be a set of forbidden digraphs and be the corresponding monotone property of -vertex graphs. If for some minimal , is not rooted, then any 1-sided error -test for makes queries in the -model.
Proof. Assume that is minimal and not rooted. Set . An -test for that is -sided error must discover some on any run that rejects. Hence it is enough to prove that any test that discovers a -appearance and makes queries must have a success probability that is less than on some -vertex graphs that are -far from .
We use Yao’s principle to prove the lower bound. Namely, we construct a probability distributionthat is supported on -vertex digraphs in that are -far from . We then show that any deterministic algorithm making queries fails to find a copy of for more than of the inputs weighted according to .
Let be an unlabelled directed graph on vertices that is a union of vertex disjoint copies666If does not divide , we augment with at most isolated vertices to get an -vertex graph. of . The distribution is formed by labelling according to a random permutation uniformly chosen from the set of all permutation on elements. Obviously is supported on -far graphs. Moreover, the only forbidden subgraphs in each graph supported by are disjoint copies of . Hence, any deterministic -sided error test with respect to ends correctly only when it finds a copy of .
Let be any deterministic algorithm making queries, adaptively. Every query made by is of the form , where is either one of the vertices that occurred as answers for some prior queries, or is a new vertex that was not yet seen. We will augment the algorithm so that on query , the algorithm receives the entire subgraph containing all vertices reachable from in the copy of where lies. Note that this gives more information to the algorithm in the form of possibly additional vertices but with at least one vertex in the -appearance of that is excluded by the assumption that is not rooted. Hence, if the augmented algorithm does not discover a copy of neither does . Note further that the additional information makes the queries of the first type – namely, queries to vertices that are the answers to prior queries redundant.
Hence the augmented algorithm will end correctly after making queries only if it for some distinct , the vertices and belong to the same component of but none is reachable from the other. This probability is clearly bounded by , for our choice of and large enough.
4.2.1 The ‘only-if’ part of Theorem 3.3
Proving the ‘only-if’ part of Theorem 3.3 naturally brings us back to configurations in digraphs as this is what a tester discovers in its run. This motivates the following definition analogous to Definition 2.7.
For a set of configurations ,the property contains all graphs that are -free for every .
We comment that for an unrestricted set of forbidden configurations , may happen to be hereditary, monotone, or neither (in the -model, -model and the undirected bounded-degree graph model). E.g., the property of not having a vertex of out-degree exactly in the -model is a property that is defined by one forbidden configuration that is the directed -star, where the center is the only developed vertex. However, the property is not monotone nor hereditary (and happens to be strongly-testable).
The following is a restatement of the ‘only if’ part of Theorem 3.3 followed by its proof. Note that configurations do not appear in the statement, but will appear in the proof.
Assume that the monotone property is -sided error strongly-testable in the -model. Then for any there is a such that for any there is a -set of rooted digraphs such that the corresponding property that contains the -vertex digraphs that are -free, satisfies the following two conditions:
(b) is -close to .
Proof. Since is strongly-testable, Theorem 2.18 implies that for any there is a -canonical -sided error -test, for