In this work we consider the complexity of solving “ordering constraint satisfaction problems (OCSP)” in the “streaming setting”. We introduce these notions below before describing our results.
1.1 Orderings and Constraint Satisfaction Problems
In this work we consider optimization problems where the solution space is all possible orderings of variables. The Travelling Salesperson Problem and most forms of scheduling fit this framework, though our work considers a more restricted class of problems, namely ordering constraint satisfaction problems (OCSPs). OCSPs as a class were first defined by Guruswami, Håstad, Manokaran, Raghavendra, and Charikar [ghmrc-classical-ugc-hardness]. To describe them here, we first set up some notation and terminology.
We let denote the set and denote the set of permutations on , i.e., the set of bijections . We sometimes use to denote . The solution space of ordering problems is , i.e., an assignment to variables is given by . Given distinct integers we define to be the unique permutation in which sorts . In other words, is the unique permutation such that . A -ary ordering constraint function is given by a predicate . An ordering constraint application on variables is given by a constraint function and a -tuple where the ’s are distinct. In the interest of brevity we will often skip the term “ordering” below and further refer to constraint functions as “functions” and constraint applications as “constraints”. A constraint is satisfied by an assignment if , where is the -tuple .
A maximum ordering constraint satisfaction problem, , is specified by a single ordering constraint function , for some positive integer arity . An instance of on variables is given by constraints where , i.e., the application of the function to the variables . (We omit from the description of a constraint when clear from context.) The value of an ordering on an instance , denoted , is the fraction of constraints satisfied by , i.e., . The optimal value of is defined as .
Two simple examples of Max-OCSP problems are the maximum acyclic subgraph (MAS) problem and the Betweenness problem. MAS corresponds to the ordering constraint function given by and . If we re-interpret the constraints as directed edges in a graph on vertices, the problem asks for an ordering of the vertices which maximizes the number of forward edges (which form an acyclic subgraph). The Betweenness problem corresponds to the ordering constraint function given by , and for all other . Here, a constraint reads as “ lies between and ”, and the goal is again to find a permutation maximizing the number of satisfied constraints.
1.2 Approximability and Streaming Algorithms
In this work we consider the “approximability” of in the “streaming setting”. We define these terms next starting with the latter.
In the (single-pass) “streaming setting” an instance of is presented as a stream of constraints with the th element of the stream being where . A streaming algorithm updates its state with each element of the stream and at the end produces the output . The measure of complexity of interest to us is the space used by and in particular we distinguish between algorithms that use space polylogarithmic in the input length and space that grows polynomially ( for ) in the input length.
We say that is an -approximation algorithm if for every ,
with probability at least 2/3 over the internal coin tosses of. Thus our approximation factors are numbers in the interval . Given let denote the probability that is satisfied by a random ordering. Every instance of satisfies and thus the algorithm that always outputs is a -approximation algorithm for . We say that a problem is approximable (in the streaming setting) if we can beat this trivial algorithm by a positive factor. Specifically is said to be approximable if for every there exists and a space algorithm that is a approximation algorithm for , We say is approximation-resistant (in the streaming setting) otherwise.
1.3 Main result and comparison to prior works
Theorem 1.1 (Main theorem).
For every and every , is approximation resistant in the (single-pass) streaming setting. In particular for every , every approximation algorithm for requires space.
Our theorem parallels a result of Guruswami, Håstad, Manokaran, Raghavendra, and Charikar [ghmrc-classical-ugc-hardness] who prove approximation resistance with respect to polynomial time algorithms based on the unique games conjecture. In our setting of streaming algorithms the only problem that seems to have been explored in the literature before was MAS, and even in this case a tight result was not known. Guruswami, Velingker, and Velusamy [gvv] proved that for every , MAS is not -approximable in space. A stronger hardness for approximation for MAS is indicated in the work of Guruswami and Tao [GT19] who suggest that their hardness of unique games, an “unordered” CSP problem, could be converted to such a hardness for MAS. As far as we know our result is the first tight hardness result for for any non-constant , while yielding tight hardness results for every .
We start by describing our proof technique for the special case of the MAS problem. Later we describe the general case.
Our general approach is to start with a hardness result for CSPs over alphabets of size (i.e., constraint satisfaction problems where the variables take values in ), and then to reduce these CSPs to the OCSP at hand. While this general approach is not new, the optimality of our results seems to come from the fact that we choose the CSP problem carefully, and are able to get optimal hardness results for problems of our choice thanks to a general result of Chou, Golovnev, Sudan and Velusamy [CGSV21]. Thus whereas previous approaches towards proving hardness of MAS, for example, were unable to get optimal hardness results for MAS despite starting with optimal hardness results of the source (unique games), by choosing our source problem more carefully we manage to get optimal hardness results.
Recall that while . For a large constant , we define the constraint function by iff . , the problem of maximizing constraints applied to variables which take values in , aims to capture a “-coarsening” of . Specifically we think of an ordering of variables as dividing the variables into blocks with variables being in the first block, being in the second block and so on. is defined so that if the -assignment to the variables based on which block they belong to satisfies an constraint, then the underlying constraint will be satisfied by .
We can get an optimal hardness result for from the work of [CGSV21] — we can use their results to show that space algorithms cannot distinguish “YES instances” whose value is from “NO instances” instances whose value is . (We remark that even to get this result we need to choose some “distributions” carefully and this is not immediate from the previous work, but once these choices are made, the lower bound follows from the previous work.) However this does not immediately imply a hardness result for the original OCSP problem : By definition of it follows that the YES instances of MAS have values at least and they are indistinguishable to small space algorithms from the NO instances, but the NO instances may now have value much higher than .
To get hardness of we can no longer use the main theorems of [CGSV21] as a black box. Instead we need to delve into their reduction and notice that the hard instances (in the NO case) not only have small values but also are “small partition expanders” in a specific sense: any partition of the constraint graph into roughly equal sized blocks has very few edges, specifically a fraction, which lie within the blocks. This additional property allows us to prove that the reduction from the coarsened problem to the ordering problem preserves values approximately (to within an additive amount).
Extending the idea to other OCSPs involves two additional steps. We define analogously to (the definition is completely determined by and ), but we still need to find the right “distributions” that allow us to apply the results of [CGSV21]. We describe this process in Section 3.1. Having done this we now need an analysis of the NO instances arising from the construction in [CGSV21]. Specifically we show that the constraint hypergraph is now a “small partition hypergraph expander”, in the sense that any partition into roughly equal sized blocks would have very few hyperedges that contain two vertices from the same block. This allows us to show that the -coarsened unordered instances have roughly the same and values (in the NO case) and this allows us to get optimal hardness results for all ordering CSPs.
We remark in passing that our notion of coarsening is somewhat similar to, but not the same as that used in previous works, notably [ghmrc-classical-ugc-hardness]. In particular the techniques used to compare the OCSP value before coarsening with the CSP value after coarsening are somewhat different: Their analysis involves more sophisticated tools such as influence of variables and Gaussian noise stability. Our analysis in contrast is a more elementary analysis of the type common with random graphs.
Organization of the rest of the paper.
In Section 2 we introduce some notation we use and background material. In Section 3 we prove our main theorem, Theorem 1.1. In this section we also introduce two distributions on instances, the YES distribution and the NO distribution, and state lemmas asserting that these distributions are concentrated on instances with high, and respectively low, OCSP value; and that these distributions are indistinguishable to single-pass small space streaming algorithms. We prove the lemmas on the OCSP values in Section 4, and prove the indistinguishability lemma in Section 5.
2 Preliminaries and definitions
2.1 Basic notation
Some of the notation we use is already introduced in Section 1.1. Here we introduce some more notation we use.
The support of an ordering constraint function is the set .
A (directed, self-loop-free, multi-) -hypergraph is given by a set of vertices and a multiset of -hyperedges (i.e., ordered -tuples of vertices), such that no vertex appears in the same -hyperedge twice. A -hyperedge is incident on a vertex if appears in . Let denote the set of vertices to which a -hyperedge is incident, and let denote the number of -hyperedges in .
A -hypergraph is a -hypermatching if it has the property that no pair of (distinct) -hyperedges is incident on the same vertex. For , an -partial -hypermatching is a -hypermatching which contains -hyperedges.
A -partition of is a map . Importantly, -partitions are ordered objects; that is, composing a -partition with a nontrivial permutation on leads to a new -partition which we treat as distinct. Given a -partition of and , we define the -th block as the set .
Given an instance of on variables, we define the constraint hypergraph to be the -hypergraph on , where each -hyperedge corresponds to a constraint (given by the exact same -tuple). We also let denote the number of constraints in (equiv., the number of -hyperedges in ).
2.2 Concentration bound
We also require the following form of Azuma’s inequality
, a concentration inequality for submartingales. For us the following form, for Boolean-valued random variables with bounded conditional expectations taken from Kapralov and Krachun[KK19], is particularly convenient.
Lemma 2.1 ([Kk19, Lemma 2.5]).
Let be (not necessarily independent) -valued random variables, such that for some , for every . Then if ,
3 The streaming space lower bound
In this section we prove our main theorem, modulo some lemmas that we prove in later sections. We restate the theorem below for convenience.
Our lower bound is proved, as is usual for such statements, by showing that no small space algorithm can “distinguish” YES instances with OCSP value at least , from NO instances with OCSP value at most . Such a statement is in turn proved by exhibiting two families of distributions, the YES distributions and the NO distributions, and showing these are indistinguishable. Specifically we choose some parameters and a permutation carefully and define two distributions and . We claim that for our choice of parameters is supported on instances with value at least — this is asserted in 3.6. Similarly we claim that is mostly supported (with probability ) on instances with value at most (see 3.7). Finally we assert in 3.8 that any algorithm that distinguishes from with “advantage” at least (i.e., accepts with probability more than ) requires space.
3.1 Distribution of hard instances
The work of [CGSV21] reduces the task of building hard instances of -ary CSPs over alphabets of size in the streaming setting to the task of defining two distributions supported on satisfying certain properties. Following the same approach, to define and , we first define a pair of distributions on , where is the arity of , which are denoted and . Later, in 3.5, we use these distributions to define and .
For , define the -tuple of “contiguous” values . For a -tuple and a permutation , define the permuted -tuple as . We define in this way because:
If is a -tuple of distinct integers, then (where denotes composition of permutations).
Let , so that is the unique permutation such that . Let , so that is the unique permutation such that . Then . Hence , as desired. ∎
Now the distributions supported on are defined as follows:
Definition 3.2 ( and ).
Let be a Max-OCSP of arity . For and , is the uniform distribution over the set
is the uniform distribution over the set. For , is the uniform distribution over all -tuples in .
For a distribution supported on and index we define its th marginal to be the distribution supported on sampled by picking and outputting . We say that a distribution has uniform marginals if is the uniform distribution on for every .
The following proposition follows immediately from the definition of the and .
For every , , and , the distributions and have uniform marginals.
Definition 3.4 (Uniform distribution over partial hypermatchings).
Let denote the uniform distribution over all -partial -hypermatchings on .
We now formally define our YES and NO distributions for . See Figure 1 below for a visual interpretation in the case of MAS.
Definition 3.5 ( and ).
Let , , and let or for some . We define the distribution , over -variable instances, as follows:
Sample a uniformly random -partition .
Sample hypermatchings independently .
For each , do the following. Let be an empty -hypergraph on . For each -hyperedge , sample a tuple , and add the -hyperedge to if and only if .
Return the instance on variables given by the constraint hypergraph .
We say that an algorithm achieves advantage in distinguishing from if there exists an such that for all , we have
In the following section we state lemmas which highlight the main properties of the distributions above.
3.2 Statement of key lemmas
Our first lemma shows that is supported on instances of high value.
Lemma 3.6 ( has high values).
For every ordering constraint satisfaction function , every , and , we have (i.e., this occurs with probability 1).
Lemma 3.7 ( has low values).
For every -ary ordering constraint function , and every , there exists and such that for all and , there exists such that for all , for sufficiently large , we have
We prove 3.7 in Section 4.3. We note that this lemma is more technically involved than 3.6 and this is the proof that needs the notion of “small partition expanders”. Finally the following lemma asserts the indistinguishability of and to small space streaming algorithms. We remark that this lemma follows directly from the work of [CGSV21].
For every there exists such that for every , the following holds: For every and , every streaming algorithm distinguishing from with advantage for all lengths uses space .
3.3 Proof of Theorem 1.1
We now prove Theorem 1.1.
Proof of Theorem 1.1.
Let be a approximation algorithm for that uses space . Fix . Consider the algorithm defined as follows: on input , an instance of , if , then outputs , else, it outputs . Observe that uses space. Set such that the condition of 3.7 holds. Set such that the conditions of 3.7 holds. Consider any and : let be set as in 3.7. Consider any : since , it follows from 3.6 that for , we have , and hence with probability at least , . Therefore, . Similarly, by the choice of , it follows from 3.7 that
and hence, . Therefore, distinguishes from with advantage . By applying 3.8, we conclude that the space complexity of is at least . ∎
4 Bounds on values of and
4.1 CSPs and coarsening
In preparation for proving the lemmas, we recall the definition of (non-ordering) constraint satisfaction problems (CSPs), whose solution spaces are (as opposed to ), and define an operation called -coarsening on Max-OCSP’s, which restricts the solution space from to .
A maximum constraint satisfaction problem, , is specified by a single constraint function , for some positive integer . An instance of on variables is given by constraints where , i.e., the application of the function to the variables . The value of an assignment on an instance , denoted , is the fraction of constraints satisfied by , i.e., , where for . The optimal value of is defined as .
Definition 4.1 (-coarsening).
Let be a -ary Max-OCSP and let . The -coarsening of is the -ary Max-CSP problem where we define as follows: For , iff the entries in are all distinct and . The -coarsening of an instance of is the instance of given by the identical collection of constraints.
The following lemma captures the idea that coarsening restricts the space of possible solutions; compare to 4.8 below.
If , is an instance of , and is the -coarsening of , then .
We will show that for every assignment to , we can construct an assignment to such that . Specifically, given an assignment to , for , let be the sequence of indices with assigned value , enumerated in some arbitrary order. Next, let be the ordering on given by placing in order. Consider any constraint in which is satisfied by . Since , . By construction, since are distinct, . Hence is also satisfied by in , and so . ∎
4.2 has high values
In this section, we prove 3.6, which states that the values of instances drawn from are large. For convenience, we restate it here:
Note that we prove a bound for every instance in the support of , although it would suffice for our application to prove that such a bound holds with high probability over the choice of .
To prove 3.6, if is the -coarsening of , by 4.2, it suffices to show that . One natural approach is to consider the -partition sampled when sampling , and define the assignment to by . Consider any constraint in ; by the definition of (3.5), we have for some (unique) , which we term the identifier of (recall, we defined as the -tuple ). Now . Hence, is satisfied by iff . By 3.1 above, . Hence a sufficient condition for to satisfy (which is in fact necessary in the case ) is that (since then ); this happens iff ’s identifier . Unfortunately, when sampling the constraints , we might get “unlucky” and get a sample which over-represents the constraints with identifier . We can resolve this issue using “shifted” versions of .111Alternatively, in expectation, . Hence with probability at least , by Markov’s inequality; this suffices for a “with-high-probability” statement. The proof is as follows:
Proof of 3.6.
For , define the assignment to as for .
Fix . Then we claim that satisfies any constraint with identifier such that . Indeed, if is a constraint with identifier , since , then we have ; as long as , then , and so and .
Now (no longer fixing ), for each , let be the fraction of constraints in with identifier . By the above claim, for each , we have . On the other hand, (since every constraint has some (unique) identifier). Hence
since each term appears exactly times in the expanded sum. Hence by averaging, there exists some such that , and so , as desired. ∎
4.3 has low values
In this section, we prove 3.7, which states that the value of an instance drawn from does not significantly exceed the random ordering threshold , with high probability. Restated:
Using concentration bounds (i.e., 2.1), one could show that a fixed solution satisfies more than constraints with probability which is exponentially small in . However, taking a union bound over all permutations would cause an unacceptable blowup in the probability. Instead, to prove 3.7, we take an indirect approach, involving bounding the Max-CSP value of the -coarsening of a random instance and bounding the gap between the Max-OCSP value and the -coarsenened Max-CSP value. To do this, we define the following notions of small set expansion for -hypergraphs:
Definition 4.3 (Lying on a set).
Let be a -hypergraph. Given a set , a -hyperedge lies on if it is incident on two (distinct) vertices in (i.e., if ).
Definition 4.4 (Congregating on a partition).
Let be a -hypergraph. Given a -partition , a -hyperedge congregates on if it lies on one of the blocks .
We denote by the number of -hyperedges of which lie on .
Definition 4.5 (Small set hypergraph expansion (SSHE) property).
A -hypergraph is a -small set hypergraph expander (SSHE) if it has the following property: For every subset of size at most , (i.e., the number of -hyperedges in which lie on is at most ).
Definition 4.6 (Small partition hypergraph expansion (SPHE) property).
A -hypergraph is a -small partition hypergraph expander (SPHE) if it has the following property: For every partition where each block has size at most , the number of -hyperedges in which congregate on is at most .
In the context of Figure 1, the SPHE property says that for any partition with small blocks, there cannot be too many “orange” edges.
Having defined the SSHE and SPHE properties, we now sketch the proof of 3.7. It will be proved formally later in this section.
Proof sketch of 3.7.
For sufficiently large , with high probability, the Max-CSP value of the -coarsening of a random instance drawn from is not much larger than (4.13 below). The constraint hypergraph for a random instance drawn from is a good SSHE with high probability (4.11 below). Hypergraphs which are good SSHEs are also (slightly worse) SPHEs (4.7 below). Finally, if the constraint hypergraph of a instance is a good SPHE, its value cannot be much larger than its -coarsened Max-CSP value (4.8 below); intuitively, this is because if we “coarsen” an optimal ordering for the Max-OCSP by lumping vertices together in small groups to get an assignment for the coarsened Max-CSP, we can view this assignment as a partition on , and for every -hyperedge in which does not congregate on this partition, the corresponding constraint in is satisfied. ∎
We remark that the bounds on Max-CSP values of coarsened random instances (4.13 below) and on SSHE in random instances (4.11 below) both use concentration inequalities (i.e., 2.1) and union bound over a space of size only (the space of all solutions to the coarsened Max-CSP and the space of all small subsets of , respectively); this lets us avoid the issue of union-bounding over the entire space directly.
In the remainder of this section, we prove the necessary lemmas and then give a formal proof of 3.7. We begin with several short lemmas.
Lemma 4.7 (Good SSHEs are good SPHEs).
For every , if a -hypergraph a -SSHE, then it is a -SPHE.
Let . Consider any partition of where each block has size at most . WLOG, all but one block has size at least