1 Background and Main Contributions
1.1 Motivational Discussion: Space Complexity of Parameterized 2SAT
Since Cook  demonstrated its -completeness, the Boolean formula satisfiability problem (SAT) of determining whether a given Boolean formula is satisfied by a suitably-chosen variable assignment has been studied extensively for more than 50 years. As its restricted variant, the CNF Boolean formula satisfiability problem (SAT), for an integer index , whose input formulas are of -conjunctive normal form (CNF) has also been a centerpiece of computational complexity theory. Since SAT is complete for4], its solvability is linked to the computational complexity of all other problems; for instance, if SAT is solved in polynomial time, then so are all NP problems. A recent study has been focused on the solvability of SAT with Boolean variables and clauses within “sub-exponential” (which means for an absolute constant and a suitable polynomial ) runtime. In this line of study, Impagliazzo, Paturi, and Zane  took a new approach toward SAT and its search version, , parameterized by the number of Boolean variables and the number of clauses in a given CNF formula as natural “size parameters” (which were called “complexity parameters” in ). To discuss such sub-exponential-time solvability for a wide range of -complete problems, Impagliazzo et al. further devised a crucial notion of sub-exponential-time reduction family (or SERF-reduction), which preserves the sub-exponential-time complexity, and they cleverly demonstrated that the two size parameters, and , make SERF-equivalent (that is, the both are SERF-reducible to each other). As a working hypothesis, Impagliazzo and Paturi  formally proposed the exponential time hypothesis (ETH), which asserts the insolvability of SAT parameterized by (succinctly denoted by ) in sub-exponential time for all indices . Their hypothesis is obviously a stronger assertion than and it has then led to intriguing consequences, including finer lower bounds on the solvability of various parameterized NP problems (see, e.g., a survey ).
Whereas ETH concerns with SAT for , we are focused on the remaining case of . The decision problem SAT is known to be complete¶¶¶This is because Jones, Lien, and Laaser  demonstrated the -completeness of the complement of (called in ) and Immerman  and Szelepcsényi  proved the closure of under complementation. for (nondeterministic logarithmic space) under log-space reductions. Since already enjoys a polynomial-time algorithm (because ), we are more concerned with how much memory space such an algorithm requires to run. An elaborate algorithm solves with variables and clauses using simultaneously polynomial time and space (Theorem 4.2), where is a constant and is a suitable polylogarithmic function. This space bound is slightly below ; however, it is not yet known that 2SAT parameterized by (or ) can be solved in polynomial time using strictly “sub-linear” space. Here, the informal term “sub-linear” for a size parameter refers to a function of the form on input instances for a certain absolute constant and an appropriately-chosen polylogarithmic function . Of course, this multiplicative factor becomes redundant if is relatively large (for example, for any constant ) and thus “sub-linear” turns out to be simply .
In parallel to a restriction of SAT to SAT, for polynomial-time sub-linear-space solvability, we further limit to , which consists of all satisfiable formulas in which each variable appears as literals in at most clauses. Notice that for each is also -complete (Proposition 4.1) as 2SAT is; in contrast, already falls into for any index .
1.2 Sub-Linear Space and Short Reductions
All (parameterized) decision problems solvable in polynomial time using sub-linear space form a new complexity class (whose prefix “P” refers to “polynomial time”), which is located between and . This class naturally includes, for example, DCFL (deterministic context-free) because Cook  earlier showed that every language in DCFL is recognized in polynomial time using -space, where is input size. Unfortunately, there is no separation known among , , , and .
It turns out that does not seem to be closed under standard log-space reductions; thus, those reductions are no longer suitable tools to discuss the solvability of NL-complete problems in polynomial time using sub-linear space. Therefore, we need to introduce a much weaker form of reductions, called short reductions, which preserve polynomial-time, sub-linear-space complexity. Intuitively speaking, a short reduction is a reduction between two (parameterized) decision problems computed by a reduction machine (or a reduction function) that can generate strings of size parameter proportional to or less than size parameter of its input string. In particular, we will define three types of such short reductions in Section 3: short L-m-reducibility (), short L-T-reducibility (), and short sub-linear-space-T-reducibility ().
As noted earlier, Impagliazzo et al. demonstrated in [10, Corollary 2] that is SERF-equivalent to . Similarly, we can give a short reduction from with to with , and vice verse; in other words, and are equivalent under short L-T-reductions (Lemma 4.3(2)). On the contrary, such equivalence is not known for and this circumstance signifies the importance of .
Another importance of can be demonstrated by showing that is actually one of the hardest problems in a natural subclass of , which we call Syntactic NL or simply SNL. An SNL formula is of the form , starting with a second-order existential quantifier, followed by first-order quantifiers, with a supporting semantic model. From this model, we define a certificate size . Their precise definitions will be given in Section 4. We say that syntactically expresses if, for every , exactly when is true. The notation stands for the collection of all , each of which is expressed syntactically by an appropriate SNL-formula and satisfies for a certain constant .
is complete for under short SLRF-T-reductions.
1.3 A New, Practical Working Hypothesis for 2SAT
Since its introduction in 2001, ETH for SAT () has served as a driving force to obtain finer lower bounds on the sub-exponential-time computability of various parameterized NP problems, since those bounds do not seem to be obtained directly from the popular assumption of . In a similar vein, we wish to propose a new working hypothesis, called the linear space hypothesis (LSH) for , in which no deterministic algorithm solves simultaneously in polynomial time using sub-linear space. More precisely:
The Linear Space Hypothesis (LSH) for : For any choice of and any polylogarithmic function , no deterministic Turing machine solves parameterized by simultaneously in polynomial time using space, where refers to an input instance to .
We can replace in the above definition by (see Section 4), and thus we often omit it. Consider the case of . Since belongs to , it is also in . This consequence contradicts LSH for . Therefore, we immediately obtain:
If LSH for is true, then .
From Theorem 1.2, our working hypothesis LSH for is expected to lead to finer, better consequences than what the assumption can lead to.
Let denote the infimum of a real number for which there is a deterministic Turing machine solving simultaneously in polynomial time using at most space on instances for a certain fixed polylogarithmic function . Here, we acknowledge three possible cases: (i) , (ii) , and (iii) , and one of them must be true after all. The hypothesis LSH for exactly matches (iii).
The working hypothesis LSH for is true iff holds.
For any -reduction, the notation refers to the collection of all (parameterized) decision problems that can be reduced by -reductions to certain problems in .
The following statements are all logically equivalent. (1) . (2) . (3) .
Proposition 1.4(3) can be compared to the fact that .
Furthermore, we seek two other characterizations of the hypothesis LSH for . The first problem is a variant of a well-known NP-complete problem, called the -linear programming problem
-linear programming problem(
). In what follows, a vector of dimensionmeans an matrix and a rational number is treated as a pair of appropriate integers.
(2,)-Entry -Linear Programming Problem (LP):
Instance: a rational matrix and a rational vector of dimension , where and each row of has at most two nonzero entries and each column of has at most non-zero entries.
Question: is there any -vector satisfying ?
As natural size parameters and , we take the numbers of columns and of rows of for instance given to , respectively.
Another problem to consider is a variant of the directed - connectivity problem∥∥∥This is also known as the graph accessibility problem and the graph reachability problem in the literature. () of asking whether a path between two given vertices exists in a directed graph.
Degree- Directed - Connectivity Problem (DSTCON):
Instance: a directed graph of degree (i.e., indegree plus outdegree) at most , and two designated vertices and .
Question: is there any path from to in ?
For any instance to , and respectively denote the number of vertices and that of edges in .
The following statements are logically equivalent: (1) LSH for , (2) LSH for (with or ), and (3) LSH for (with or ).
This theorem allows us to use and for LSH as substitutes for .
1.4 Four Examples of How to Apply the Working Hypothesis
To demonstrate the usefulness of LSH for , we will seek four applications of LSH in the fields of search problems and optimization problems. Although many NL decision problems have been turned into NL search problems (whose precise definition is given in Section 6), not all NL problems can be “straightforwardly” converted into a framework of NL search problems. For example, 2SAT is NL-complete but the problem of finding a truth assignment (when variables are ordered in an arbitrarily fixed way) that satisfies a given 2CNF formula does not look like a legitimate form of NL search problem. In addition, its optimization version, Max2SAT, is already complete for (polynomial-time approximable NP optimization) instead of NLO (NL optimization class) under polynomial-time approximation-preserving reductions (see ).
First, we will see two simple applications of LSH for in the area of NL search problems. Earlier, Jones et al.  discussed the NL-completeness of a decision problem concerning one-way nondeterministic finite automata (or 1nfa’s). We modify this problem into an associated search problem, called , as given below.
1NFA Membership Search Problem (Search-1NFA):
Instance: a 1nfa with no -moves, and a parameter , where is the empty string for .
Solution: an input string of length accepted by (i.e., when is written on ’s read-only input tape, eventually enters a final state in before or on reading the last symbol of ).
As a meaningful size parameter , we set for instance .
Assuming that LSH for , for every fixed value , there is no polynomial-time -space algorithm for .
Jenner  presented a few variants of the well-known knapsack problem and showed their NL-completeness. Here, we choose one of them that fit into the NL-search framework by a small modification. Given a string , a substring of is called unique if there exists a unique pair satisfying . Write for the set .
Unique Ordered Concatenation Knapsack Search Problem (Search-UOCK):
Instance: a string and a sequence of strings over a certain fixed alphabet such that, for every , if is a substring of , then is unique.
Solution: a sequence of indices with such that and .
Our size parameter for is the number of elements in the above definition (namely, for instance ).
If LSH for holds, then, for any , there is no polynomial-time -space algorithm for .
We then turn to the area of NL optimization problems (or NLO problems, in short) [19, 20]. See Section 6 for their formal definition. We will consider a problem that belongs to but does not seem to be solvable using log space. Here, is the collection of NLO problems that have log-space approximation schemes, where a log-space approximation scheme for an NLO problem is a deterministic Turing machine that takes any input of the form and outputs a solution of using space at most for a certain log-space computable function for which the performance ratio satisfies . Such a solution is called a -approximate solution. Notice that the performance ratio is a ratio between the value of an optimal solution and that of ’s output.
In 2007, Tantau  presented an NL maximization problem, called , which falls into . This problem was later rephrased in [20, arXiv version] in terms of complete graphs and it was shown to be computationally hard for (log-space computable NL optimization) under approximation-preserving exact NC-reduction.
Maximum Hot Potato Problem (Max-HPP):
Instance: an matrix whose entries are drawn from , a number , and a start index , where .
Solution: an index sequence of length with for any .
Measure: total weight .
We use the number of columns in a given matrix as size parameter . We can show that, under the assumption of LSH for , cannot have polynomial-time -space approximation schemes of finding -approximate solutions for instances .
If LSH for is true, then, for any , there is no polynomial-time -space algorithm finding -approximate solutions of , where is any instance and is an approximation parameter.
The fourth example concerns with the computational complexity of transforming one type of finite automata into another type. It is known that we can convert a 1nfa to an “equivalent” one-way deterministic finite automaton (or 1dfa) in the sense that both and recognize exactly the same language. In particular, we consider the case of transforming an -state unary 1nfa into its equivalent unary 1dfa, where a unary finite automaton takes a single-letter input alphabet. A standard procedure of such transformation requires polynomial-time and space (cf. ). Under LSH for , we can demonstrate that this space bound cannot be made significantly smaller.
If LSH for is true, then, for any constant , there is no polynomial-time -space algorithm that takes an -state unary 1nfa as input and produces an equivalent unary 1dfa of states.
We hope that the working hypothesis LSH for will stimulate the study on the space complexity of problems and lead us to a rich research area.
2 Basic Notions and Notation
Let be the set of natural numbers (i.e., nonnegative integers) and set . Two notations and denote respectively the set of all real numbers and that of all nonnegative real numbers. For any two integers and with , the notation denotes the set , which is an integer interval between and . For simplicity, when , we write for .
In this paper, all polynomials are assumed to have nonnegative integer coefficients. All logarithms are to base . A polylogarithmic (or polylog) function is a function mapping to such that there exists a polynomial for which holds for all , provided that “” is conventionally set to be .
In a course of our study on polynomial-time sub-linear-space computability, it is convenient to expand the standard framework of decision problems to problems parameterized by properly chosen “size parameters” (called “complexity parameters” in ), which serve as a basis unit of the time/space complexity of an algorithm. In this respect, we follow a framework of Impagliazzo et al.  to work with a flexible choice of size parameter. A standard size parameter is the total length of the binary representation of an input instance and it is often denoted by . More generally, a (log-space) size parameter for a problem is a function mapping (where is an input alphabet) to such that (1) must be computed using log space (that is, by a certain Turing machine that takes input and outputs in unary on an output tape using at most space for certain constants ) and (2) there exists a polynomial satisfying for all instances of .
As key examples, for any graph-related problem (such as ), and denote respectively the total number of edges and that of vertices in a given graph instance . Clearly, and are log-space computable. To emphasize the use of size parameter , we often write in place of . We say that a multi-tape Turing machine uses logarithmic space (or log space, in short) with respect to size parameter if there exist two absolute constants such that each of the work tapes (not including input and output tapes) used by on are upper-bounded by on every input .
Two specific notations and respectively stand for the classes of all decision problems solvable on multi-tape deterministic and nondeterministic Turing machines using log space. It is known that the additional requirement of “polynomial runtime” does not change these classes. More generally, expresses a class composed of all (parameterized) decision problems solvable deterministically in polynomial time (in ) using space at most on any instance given to .
To define NL search and optimization problems in Section 6, it is convenient for us to use a practical notion of “auxiliary Turing machine” (see, e.g., ). An auxiliary Turing machine is a multi-tape deterministic Turing machine equipped with an extra read-only auxiliary input tape, in which a tape head scans each auxiliary input symbol only once by moving from the left to the right. Given two alphabets and , a (parameterized) decision problem with is in if there exist a polynomial and an auxiliary Turing machine that takes a standard input and an auxiliary input of length and decides whether accepts or not in time polynomial in using space logarithmic in . Its functional version is denoted by , provided that each underlying Turing machine is equipped with an extra write-only output tape (in which a tape head moves to the right whenever it writes a non-blank output symbol) and that the machine produces output strings of at most polynomial length.
3 Sub-Linear Space and Short Reductions
Recall from [9, 10] that the term “sub-exponential” means for a certain constant . In contrast, our main subject is polynomial-time, sub-linear-space computability, where the term “sub-linear” refers to functions of the form on input instances for a certain constant and a certain polylogarithmic function . As noted in Section 1.2, the multiplicative factor can be eliminated whenever is relatively large.
First, we will provide basic definitions for (parameterized) decision problems. A decision problem parameterized by size parameter is said to be solvable in polynomial time using sub-linear space if, for a certain choice of constant , there exist a deterministic Turing machine , a polynomial , and a polylogarithmic function for which solves simultaneously in at most steps using space at most for all instances given to .
The notation expresses the collection of all (parameterized) decision problems that are solvable in polynomial time using sub-linear space. In other words, for input instances , where refers to an arbitrary (log-space) size parameter and refers to any polylogarithmic function. It thus follows that but none of these inclusions is known to be proper.
The notion of reducibility among decision problems is quite useful in measuring the relative complexity of the problems. For the class , in particular, we need a restricted form of reducibility, which we call “short” reducibility, satisfying a special property that any outcome of the reduction is linearly upper-bounded in size by an input of the reduction. We will define such restricted reductions for (parameterized) decision problems of our interest.
We begin with a description of L-m-reducibility for (parameterized) decision problems. Given two (parameterized) decision problems and , we say that is L-m-reducible to , denoted by , if there is a function (where refers to the bit length) and two constants such that, for any input string , (i) iff and (iii) . Notice that all functions in are, by their definition, polynomially bounded.
Concerning polynomial-time sub-linear-space solvability, we introduce a restricted variant of this L-m-reducibility, which we call the short L-m-reducibility (or sL-m-reducibility, in short), obtained by replacing the equality in the above definition of with . To express this new reducibility, we use a new notation of .
Since many-one reducibility is too restrictive to use, we need a stronger notion of Turing reduction, which fits into a framework of polynomial-time, sub-linear-space computability. Our reduction is actually a polynomial-time sub-linear-space reduction family (SLRF, in short), performed by oracle Turing machines. A (parameterized) decision problem is SLRF-T-reducible to another one , denoted by , if, for every fixed value , there exist an oracle Turing machine equipped with an extra write-only query tape, a polynomial , a polylog function , and three constants such that, for every instance to , (1) runs in at most time using at most space, provided that its query tape is not subject to this space bound, (2) if makes a query to with query word written on the query tape, then satisfies both and , and (3) after makes a query, in a single step, it automatically erases its query tape, it returns its tape head back to the initial cell, and oracle informs the machine of its answer by changing the machine’s inner state.
The short SLRF-T-reducibility (or sSLRF-T-reducibility, in short) is obtained from the SLRF-reducibility by substituting for the above inequality . The notation denotes this restricted reducibility. In the case where is limited to log-space usage, we use a different notation of . Note that any -reduction is an -reduction but the converse is not true because there is a pair of problems reducible by -reductions but not by -reductions.
For any reduction , a decision problem is said to be -complete for a given class of problems if (1) and (2) every problem in is -reducible to . We use the notation to express the collection of all problems that are -reducible to certain problems in . When is a singleton, say, , we write instead of .
It follows that implies , which further implies . The same statement holds for , , and . Moreover, implies . The same holds for and .
Here are other basic properties of SLRF-T- and sSLRF-T-reductions.
The reducibilities and are reflexive and transitive.
The class is closed under -reductions.
There exist recursive decision problems and such that but . A similar statement holds also for and .
4 The 2CNF Boolean Formula Satisfiability Problem and SNL
We will make a brief discussion on (CNF Boolean formulas satisfiability problem) and the complexity class . As noted in Section 1.1, is NL-complete under L-m-reductions.
In what follows, we are focused on two specific size parameters: and , which respectively denote the numbers of propositional variables and clauses appearing in formula-related instance (not necessarily limited to instances of ).
We further restrict by limiting the number of literals appearing in an input Boolean formula as follows. Let . We denote by the collection of all formulas in such that, for each variable in , the number of occurrences of and is at most . Since and are solvable using only log space, we force our attention on the case of . From with a help of the fact that 2SAT is NL-complete, we can immediately obtain the following.
For each index , is -complete.
To solve 2SAT in polynomial time, we need slightly larger than sub-linear space.
For a certain constant and a polylog function , with variables and clauses can be solved in polynomial time using space.
For any reduction defined in Section 3, we write if both and hold.
Let and . (1) and (2) .
Contrary to Lemma 4.3(2), it is still unknown whether .
Hereafter, we will define the notion of SNL formulas, which induce the complexity class . Let be any instance, including “sets” and “objects” . An SNL formula is of the form , where is a quantifier-free formula, which is a Boolean combination of atomic formulas of the following forms: , , , , and (i.e., is the th symbol of ), where is a second-order predicate symbol, and are first-order variables, having the following semantic model for . In this model, ranges over a subset of (where is a universe) with , each ranges a number in , each takes an element in another universe with , and each ranges over a set of at most elements (i.e., ) for absolute constants and polynomials , not depending on the choice of . A certificate size is defined to be as our basis size parameter.
As a quick example, let us consider a (parameterized) decision problem such that there are a polynomial , a constant , and a deterministic Turing machine recognizing simultaneously in time at most using space at most for every instance to , where is a work-tape alphabet. We assume that terminates in a configuration in which the work tape is blank and all tape heads return to the initial position. For our convenience, is extended to include a special transition from an accepting configuration to itself. To express , we define an SNL-formula as: with a semantic model supporting , , , where , is the set of a unique accepting configuration, indicates the largest index that ensures , expresses a -transition between two configurations, and asserts that represents a function satisfying . Note that . Hence, belongs to .
5 The Working Hypothesis LSH for 2SAT
The exponential time hypothesis (ETH) has served as a driving force to obtain better lower bounds on the computational complexity of various important problems (see, e.g., ).
In Theorem 4.2, we have seen that with variables and clauses can be solved in polynomial time using space for a certain constant ; however, it is not yet known to be solved in polynomial time using sub-linear space. This circumstance encourages us to propose (in Section 1.3) a practical working hypothesis—the linear space hypothesis (LSH) for —which asserts the insolvability of in polynomial time using sub-linear space. The choice of does not matter; as shown in Lemma 4.3(2) with a help of Lemma 3.1(2), we can replace in the definition of LSH by . Theorem 1.5 has further given two alternative definitions to LSH in terms of and .
The working hypothesis LSH concerns with but it also carries over to .
Assuming that LSH for is true, each of the following statements holds: (1) and (2) .
As another consequence of LSH for , we can show the existence of a pair of problems in the class , which are incomparable with respect to -reductions. This indicates that the class has a fine, complex structure with respect to sSLRF-T-reducibility.
Assuming LSH for , there are two decision problems and in such that and .
6 Proofs of the Four Examples of LSH Applications
In Section 1.4, we have described four examples of how to apply our working hypothesis LSH for . Here, we will give three of their proofs.
First, we will briefly describe (parameterized) NL search problems. In general, a search problem parameterized by (log-space) size parameter is expressed as , where consists of (admissible) instances and is a function from to a set of strings (called a solution space) such that, for any , implies for certain constants , where stands for . In particular, when we use the standard “bit length” of instances, we omit “” and write instead of . Of all search problems, (parameterized) NL search problems are (parameterized) search problems for which and . Finally, we denote by the collection of all (parameterized) NL search problems.
We say that a deterministic Turing machine solves if, for any instance , takes as input and produces a solution in if , and produces a designated symbol (“no solution”) otherwise. Now, we recall from Section 1.4 a special NL search problem, called , in which we are asked to find an input of length accepted by a given -free 1nfa . Theorem 1.6 states that no polynomial-time -space algorithm solves .
Proof of Theorem 1.6. Toward a contradiction, we assume that is solved by a deterministic Turing machine in time polynomial in using space at most on instances , where are constants. Our aim is to show that can be solved in polynomial time using sub-linear space, because this contradicts LSH for , which is equivalent to LSH for by Theorem 1.5(3).
Let be any instance to with and . Let . Associated with this , we define a 1nfa as follows. First, let and . Define and . For each , consider its neighbor . We assume that all elements in are enumerated in a fixed linear order as with . The transition function is defined as if .
Supposedly, is a path from to in . For each index , we choose an index satisfying and we then set . When reads , it eventually enters , which is a halting state, and therefore accepts . On the contrary, in the case where there is no path from to in , never accepts any input. Therefore, it follows that (*) has a path from to iff accepts .
Finally, we set as an instance to parameterized by . Note that . By (*), can be solved by running on in polynomial time; moreover, the space required for this computation is upper-bounded by , which is obviously sub-linear.
Another NL search problem, , asks to find, for a given string , an index sequence in increasing order that makes the concatenation equal to among . Here, we present the proof of Theorem 1.7.
Proof of Theorem 1.7. Let us assume that there is a polynomial-time -space algorithm for on instances for certain constants . We will use this to solve in polynomial time using sub-linear space.
Let be any instance to with . For simplicity of our argument, let , , and . Now, we define for each pair . First, we modify into another graph , where and