The Power of Self-Reducibility: Selectivity, Information, and Approximation

02/21/2019
by   Lane A. Hemaspaandra, et al.
0

This chapter provides a hands-on tutorial on the important technique known as self-reducibility. Through a series of "Challenge Problems" that are theorems that the reader will---after being given definitions and tools---try to prove, the tutorial will ask the reader not to read proofs that use self-reducibility, but rather to discover proofs that use self-reducibility. In particular, the chapter will seek to guide the reader to the discovery of proofs of four interesting theorems---whose focus areas range from selectivity to information to approximation---from the literature, whose proofs draw on self-reducibility. The chapter's goal is to allow interested readers to add self-reducibility to their collection of proof tools. The chapter simultaneously has a related but different goal, namely, to provide a "lesson plan" (and a coordinated set of slides is available online to support this use [Hem19]) for a lecture to a two-lecture series that can be given to undergraduate students---even those with no background other than basic discrete mathematics and an understanding of what polynomial-time computation is---to immerse them in hands-on proving, and by doing that, to serve as an invitation to them to take courses on Models of Computation or Complexity Theory.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

06/22/2016

Finding Proofs in Tarskian Geometry

We report on a project to use a theorem prover to find proofs of the the...
09/20/2020

Foundations

This is a draft of a chapter on mathematical logic and foundations for a...
04/14/2022

Fundamentals of Compositional Rewriting Theory

A foundational theory of compositional categorical rewriting theory is p...
04/28/2021

A Study of the Mathematics of Deep Learning

"Deep Learning"/"Deep Neural Nets" is a technological marvel that is now...
02/24/2016

Automatically Proving Mathematical Theorems with Evolutionary Algorithms and Proof Assistants

Mathematical theorems are human knowledge able to be accumulated in the ...
09/29/2020

Generating Mutually Inductive Theorems from Concise Descriptions

We describe defret-mutual-generate, a utility for proving ACL2 theorems ...
09/17/2019

The Mathematics of Benford's Law – A Primer

This article provides a concise overview of the main mathematical theory...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Section 1.1 explains the two quite different audiences that this chapter is intended for, and for each describes how that group might use the chapter. If you’re not a computer science professor it would make sense to skip Section 1.1.2, and if you are a computer science professor you might at least on a first reading choose to skip Section 1.1.1

Section 1.2 introduces the type of self-reducibility that this chapter will focus on, and the chapter’s central set, (the satisfiability problem for propositional Boolean formulas).

1.1 A Note on the Two Audiences, and How to Read This Chapter

This chapter is unusual in that it has two intended audiences, and those audiences differ dramatically in their amounts of background in theoretical computer science.

1.1.1 For Those Unfamiliar with Complexity Theory

The main intended audience is those—most especially young students—who are not yet familiar with complexity theory, or perhaps have not even yet taken a models of computation course. If that describes you, then this chapter is intended to be a few-hour tutorial immersion in—and invitation to—the world of theoretical computer science research. As you go through this tutorial, you’ll try to solve—hands-on—research issues that are sufficiently important that their original solutions appeared in some of theoretical computer science’s best conferences and journals.

You’ll be given the definitions and some tools before being asked to try your hand at finding a solution to a problem. And with luck, for at least a few of our four challenge problems, you will find a solution to the problem. Even if you don’t find a solution for the given problem—and the problems increase in difficulty and especially the later ones require bold, flexible exploration to find possible paths to the solution—the fact that you have spent time trying to solve the problem will give you more insight into the solution when the solution is then presented in this chapter.

A big-picture goal here is to make it clear that doing theoretical computer science research is often about playful, creative, flexible puzzle-solving. The underlying hope here is that many people who thought that theoretical computer science was intimidating and something that they could never do or even understand will realize that they can do theoretical computer science and perhaps even that they (gasp!) enjoy doing theoretical computer science.

The four problems also are tacitly bringing out a different issue, one more specifically about complexity. Most people, and even most computer science professors, think that complexity theory is extraordinarily abstract and hard to grasp. Yet in each of our four challenge problems, we’ll see that doing complexity is often extremely concrete, and in fact is about building a program that solves a given problem. Building programs is something that many people already have done, e.g., anyone who has taken an introduction to programming course or a data structures course. The only difference in the programs one builds when doing proofs in complexity theory is that the programs one builds typically draw on some hypothesis that provides a piece of the program’s action. For example, our fourth challenge problem will be to show that if a certain problem is easy to approximate, then it can be solved exactly. So your task, when solving it, will be to write a program that exactly solves the problem. But in writing your program you will assume that you have as a black box that you can draw on as a program (a subroutine) that given an instance of the problem gives an approximate solution.

This view that complexity is largely about something that is quite concrete, namely building programs, in fact is the basis of an entire graduate-level complexity-theory textbook [HO02], in which the situation is described as follows:

Most people view complexity theory as an arcane realm populated by pointy-hatted (if not indeed pointy-headed) sorcerers stirring cauldrons of recursion theory with wands of combinatorics, while chanting incantations involving complexity classes whose very names contain hundreds of characters and sear the tongues of mere mortals. This stereotype has sprung up in part due to the small amount of esoteric research that fits this bill, but the stereotype is more strongly attributable to the failure of complexity theorists to communicate in expository forums the central role that algorithms play in complexity theory.

Expected Background

To keep this chapter as accessible as possible, the amount of expected background has been kept quite small. But there are some types of background that are being assumed here. The reader is assumed to know following material, which would typically be learned within about the first two courses of most computer science departments’ introductory course sequences.

  1. What a polynomial is.

    As an example, is a polynomial; is not.

  2. What it means for a set or function to be computable in polynomial time, i.e., to be computed in time polynomial in the number of bits in the input to the problem. The class of all sets that can be computed in polynomial time is denoted P, and is one of the most important classes in computer science.

    As an example, the set of all positive integers that are multiples of 10 is a set that belongs to .

  3. Some basics of logic such as the meaning of quantifiers ( and ) and what a (propositional) Boolean formula is.

    As an example of the latter, the formula is a such a formula, and evaluates as —with each of , , and being variables whose potential values are or —exactly if is and either is or the negation of is .

If you have that background in hand, wonderful! You have the background to tackle this chapter’s puzzles and challenges, and please (unless you’re a professor thinking of modeling a lecture series on this chapter) skip from here right on to Section 1.2.

1.1.2 For Computer Science Professors

Precisely because this chapter is designed to be accessible and fun for students who don’t have a background in theoretical computer science, the chapter avoids—until Section 7—trying to abstract away from the focus on . In particular, this chapter either avoids mentioning complexity class names such as NP, coNP, and PSPACE, or at least, when it does mention them, uses phrasings such as “classes known as” to make clear that the reader is not expected to possess that knowledge.

Despite that, computer science professors are very much an intended audience for this chapter, though in a way that is reflecting the fact that the real target audience for these challenges is young students. In particular, in addition to providing a tutorial introduction for students, of the flavor described in Section 1.1.1, this chapter also has as its goal to provide to you, as a teacher, a “lesson plan” to help you offer in your course a one- or two-day lecture (but really hands-on workshop) sequence111To cover all four problems would take two class sessions. Covering just the first two or perhaps three of the problems could be done in a single 75-minute class session. in which you present the definitions and tools of the first of these problems, and then ask the students to break into groups and in groups spend about 10–25 minutes working on solving the problem,222In this chapter, since student readers of the chapter will be working as individuals, I suggest to the reader, for most of the problems, longer amounts of time. But in a classroom setting where students are working in groups, 10–25 minutes may be an appropriate amount of time; perhaps 10 minutes for the first challenge problem, 15 for the second, 15 for the third, and 25 for the fourth. You’ll need to yourself judge the time amounts that are best, based on your knowledge of your students. For many classes, the just-mentioned times will not be enough. Myself, I try to keep track of whether the groups seem to have found an answer, and I will sometimes stretch out the time window if many groups seem to be still working intensely and with interest. Also, if TAs happen to be available who don’t already know the answers, I may assign them to groups so that the class’s groups will have more experienced members, though the TAs do know to guide rather than dominate a group’s discussions. and then you ask whether some group has found a solution and would like to present it to the class, and if so you and the class listen to and if needed correct the solution (and if no group found a solution, you and the class will work together to reach a solution). And then you go on to similarly treat the other three problems, again with the class working in teams. This provides students with a hands-on immersion in team-based, on-the-spot theorem-proving—something that most students never get in class. I’ve done this in classes of size up to 79 students, and they love it. The approach does not treat them as receptors of information lectured at them, but rather shows them that they too can make discoveries—even ones that when first obtained appeared in such top forums as CCC (the yearly Computational Complexity Conference), ICALP (the yearly International Colloquium on Automata, Languages, and Programming), the journal Information and Computation, and SIAM Journal on Computing.

To support this use of the chapter as a teaching tool in class, I have made publicly available a set of LaTeX/Beamer slides that can be used for a one- or two-class hands-on workshop series on this chapter. The slides are available online [Hem19], both as pdf slides and, for teachers who might wish to modify the slides, as a zip archive of the source files.

Since the slides might be used in courses where students already do know of such classes as NP and coNP, the slides don’t defer the use of those classes as aggressively as this chapter itself does. But the slides are designed so that the mention of the connections to those classes is parenthetical (literally so—typically a parenthetical, at the end of a theorem statement, noting the more general application of the claim to all of NP or all of ), and those parentheticals can be simply skipped over. Also, the slides define on the fly both NP and coNP, so that if you do wish to cover the more general versions of the theorems, the slides will support that too.

The slides don’t themselves present the solutions to Challenge Problems 1, 2, or 3. Rather, they assume that one of the class’s groups will present a solution on the board (or document camera) or will speak the solution with the professor acting as a scribe at the board or the document camera. Challenge Problems 1, 2, and 3 are doable enough that usually at least one group will either have solved the question, or at least will made enough progress that, with some help from classmates or some hints/help from the professor, a solution can quickly be reached building on the group’s work. (The professor ideally should have read the solutions in this chapter to those problems, so that even if a solution isn’t reached or almost reached by the students on one or two of those problems, the professor can provide a solution at the board or document camera. However, in the ideal case, the solutions of those problems will be heavily student-driven and won’t need much, if any, professorial steering.)

Challenge Problem 4 is sufficiently hard that the slides do include both a slide explaining why a certain very natural approach—which is the one students often (quite reasonably) try to make work—cannot possibly work, and thus why the approach that the slides gave to students as a gentle, oblique hint may be the way to go, and then the slides present a solution along those lines.

The difficulty of Challenge Problem 4 has a point. Although this chapter is trying to show students that they can do theory research, there is also an obligation not to give an artificial sense that all problems are easily solved. Challenge Problem 4 shows students that some problems can have multiple twists and turns on the way to finding a solution. Ideally, the students won’t be put off by this, but rather will appreciate both that solving problems is something they can do, and that in doing so one may well run into obstacles that will take some out-of-the-box thinking to try to get around—obstacles that might take not minutes of thought but rather hours or days or more, as well as working closely with others to share ideas as to what might work.

1.2 Self-Reducibility and SAT

Now that you have read whatever part of Section 1.1 applied to you, to get the lay of the land as to what this chapter is trying to provide you, let us discuss the approach that will be our lodestar throughout this chapter.

One of the most central paradigms of computer science is “divide and conquer.” Some of the most powerful realizations of that approach occur though the use of what is known as self-reducibility, which is our chapter’s central focus.

Loosely put, a set is self-reducible if any membership question regarding the set can be easily resolved by asking (perhaps more than one) membership questions about smaller strings.

That certainly divides, but does it conquer?

The answer varies greatly depending on the setting. Self-reducibility itself, depending on which polynomial-time variant one is looking at, gives upper bounds on a set’s complexity. However, those bounds—which in some cases are the complexity classes known as NP and PSPACE—are nowhere near to putting the set into deterministic polynomial time (aka, ).

The magic of self-reducibility, however, comes when one adds another ingredient to one’s stew. Often, one can prove that if a set is self-reducible and has some other property regarding its structure, then the set is feasible, i.e., is in deterministic polynomial time.

This tutorial will ask the reader to—and help the reader to—discover for him- or herself the famous proofs of three such magic cases (due to Selman, Berman, and Fortune), and then of a fourth case that is about the “counting” analogue of what was described in the previous paragraph.

Beyond that, I hope you’ll keep the tool/technique of self-reducibility in mind for the rest of your year, decade, and lifetime—and on each new challenge will spend at least a few moments asking, “Can self-reducibility play a helpful role in my study of this problem?” And with luck, sooner or later, the answer may be, “Yes! Goodness… what a surprise!”

Throughout this chapter, our model language (set) will be “,” i.e., the question of whether a given Boolean formula, for some way of assigning each of its variables to or to , evaluates to . is a central problem in computer science, and possesses a strong form of self-reducibility. As a quiet bonus, though we won’t focus on this in our main traversal of the problems and their solutions, has certain “completeness” properties that make results proven about often yield results for an entire important class of problems known as the “NP-complete” sets; for those interested in that, Section 7, “Going Big: Complexity-Class Implications,” briefly covers that broader view.

2 Definitions Used Throughout: SAT and Self-Reducibility

The game plan of this chapter, as mentioned above, is this: For each of the four challenge problems (theorems), you will be given definitions and some other background or tools. Then the challenge problem (theorem) will be stated, and you’ll be asked to try to solve it, that is, you’ll be asked to prove the theorem. After you do, or after you hit a wall so completely that you feel you can’t solve the theorem even with additional time, you’ll read a proof of the theorem. Each of the four challenge problems has an appendix presenting a proof of the result.

But before we start on the problems, we need to define and discuss its self-reducibility.

Definition 1.

is the set of all satisfiable (propositional) Boolean formulas.

Example 1.
  1. , since neither of the two possible assignments to causes the formula to evaluate to .

  2. , since that formula evaluates to under at least one of the eight possible ways that the four variables can each be assigned to be or . For example, when we take and , the formula evaluates to .

has the following “divide and conquer” property.

Fact 1 (2-disjunctive length-decreasing self-reducibility).

Let . Let be a Boolean formula (without loss of generality, assume that each of the variables actually occurs in the formula). Then

The above fact says that SAT is self-reducible (in particular, in the lingo, it says that is 2-disjunctive length-decreasing self-reducible).

Note: We won’t at all focus in this chapter on details of the encoding of formulas and other objects. That indeed is an issue if one wants to do an utterly detailed discussion/proof. But the issue is not a particularly interesting one, and certainly can be put to the side during a first traversal, such as that which this chapter is inviting you to make.

A Bit of History

In this chapter, we typically won’t focus much on references. It is best to immerse oneself in the challenges, without getting overwhelmed with a lot of detailed historical context. However, so that those who are interested in history can have some sense of the history, and so that those who invented the concepts and proved the theorems are property credited, we will for most sections have an “A Bit of History” paragraph that extremely briefly gives literature citations and sometimes a bit of history and context. As to the present section, self-reducibility dates back to the 1970s, and in particular is due to the work of Schnorr [Sch76] and Meyer and Paterson [MP79].

3 Challenge Problem 1: Is SAT even Semi-Feasible?

Pretty much no one believes that has a polynomial-time decision algorithm, i.e., that  [Gas19]. This section asks you to show that it even is unlikely that has a polynomial-time semi-decision algorithm—a polynomial-time algorithm that, given any two formulas, always outputs one of them and does so in such a way that if at least one of the input formulas is satisfiable then the formula that is output is satisfiable.

3.1 Needed Definitions

A set is said to be feasible (in the sense of belonging to P) if there is a polynomial-time algorithm that decides membership in .

A set is said to be semi-feasible (aka P-selective) if there is a polynomial-time algorithm that semi-decides membership, i.e., that given any two strings, outputs one that is “more likely” to be in the set (to be formally cleaner, since the probabilities are all 0 and 1 and can tie, what is really meant is “no less likely” to be in the set). The following definition makes this formal. (Here and elsewhere,

will denote our (finite) alphabet and will denote the set of finite strings over that alphabet.)

Definition 2.

A set is -selective if there exists a polynomial-time function, such that,

It is known that some P-selective sets can be very hard. Some even have the property known as being “undecidable.” Despite that, our first challenge problem is to prove that cannot be P-selective unless is outright easy computationally, i.e., . Since it is close to an article of faith in computer science that , showing that some hypothesis implies that is considered, with the full weight of modern computer science’s current understanding and intuition, to be extremely strong evidence that the hypothesis is unlikely to be true. (In the lingo, the hypothesis is implying that . Although it is possible that is true, basically no one believes that it is [Gas19]. However, the issue is the most important open issue in applied mathematics, and there is currently a $1,000,000 prize for whoever resolves the issue [Cla].)

A Bit of History

Inspired by an analogue in computability theory, P-selectivity was defined by Alan L. Selman in a seminal series of papers [Sel79, Sel81, Sel82b, Sel82a], which included a proof of our first challenge theorem. The fact, alluded to above, that P-selective sets can be very hard is due to Alan L. Selman’s above work and to the work of the researcher in whose memory this chapter is written, Ker-I Ko [Ko82]. In that same paper, Ko also did very important early work showing that P-selective sets are unlikely to have what is known as “small circuits.” For those particularly interested in the P-selective sets, they and their nondeterministic cousins are the subject of a book, Theory of Semi-Feasible Algorithms [HT03].

3.2 Can SAT Be P-Selective?

Challenge Problem 1.

(Prove that) if is -selective, then .

Keep in mind that what you should be trying to do is this. You may assume that is P-selective. So you may act as if you have in hand a polynomial-time computable function, , that in the sense of Definition 2 shows that is P-selective. And your task is to give a polynomial-time algorithm for , i.e., a program that in time polynomial in the number of bits in its input determines whether the input string belongs to . (Your algorithm surely will be making calls to —possibly quite a few calls.)

So that you have them easily at hand while working on this, here are some of the key definitions and tools that you might want to draw on while trying to prove this theorem.

SAT

is the set of all satisfiable (propositional) Boolean formulas.

Self-reducibility

Let . Let be a Boolean formula (without loss of generality assume that each of the variables occurs in the formula). Then .

P-selectivity

A set is -selective if there exists a polynomial-time function, such that, .

My suggestion to you would be to work on proving this theorem until either you find a proof, or you’ve put in at least 20 minutes of thought, are stuck, and don’t think that more time will be helpful.

When you’ve reached one or the other of those states, please go on to Appendix A to read a proof of this theorem. Having that proof will help you check whether your proof (if you found one) is correct, and if you did not find a proof, will show you a proof. Knowing the answer to this challenge problem before going on to the other three challenge problems is important, since an aspect of this problem’s solution will show up in the solutions to the other challenge problems.

I’ve put the solutions in separate appendix sections so that you can avoid accidentally seeing them before you wish to. But please do (unless you are completely certain that your solution to the first problem is correct) read the solution for the first problem before moving on to the second problem. If you did not find a proof for this first challenge problem, don’t feel bad; everyone has days when we see things and days when we don’t. On the other hand, if you did find a proof of this first challenge theorem, wonderful, and if you felt that it was easy, well, the four problems get steadily harder, until by the fourth problem almost anyone would have to work very, very hard and be a bit lucky to find a solution.

4 Challenge Problem 2: Low Information Content and SAT, Part 1: Can SAT Reduce to a Tally Set?

Can have low information content? To answer that, one needs to formalize what notion of low information content one wishes to study. There are many such notions, but a particularly important one is whether a given set can “many-one polynomial-time reduce” to a tally set (a set over a 1-letter alphabet).

A Bit of History

Our second challenge theorem was stated and proved by Piotr Berman [Ber78]. Berman’s paper started a remarkably long and productive line of work, which we will discuss in more detail in the “A Bit of History” note accompanying the third challenge problem. That same note will provide pointers to surveys of that line of work, for those interested in additional reading.

4.1 Needed Definitions

will denote the empty string.

Definition 3.

A set is a tally set if .

Definition 4.

We say that ( many-one polynomial-time reduces to ) if there is a polynomial-time computable function such that

Informally, this says that is so powerful that each membership query to can be efficiently transformed into a membership query to that gets the same answer as would the question regarding membership in .

4.2 Can SAT Reduce to a Tally Set?

Challenge Problem 2.

(Prove that) if there exists a tally set such that , then .

Keep in mind that what you should be trying to do is this. You may assume that there exists a tally set such that . You may not assume that you have a polynomial-time algorithm for ; you are assuming that exists, but for all we know, might well be very hard. On the other hand, you may assume that you have in hand a polynomial-time computable function that reduces from to in the sense of Definition 4. (After all, that reduction is (if it exists) a finite-sized program.) Your task here is to give a polynomial-time algorithm for , i.e., a program that in time polynomial in the number of bits in its input determines whether the input string belongs to . (Your algorithm surely will be making calls to —possibly quite a few calls.)

So that you have them easily at hand while working on this, here are some of the key definitions and tools that you might want to draw on while trying to prove this theorem.

SAT

is the set of all satisfiable (propositional) Boolean formulas.

Self-reducibility

Let . Let be a Boolean formula (without loss of generality assume that each of the actually occurs in the formula). Then .

Tally sets

A set is a tally set if .

Many-one reductions

We say that if there is a polynomial-time computable function such that .

My suggestion to you would be to work on proving this theorem until either you find a proof, or you’ve put in at least 30 minutes of thought, are stuck, and don’t think that more time will be helpful.

When you’ve reached one or the other of those states, please go on to Appendix B to read a proof of this theorem. The solution to the third challenge problem is an extension of this problem’s solution, so knowing the answer to this challenge problem before going on to the third challenge problem is important.

5 Challenge Problem 3: Low Information Content and SAT, Part 2: Can Reduce to a Sparse Set?

This problem challenges you to show that even a class of sets that is far broader than the tally sets, namely, the so-called sparse sets, cannot be reduced to from unless .

A Bit of History

This third challenge problem was stated and proved by Steve Fortune [For79]. It was another step in what was a long line of advances—employing more and more creative and sometimes difficult proofs—that eventually led to the understanding that, unless , no sparse set can be hard for even with respect to extremely flexible types of reductions. The most famous result within this line is known as Mahaney’s Theorem: If there is a sparse set such that , then  [Mah82]. There are many surveys of the just-mentioned line of work, e.g., [Mah86, Mah89, You92, HOW92]. The currently strongest result in that line is due to Glaßer [Gla00] (see the survey/treatment of that in [GH00], and see also the results of Arvind et al. [AHH93]).

5.1 Needed Definitions

Let denote the cardinality of set , e.g., .

For any set , let denote the complement of .

Let denote the length string , e.g., .

Definition 5.

A set is sparse if there exists a polynomial such that, for each natural number it holds that

Informally put, the sparse sets are the sets whose number of strings up to a given length is at most polynomial. is, for example, not a sparse set, since up to length it has strings. But all tally sets are sparse, indeed all via the bounding polynomial .

5.2 Can Reduce to a Sparse Set?

Challenge Problem 3.

(Prove that) if there exists a sparse set such that , then .

Keep in mind that what you should be trying to do is this. You may assume that there exists a sparse set such that . You may not assume that you have a polynomial-time algorithm for ; you are assuming that exists, but for all we know, might well be very hard. On the other hand, you may assume that you have in hand a polynomial-time computable function that reduces from to in the sense of Definition 4. (After all, that reduction is—if it exists—a finite-sized program.) And you may assume that you have in hand a polynomial that upper-bounds the sparseness of in the sense of Definition 5. (After all, one of the countably infinite list of simple polynomials —for —will provide such an upper bound, if any such polynomial upper bound exists.) Your task here is to give a polynomial-time algorithm for , i.e., a program that in time polynomial in the number of bits in its input determines whether the input string belongs to . (Your algorithm surely will be making calls to —possibly quite a few calls.)

One might wonder why I just said that you should build a polynomial-time algorithm for , given that the theorem speaks of . However, since it is clear that (namely, given a polynomial-time algorithm for , if we simply reverse the answer on each input, then we now have a polynomial-time algorithm for ), it is legal to focus on —and most people find doing so more natural and intuitive.

Do be careful here. Solving this challenge problem may take an “aha!… insight” moment. Knowing the solution to Challenge Problem 2 will be a help here, but even with that knowledge in hand one hits an obstacle. And then the challenge is to find a way around that obstacle.

So that you have them easily at hand while working on this, here are some of the key definitions and tools that you might want to draw on while trying to prove this theorem.

SAT and

is the set of all satisfiable (propositional) Boolean formulas. denotes the complement of .

Self-reducibility

Let . Let be a Boolean formula (without loss of generality assume that each of the actually occurs in the formula). Then .

Sparse sets

A set is sparse if there exists a polynomial such that, for each natural number , it holds that .

Many-one reductions

We say that if there is a polynomial-time computable function such that .

My suggestion to you would be to work on proving this theorem until either you find a proof, or you’ve put in at least 40 minutes of thought, are stuck, and don’t think that more time will be helpful.

When you’ve reached one or the other of those states, please go on to Appendix C to read a proof of this theorem.

6 Challenge Problem 4: Is #SAT as Hard to (Enumeratively) Approximate as It Is to Solve Exactly?

This final challenge is harder than the three previous ones. To solve it, you’ll have to have multiple insights—as to what approach to use, what building blocks to use, and how to use them.

The problem is sufficiently hard that the solution is structured to give you, if you did not solve the problem already, a second bite at the apple! That is, the solution—after discussing why the problem can be hard to solve—gives a very big hint, and then invites you to re-try to problem with that hint in hand.

A Bit of History

The function , the counting version of , will play a central role in this challenge problem. was introduced and studied by Valiant [Val79a, Val79b]. This final challenge problem, its proof (including the lemma given in the solution I give here and the proof of that lemma), and the notion of enumerators and enumerative approximation are due to Cai and Hemachandra [CH89]. The challenge problem is a weaker version of the main result of that paper, which proves the result for not just 2-enumerators but even for sublinear-enumerators; later work showed that the result even holds for all polynomial-time computable enumerators [CH91].

6.1 Needed Definitions

will denote the length of (the encoding of) formula .

is the function that given as input a Boolean formula —without loss of generality assume that each of the variables occurs in —outputs the number of satisfying assignments the formula has (i.e., of the possible assignments of the variables to /, the number of those under which evaluates to ; so the output will be a natural number in the interval ). For example, and .

Definition 6.

We say that has a polynomial-time -enumerator (aka, is polynomial-time 2-enumerably approximable) if there is a polynomial-time computable function such that on each input ,

  1. outputs a list of two (perhaps identical) natural numbers, and

  2. appears in the list output by .

So a 2-enumerator outputs a list of (at most) two candidate values for the value of on the given input, and the actual output is always somewhere in that list. This notion generalizes in the natural way to other list cardinalities, e.g., -enumerators and, for each , -enumerators.

6.2 Food for Thought

You’ll certainly want to use some analogue of the key self-reducibility observation, except now respun by you to be about the number of solutions of a formula and how it relates to or is determined by the number of solutions of its two “child” formulas.

But doing that is just the first step your quest. So… please play around with ideas and approaches. Don’t be afraid to be bold and ambitious. For example, you might say “Hmmmm, if we could do/build XYZ (where perhaps XYZ might be some particular insight about combining formulas), that would be a powerful tool in solving this, and I suspect we can do/build XYZ.” And then you might want to work both on building XYZ and on showing in detail how, if you did have tool XYZ in hand, you could use it to show the theorem.

6.3 Is #SAT as Hard to (Enumeratively) Approximate as It Is to Solve Exactly?

Challenge Problem 4 (Cai and Hemachandra).

(Prove that) if has a polynomial-time -enumerator, then there is a polynomial-time algorithm for .

Keep in mind that what you should be trying to do is this. You may assume that you have in hand a polynomial-time 2-enumerator for . Your task here is to give a polynomial-time algorithm for , i.e., a program that in time polynomial in the number of bits in its input determines the number of satisfying assignments of the (formula encoded by the) input string. (Your algorithm surely will be making calls to the 2-enumerator—possibly quite a few calls.)

Do be careful here. Proving this may take about three “aha!… insight” moments; Section 6.2 gave slight hints regarding two of those.

So that you have them easily at hand while working on this, here are some of the key definitions and tools that you might want to draw on while trying to prove this theorem.

#SAT

is the function that given as input a Boolean formula —without loss of generality assume that each of the variables occurs in —outputs the number of satisfying assignments the formula has (i.e., of the possible assignments of the variables to /, the number of those under which evaluates to ; so the output will be a natural number in the interval ). For example, and .

Enumerative approximation

We say that has a polynomial-time -enumerator (aka, is polynomial-time 2-enumerably approximable) if there is a polynomial-time computable function such that on each input , (a)  outputs a list of two (perhaps identical) natural numbers, and (b)  appears in the list output by .

My suggestion to you would be to work on proving this theorem until either you find a proof, or you’ve put in at least 30–60 minutes of thought, are stuck, and don’t think that more time will be helpful.

When you’ve reached one or the other of those states, please go on to Appendix D, where you will find first a discussion of what the most tempting dead end here is, why it is a dead end, and a tool that will help you avoid the dead end. And then you’ll be urged to attack the problem again with that extra tool in hand.

7 Going Big: Complexity-Class Implications

During all four of our challenge problems, we focused just on the concrete problem, , in its language version or in its counting analogue, .

However, the challenge results in fact apply to broader classes of problems. Although we (mostly) won’t prove those broader results in this chapter, this section will briefly note some of those (and the reader may well be able to in most cases easily fill in the proofs). The original papers, cited in the “A Bit of History” notes, are an excellent source to go to for more coverage. None of the claims below, of course, are due to the present tutorial paper, but rather they are generally right from the original papers. Also often of use for a gentler treatment than the original papers is the textbook, The Complexity Theory Companion [HO02], in which coverage related to our four problems can be found in, respectively, chapters 1, 1 [sic], 3, and (using a different technique and focusing on a concrete but different target problem) 6.

Let us define the complexity class NP by

. NP more commonly is defined as the class of sets accepted by nondeterministic polynomial-time Turing machines; but that definition in fact yields the same class of sets as the alternate definition just given, and would require a detailed discussion of what Turing machines are.

Recall that denotes the complement of . Let us define the complexity class by .

A set is said to be hard for a class if for each set it holds that . If in addition , then we say that is -complete. It is well known—although it takes quite a bit of work to show and showing this was one of the most important steps in the history of computer science—that is NP-complete [Coo71, Lev75, Kar72].

The following theorem follows easily from our first challenge theorem, basically because if some NP-hard set is P-selective, that causes to be P-selective. (Why? Our P-selector function for will simply polynomial-time reduce each of its two inputs to the NP-hard set, will run that set’s P-selector function on those two strings, and then will select as the more likely to belong to whichever input string corresponded to the selected string, and for definiteness will choose its first argument in the degenerate case where both its arguments map to the same string.)

Theorem 1.

If there exists an -hard, -selective set, then .

The converse of the above theorem also holds, since if then and indeed all of NP is P-selective, since P sets unconditionally are P-selective.

The following theorem follows easily from our second challenge theorem.

Theorem 2.

If there exists an -hard tally set, then .

The converse of the above theorem also holds.

The following theorem follows easily from our third challenge theorem.

Theorem 3.

If there exists a -hard sparse set, then .

The converse of the above theorem also holds.

To state the complexity-class analogue of the fourth challenge problem takes a bit more background, since the result is about function classes rather than language classes.

There is a complexity class, which we will not define here, defined by Valiant and known as  [Val79a], that is the set of functions that count the numbers of accepting paths of what are known as nondeterministic polynomial-time Turing machines.

Metric reductions give a reduction notion that applies to the case of functions rather than languages, and are defined as follows. A function is said to polynomial-time metric reduce to a function if there exist two polynomial-time computable functions, and , such that  [Kre88]. (We are assuming that our output natural numbers are naturally coded in binary.) We say a function is hard for with respect to polynomial-time metric reductions if for every it holds that polynomial-time metric reduces to ; if in addition , we say that is -complete with respect to polynomial-time metric reductions.

With that groundwork in hand, we can now state the analogue, for counting classes, of our fourth challenge theorem. Since we have not defined here, we’ll state the theorem both in terms of and in terms of (the two statements below in fact turn out to be equivalent).

Theorem 4.
  1. If there exists a function such that polynomial-time metric reduces to and has a -enumerator, then there is a polynomial-time algorithm for .

  2. If there exists a function that is -hard with respect to polynomial-time metric reductions and has a -enumerator, then there is a polynomial-time algorithm for .

The converse of each of the theorem parts also holds. The above theorem parts (and their converses) even hold if one asks not about 2-enumerators but rather about polynomial-time enumerators that have no limit on the number of elements in their output lists (aside from the polynomial limit that is implicit from the fact that the enumerators have only polynomial time to write their lists).

8 Conclusions

In conclusion, self-reducibility provides a powerful tool with applications across a broad range of settings.

Myself, I have found self-reducibility and its generalizations to be useful in understanding topics ranging from election manipulation [HHM13] to backbones of and backdoors to Boolean formulas [HN17, HN19] to the complexity of sparse sets [HS95], space-efficient language recognition [HOT94], logspace computation [HJ97], and approximation [HZ96, HH03].

My guess and hope is that perhaps you too may find self-reducibility useful in your future work. That is, please, if it is not already there, consider adding this tool to your personal research toolkit: When you face a problem, think (if only for a moment) whether the problem happens to be one where the concept of self-reducibility will help you gain insight. Who knows? One of these years, you might be happily surprised in finding that your answer to such a question is “Yes!”

Acknowledgments

I am grateful to the students and faculty at the computer science departments of RWTH Aachen University, Heinrich Heine University Düsseldorf, and the University of Rochester. I “test drove” this chapter at each of those schools in the form of a lecture or lecture series. Particular thanks go to Peter Rossmanith, Jörg Rothe, and Muthu Venkitasubramaniam, who invited me to speak, and to Gerhard Woeginger regarding the counterexample in Appendix D. My warm thanks to Ding-Zhu Du, Bin Liu, and Jie Wang for inviting me to contribute to this project that they have organized in memory of the wonderful Ker-I Ko, whose work contributed so richly to the beautiful, ever-growing tapestry that is complexity theory.



Appendices

Appendix A Solution to Challenge Problem 1

Before we start on the proof, let us put up a figure that shows the flavor of a structure that we will use to help us understand and exploit ’s self-reducibility. The structure is known as the self-reducibility tree of a formula. At the root of this tree sits the formula. At the next level as the root’s children, we have the formula with its first variable assigned to and to . At the level below that, we have the two formulas from the second level, except with each of their first variables (i.e., the second variable of the original formula) assigned to both and . Figure 1 shows the self-reducibility tree of a two-variable formula.

fig-self-red

Figure 1: The self-reducibility tree (completely unpruned) of a two-variable formula, represented generically.

Self-reducibility tells us that, for each node in such a self-reducibility tree (except the leaves, since they have no children), is satisfiable if and only if at least one of its two children is satisfiable. Inductively, the formula at the root of the tree is satisfiable if and only if each level of the self-reducibility tree has at least one satisfiable node. And, also, the formula at the root of the tree is satisfiable if and only if every level of the self-reducibility tree has at least one satisfiable node.

How helpful is this tree? Well, we certainly don’t want to solve by checking every leaf of the self-reducibility tree. On formulas with variables, that would take time at least —basically a brute-force exponential-time algorithm. Yuck! That isn’t surprising though. After all, the tree is really just listing all assignments to the formula.

But the magic here, which we will exploit, is that the “self-reducibility” relationship between nodes and their children as to satisfiability will, at least with certain extra assumptions such as about P-selectivity, allow us to not explore the whole tree. Rather, we’ll be able to prune away, quickly, all but a polynomially large subtree. In fact, though on its surface this chapter is about four questions from complexity theory, it really is about tree-pruning—a topic more commonly associated with algorithms than with complexity. To us, though, that is not a problem but an advantage. As we mentioned earlier, complexity is largely about building algorithms, and that helps make complexity far more inviting and intuitive than most people realize.

That being said, let us move on to giving a proof of the first challenge problem. Namely, in this section we sketch a proof of the result:

If is -selective, then .

So assume that is P-selective, via (in the sense of Definition 2) polynomial-time computable function . Let us give a polynomial-time algorithm for . Suppose the input to our algorithm is the formula . (If the input is not a syntactically legal formula we immediately reject, and if the input is a formula that has zero variables, e.g., , we simply evaluate it and accept if and only if it evaluates to .) Let us focus on and ’s two children in the self-reducibility tree, as shown in Figure 2.

fig-psel

Figure 2: and ’s two children.

Now, run on ’s two children. That is, compute, in polynomial time, . Due to the properties of P-selectivity and self-reducibility, note that the output of that application of is a formula/node that has the property that the original formula is satisfiable if and only if that child-node is satisfiable.

In particular, if then we know that is satisfiable if and only if is satisfiable. And if then we know that is satisfiable if and only if is satisfiable.

Either way, we have in time polynomial in the input’s size eliminated the need to pay attention to one of the two child nodes, and now may focus just on the other one.

Repeat the above process on the child that, as per the above, was selected by the selector function. Now, “split” that formula by assigning both possible ways. That will create two children, and then analogously to what was done above, use the selector function to decide which of those two children is the more promising branch to follow.

Repeat this until we have assigned all variables. We now have a fully assigned formula, but due to how we got to it, we know that it evaluates to if and only if the original formula is satisfiable. So if that fully assigned formula evaluates to ,then we state that the original formula is satisfiable (and indeed, our path down the self-reducibility tree has outright put into our hands a satisfying assignment). And, more interestingly, if the fully assigned formula evaluates to , then we state that the original formula is not satisfiable. We are correct in stating that, because at each iterative stage we know that if the formula we start that stage focused on is satisfiable, then the child the selector function chooses for us will also be satisfiable.

The process above is an at most polynomial number of at most polynomial-time “descend one level having made a linkage” stages, and so overall itself runs in polynomial time. Thus we have given a polynomial-time algorithm for , under the hypothesis that is P-selective. This completes the proof sketch.

Our algorithm was mostly focused on tree pruning. Though induces a giant binary tree as to doing variable assignments one variable at a time in all possible ways, thanks to the guidance of the selector function, we walked just a single path through that tree.

Keeping this flavor of approach in mind might be helpful on Challenge Problem 2, although that is a different problem and so perhaps you’ll have to bring some new twist, or greater flexibility, to what you do to tackle that.

And now, please pop right on back to the main body of the chapter, to read and tackle Challenge Problem 2!

Appendix B Solution to Challenge Problem 2

In this section we sketch a proof of the result:

If there exists a tally set such that , then .

So assume that there exists a tally set such that . Let be the polynomial-time computable function performing that reduction, in the sense of Definition 4. (Keep in mind that we may not assume that . We have no argument line in hand that would tell us that that happens to be true.) Let us give a polynomial-time algorithm for .

Suppose the input to our algorithm is the formula . (If the input is not a syntactically legal formula we immediately reject, and if the input is a formula that has zero variables we simply evaluate it and accept if and only if it evaluates to .)

Let us focus first on . Compute, in polynomial time, . If , then clearly , since we know that (a)  and (b) . So in that case, we output that . Otherwise, we descend to the next level of the “self-reducibility tree” as follows.

We consider the nodes (i.e., in this case, formulas) and . Compute and . If either of our two nodes in question does not, under the action just computed of , map to a string in , then that node certainly is not a satisfiable formula, and we can henceforward mentally ignore it and the entire tree (created by assigning more of its variables) rooted at it. This is one key type of pruning that we will use: eliminating from consideration nodes that map to “nontally” strings.

But there is a second type of pruning that we will use: If it happens to be the case that and , then at this point it may not be clear to us whether is or is not satisfiable. However, what is clear is that

How do we know this? Since reduces to , we know that

and

By those observations, the fact that , and the transitivity of “”, we indeed have that . But since that says that either both or neither of these nodes is a formula belonging to , there is no need at all for us to further explore more than one of them, since they stand or fall together as to membership in . So if we have , we can mentally dismiss —and of course also the entire subtree rooted at it—from all further consideration.

After doing the two types of pruning just mentioned, we will have either one or two nodes left at the level of the tree—the level one down from the root—that we are considering. (If we have zero nodes left, we have pruned away all possible paths and can safely reject). Also, if , then we can simply check whether at least one node that has not been pruned away evaluates to , and if so we accept and if not we reject.

But what we have outlined can iteratively be carried out in a way that drives us right down through the tree, one level at a time. At each level, we take all nodes (i.e., formulas; we will speak interchangeably of the node and the formula that it is representing) that have not yet been eliminated from consideration, and for each, take the next unassigned variable and make two child formulas, one with that variable assigned to and one with that variable assigned to . So if at a given level after pruning we end up with formulas, we in this process start the next level with formulas, each with one fewer variable. Then for those formulas we do the following: For each of them, if applied to that formula outputs a string that is not a member of , then eliminate that node from all further consideration. After all, the node clearly is not a satisfiable formula. Also, for all nodes among the such that the string that maps them to belongs to and is mapped to by by at least one other of the nodes, for each such cluster of nodes that map to the same string (of the form ) eliminate all but one of the nodes from consideration. After all, by the argument given above, either all of that cluster are satisfiable or none of them are, so we can eliminate all but one from consideration, since eliminating all the others still leaves one that is satisfiable, if in fact the nodes in the cluster are satisfiable.

Continue this process until (it internally terminates with a decision, or) we reach a level where all variables are assigned. If there were nodes at the level above that after pruning, then at this no-variables-left-to-assign level we have at most formulas. The construction is such that if and only if at least one of these at most variable-free formulas belongs to , i.e., evaluates to . But we can easily check that in time polynomial in .

Is the proof done? Not yet. If can be huge, we’re dead, as we might have just sketched an exponential-time algorithm. But fortunately, and this was the key insight in Piotr Berman’s paper that proved this result, as we go down the tree, level by level, the tree never grows too wide. In particular, it is at most polynomially wide!

How can we know this? The insight that Berman (and with luck, also you!) had is that there are not many “tally” strings that can be reached by the reduction on the inputs that it will be run on in our construction on a given input. And that fact will ensure us that after we do our two kinds of pruning, we have at most polynomially many strings left at the now-pruned level.

Let us be more concrete about this, since it is not just the heart of this problem’s solution, but also might well (hint!, hint!) be useful when tacking the third challenge problem.

In particular, we know that is polynomial-time computable. So there certainly is some natural number such that, for each natural number , the function runs in time at most on all inputs of length . Let . Note that, at least if the encoding scheme is reasonable and we if needed do reasonable, obvious simplifications (e.g., , , , and ), then each formula in the tree is of length less than or equal to . Crucially, applied to strings of length less than or equal to can never output any string of length greater than . And so there are at most strings (the “+ 1” is because the empty string is one of the strings that can be reached) in that can be mapped to by any of the nodes that are part of our proof’s self-reducibility tree when the input is . So at each level of our tree-pruning, we eliminate all nodes that map to strings that do not belong to , and since we leave at most one node mapping to each string that is mapped to in , and as we just argued that there are at most of those, at the end of pruning a given level, at most nodes are still under consideration. But is the length of our problem’s input, so each level, after pruning, finishes with at most at most nodes, and so the level after it, after we split each of the current level’s nodes, will begin with at most nodes. And after pruning that level, it too ends up with at most nodes still in play. The tree indeed remains at most polynomially wide.

Thus when we reach the “no variables left unassigned” level, we come into it with a polynomial-sized set of possible satisfying assignments (namely, a set of at most assignments), and we know that the original formula is satisfiable if and only if at least one of these assignments satisfies .

Thus the entire algorithm is a polynomial number of rounds (one per variable eliminated), each taking polynomial time. So overall it is a polynomial-time algorithm that it is correctly deciding . This completes the proof sketch.

And now, please pop right on back to the main body of the chapter, to read and tackle Challenge Problem 3! While doing so, please keep this proof in mind, since doing so will be useful on Challenge Problem 3… though you also will need to discover a quite cool additional insight—the same one Steve Fortune discovered when he originally proved the theorem that is our Challenge Problem 3.

Appendix C Solution to Challenge Problem 3

In this section we sketch a proof of the result:

If there exists a sparse set such that , then .

So assume that there exists a sparse set such that . Let be the polynomial-time computable function performing that reduction, in the sense of Definition 4. (Keep in mind that we may not assume that . We have no argument line in hand that would tell us that that happens to be true.) Let us give a polynomial-time algorithm for .

Suppose the input to our algorithm is the formula . (If the input is not a syntactically legal formula we immediately reject, and if the input is a formula that has zero variables we simply evaluate it and accept if and only if it evaluates to .)

What we are going to do here is that we are going to mimic the proof that solved Challenge Problem 2. We are going to go level by level down the self-reducibility tree, pruning at each level, and arguing that the tree never gets too wide—at least if we are careful and employ a rather jolting insight that Steve Fortune (and with luck, also you!) had.

Note that of the two types of pruning we used in the Challenge Problem 2 proof, one applies perfectly well here. If two or more nodes on a given level of the tree map under to the same string, we can eliminate from consideration all but one of them, since either all of them or none of them are satisfiable.

However, the other type of pruning—eliminating all nodes not mapping to a string in —completely disappears here. Sparse sets don’t have too many strings per level, but the strings are not trapped to always being of a specific, well-known form.

Is the one type of pruning that is left to us enough to keep the tree from growing exponentially bushy as we go down it? At first glance, it seems that exponential width growth is very much possible, e.g., imagine the case that every node of the tree maps to a different string than all the others at the node’s same level. Then with each level our tree would be doubling in size, and by its base, if we started with variables, we’d have nodes at the base level—clearly an exponentially bushy tree.

But Fortune stepped back and realized something lovely. He realized that if the tree ever became too bushy, then that itself would be an implicit proof that is satisfiable! Wow; mind-blowing!

In particular, Fortune used the following beautiful reasoning.

We know runs in polynomial time. So let the polynomial bound ’s running time on inputs of length , and without loss of generality, assume that is nondecreasing. We know that is sparse, so let the polynomial bound the number of strings in up to and including length , and without loss of generality, assume that is nondecreasing.

Let , and as before, note that all the nodes in our proof are of length less than or equal to .

How many distinct strings in can be reached by applying to strings of length at most ? On inputs of length at most , clearly maps to strings of length at most . But note that the number of strings in of length at most is at most .

Now, there are two cases. Suppose that at each level of our tree we have, after pruning, at most nodes left active. Since itself is a polynomial in the input size, , that means our tree remains at most polynomially bushy (since levels of our tree are never, even right after splitting a level’s nodes to create the next level, wider than ). Analogously to the argument of Challenge Problem 2’s proof, when we reach the “all variables assigned” level, we enter it with a set of at most no-variables-left formulas such that is satisfiable if and only if at least one of those formulas evaluates to . So in that case, we easily do compute in polynomial time whether the given input is satisfiable, analogously to the previous proof.

On the other hand, suppose that on some level, after pruning, we have at least nodes. This means that at that level, we had at least distinct labels. But there are only distinct strings that can possibly reach, on our inputs, that belong to . So at least one of the formulas in our surviving nodes maps to a string that does not belong to . But was a reduction from to , so that node that mapped to a string that does not belong to must itself be a satisfiable formula. Ka-zam! That node is satisfiable, and yet that node is simply with some of its variables fixed. And so itself certainly is satisfiable. We are done, and so the moment our algorithm finds a level that has distinct labels, our algorithm halts and declares that is satisfiable.

Note how subtle the action here is. The algorithm is correct in reasoning that, when we have at least distinct labels at a level, at least one of the still-live nodes at that level must be satisfiable, and thus is satisfiable. However, the algorithm doesn’t know a particular one of those at-least--nodes that it can point to as being satisfiable. It merely knows that at least one of them is. And that is enough to allow the algorithm to act correctly. (One can, if one wants, extend the above approach to actually drive onward to the base of the tree; what one does is that at each level, the moment one gets to distinct labels, one stops handling that level, and goes immediately on to the next level, splitting each of those nodes into two at the next level. This works since we know that at least one of the nodes is satisfiable, and so we have ensured that at least node at the next level will be satisfiable.) This completes the proof sketch.

And now, please pop right on back to the main body of the chapter, to read and tackle Challenge Problem 4! There, you’ll be working within a related but changed and rather challenging setting: you’ll be working in the realms of functions and counting. Buckle up!

Appendix D Solution to Challenge Problem 4

d.1 Why One Natural Approach Is Hopeless

One natural approach would be to run the hypothetical 2-enumerator on the input formula and both of ’s -assigned subformulas, and to argue that purely based on the two options that gives for each of those three, i.e., viewing the formulas for a moment as black boxes (note: without loss of generality, we may assume that each of the three applications of the 2-enumerator has two distinct outputs; the other cases are even easier), we can either output or can identify at least one of the subformulas such that we can show a particular 1-to-1 linkage between which of the two predicted numbers of solutions it has and which of the two predicted numbers of solutions has. And then we would iteratively walk down the tree, doing that.

But the following example, based on one suggested by Gerhard Woeginger, shows that that is impossible. Suppose predicts outputs for , and predicts outputs for the left subformula, and predicts outputs for the right subformula. That is, for each, it says “this formula either has zero satisfying assignments or has exactly one satisfying assignment.” In this case, note that the values of the root can’t be, based solely on the numbers the enumerator output, linked 1-to-1 to those of the left subformula, since solutions for the left subformula can correspond to a root value of 0 () or to a root value of 1 (). The same clearly also holds for the right subformula.

The three separate number-pairs just don’t have enough information to make the desired link! But don’t despair: we can make  help us far more powerfully than was done above!

d.2 XYZ Idea/Statement

To get around the obstacle just mentioned, we can try to trick the enumerator into giving us linked/coordinated guesses! Let us see how to do that.

What I was thinking of, when I mentioned XYZ in the food-for-thought hint (Section 6.2

), is the fact that we can efficiently combine two Boolean formulas into a new one such that from the number of satisfying assignments of the new formula we can easily “read off” the number of satisfying assignments of both the original formulas. In fact, it turns out that we can do the combining in such a way that if we concatenate the (appropriately padded as needed) bitstrings capturing the numbers of solutions of the two formulas, we get the (appropriately padded as needed) bitstring capturing the number of solutions of the new “combined” formula. We will, when

is a Boolean formula, use to denote the number of satisfying assignments of .

Lemma 1.

There are polynomial-time computable functions and such that for any Boolean formulas and , is a Boolean formula and prints .

Proof Sketch.

Let and , where are all distinct. Let and be two new Boolean variables. Then

gives the desired combination, since and . ∎

We can easily extend this technique to combine three, four, or even polynomially many formulas.

d.3 Invitation to a Second Bite at the Apple

Now that you have in hand the extra tool that is Lemma 1, this would be a great time, unless you already found a solution to the fourth challenge problem, to try again to solve the problem. My guess is that if you did not already solve the fourth challenge problem, then the ideas you had while trying to solve it will stand you in good stead when you with the combining lemma in hand revisit the problem.

My suggestion to you would be to work again on proving Challenge Problem 4 until either you find a proof, or you’ve put in at least 15 more minutes of thought, are stuck, and don’t think that more time will be helpful.

When you’ve reached one or the other of those states, please go on to Section D.4 to read a proof of the theorem.

d.4 Proof Sketch of the Theorem

Recall that we are trying to prove:

If is has a polynomial-time -enumerator, then there is a polynomial-time algorithm for .

Here is a quick proof sketch. Start with our input formula, , whose number of solutions we wish to compute in polynomial time.

If  has no variables, we can simply directly output the right number of solutions, namely, 1 (if  evaluates to ), or 0 (otherwise). Otherwise, self-reduce formula on its first variable. Using the XYZ trick, twice, combine the original formula and the two subformulas into a single formula, , whose number of solutions gives the number of solutions of all three. For example, if our three formulas are , , and , our combined formula can be

and the decoding process is clear from this and Lemma 1 (and its proof). Run the 2-enumerator on . If either of ’s output’s two decoded guesses are inconsistent (), then ignore that line and the other one is the truth. If both are consistent and agree on , then we’re also done. Otherwise, the two guesses must each be internally consistent and the two guesses must disagree on , and so it follows that the two guesses differ in their claims about at least one of and . Thus if we know the number of solutions of that one, shorter formula, we know the number of solutions of .

Repeat the above on that formula, and so on, right on down the three, and then (unless the process resolves internally or ripples back up earlier) at the end we have reached a zero-variable formula and for it we by inspection will know how many solutions it has (either 1 or 0), and so using that we can ripple our way all the way back up through the tree, using our linkages between each level and the next, and thus we now have computed . The entire process is a polynomial number of polynomial-time actions, and so runs in polynomial time overall.

That ends the proof sketch, but let us give an example regarding the key step from the proof sketch, as that will help make clear what is going on.

Which of

the Guesses

First 100 83 17
Second 101 85 16

In this example, note that we can conclude that if , and if ; and we know that .

So we have in polynomial time completely linked to the issue of the number of satisfying assignments of the (after simplifying) shorter formula . This completes our example of the key linking step.

References

  • [AHH93] V. Arvind, Y. Han, L. Hemachandra, J. Köbler, A. Lozano, M. Mundhenk, M. Ogiwara, U. Schöning, R. Silvestri, and T. Thierauf. Reductions to sets of low information content. In K. Ambos-Spies, S. Homer, and U. Schöning, editors, Complexity Theory, pages 1–45. Cambridge University Press, 1993.
  • [Ber78] P. Berman. Relationship between density and deterministic complexity of NP-complete languages. In Proceedings of the 5th International Colloquium on Automata, Languages, and Programming, pages 63–71. Springer-Verlag Lecture Notes in Computer Science #62, July 1978.
  • [CH89] J.-Y. Cai and L. Hemachandra. Enumerative counting is hard. Information and Computation, 82(1):34–44, 1989.
  • [CH91] J.-Y. Cai and L. Hemachandra. A note on enumerative counting. Information Processing Letters, 38(4):215–219, 1991.
  • [Cla] Clay Mathematics Institute. www.claymath.org/millennium-problems, URL verified 2019/2/16.
  • [Coo71] S. Cook. The complexity of theorem-proving procedures. In

    Proceedings of the 3rd ACM Symposium on Theory of Computing

    , pages 151–158. ACM Press, May 1971.
  • [For79] S. Fortune. A note on sparse complete sets. SIAM Journal on Computing, 8(3):431–433, 1979.
  • [Gas19] W. Gasarch. The third P=?NP poll. SIGACT News, 50(1), 2019. To appear.
  • [GH00] C. Glaßer and L. Hemaspaandra. A moment of perfect clarity II: Consequences of sparse sets hard for NP with respect to weak reductions. SIGACT News, 31(4):39–51, 2000.
  • [Gla00] C. Glaßer. Consequences of the existence of sparse sets hard for NP under a subclass of truth-table reductions. Technical Report TR 245, Institut für Informatik, Universität Würzburg, Würzburg, Germany, January 2000.
  • [Hem19] L. Hemaspaandra. The power of self-reducibility: Selectivity, information, and approximation, 2019. File set—providing slides and their source code—available online at http://www.cs.rochester.edu/u/lane/=self-reducibility/, URL verified 2019/2/25.
  • [HH03] L. Hemaspaandra and H. Hempel. P-immune sets with holes lack self-reducibility properties. Theoretical Computer Science, 302(1–3):457–466, 2003.
  • [HHM13] E. Hemaspaandra, L. Hemaspaandra, and C. Menton. Search versus decision for election manipulation problems. In Proceedings of the 30th Annual Symposium on Theoretical Aspects of Computer Science, volume 20, pages 377–388. Leibniz International Proceedings in Informatics (LIPIcs), February/March 2013.
  • [HJ97] L. Hemaspaandra and Z. Jiang. Logspace reducibility: Models and equivalences. International Journal of Foundations of Computer Science, 8(1):95–108, 1997.
  • [HN17] L. Hemaspaandra and D. Narváez. The opacity of backbones. In

    Proceedings of the 31st AAAI Conference on Artificial Intelligence

    , pages 3900–3906. AAAI Press, February 2017.
  • [HN19] L. Hemaspaandra and D. Narváez. Existence versus exploitation: The opacity of backbones and backdoors under a weak assumption. In Proceedings of the 45th International Conference on Current Trends in Theory and Practice of Computer Science, pages 247–259. Springer-Verlag Lecture Notes in Computer Science #11376, January 2019.
  • [HO02] L. Hemaspaandra and M. Ogihara. The Complexity Theory Companion. Springer-Verlag, 2002.
  • [HOT94] L. Hemaspaandra, M. Ogihara, and S. Toda. Space-efficient recognition of sparse self-reducible languages. Computational Complexity, 4(3):262–296, 1994.
  • [HOW92] L. Hemachandra, M. Ogiwara, and O. Watanabe. How hard are sparse sets? In Proceedings of the 7th Structure in Complexity Theory Conference, pages 222–238. IEEE Computer Society Press, June 1992.
  • [HS95] L. Hemaspaandra and R. Silvestri. Easily checked generalized self-reducibility. SIAM Journal on Computing, 24(4):840–858, 1995.
  • [HT03] L. Hemaspaandra and L. Torenvliet. Theory of Semi-Feasible Algorithms. Springer-Verlag, 2003.
  • [HZ96] L. Hemaspaandra and M. Zimand. Strong self-reducibility precludes strong immunity. Mathematical Systems Theory, 29(5):535–548, 1996.
  • [Kar72] R. Karp. Reducibilities among combinatorial problems. In R. Miller and J. Thatcher, editors, Complexity of Computer Computations, pages 85–103, 1972.
  • [KM81] K. Ko and D. Moore. Completeness, approximation, and density. SIAM Journal on Computing, 10(4):787–796, 1981.
  • [Ko82] K. Ko. The maximum value problem and NP real numbers. Journal of Computer and System Sciences, 24(1):15–35, 1982.
  • [Ko83] K. Ko. On self-reducibility and weak P-selectivity. Journal of Computer and System Sciences, 26(2):209–221, 1983.
  • [Ko87] K. Ko. On helping by robust oracle machines. Theoretical Computer Science, 52(1–2):15–36, 1987.
  • [Kre88] M. Krentel. The complexity of optimization problems. Journal of Computer and System Sciences, 36(3):490–509, 1988.
  • [Lev75] L. Levin. Universal sequential search problems. Problems of Information Transmission, 9(3):265–266, 1975. March 1975 translation into English of Russian article originally published in 1973.
  • [Mah82] S. Mahaney. Sparse complete sets for NP: Solution of a conjecture of Berman and Hartmanis. Journal of Computer and System Sciences, 25(2):130–143, 1982.
  • [Mah86] S. Mahaney. Sparse sets and reducibilities. In R. Book, editor, Studies in Complexity Theory, pages 63–118. John Wiley and Sons, 1986.
  • [Mah89] S. Mahaney. The Isomorphism Conjecture and sparse sets. In J. Hartmanis, editor, Computational Complexity Theory, pages 18–46. American Mathematical Society, 1989. Proceedings of Symposia in Applied Mathematics #38.
  • [MP79] A. Meyer and M. Paterson. With what frequency are apparently intractable problems difficult? Technical Report MIT/LCS/TM-126, Laboratory for Computer Science, MIT, Cambridge, MA, 1979.
  • [Sch76] C. Schnorr. Optimal algorithms for self-reducible problems. In Proceedings of the 3rd International Colloquium on Automata, Languages, and Programming, pages 322–337. Edinburgh University Press, July 1976.
  • [Sel79] A. Selman. P-selective sets, tally languages, and the behavior of polynomial time reducibilities on NP. Mathematical Systems Theory, 13(1):55–65, 1979.
  • [Sel81] A. Selman. Some observations on NP real numbers and P-selective sets. Journal of Computer and System Sciences, 23(3):326–332, 1981.
  • [Sel82a] A. Selman. Analogues of semirecursive sets and effective reducibilities to the study of NP complexity. Information and Control, 52(1):36–51, 1982.
  • [Sel82b] A. Selman. Reductions on NP and P-selective sets. Theoretical Computer Science, 19(3):287–304, 1982.
  • [Val79a] L. Valiant. The complexity of computing the permanent. Theoretical Computer Science, 8(2):189–201, 1979.
  • [Val79b] L. Valiant. The complexity of enumeration and reliability problems. SIAM Journal on Computing, 8(3):410–421, 1979.
  • [You92] P. Young. How reductions to sparse sets collapse the polynomial-time hierarchy: A primer. SIGACT News, 23, 1992. Part I (#3, pages 107–117), Part II (#4, pages 83–94), and Corrigendum to Part I (#4, page 94).