PPP-Completeness with Connections to Cryptography

08/20/2018 ∙ by Katerina Sotiraki, et al. ∙ Northeastern University MIT 0

Polynomial Pigeonhole Principle (PPP) is an important subclass of TFNP with profound connections to the complexity of the fundamental cryptographic primitives: collision-resistant hash functions and one-way permutations. In contrast to most of the other subclasses of TFNP, no complete problem is known for PPP. Our work identifies the first PPP-complete problem without any circuit or Turing Machine given explicitly in the input, and thus we answer a longstanding open question from [Papadimitriou1994]. Specifically, we show that constrained-SIS (cSIS), a generalized version of the well-known Short Integer Solution problem (SIS) from lattice-based cryptography, is PPP-complete. In order to give intuition behind our reduction for constrained-SIS, we identify another PPP-complete problem with a circuit in the input but closely related to lattice problems. We call this problem BLICHFELDT and it is the computational problem associated with Blichfeldt's fundamental theorem in the theory of lattices. Building on the inherent connection of PPP with collision-resistant hash functions, we use our completeness result to construct the first natural hash function family that captures the hardness of all collision-resistant hash functions in a worst-case sense, i.e. it is natural and universal in the worst-case. The close resemblance of our hash function family with SIS, leads us to the first candidate collision-resistant hash function that is both natural and universal in an average-case sense. Finally, our results enrich our understanding of the connections between PPP, lattice problems and other concrete cryptographic assumptions, such as the discrete logarithm problem over general groups.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

The fundamental task of Computational Complexity

theory is to classify computational problems according to their inherent computational difficulty. This led to the definition of

complexity classes such as which contains the decision problems with efficiently verifiable proofs in the “yes” instances. The search analog of the class , called , is defined as the class of search problems whose decision version is in . The same definition extends to the class , as the search analog of . The seminal works of [JPY88, Pap94] considered search problems in that are total, i.e. their decision version is always affirmative and thus a solution must always exist. This totality property makes the definition of inadequate to capture the intrinsic complexity of total problems in the appropriate way as it was first shown in [JPY88]. Moreover, there were evidences for the hardness of total search problems e.g. in [HPV89]. Megiddo and Papadimitriou [MP89] defined the class that contains the total search problems of , and Papadimitriou [Pap94] proposed the following classification rule of problems in :

Total search problems should be classified in terms of the profound mathematical principles that are invoked to establish their totality.

Along these lines, many subclasses for have been defined. Johnson, Papadimitriou and Yannakakis [JPY88] defined the class . A few years later, Papadimitriou [Pap94] defined the complexity classes , , and , each one associated with a profound mathematical principle in accordance with the above classification rule. More recently, the classes and were defined in [DP11] and [Jer16], respectively. In Section 1.1 we give a high-level description of all these classes.

Finding complete problems for the above classes is important as it enhances our understanding of the underlying mathematical principles. In turn, such completeness results reveal equivalences between total search problems, that seemed impossible to discover without invoking the definition of these classes. Since the definition of these classes in [JPY88, Pap94] it was clear that the completeness results about problems that don’t have explicitly a Turing machine or a circuit as a part of their input are of particular importance. For this reason it has been established to call such problems natural in the context of the complexity of total search problems (see [FRG18]).

Many natural complete problems are known for and , and recently natural complete problems for were identified too (see Section 1.1). However, no natural complete problems are known for the classes , that have profound connections with the hardness of important cryptographic primitives, as we explain later in detail.

Our Contributions. Our main contribution is to provide the first natural complete problems for and , and thus solve a longstanding open problem from [Pap94]. Beyond that, our completeness results lead the way towards answering important questions in cryptography and lattice theory as we highlight below.

Universal Collision-Resistant Hash Function. Building on the inherent connection of with collision-resistant hash functions, we construct a natural hash function family with the following properties:

  1. Worst-Case Universality. No efficient algorithm can find a collision in every function of the family , unless worst-case collision-resistant hash functions do not exist.

    Moreover, if an (average-case hard) collision-resistant hash function family exists, then there exists an efficiently samplable distribution over , such that is an (average-case hard) collision-resistant hash function family.

  2. Average-Case Hardness. No efficient algorithm can find a collision in a function chosen uniformly at random from

    , unless we can efficiently find short lattice vectors in any (worst-case) lattice.

The first property of is reminiscent of the existence of worst-case one-way functions from the assumption that [Sel92]. The corresponding assumption for the existence of worst-case collision-resistance hash functions is assuming , but our hash function family is the first natural definition that does not involve circuits, and admits this strong completeness guarantee in the worst-case.

The construction and properties of lead us to the first candidate of a natural and universal collision-resistant hash function family. The idea of universal constructions of cryptographic primitives was initiated by Levin in [Lev87], who constructed the first universal one-way function and followed up by [Lev03, KN09]. Using the same ideas we can also construct collision a universal collision-resistant hash function family as we describe in Appendix C. The constructed hash function though invokes in the input an explicit description of a Turing machine and hence it fails to be natural, with the definition of naturality that we described before. In contrast, our candidate construction is natural, simple, and could have practical applications.

Complexity of Lattice Problems in . The hardness of lattice problems in  [AR04] has served as the foundation for numerous cryptographic constructions in the past two decades. This line of work was initiated by the breakthrough work of Ajtai [Ajt96], and later developed in a long series of works (e.g. [AD97, MR07, Reg09, GPV08, Pei09, GVW13, BLP13, BV14, GSW13, GVW15b, GKW17, WZ17, PRSD17]). This wide use of search (approximation) lattice problems further motivates their study.

We make progress in understanding this important research front by showing that:

  1. the computational problem associated with Blichfeldt’s theorem, which can be viewed as a generalization of Minkowski’s theorem, is -complete,

  2. the problem, a constrained version of the Short Integer Solution (), is -complete,

  3. we combine known results and techniques from lattice theory to show that most approximation lattice problems are reducible to and .

These results create a new path towards a better understanding of lattice problems in terms of complexity classes.

Complexity of Other Cryptographic Assumptions. Besides lattice problems, we discuss the relationship of other well-studied cryptographic assumptions and . Additionally, we formulate a white-box variation of the generic group model for the discrete logarithm problem [Sho97]; we observe that this problem is in and is another natural candidate for being -complete.

1.1 Related Work

In this section we discuss the previous work on the complexity of total search problems, that has drawn attention from the theoretical computer science community over the past decades. We start with a high-level description of the total complexity classes and then discuss the known results for each one of them.

.

The class of problems whose totality is established using a potential function argument.
Every finite directed acyclic graph has a sink.

.

The class of problems whose totality is proved through a parity argument.

Any finite graph has an even number of odd-degree nodes.

.

The class of problems whose totality is proved through a directed parity argument.
All directed graphs of degree two or less have an even number of degree one nodes.

.

The class of problems whose totality is proved through a pigeonhole principle argument.
Any map from a set to itself either is onto or has a collision.

Using the same spirit two more classes were defined after [Pap94], in [DP11] and [Jer16].

.

The class of problems whose totality is established using both a potential function argument and a parity argument.

.

The class of problems whose totality is proved through a weak pigeonhole principle.
Any map from a set to a strict subset of has a collision.

Figure 1: The classes and in the world.

Recently, a syntactic analog of the semantic class has been defined in [GP17], and a complete problem for this class has been identified. It has also been shown that all the classes we described above are subsets of . Oracle separations between all these classes are known [BCE98], with the only exception of whether is contained in .

-completeness. The class represents the complexity of local optimization problems. Some important problems that have been shown to be -complete are: Local Max-Cut [SY91], Local Travelling Salesman Problem [Pap92], and Finding a Pure Nash Equilibrium [FPT04]. Recently, important results for the smoothed complexity of the Local Max-Cut problem were shown in [ER17, ABPW17].

-completeness. Arguably, the most celebrated application of the complexity of total search problems is the characterization of the computational complexity of finding a Nash equilibrium in terms of -completeness [DGP09, CDT09]

. This problem lies in the heart of game theory and economics. The proof that Nash Equilibrium is

-complete initiated a long line of research in the intersection of computer science and game theory and revealed connections between the two scientific communities that were unknown before (e.g. [EGG06, CDDT09, VY11, KPR13, CDO15, Rub15, Rub16, CPY17, SSB17]).

-completeness. -complete problems usually arise as the undirected generalizations of their -complete analogs. For example, Papadimitriou [Pap94] showed that Sperner’s Lemma in a 3-D cube is -complete and later Grigni [Gri01] showed that Sperner’s Lemma in a 3-manifold consisting of the product of a Möbius strip and a line segment is -complete. Since Möbius strip is non-orientable, this indeed is a non-directed version of the Sperner’s Lemma. Similarly, other problems have been showed to be -complete, all involving some circuit as an input in their definition [ABB15, DEF16, BIQ17]. Recently, the first natural -complete problem, without a circuit as part of the input, has been identified in [FRG18]. This illustrates an interesting relation between and complexity of social choice theory problems.

-completeness. The class was defined in [DP11] to capture the complexity of problems such as P-matrix LCP, computing KKT-points, and finding Nash equilibria in congestion and network coordination games. Recently, it has been proved that the problem of finding a fixed point whose existence invokes Banach’s Fixed Point Theorem, is -complete [DTZ18, FGMS17].

and cryptography. The connection of and cryptography was illustrated by Papadimitriou in [Pap94], where he proved that if then one-way permutations cannot exist. In [BO06], a special case of integer factorization was shown to be in . This was generalized in [Jer16] by proving that the problem of factoring integers is in under randomized reductions. Recently, strong cryptographic assumptions were used to prove the average-case hardness of and [BPR15, GPS16, HY17]. In [RSS17] it was shown that average-case hardness does not imply one-way function under black-box reductions, whereas in [HNY17] it was shown that any hard on average problem in implies the average case hardness of . Finally, in [KNY17] it is proved that the existence of multi-collision resistant hash functions is equivalent with a variation of the total search problem , which is not known to belong to any of the above complexity classes. Interestingly, they prove that a variation of called colorful-Ramsey () is -hard. Although this an important result, the problem still invokes a circuit in the input and in not known to be in , hence does not resolve the problem of identifying a natural complete problem for .

and lattices. In [BJP15] it was shown that the computational analog of Minkowski’s theorem (namely ) is in , was conjectured that it is also -complete. The authors justified their conjecture by showing that , a problem from [Pap94] that is conjectured to be -complete, reduces to . Additionally, they show that a number theoretic problem called reduces to , and thus is in . In [HRRY17] it is proven that the problem is equivalent to a polynomial approximation of Minkowski’s theorem in the norm (via Cook reductions for both directions).

1.2 Roadmap of the paper

We start our exposition with a brief description of the results contained in this paper. First we briefly describe the -completeness of that illustrates some of the basic ideas behind our main result that is -complete. The complete proof of the -completeness of can be found in Section 3. We suggest to readers that have experience with the fundamental concepts of the theory of lattices to skip the details Section 3.

Then, we present a brief description of our main theorem and its proof. The complete proof of the -completeness of can be found in Section 4.

In Section 5 we describe the -completeness of a weaker version of and its relation with the definition of the first natural universal collision resistant hash function family in the worst-case sense. This proof also provides the first candidate for a collision resistant hash function family that is both natural and universal in the average-case sense.

Finally in Section 6 we present, for completeness of our exposition, other lattice problems that are already known to belong to and and in Section 7 we present more general other cryptographic assumptions that belong to and .

1.3 Overview of the Results

Before we describe our results in more detail we define the class more formally. The class contains the set of problems that are reducible to the problem. The input to is a binary circuit and its output is either an such that , or a pair such that and .

Our first and technically most challenging result is to identify and prove the -completeness of two problems, both of which share similarities with lattice problems. For our exposition, a lattice can be viewed as a finitely generated additive subgroup of . A lattice is generated by a full-rank matrix , called basis. In the rest of this section we also use the fundamental parallelepiped of defined as .

1.3.1 is -complete.

We define the problem as the computational analog of Blichfeldt’s theorem (see Theorem 3.1). Its input is a basis for a lattice and a set of cardinality greater or equal to the volume of . Its output is either a point in that belongs to , or for two (different) points in such that their difference belongs to . In the overview below, we explain why such an output always exists. Notice that finding a solution to becomes trivial if the input representation of has length proportional to its size, i.e. one can iterate over all element pairs of . The problem becomes challenging when is represented succinctly. We introduce a notion for a succinct representation of sets that we call value function. Informally, a value function for a set is a small circuit that takes as input bits that describe an index , and outputs .

We give a proof overview of our first main theorem, and highlight the obstacles that arise, along with our solutions.

Theorem 3.5.

The problem is -complete.

Membership of Overview. We denote with the set . We define the map that reduces any point in modulo the parallelepiped to , i.e. . Using we can efficiently check the membership of any in , by checking if maps to the origin. Observe that if then .

We show in Lemma 3.2 that , hence the input requirement for is equivalent to . Notice that the points of after applying the map , either have a collision in or a preimage of the origin exists in . It follows by a pigeonhole argument that a solution to always exists. For the rest of this part we assume that and let .

We construct a circuit that on input an appropriate index , evaluates the value function of to obtain , and computes . The most challenging part of the proof is to construct an efficient map from to in the following way. We define an appropriate parallelepiped where the are non-negative integers, and a bijection . Because is a cartesian product, a natural efficient indexing procedure exists as described in Lemma 2.1. This allows to map to . The circuit outputs the binary decomposition of . It follows that any such that corresponds to a vector such that . On the other hand, a collision with corresponds to a collision , where is the restriction of on , and hence .

Hardness of Overview. We start with a circuit that is an input to . We construct a set and a lattice as input to in the following way. The set contains the elements and is represented succinctly with the value function that maps to . Notice that . The lattice consists of all that satisfy the equation . By Lemma 2.3, one can efficiently obtain a basis from this description of and in addition the volume of is at most . Thus, and is a valid input for .

The output of is either an that (by construction of ) implies , or two different elements of , , with that implies and (by construction of ) .

1.3.2 is -complete.

Part of the input to is represented with a value function which requires a small circuit. As we explained before this makes a non-natural problem with the respect to the definition of naturality in the context of the complexity of total search problems. We now introduce a natural problem that we call constrained Short Integer Solution (), and show that it is -complete. The problem is a generalization of the well-known Short Integer Solution () problem, and discuss their connection in Section 5.2.

The input is , and , for some positive integer and . The matrix has the property that for every we can efficiently find an such that . We define such matrices as binary invertible. The output is either a vector such that and , or two different vectors such that and . We give a proof overview of the next theorem, and a full proof in Section 4.

Theorem 4.11.

The problem is -complete.

Membership of Overview. We show the membership of in for the general class of binary invertible matrices in Section 4. In order to simplify the exposition, we assume that and

to be the “gadget” matrix concatenated with a random matrix

. That is, has the form where .

Let , , and be the input to . We now explain why suffices to always guarantee a solution to . First, observe that the first columns of , corresponding to the gadget matrix , are enough to guarantee that for every there exists an such that . Hence, there are at least solutions to the equation . Also, there are possible values for . By a pigeonhole argument a solution to always exists. To complete the membership proof, issues similar to with the circuit representation of the problem instance appear, but we overcome them using similar ideas.

Hardness of Overview. We start with a circuit that is an input to . Since the input of is a circuit and the input of

is a pair of matrices and a vector, we need to represent this circuit in an algebraic way. In particular, we device a way to encode the circuit in a binary invertible matrix

and a vector . To gain a better intuition of why this is possible, we note that a gate can be expressed as the linear modular equation , where . By a very careful construction, we can encode these linear modular equations in a binary invertible matrix . For further details we defer to Section 4.

Since with returns a vector such that and asks for a binary vector such that , a natural idea is to let be of the form

, where the identity matrix corresponds to the columns representing the output of circuit

in . Finally, we argue that a solution to with input and as constructed above, gives either a collision or a preimage of zero for the circuit as required.

It can be argued that this reduction shares common ideas with the reduction of to ; this shows the importance of the input conditions for and hints to the numerous complications that arise in the proof. Without these conditions, we could end up with a trivial reduction to an -hard problem! Fortunately, we are able to show that our construction satisfies the input conditions of .

1.3.3 Towards a Natural and Universal Collision-Resistant Rash Family.

is a subclass of in which a collision always exists; it is not hard to show that variations of both and are -complete. We tweak the parameters of valid inputs in order to guarantee that a collision always exists. The -complete variation of , which we denote by , gives a function family which is a universal collision-resistant hash function family in a worst-case sense: if there is a function family that contains at least one function for which it is hard to find collisions, then our function family also includes a function for which it is hard to find collisions.

We now describe the differences of and . As before we assume that . The input to is a matrix , and a binary invertible matrix . Notice that there is no vector in the input, and the relation between and is that has to be strictly greater that . Namely, . This change in the relation of the parameters might seem insignificant, but is actually very important, as it allows us to replace in by the zero vector. This transforms into a pure lattice problem: on input matrices with corresponding bases and , where is binary invertible, find two vectors and such that and .

The great resemblance of with and its completeness for lead us to the first candidate for a universal collision-resistant hash function :

  1. The key is a pair of matrices , where is binary invertible.

  2. Given a key and a binary vector , is the binary decomposition of , where such that .

Because lattice problems have worst-to-average case reductions and our hash family is based on a lattice problem, this gives hope for showing that our construction is universal in the average-case sense.

1.3.4 Other Lattice Problems Known to be in .

We show that the computational analog of Minkowski’s theorem, namely , is in via a Karp-reduction to . We note that a Karp-reduction showing was shown in [BJP15]. Based on these two problems and the known reductions between lattice problems, we conclude that a variety of lattice (approximation) problems belong to ; the most important among them are , - and (see Figure 2).

1.3.5 Other Cryptographic Assumptions in .

By the definition, the class contains all cryptographic assumptions that imply collision-resistant hash functions. These include the factoring of Blum integers, the Discrete Logarithm problem over and over elliptic curves, and the lattice problem (a special case of ). Also, Jeřábek [Jer16] showed that the problem of factoring integers is in .

We extend the connection between and cryptography by introducing a white-box model to describe general groups, which we define to be cyclic groups with a succinct representation of their elements and group operation (i.e. a small circuit). We show that the Discrete Logarithm over general groups is in . An example of a general group is . These connections are also summarized in Figure 2.

Figure 2: Solid arrows denote a Karp reduction, and dashed arrows denote a Cook reduction.

1.4 Open questions.

Numerous new questions arise from our work and the connections we draw between , cryptography and lattices. We summarize here some of them.

Open Problem 1.1.

Is there a worst-to-average case reduction from to itself?
This result will provide the first natural, in the sense that does not invoke explicitly a Turing machine in the input, and universal collision resistant hash function family.

Open Problem 1.2.

Is or -hard?

Open Problem 1.3.

Is - in for ?

Open Problem 1.4.

Is - -hard for ?

Open Problem 1.5.

Is the discrete logarithm problem in for general elliptic curves?

2 Preliminaries

General Notation. Let be the set , and We use small bold letters to refer to real vectors in finite dimension and capital bold letters to refer to matrices in . For a matrix , we denote by its -th row and by its -th element. Let denote the -dimensional identity matrix. We denote with the matrix that has all zeros except that . A function is negligible if for any constant and sufficiently large . All logarithms are in base 2.

Vector Norms. We define the -norm of to be and the -norm of to be . For simplicity we use for the -norm instead of . It is well known that for and for .

2.1 Complexity Classes and Reductions

Binary Strings and Natural Numbers. We use bold and underlined small letters to refer to binary strings. Binary strings of length can also be viewed as vectors in . Every binary string can be mapped to a non negative integer number through the nonlinear map called bit composition, where . It is trivial to see that actually is a bijective mapping and hence we can define the inverse mapping called bit decomposition, which is also trivial to compute for any given number .

The bit decomposition function is extended to integer vectors and the result is the concatenation of the bit decomposition of each coordinate of the vector. Similarly, this is also extended to integer matrices. Then of course the bit composition function is no longer well defined because its output can be either a number or a vector of numbers, but for simplicity we still use the notation and it will be made clear from the context whether the output is a number or a vector of numbers. When is applied to a set the output is still a set with the bit decomposition of each element .

Boolean Circuits. A boolean circuit with inputs and output is represented as a labeled directed acyclic graph with in-degree at most , with exactly source nodes and exactly sink node. Each source node is an input of and the sink node is the output of . Each of the input nodes of is labeled with a number in denoting the ordering of the input variables. Each node with in-degree is labeled with one of the boolean functions with variable . Each node with in-degree can be labeled with one of the boolean function with two variables, but for our purposes we are going to use only the following five boolean functions: nand, nor, xor, and, or, with corresponding symbols . Every boolean circuit defines a boolean function on variables . Let be a binary string of length , i.e. . The value of the circuit on input is computed by evaluating of one by one in a topological sorting of , starting from the input nodes. Then, is the value of the output node. The size of is the number of nodes in the graph that represents .

Circuits. We can now define a circuit with inputs and outputs as an ordered tuple of boolean circuits which defines a function , where . The size of is equal to . It is known that (see [AB09]), where is the class of polynomial-sized circuits. Thus, any polynomial time procedure we describe, implies an equivalent circuit of polynomial size.

Search Problems. A search problem in is defined by a relation that on input of size and for every of size , is polynomial-time computable on . A solution to the search problem with input is a of size such that holds.

A search problem is total if for every input of size , there exists a of size such that holds. The class of total search problems in is called .

Karp Reductions Between Search Problems. A search problem is Karp-reducible to a search problem if there exist polynomial-time (in the input size of ) computable functions and such that if is an input of , then is an input of and if is any solution of with input then is a solution of .

Cook Reductions Between Search Problems. A search problem is Cook-reducible to a search problem if there exists a polynomial-time (in the input size of ) oracle Turing machine such that if is an input of , computes a such that is a solution of whenever all the oracle answers are solutions of . The set of all search problems that are Cook-reducible to problem is denoted by .

The Complexity Class. The class is a subclass of and consists of all search problems Karp-reducible to the following problem called .

black Problem.
Input: A circuit with inputs and outputs.
Output: One of the following:

  1. a binary vector such that , or

  2. two binary vectors , such that .

The weak Complexity Class. The class is the set of all search problems Karp-reducible to the following problem called .

black Problem.
Input: A circuit with inputs and outputs with .
Output: Two binary vectors , such that .

2.2 Set Description Using Circuits

Let and let , i.e. the elements of can be represented using bits. As we will see later there is an inherent connection between proofs of both the inclusion and the hardness of and the succinct representation of subsets using circuits. We define here three such representations: the characteristic function, the value function and the index function.

Characteristic Function. We say that a circuit with binary inputs and one output is a characteristic function representation of if if and only if .

Value Function. Let be a tuple where is a circuit with binary inputs and outputs and . Let be a function such that for all with . Then, is a value function representation of if and only if is a bijective map between and . The value can be arbitrary when .

Index Function. Let be a tuple where is a circuit with binary inputs and outputs and . Let be a function such that for all . Then, is an index function representation of if and only if is a bijective map between and . The value can be arbitrary when .

Some remarks about the above definitions are in order. First, given a succinct representation of it is computationally expensive to compute , thus we provide it explicitly using . Second, even though the input and the output of each circuit have to be binary vectors, we abuse notation and let the input of the index function and the characteristic function, and the output of the value function to be a element in . Formally, according to the above definitions the output of is the bit decomposition of an element in , namely . In the rest of the paper, we abuse notation and drop to denote by the vector in . Similarly, we drop and for the characteristic and the index functions.

To illustrate the definitions of succinct representations of sets, we explain how to define them in the simple case of the set . Although this is a simple example, it is an ingredient that we need when we show the connection of lattice problems with the class .

Lemma 2.1.

Let