# On the Inner Product Predicate and a Generalization of Matching Vector Families

Motivated by cryptographic applications such as predicate encryption, we consider the problem of representing an arbitrary predicate as the inner product predicate on two vectors. Concretely, fix a Boolean function P and some modulus q. We are interested in encoding x to x⃗ and y to y⃗ so that P(x,y) = 1 〈x⃗,y⃗〉= 0 q, where the vectors should be as short as possible. This problem can also be viewed as a generalization of matching vector families, which corresponds to the equality predicate. Matching vector families have been used in the constructions of Ramsey graphs, private information retrieval (PIR) protocols, and more recently, secret sharing. Our main result is a simple lower bound that allows us to show that known encodings for many predicates considered in the cryptographic literature such as greater than and threshold are essentially optimal for prime modulus q. Using this approach, we also prove lower bounds on encodings for composite q, and then show tight upper bounds for such predicates as greater than, index and disjointness.

## Authors

• 2 publications
• 7 publications
• 3 publications
• ### The space complexity of inner product filters

Motivated by the problem of filtering candidate pairs in inner product s...
09/24/2019 ∙ by Rasmus Pagh, et al. ∙ 0

• ### General lower bounds for interactive high-dimensional estimation under information constraints

We consider the task of distributed parameter estimation using sequentia...
10/13/2020 ∙ by Jayadev Acharya, et al. ∙ 10

• ### Bounds on the Entropy of a Function of a Random Variable and their Applications

It is well known that the entropy H(X) of a discrete random variable X i...
12/21/2017 ∙ by Ferdinando Cicalese, et al. ∙ 0

• ### Truly asymptotic lower bounds for online vector bin packing

In this work, we consider online vector bin packing. It is known that no...
08/03/2020 ∙ by János Balogh, et al. ∙ 0

• ### Upper and lower bounds for dynamic data structures on strings

We consider a range of simply stated dynamic data structure problems on ...
02/19/2018 ∙ by Raphael Clifford, et al. ∙ 0

• ### Finding Collisions in Interactive Protocols – Tight Lower Bounds on the Round and Communication Complexities of Statistically Hiding Commitments

We study the round and communication complexities of various cryptograph...
05/04/2021 ∙ by Iftach Haitner, et al. ∙ 0

• ### Optimal detection of the feature matching map in presence of noise and outliers

We consider the problem of finding the matching map between two sets of ...
06/13/2021 ∙ by Tigran Galstyan, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

There are many situations in cryptography where one is interested in computing some function of a sensitive input but the computational model is restricted so that only “simple” functions can be directly computed. For instance, the entries of may be encrypted so that only affine functions can be computed, or distributed between multiple non-interacting parties so that only local functions can be computed, or simply that we only know how to construct schemes for handling simple functions.

For all of these reasons, it is useful to be able to “encode” complex functions as simple functions. An extremely influential example of an “encoding” in the cryptographic literature is that of garbling schemes (or randomized encodings), which have found applications in many areas of cryptography and elsewhere (see [Yao82, FKN94, IK00, AIK06, App11, BHR12, PS03] and references therein).

In this work, we consider the problem of inner product encoding, namely, representing an arbitrary predicate as the inner product predicate on two vectors. Concretely, fix a Boolean function (a predicate) and some modulus (may be composite as well as prime). We are interested in mappings that map to vectors in such that for all :

 P(x,y)=1⟺⟨→x,→y⟩=0modq,

and is as small as possible. This notion is motivated by the study of predicate encryption in [KSW08], where is typically very large, for instance, as large as the domains of , and can also be viewed as a natural generalization of matching vector families to arbitrary predicates.

As an example, consider the equality predicate over . Here, if , then it is not difficult to show that the vectors must have length . On the other hand, if , then it is sufficient to use vectors of length : the inner product of and is iff . More generally, for any predicate and any prime , the “truth table” construction achieves vectors of length .

Interestingly, inner product predicate encoding for the equality predicate have been studied in combinatorics and complexity theory, where they are known as matching vector families. Moreover, matching vector families have found many applications, including the construction of Ramsey graphs, private information retrieval (PIR) protocols [Gro00, Yek08, Efr12, DGY11, DG15], and more recently, secret-sharing schemes [LVW17, LVW18, LV18]. Here, prior works showed that if is a prime, then we must use vectors of length [DGY11].

### 1.1 Our results

Our main results are nearly tight bounds for many predicates considered in the cryptographic literature such as greater than and threshold, for both prime and composite modulus . In particular, we have the following results for prime modulus :

• Greater than predicate for numbers in requires vectors of length . This rules out the possibility of deriving the predicate encryption for range queries with ciphertext and secret key sizes in [BW07] as a special case of inner product predicate encryption.

• Threshold for -bit strings and threshold requires vectors of length . This rules out the possibility of constructing full-fledged functional encryption schemes by carrying out FHE decryption in the lattice-based predicate encryption of Gorbunov, Vaikuntanathan and Wee [GVW15] using a pairing-based functional encryption scheme for the inner product predicate.

We then investigate encodings for composite , specifically when is a product of distinct primes. In many cases, a lower bound of for composite follows naturally if our method gives lower bound for prime . For predicates such as greater than, index and disjointness, we are able to show tight lower and upper bounds for both prime and composite . The full summary of upper and lower bounds is shown in Table 1, and the listed predicates are described in Section 3.

Finally, we also consider probabilistic inner product predicate encoding. For example, there is a probabilistic encoding of length for the greater than predicate for numbers in , while any deterministic encoding must have length , if is prime.

#### Our lower bound technique.

Our lower bound technique is remarkably simple. Suppose that is prime and we can represent a predicate as an inner product predicate on vectors of length corresponding to mappings . Following [BDL13], we consider a matrix of dimensions over whose ’th entry is . Then the matrix has rank at most , because we can write as the product of two matrices of dimensions and . Concretely, where the ’th row of is and the ’th column of is . This means that to show a lower bound on , it suffices to show that has large rank, e.g. by exhibiting a full rank submatrix.

As an example, consider the greater than predicate on for any prime modulus . Then, the matrix is an upper triangular matrix where all the entries on and above the diagonal are non-zero. This matrix has rank , hence any correct construction must have dimension at least . Note that the above lower bound argument breaks down when is composite. In fact, if , there is an encoding for greater than with dimension : take . Correctness follows from the fact that , and the construction extends also to the setting where is a product of distinct primes.

In order to extend our lower bounds to composite that is the product of distinct primes, we observe that if contains a triangular submatrix of dimensions , then there exists some prime factor of such that contains a triangular submatrix of dimensions ; this follows from looking at the CRT decomposition of and a pigeonhole argument. This simple observation allows us to translate many of our lower bounds to the composite modulus setting, which we prove to be essentially optimal via new upper bounds.

For instance, for the “greater than” predicate, we obtain a tight bound of when is a product of distinct primes; this is sharp contrast to standard matching vector families (i.e., the equality predicate), where we have constructions of length when is a product of distinct primes. For the upper bound, we begin with a construction of length for and then derive the more general construction by treating the inputs as vectors of length and then dividing that into blocks each of length .

Finally, we extend our results to the randomized setting. Here, we use a similar argument to show that the minimum size of a probabilistic inner product encoding is upper bounded by the probabilistic rank introduced by Alman and Williams [AW17].

#### Organization.

The paper is organized as follows. We describe our lower bound method in Section 2. The notation and predicates used throughout the rest of the paper are defined in Section 3. In Section 4 we describe lower and upper bounds for these predicates. Finally, we consider probabilistic encodings in Section 5.

## 2 Main Theorem

In this section we describe our lower bound technique. Let be a predicate, and be the integer modulus. We say that a matrix represents modulo if for all , we have iff .

An inner product encoding of of length is a pair of mappings from to that map to in a way that the matrix defined by represents . Denote the length of the shortest reduction from to inner product modulo by (Deterministic Inner product). Then we have the following simple and effective lower bound method.

###### Theorem 1.

For any predicate and any prime , we have

 DI(P,q)=minFrank(F),

where is any matrix that represents modulo .

###### Proof.

We show that if can be represented by a matrix modulo , then the necessary and sufficient length of the encoding from to is exactly . The decomposition rank definition states that the rank of an matrix is the smallest integer such that can be factored as , where is an matrix and is a matrix. Let be the row vector of that corresponds to and be the column vector of that corresponds to . Then the pair of mappings and is a correct encoding of , which is also the shortest possible for . ∎

Therefore, to show a lower bound on the length of an encoding for , it is sufficient to exhibit a set of rows and a set of columns such that for any matrix that represents , the submatrix is a full rank submatrix. Typically we find a large full rank upper triangular submatrix and apply Theorem 1. Other times, we prove a lower bound for some predicate , and then prove that the same lower bound holds for by showing a predicate reduction from to (see Section 3 for details).

For composite , we have the following lower bound:

###### Theorem 2.

Let be a product of distinct primes. Let be a predicate such that every matrix that represents modulo is a triangular matrix such that all numbers on the main diagonal are non-zero modulo . Then

 DI(P,q)≥n/k.
###### Proof.

Let represent modulo . Let (all entries taken modulo ). Since all entries on the main diagonal of are non-zero, there exists such that there at least non-zero entries on the main diagonal of by pigeonhole principle. As is also a triangular matrix, the rank of modulo is at least . By Theorem 1, the length of any encoding from to modulo must be at least , hence also . ∎

## 3 Definitions and Predicates

In this section, first we describe some of the notation used throughout the paper. Then we define the predicates examined in the paper, and define the predicate reduction.

#### Notation.

We denote the set of all subsets of by . For a set , define the characteristic vector by

 χ(S)i={1,if i∈S,0,otherwise.

Conversely, for a vector , let be the characteristic set of .

For simplicity, denote the characteristic vector of by (the length is usually inferred from the context). The characteristic vectors of and are denoted by and

. We denote the identity matrix of dimension

by , and all ones matrix by .

For a truth expression , we define to be 1 if is true, and 0 if is false. For example, iff .

For a number , let be the binary representation of .

#### Predicates.

We consider the predicates listed below.

• Equality: and

• Greater than: and

• Inequality: and

• Index: and Here, denotes the ’th coordinate of . Note that we can also interpret as the characteristic vector of a subset of . Because in our model corresponds to “true”, we have defined the index to be true if the bit value in the corresponding position is 0.

• Disjointness: and

• Exact threshold: and where is the threshold parameter.

• Threshold: and where is the threshold parameter.

• Multilinear polynomials: , , the latter is the set of all multilinear polynomials of degree at most . Then

• Disjunction of equality tests: and

#### Reductions

We say that a predicate can be reduced to a predicate if there exist two mappings and such that for all (or mappings and ). In that case we write .

For example, consider the following reductions:

• .

The reduction holds since . On the other hand, , as .

• .

As , the reduction follows.

• Let be any predicate. Then .

Let be the truth table of defined by . Then we have and . Similarly, we also have .

Effectively, then an inner product encoding for implies an encoding for and a lower bound for implies a lower bound for . This makes it easier to prove upper and lower bounds. For example, as later we prove that for prime (see Section 4.2), the last reduction implies that for all predicates .

If is a product of distinct primes, then for the same reason. Therefore, for any predicate, if , there is an encoding of and simply to numbers modulo .

## 4 Deterministic Encodings

In this section, we apply our technique to provide lower bounds on deterministic inner product encodings for many well-known predicates. For each of them, first we discuss the encodings and then proceed to prove lower bounds.

### 4.1 Equality

An encoding for over is a matching family of vectors modulo [DGY11]. The maximum size of a matching family of vectors of length modulo is denoted by and has been studied extensively. Lower and upper bounds on give upper and lower bounds on , respectively (in the relevant literature, usually and are denoted by and , respectively).

For prime , a tight bound is known [DGY11]. If is a product of primes, we have a upper bound from [Gro00]. For any composite , we also have an lower bound from [DH13].

Here, first we show two simple upper bounds for and . Then we reprove the optimal lower bound for using our rank lower bound.

#### Upper bounds.

For , we construct an encoding of length . Let and . Then , thus it is a correct inner product encoding and .

Let be any integer such that . Let and . Then , so it is 0 iff . Therefore, .

#### Lower bound.

We show a matching lower bound for case . There is a unique matrix over that represents , namely . Express . By sub-additivity of rank, we have . Hence, by Theorem 1, any inner product encoding of modulo 2 requires vectors of length at least , that is, .

### 4.2 Index

We prove that , for every that is a product of distinct primes.

For some , the upper bound follows from (see Section 4.5). However, there is a much simpler encoding, which we present below. Moreover, this upper bound holds for every that is the product of distinct primes.

#### Upper bound.

We begin with the warm-up for the special case . Here, consider

 →x=n∏i=1p1−xii,→y=q/py.

Then iff .

Next, we consider general . Since , it is enough to construct an encoding for the case . The data is the string , and the index is given by . Encode as an binary matrix , and as an binary matrix .

Now we construct the encoding.

 →xi=k∏j=1pXi,jj,→yi={q/pj,if Yi,j=1,0,otherwise.

Now we analyze the correctness of the protocol. Let be such that . Then .

• If , then .

• If , then , hence .

#### Lower bound.

The lower bound follows from (see Section 4.3).

### 4.3 Inequality

We show that , for every that is the product of distinct primes.

#### Upper bound.

The upper bound follows from (see Section 4.2).

#### Lower bound.

Any matrix that represents is a diagonal matrix with non-zero entries on the main diagonal. By Theorem 2, it follows that .

### 4.4 Greater Than

We show that , for every that is the product of distinct primes.

#### Upper bound.

The upper bound follows from (see Section 4.2).

If is prime, the encoding simplifies to and . If , a different simple encoding is and .

#### Lower bound.

Let be any matrix that represents modulo . Then all entries below the main diagonal are 0, while all entries on and above the main diagonal are non-zero, hence is a triangular matrix. By Theorem 2, we conclude that .

### 4.5 Disjointness

We prove that for an appropriate choice of that depends on , and that if is any product of distinct primes.

#### Upper bound.

We start with a simple encoding for that works for any product of distinct primes . Recall that the sets and are the input to disjointness. Let

 →x=n∏i=1p1−χ(S)ii,→y=n∏i=1p1−χ(T)ii.

Then is iff and are disjoint. If , then for any it is possible that although some of the products are not divisible by , their sum might be divisible by , hence the encoding doesn’t work for any .

For the general case, the following variation of Dirichlet’s theorem will be useful for us.

###### Theorem 3 (Dirichlet).

For any integer , there are infinitely many primes such that .

Let be a product of distinct primes to be defined later. We construct an encoding of length for the case . Encode as an binary matrix . Similarly encode as . Let

 →xi=k∏j=1p1−Xi,jj,→yi=k∏j=1p1−Yi,jj.

Now we find the appropriate primes for the general case. We construct them and prove the correctness by induction on .

Base case. If , then is a prime itself. Pick any prime such that . We have . If and are disjoint, then . Suppose that and are not disjoint. Let . Then . As , we have , therefore .

Inductive step. Assume that there exists a correct encoding for some such that it is a product of distinct primes . Let be a prime such that and (such exist by Theorem 3).

Suppose that , where . Examine the sets and .

• Suppose that and are not disjoint. Then the set is non-empty. If , then at least one of and is 0, thus is divisible by . Thus, we have that . Therefore, .

• Suppose that and are disjoint. Then for all , we have that . Therefore, .

Moreover, since , we have that and . Therefore, is equal to 0 iff the sets and are disjoint by the inductive hypothesis.

#### Lower bound.

The lower bound follows from (see Section 4.2).

### 4.6 Exact Threshold

#### Upper bound.

The following encoding modulo of length is due to Katz, Sahai and Waters [KSW08]. For all , let , and let . For all , let , and let . Then is equal to 0 iff . Therefore, .

Surprisingly, if , there exist constant size encodings.

• If , there is an encoding of length 2. The encoding is as follows: and . Then we have , which is 0 iff .

• If , there is an encoding of length 3. The encoding for and is as follows:

 →x=⎧⎪⎨⎪⎩(1,0,0),if |S|=n,(0,i,1),if |S|=[n]∖{i},(1,−1,1),otherwise.→y=⎧⎪⎨⎪⎩(1,0,0),if |T|=n,(0,1,−i),if |T|=[n]∖{i}.(1,1,1),otherwise.

It is easy to check by hand that iff . Note that we require .

#### Lower bound.

We show that for , we have .

1. First we prove that if , the length of any encoding must be at least . We show that by using two reductions.

Firstly, we have , because we can map . Secondly, we prove that . Consider the following mappings:

 f={1↦∅,i↦[i−1],g={j↦{j},m+1↦∅. (1)

Consider a pair of numbers . If , then and also . If , then and . Otherwise, . Hence the reduction is correct.

Therefore, we conclude that

 DI(ETHRtn,q)≥DI(ETHR1n−t+1,q)≥DI(GTn−t+2,q)≥(n−t+2)/k

by the lower bound on greater than of Section 4.4.

2. Now we prove that if , the length of any encoding is at least . Again, we exhibit two reductions.

Firstly, simply mapping any set to itself. Secondly, . This is because we can map for any . Then the size of the intersection is equal to if , and , if .

Therefore, it follows that

 DI(ETHRtn,q)≥DI(ETHRtt+2,q)≥DI(NEQt+2,q)≥(t+2)/k

by the lower bound on inequality of Section 4.3.

Therefore, for any , any encoding must have length at least and we have that .

### 4.7 Multilinear Polynomials

First we show a known encoding that gives . Then we show a lower bound of . For prime , we show an optimal lower bound .

#### Upper bound.

The following is a simple construction by [KSW08]. For , let and let be a multilinear polynomial of degree at most . For each subset such that , let and ; then is precisely equal to . Since a multilinear polynomial of degree at most on variables has at most monomials, it follows that .

#### Lower bound.

We show a reduction . Let be the bijection from the numbers in to subsets of of size . For a pair of inputs , consider mappings and . Since iff , it is a correct reduction. Thus, by the lower bound from Section 4.3.

Note that if is prime, we can get a tight lower bound of . Since any two distinct polynomials disagree on some inputs, each polynomial must be mapped to a different vector. Therefore, the number of possible vectors must be at least the number of possible polynomials, . The total number of possible monomials of degree at most is . Each monomial can have any coefficient in . It implies that and .

### 4.8 Threshold

First we show an upper bound of for , and then a lower bound of .

#### Upper Bound.

The idea is to encode the threshold into multilinear polynomial evaluation. Let and . Examine the following polynomial:

 py(x)=(n∑i=1xiyi−t)⋅(n∑i=1xiyi−(t+1))⋅…⋅(n∑i=1xiyi−n).

Firstly, , thus iff . Secondly, the degree of each factor is 1, hence . Note that the polynomial is still multilinear, since all the variables are 0 or 1. Therefore, we have a reduction . The upper bound from Section 4.7 implies that .

#### Lower Bound.

First of all, we have