1 Introduction
We study the following fundamental problem from similarity search and statistics, which asks to find the most correlated pair in a dataset:
Definition 1.1 (Bichromatic Maximum Inner Product (MaxIP)).
For , the problem is defined as: given two sets of vectors from compute
We use () to denote the same problem, but with being sets of vectors from ().
Hardness of Approximation MaxIP.
A natural bruteforce algorithm solves MaxIP in time. Assuming SETH^{1}^{1}1SETH (Strong Exponential Time Hypothesis) states that for every there is a such that SAT cannot be solved in time [IP01]., there is no time algorithm for when [Wil05].
Despite being one of the most central problems in similarity search and having numerous applications [IM98, AI06, RR07, RG12, SL14, AINR14, AIL15, AR15, NS15, SL15, Val15, AW15, KKK16, APRS16, TG16, CP16, Chr17], until recently it was unclear whether there could be a nearlineartime, approximating algorithm, before the recent breakthrough of Abboud, Rubinstein and Williams [ARW17] (see [ARW17] for a thorough discussion on the state of affairs on hardness of approximation in P before their work).
In [ARW17], a framework for proving inapproximability results for problems in is established (the distributed PCP framework), from which it follows:
Theorem 1.2 ([Arw17]).
Assuming SETH, there is no multiplicativeapproximating time algorithm for .
Theorem 1.2 is an exciting breakthrough for hardness of approximation in , implying other important inapproximability results for a host of problems including Bichromatic LCS Closest Pair Over Permutations, Approximate Regular Expression Matching, and Diameter in Product Metrics [ARW17]. However, we still do not have a complete understanding of the approximation hardness of MaxIP yet. For instance, consider the following two concrete questions:
Question 1.
Is there a multiplicativeapproximating time algorithm for ? What about a multiplicativeapproximating for ?
Question 2.
Is there a additiveapproximating time algorithm for ?
We note that the lower bound from [ARW17] cannot answer Question 1. Tracing the details of their proofs, one can see that it only shows approximation hardness for dimension . Question 2 concerning additive approximation is not addressed at all by [ARW17]. Given the importance of MaxIP, it is interesting to ask:
For what ratios do time approximation algorithms exist for MaxIP?
Does the bestpossible approximation ratio (in time) relate to the dimensionality, in some way?
In an important recent work, Rubinstein [Rub18] improved the distributed PCP construction in a very crucial way, from which one can derive more refined lower bounds on approximating MaxIP. Building on its technique, in this paper we provide full characterizations, determining essentially optimal multiplicative approximations and additive approximations to MaxIP, under SETH.
Hardness of Exact MaxIP.
Recall that from [Wil05], there is no time algorithm for exact Boolean . Since in real life applications of similarity search, one often deals with realvalued data instead of just Boolean data, it is natural to ask about MaxIP (which is certainly a special case of ): what is the maximum such that can be solved exactly in time?
Besides being interesting in its own right, there are also reductions from MaxIP to Furthest Pair and Bichromatic Closest Pair. Hence, lower bounds for MaxIP imply lower bounds for these two famous problems in computational geometry (see [Wil18] for a discussion on this topic).
Prior to our work, it was implicitly shown in [Wil18] that:
Theorem 1.3 ([Wil18]).
Assuming SETH, there is no time algorithm for with vectors of bit entries.
However, the best known algorithm for MaxIP runs in time [Mat92, AESW91, Yao82]^{2}^{2}2[AESW91, Yao82] are for Furthest Pair or Bichromatic Closest Pair. They also work for MaxIP as there are reductions from MaxIP to these two problems, see [Wil18] or Lemma 4.5 and Lemma 4.6., hence there is still a gap between the lower bound and the best known upper bounds. To confirm these algorithms are in fact optimal, we would like to prove a lower bound with dimensions.
In this paper, we significantly strength the previous lower bound from dimensions to dimensions ( is an extremely slowgrowing function, see preliminaries for its formal definition).
1.1 Our Results
We use to denote the Orthogonal Vectors problem: given two sets of vectors each consisting of vectors from , determine whether there are and such that .^{3}^{3}3Here we use the bichromatic version of OV instead of the monochromatic one for convenience, as they are equivalent. Similarly, we use to denote the same problem except for that consists of vectors from (which is also called Hopcroft’s problem).
All our results are based on the following widely used conjecture about OV:
Characterizations of Hardness of Approximate MaxIP
The first main result of our paper characterizes when there is a truly subquadratic time ( time, for some universal constant hidden in the big) multiplicativeapproximating algorithm for MaxIP, and characterizes the bestpossible additive approximations as well. We begin with formal definitions of these two standard types of approximation:

We say an algorithm for () is multiplicativeapproximating, if for all , outputs a value such that .

We say an algorithm for () is additiveapproximating, if for all , outputs a value such that .

To avoid ambiguity, we call an algorithm computing exactly an exact algorithm for ().
Multiplicative Approximations for MaxIP.
In the multiplicative case, our characterization (formally stated below) basically says that there is a multiplicativeapproximating time algorithm for if and only if . Note that in the following theorem we require , since in the case of , there are time algorithms for exact [AW15, ACW16].
Theorem 1.5.
Letting and ,^{4}^{4}4Note that and are both functions of , we assume they are computable in time throughout this paper for simplicity. the following holds:

There is an time multiplicativeapproximating algorithm for if
and under SETH (or OVC), there is no time multiplicativeapproximating algorithm for if

Moreover, let . There are multiplicativeapproximating deterministic algorithms for running in time
or time
Remark 1.6.
The first algorithm is slightly faster, but only truly quadratic when , while the second algorithm still gets a nontrivial speed up over the brute force algorithm as long as .
We remark here that the above algorithms indeed work for the case where the sets consisting of nonnegative reals (i.e., MaxIP):
Corollary 1.7.
Assuming and letting , there is a multiplicativeapproximating deterministic algorithm for running in time
The lower bound is a direct corollary of the new improved protocols for SetDisjointness from [Rub18], which is based on Algebraic Geometry codes. Together with the framework of [ARW17], that protocol implies a reduction from OV to approximating MaxIP.
Our upper bounds are application of the polynomial method [Wil14, AWY15]: defining appropriate sparse polynomials for approximating MaxIP on small groups of vectors, and use fast matrix multiplication to speed up the evaluation of these polynomials on many pairs of points.
Via the known reduction from MaxIP to LCSPair in [ARW17], we also obtain a more refined lower bound for approximating the LCS Closest Pair problem (defined below).
Definition 1.8 (LCS Closest Pair).
The problem is: given two sets of strings from ( is a finite alphabet), determine
where is the length of the longest common subsequence of strings and .
Corollary 1.9 (Improved Inapproximability for LCSClosestPair).
Assuming SETH (or OVC), for every , multiplicativeapproximating requires time, if .
A Different Approach Based on Approximate Polynomial for .
Making use of the degree approximate polynomial for [BCDWZ99, dW08], we also give a completely different proof for the hardness of multiplicative approximation to MaxIP.^{5}^{5}5That is, MaxIP with sets and being vectors from . Lower bound from that approach is inferior to Theorem 1.5: in particular, it cannot achieve a characterization.
It is asked in [ARW17] that whether we can make use of the communication protocol for SetDisjointness [BCW98] to prove conditional lower bounds. Indeed, that quantum communication protocol is based on the time quantum query algorithm for (Grover’s algorithm [Gro96]), which induces the needed approximate polynomial for . Hence, the following theorem in some sense answers their question in the affirmative:
Theorem 1.10 (Informal).
Assuming SETH (or OVC), there is no time multiplicativeapproximating algorithm for .
Additive Approximations for MaxIP.
Our characterization for additive approximations to MaxIP says that there is a additiveapproximating time algorithm for if and only if .
Theorem 1.11.
Letting and , the following holds:

There is an time additiveapproximating algorithm for if
and under SETH (or OVC), there is no time additiveapproximating algorithm for if

Moreover, letting , there is an
time, additiveapproximating randomized algorithm for when .
The lower bound above is already established in [Rub18], while the upper bound works by reducing the problem to the case via randomsampling coordinates, and solving the reduced problem via known methods [AW15, ACW16].
Remark 1.12.
We want to remark here that the lower bounds for approximating MaxIP are direct corollaries of the new protocols for SetDisjointness in [Rub18]. Our main contribution is providing the complementary upper bounds to show that these lower bounds are indeed tight assuming .
AllPairMaxIP.
Finally, we remark here that our algorithms (with slight adaptions) also work for the following stronger problem^{6}^{6}6Since AllPairMaxIP is stronger than MaxIP, lower bounds for MaxIP automatically apply for AllPairMaxIP.: , in which we are given two sets and of vectors from , and for each we must compute . An algorithm is multiplicativeapproximating (additiveapproximating) for AllPairMaxIP if for all ’s, it computes corresponding approximating answers.
Corollary 1.13.
Suppose , and let
There is an time multiplicativeapproximating algorithm and an time additiveapproximating algorithm for , when .
Hardness of Exact MaxIP in Dimensions
Thirdly, we show that MaxIP is hard to solve in time, even with dimensional vectors:
Theorem 1.14.
Assuming SETH (or OVC), there is a constant such that any exact algorithm for for dimensions requires time, with vectors of bit entries.
As direct corollaries of the above theorem, using reductions implicit in [Wil18], we also conclude hardness for Furthest Pair and Bichromatic Closest Pair under SETH (or OVC) in dimensions.
Theorem 1.15 (Hardness of Furthest Pair in Dimensions).
Assuming SETH (or OVC), there is a constant such that Furthest Pair in dimensions requires time, with vectors of bit entries.
Theorem 1.16 (Hardness of Bichromatic Closest Pair in Dimensions).
Assuming SETH (or OVC), there is a constant such that Bichromatic Closest Pair in dimensions requires time, with vectors of bit entries.
Improved Dimensionality Reduction for Ov and Hopcroft’s Problem
Our hardness of MaxIP is established by a reduction from Hopcroft’s problem, whose hardness is in turn derived from the following significantly improved dimensionality reduction for OV.
Lemma 1.17 (Improved Dimensionality Reduction for Ov).
Let . There is an
reduction from to instances of , with vectors of entries with bitlength .
Comparison with [Wil18].
Comparing to the old construction in [Wil18], our reduction here is more efficient when is much smaller than (which is the case we care about). That is, in [Wil18], can be reduced to instances of , while we get instances in our improved one. So, for example, when , the old reduction yields instances (recall that for an arbitrary constant ), while our improved one yields only instances, each with dimensions.
Theorem 1.18 (Hardness of Hopcroft’s Problem in Dimensions).
Assuming SETH (or OVC), there is a constant such that with vectors of bit entries requires time.
Connection between MaxIP lower bounds and communication protocols
We also show a new connection between MaxIP and a special type of communication protocol. Let us first recall the SetDisjointness problem:
Definition 1.19 (SetDisjointness).
Let , in SetDisjointness (), Alice holds a vector , Bob holds a vector , and they want to determine whether .
Recall that in [ARW17], the hardness of approximating MaxIP is established via a connection to communication protocols (in particular, a fast communication protocol for SetDisjointness). Our lower bound for (exact) MaxIP can also be connected to similar protocols (note that ).
Formally, we define protocols as follows:
Definition 1.20.
For a problem with inputs of length (Alice holds and Bob holds ), we say a communication protocol is an efficient communication protocol if the following holds:

There are three parties Alice, Bob and Merlin in the protocol.

Merlin sends Alice and Bob an advice string of length , which is a function of and .

Given and , Bob sends Alice bits, and Alice decides to accept or not.^{7}^{7}7In , actually oneway communication is equivalent to the seemingly more powerful one in which they communicate [PS86]. They have an unlimited supply of private random coins (not public, which is important) during their conversation. The following conditions hold:

Otherwise, for all possible advice strings from Merlin, Alice accepts with probability .
Moreover, we say the protocol is
computationalefficient, if in addition the probability distributions of both Alice and Bob’s behavior can be computed in
time given their input and the advice.Our new reduction from OV to MaxIP actually implies a superefficient protocol for SetDisjointness.
Theorem 1.21.
For all , there is an
communication protocol for .
For example, when , Theorem 1.21 implies there is an computationalefficient communication protocol for . Moreover, we show that if the protocol of Theorem 1.21 can be improved a little (removing the term), we would obtain the desired hardness for MaxIP in dimensions.
Theorem 1.22.
Assuming SETH (or OVC), if there is an increasing and unbounded function such that for all , there is an
communication protocol for , then requires time with vectors of bit entries. The same holds for Furthest Pair and Bichromatic Closest Pair.
Improved Protocols for SetDisjointness
Finally, we also obtain a new protocol for SetDisjointness, which improves on the previous protocol in [AW09], and is closer to the lower bound by [Kla03]. Like the protocol in [AW09], our new protocol also works for the following slightly harder problem Inner Product.
Definition 1.23 (Inner Product).
Let , in Inner Product (), Alice holds a vector , Bob holds a vector , and they want to compute .
Theorem 1.24.
There is an protocol for and with communication complexity
In [Rub18], the author asked whether the communication complexity of DISJ (IP) is or , and suggested that may be necessary for IP. Our result makes progress on that question by showing that the true complexity lies between and .
1.2 Intuition for Dimensionality Self Reduction for Ov
The factor in Lemma 1.17 is not common in theoretical computer science^{8}^{8}8Other examples include an algorithm for [Mat93], algorithms (Fürer’s algorithm with its modifications) for Fast Integer Multiplication [Für09, CT15, HVDHL16] and an old time algorithm for Klee’s measure problem [Cha08]., and our new reduction for OV is considerably more complicated than the polynomialbased construction from [Wil18]. Hence, it is worth discussing the intuition behind Lemma 1.17, and the reason why we get a factor of .
A Direct Chinese Remainder Theorem Based Approach.
We first discuss a direct reduction based on the Chinese Remainder Theorem (CRT) (see Theorem 2.5 for a formal definition). CRT says that given a collection of primes , and a collection of integers , there exists a unique integer such that for each (CRR stands for Chinese Remainder Representation).
Now, let , suppose we would like to have a dimensionality reduction from to . We can partition an input into blocks, each of length , and represent each block via CRT: that is, for a block , we map it into a single integer , and the concatenations of over all blocks of is .
The key idea here is that, for , is simply . That is, the multiplication between two integers simulates the coordinatewise multiplication between two vectors and !
Therefore, if we make all primes larger than , we can in fact determine from , by looking at for each . That is,
Hence, let be the set of all integer that for all , we have
The reduction is completed by enumerating all integers , and appending corresponding values to make and (this step is from [Wil18]).
Note that a nice property for is that each only depends on the th block of , and the mapping is the same on each block (); we call this the block mapping property.
Analysis of the Direct Reduction.
To continue building intuition, let us analyze the above reduction. The size of is the number of instances we create, and . These primes have to be all distinct, and it follows that is . Since we want to create at most instances (or for arbitrarily small ), we need to set . Moreover, to base our hardness on OVC which deals with dimensional vectors, we need to set for an arbitrary constant . Therefore, we must have , and the above reduction only obtains the same hardness result as [Wil18].
Key Observation: “Most Space Modulo ” is Actually Wasted.
To improve the above reduction, we need to make smaller. Our key observation about is that, for the primes ’s, they are mostly larger than , but for all these ’s. Hence, “most space modulo ” is actually wasted.
Make More “Efficient” Use of the “Space”: Recursive Reduction.
Based on the previous observation, we want to use the “space modulo ” more efficiently. It is natural to consider a recursive reduction. We will require all our primes ’s to be larger than . Let be a very small integer compared to , and let with a set and a block mapping be a similar reduction on a much smaller input: for , . We also require here that for all and .
For an input and a block of , our key idea is to partition again into “micro” blocks each of size . And for a block in , let be its micro blocks, we map into an integer .
Now, given two blocks , we can see that
That is, in fact is equal to , where is the concatenation of the th micro blocks of in each block, and is defined similarly. Hence, we can determine whether from for all , and therefore also determine whether from .
We can now observe that , smaller than before; thus we get an improvement, depending on how large can be. Clearly, the reduction can also be constructed from even smaller reductions, and after recursing times, we can switch to the direct construction discussed before. By a straightforward (but tedious) calculation, we can derive Lemma 1.17.
HighLevel Explanation on the Factor.
Ideally, we want to have a reduction from OV to OV with only instances, in other words, we want . The reason we need to pay an extra factor in the exponent is as follows:
In our reduction, is at least , which is also the bound on each coordinate of the reduction: equals to a encoding of a vector with , whose value can be as large as . That is, all we want is to control the upper bound on the coordinates of the reduction.
Suppose we are constructing an “outer” reduction from the “micro” reduction with coordinate upper bound (), and let (that is, is the extra factor comparing to the ideal case). Recall that we have to ensure to make our construction work, and therefore we have to set larger than .
Then the coordinate upper bound for becomes . Therefore, we can see that after one recursion, the “extra factor” at least doubles. Since our recursion proceeds in rounds, we have to pay an extra factor on the exponent.
1.3 Related Works
SETHbased Conditional Lower Bound.
SETH is one of the most fruitful conjectures in the FineGrained Complexity. There are numerous conditional lower bounds based on it for problems in among different areas, including: dynamic data structures [AV14], computational geometry [Bri14, Wil18, DKL16]
[AVW14, BI15, BI16, BGL16, BK18], graph algorithms [RV13, GIKW17, AVY15, KT17]. See [Vas18] for a very recent survey on SETHbased lower bounds (and more).Communication Complexity and Conditional Hardness.
The connection between communication protocols (in various model) for SetDisjointness and SETH dates back at least to [PW10], in which it is shown that a sublinear, computational efficient protocol for party NumberOnForehead SetDisjointness problem would refute SETH. And it is worth mentioning that [AR18]’s result builds on the IP communication protocol for SetDisjointness in [AW09].
Distributed PCP.
Hardness of Approximation in .
Making use of Chebychev embeddings, [APRS16] prove a inapproximability lower bound on .^{10}^{10}10which is improved by Theorem 1.10 [AB17] take an approach different from Distributed PCP, and shows that under certain complexity assumptions, LCS does not have a deterministic approximation in time. They also establish a connection with circuit lower bounds and show that the existence of such a deterministic algorithm implies does not have nonuniform linearsize Valiant Series Parallel circuits. In [AR18], it is improved to that any constant factor approximation deterministic algorithm for LCS in time implies that does not have nonuniform linearsize circuits. See [ARW17] for more related results in hardness of approximation in .
Organization of the Paper
In Section 2, we introduce the needed preliminaries for this paper. In Section 3, we prove our characterizations for approximating MaxIP and other related results. In Section 4, we prove dimensional hardness for MaxIP and other related problems. In Section 5, we establish the connection between communication protocols and SETHbased lower bounds for exact MaxIP. In Section 6, we present the protocol for SetDisjointness.
2 Preliminaries
We begin by introducing some notation. For an integer , we use to denote the set of integers from to . For a vector , we use to denote the th element of .
We use to denote the logarithm of with respect to base with ceiling as appropriate, and to denote the natural logarithm of .
In our arguments, we use the iterated logarithm function , which is defined recursively as follows:
2.1 Fast Rectangular Matrix Multiplication
Similar to previous algorithms using the polynomial method, our algorithms make use of the algorithms for fast rectangular matrix multiplication.
Theorem 2.1 ([Gu18]).
There is an time algorithm for multiplying two matrices and with size and , where .
Theorem 2.2 ([Cop82]).
There is an time algorithm for multiplying two matrices and with size and , where .
2.2 Number Theory
Here we recall some facts from number theory. In our reduction from OV to OV
, we will apply the famous prime number theorem, which supplies a good estimate of the number of primes smaller than a certain number. See e.g.
[Apo13] for a reference on this.Theorem 2.3 (Prime Number Theorem).
Let be the number of primes , then we have
From a simple calculation, we obtain:
Lemma 2.4.
There are distinct primes in for a large enough .
Proof.
For a large enough , from the prime number theorem, the number of primes in is equal to
∎
Next we recall the Chinese remainder theorem, and Chinese remainder representation.
Theorem 2.5.
Given pairwise coprime integers , and integers , there is exactly one integer such that
We call this the Chinese remainder representation (or the CRR encoding) of the ’s (with respect to these ’s). We also denote
for convenience. We sometimes omit the sequence for simplicity, when it is clear from the context.
Moreover, can be computed in polynomial time with respect to the total bits of all the given integers.
2.3 Communication Complexity
In our paper we will make use of a certain kind of protocol, we call them efficient protocols^{11}^{11}11Our notations here are adopted from [KLM17]. They also defined similar party communication protocols, while we only discuss party protocols in this paper..
Definition 2.6.
We say an Protocol is efficient for a communication problem, if in the protocol:

There are three parties Alice, Bob and Merlin in the protocol, Alice holds input and Bob holds input .

Merlin sends an advice string of length to Alice, which is a function of and .

Alice and Bob jointly toss coins to obtain a random string of length .

Given and , Bob sends Alice a message of length .

After that, Alice decides whether to accept or not.

When the answer is yes, Merlin has exactly one advice such that Alice always accept.

When the answer is no, or Merlin sends the wrong advice, Alice accepts with probability at most .

2.4 Derandomization
We make use of expander graphs to reduce the amount of random coins needed in one of our communication protocols. We abstract the following result for our use here.
Theorem 2.7 (see e.g. Theorem 21.12 and Theorem 21.19 in [Ab09]).
Let be an integer, and set . Suppose . There is a universal constant such that for all , there is a time computable function , such that
here means is one of the element in the sequence .
3 Hardness of Approximate MaxIP
In this section we prove our characterizations of approximating MaxIP.
3.1 The Multiplicative Case
We begin with the proof of Theorem 1.5. We recap it here for convenience.
Reminder of Theorem 1.5 Letting and , the following holds:

There is an time multiplicativeapproximating algorithm for if
and under SETH (or OVC), there is no time multiplicativeapproximating algorithm for if

Moreover, let . There are multiplicativeapproximating deterministic algorithms for running in time
or time
In Lemma 3.2, we construct the desired approximate algorithm and in Lemma LABEL:lm:lowbMaxIPM we prove the lower bound.
The Algorithm
First we need the following simple lemma, which says that the th root of the sum of the th powers of nonnegative reals gives a good approximation to their maximum.
Lemma 3.1.
Let be a set of nonnegative real numbers, be an integer, and . We have
Proof.
Since
the lemma follows directly by taking the th root of both sides.
∎
Lemma 3.2.
Assuming and letting , there are multiplicativeapproximating deterministic algorithms for running in time
or time
Proof.
Let . From the assumption, we have , and . When , we simply use a multiplicativeapproximating algorithm instead, hence in the following we assume . We begin with the first algorithm here.
Construction and Analysis of the Power of Sum Polynomial .
Let be a parameter to be specified later and be a vector from , consider the following polynomial
Observe that since each takes value in , we have for . Therefore, by expanding out the polynomial and replacing all with by , we can write as
In which , and the ’s are the corresponding coefficients. Note that has
terms.
Then consider , plugging in , it can be written as
where , and is defined similarly.
Construction and Analysis of the Batch Evaluation Polynomial .
Now, let and be two sets of vectors from , we define
Comments
There are no comments yet.