# A Subquadratic Algorithm for 3XOR

Given a set X of n binary words of equal length w, the 3XOR problem asks for three elements a, b, c ∈ X such that a ⊕ b=c, where ⊕ denotes the bitwise XOR operation. The problem can be easily solved on a word RAM with word length w in time O(n^2 n). Using Han's fast integer sorting algorithm (2002/2004) this can be reduced to O(n^2 n). With randomization or a sophisticated deterministic dictionary construction, creating a hash table for X with constant lookup time leads to an algorithm with (expected) running time O(n^2). At present, seemingly no faster algorithms are known. We present a surprisingly simple deterministic, quadratic time algorithm for 3XOR. Its core is a version of the Patricia trie for X, which makes it possible to traverse the set a ⊕ X in ascending order for arbitrary a∈{0, 1}^w in linear time. Furthermore, we describe a randomized algorithm for 3XOR with expected running time O(n^2·{^3w/w, (n)^2/^2 n}). The algorithm transfers techniques to our setting that were used by Baran, Demaine, and Pătraşcu (2005/2008) for solving the related int3SUM problem (the same problem with integer addition in place of binary XOR) in expected time o(n^2). As suggested by Jafargholi and Viola (2016), linear hash functions are employed. The latter authors also showed that assuming 3XOR needs expected running time n^2-o(1) one can prove conditional lower bounds for triangle enumeration just as with 3SUM. We demonstrate that 3XOR can be reduced to other problems as well, treating the examples offline SetDisjointness and offline SetIntersection, which were studied for 3SUM by Kopelowitz, Pettie, and Porat (2016).

## Authors

• 2 publications
• 1 publication
• 6 publications
02/15/2018

### A Faster FPTAS for #Knapsack

Given a set W = {w_1,..., w_n} of non-negative integer weights and an in...
08/24/2020

### Fast and Simple Modular Subset Sum

We revisit the Subset Sum problem over the finite cyclic group ℤ_m for s...
07/15/2021

### Deterministic and Las Vegas Algorithms for Sparse Nonnegative Convolution

Computing the convolution A⋆ B of two length-n integer vectors A,B is a ...
01/05/2020

### All non-trivial variants of 3-LDT are equivalent

The popular 3-SUM conjecture states that there is no strongly subquadrat...
07/18/2018

### Deterministic oblivious distribution (and tight compaction) in linear time

In an array of N elements, M positions and M elements are "marked". We s...
10/06/2021

### More on Change-Making and Related Problems

Given a set of n integer-valued coin types and a target value t, the wel...
03/13/2018

### On Integer Programming and Convolution

Integer programs with a fixed number of constraints can be solved in pse...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

The 3XOR problem [15] is the following: Given a set of binary strings of equal length , are there elements such that , where is bitwise XOR? We work with the word RAM [9] model with word length , and we assume as usual that one input string fits into one word. Then, using sorting, the problem can easily be solved in time . Using Han’s fast integer sorting algorithm [14] the time can be reduced to . In order to achieve quadratic running time, one could utilize a randomized dictionary for with expected linear construction time and constant lookup time (like in [8]) or (weakly non-uniform, quite complicated) deterministic static dictionaries with construction time and constant lookup time as provided in [13]. Once such a dictionary is available, one just has to check whether , for all pairs . No subquadratic algorithms seem to be known.

It is natural to compare the situation with that for the 3SUM problem, which is as follows:111There are many different, but equivalent versions of 3SUM and 3XOR, differing in the way the input elements are grouped. Often one sees the demand that the three elements , , and with or , resp., come from different sets. Given a set of real numbers, are there such that ? There is a very simple quadratic time algorithm for this problem (see Section 3 below). After a randomized subquadratic algorithm was suggested by Grønlund Jørgensen and Pettie [16], improvements ensued [10, 12], and recently Chan [5] gave the fastest deterministic algorithm known, with a running time of . The restricted version where the input consists of integers whose bit length does not exceed the word length is called int3SUM. The currently best randomized algorithm for int3SUM was given by Baran, Demaine, and Pǎtraşcu [2, 3]; it runs in expected time for . The 3SUM problem has received a lot of attention in recent years, because it can be used as a basis for conditional lower time bounds for problems e.g. from computational geometry and data structures [11, 18, 22]. Because of this property, 3SUM is in the center of attention of papers dealing with low-level complexity. Chan and Lewenstein [6] give upper bounds for inputs with a certain structure. Kane, Lovett, and Moran [17]

prove near-optimal upper bounds for linear decision trees. Wang

[24] considers randomized algorithms for subset sum, trying to minimize the space, and Lincoln et al. [19] investigate time-space tradeoffs in deterministic algorithms for -SUM.

In contrast, 3XOR received relatively little attention, before Jafargholi and Viola [15] studied 3XOR and described techniques for reducing this problem to triangle enumeration. In this way they obtained conditional lower bounds in a way similar to the conditional lower bounds based on int3SUM.

The main results of this paper are the following: We present a surprisingly simple deterministic algorithm for 3XOR that runs in time . When is given in sorted order, it constructs in linear time a version of the Patricia trie [21] for , using only word operations and not looking at single bits at all. This tree then makes it possible to traverse the set in ascending order in linear time, for arbitrary . This is sufficient for achieving running time . The second result is a randomized algorithm for 3XOR that runs in time for , which is almost the same bound as that of [2] for int3SUM. Finding a deterministic algorithm for 3XOR with subquadratic running time remains an open problem. Finally, we reduce 3XOR to offline SetDisjointness and offline SetIntersection, establishing conditional lower bounds (as in [18] conditioned on the int3SUM conjecture).

Unfortunately, no (non-trivial) relation between the required (expected) time for 3SUM and 3XOR is known. In particular, we cannot exclude the case that one of these problems can be solved in (expected) time for some constant whereas the other one requires (expected) time . Actually, this possibility is the background of some conditional statements on the cost of listing triangles in graphs in [15, Cor. 2]. However, due to the similarity of 3XOR to 3SUM, the question arises whether the recent results on 3SUM can be transferred to 3XOR.

In Section 2, we review the word RAM model and examine -universal classes of linear hash functions. In particular, we determine the evaluation cost of such hash functions and we restate a hashing lemma [2] on the expected number of elements in “overfull” buckets. Furthermore, we state how fast one can solve the set intersection problem on word-packed arrays (with details given in the appendix). In Section 3, we construct a special enhanced binary search tree to represent a set of binary strings of fixed length. This representation makes it possible to traverse the set in ascending order for any in linear time, which leads to a simple deterministic algorithm for 3XOR that runs in time . Then, we turn to randomized algorithms and show how to solve 3XOR in subquadratic expected time in Section 4: for , and for . Our approach uses the ideas of the subquadratic expected time algorithm for int3SUM presented in [2], i.e., computing buckets and fingerprints, word packing, exploiting word-level parallelism, and using lookup tables. Altogether, we get the same expected running time for and a word-length-dependent upper bound on the expected running time for that is worse by a factor in comparison to the int3SUM setting. Based on these results and the similarity of 3XOR to 3SUM, it seems natural to conjecture that 3XOR requires expected time , too, and so 3XOR is a candidate for reductions to other computational problems just as 3SUM. In Section 5, we describe how to reduce 3XOR to offline SetDisjointness and offline SetIntersection, transferring the results of [18] from 3SUM to 3XOR.

Recently, Bouillaguet et al. [4] studied algorithms “for the 3XOR problem”. This is related to our setting, but not identical. These authors study a variant of the “generalized birthday problem”, well known in cryptography as a problem to which some attacks on cryptosystems can be reduced, see [4]. Translated into our notation, their question is: Given a random set of size , find, if possible, three different strings such that . Adapting the algorithm from [2], these authors achieve a running time of , which corresponds to the running time of our algorithm for . The difference to our situation is that their input is random. This means that the issue of 1-universal families of linear hash functions disappears (a projection of the elements in

on some bit positions does the job) and that complications from weak randomness are absent (e.g., one can use projection into relatively small buckets and use Chernoff bounds to prove that the load is very even with high probability). This means that the algorithm described in

[4] does not solve our version of the 3XOR problem.

## 2 Preliminaries

### 2.1 The Word RAM Model

As is common in the context of fast algorithms for the int3SUM problem [2], we base our discussion on the word RAM model [9]. This is characterized by a word length . Each memory cell can store bits, interpreted as a bit string or an integer or a packed sequence of subwords, as is convenient. The word length is assumed to be at least and at least the bit length of a component of the input. It is assumed that the operations of the multiplicative instruction set, i.e., arithmetic operations (addition, subtraction, multiplication), word operations (left shift, right shift), bitwise Boolean operations (, , , ), and random memory accesses can be executed in constant time. We will write to denote the bitwise operation. A randomized word RAM also provides an operation that in constant time generates a uniformly random value in for any given .

### 2.2 Linear Hash Functions

We consider hash functions , where the domain (“universe”) is and the range is with

. Both universe and range are vector spaces over

. In [2] and in successor papers on int3SUM “almost linear” hash functions based on integer multiplication and truncation were used, as can be found in [7]. As noted in  [15], in the 3XOR setting the situation is much simpler. We may use , the set of all -linear functions from to . A function from this family is described by a matrix , and given by , where and are written as column vectors. For all hash functions and all we have , by the very definition of linearity. Further, this family is -universal, indeed, we have , for all pairs of different keys in . We remark that the convolution class described in [20], a subfamily of , can be used as well, as it is also 1-universal, and needs only random bits.

The universe we consider here is . The time for evaluating a hash function on one or on several inputs depends on the instruction set and on the way is stored. In contrast to the int3SUM setting [2], we are not able to calculate hash values in constant time.

###### Lemma .

For and inputs from we have:
(a) can be calculated in time , if of -bit words is a constant time operation.
(b) can always be calculated in time .
(c) can be evaluated in time .

###### Proof.

(Sketch.) Assume . For (a) we store the rows of as -bit strings, and obtain each bit of the hash value by a bitwise operation followed by . For (b) we assume the columns of are stored as -bit blocks, in words. An evaluation is effected by selecting the columns indicated by the 1-bits of and calculating the of these vectors in a word-parallel fashion. In rounds, these vectors are added, halving the number of vectors in each round. For (c), we first pack the columns selected for the input strings into words and then carry out the calculation indicated in (b), but simultaneously for all and within as few words as possible. This makes it possible to further exploit word-level parallelism, if should be much smaller than . ∎

We shall use linear, 1-universal hashing for splitting the input set into buckets and for replacing keys by fingerprints in Section 4.

In the following, we will apply Section 2.2(c) to map binary strings of length to hash values of length in time . Since will dominate the running time only for huge word lengths, we assume in the rest of the paper that and that all hash values can be calculated in time .

When randomization is allowed, we will assume that we have constructed in expected time a standard hash table for input set with constant lookup time [8]. (Arbitrary 1-universal classes can be used for this.)

### 2.3 A Hashing Lemma for 1-Universal Families

A hash family of functions from to is called 1-universal if for all , . We map a set with into with by a random element . In [2, Lemma 4] it was noted that for 1-universal families the expected number of keys that collide with more than other keys is bounded by . We state a slightly stronger version of that lemma. (The strengthening is not essential for the application in the present paper.)

###### Lemma (slight strengthening of Lemma 4 in [2]).

Let be a 1-universal class of hash functions from to , with , and let with . Choose uniformly at random. For define . Then for we have:

 Eh∈H[|{x∈S∣|Bh(x)|≥t}|]

(The bound in [2] was about twice as large. The proof is given in Section A.1.)

In our algorithm, we will be interested in the number of elements in buckets with size at least three times the expectation. Choosing in Section 2.3, we conclude that the expected number of such elements is smaller than the number of buckets.

###### Corollary .

In the setting of Section 2.3 we have .

### 2.4 Set Intersection on Unsorted Word-Packed Arrays

We consider the problem “set intersection on unsorted word-packed arrays”: Assume and are such that , and that two words and are given that both contain many -bit strings: contains and contains . We wish to determine whether is empty or not and find an element in the intersection if it is nonempty.

In [3, proof of Lemma 3] a similar problem is considered: It is assumed that is sorted and is bitonic, meaning that it is a cyclic rotation of a sequence that first grows and then falls. In this case one sorts the second sequence by a word-parallel version of bitonic merge (time ), and then merges the two sequences into one sorted sequence (again in time ). Identical elements now stand next to each other, and it is not hard to identify them. We can use a slightly slower modification of the approach of [3]: We sort both sequences by word-packed bitonic sort [1], which takes time , and then proceed as before.222It is this slower version of packed intersection that causes our randomized 3XOR algorithm to be a little slower than the int3SUM algorithm for . We obtain the following result.

###### Lemma .

Assume , and assume that two sequences of -bit strings, each of length , are given. Then the entries that occur in both sequences can be listed in time .

For completeness, we give a more detailed description in Section A.2.

## 3 A Deterministic 3XOR Algorithm in Quadratic Time

A well known deterministic algorithm for solving the 3SUM problem in time is reproduced in Algorithm 1.

After sorting the input as in time , we consider each separately and look for triples of the form . Such triples correspond to elements of the intersection of and . Since is sorted, we can iterate over both and in ascending order and compute the intersection with an interleaved linear scan.

Unfortunately, the -operation is not order preserving, i.e., does not imply for the lexicographic ordering on bitstrings—or, indeed, any total ordering on bitstrings. We may sort and each set , for , separately to obtain an algorithm with running time . Using fast deterministic integer sorting [14] reduces this to time . In order to achieve quadratic running time, one may utilize a randomized dictionary for with expected linear construction time and constant lookup time (like in [8]) or (weakly non-uniform, rather complex) deterministic static dictionaries with construction time and constant lookup time as provided in [13]. Once such a dictionary is available, one just has to check whether , for all .

Here we describe a rather simple deterministic algorithm with quadratic running time. For this, we utilize a special binary search tree333The structure of the tree is that of the Patricia trie [21] for . that allows, for arbitrary , to traverse the set in lexicographically ascending order, in linear time. For , the tree is recursively defined as follows.

• • If , then is , a tree consisting of a single leaf with label . • If , let denote the longest common prefix of the elements of when viewed as bitstrings. That is, all elements of coincide on the first bits, the elements of some nonempty set start with and the elements of start with . We define for some , meaning that consists of a root vertex with label , a left subtree and a right subtree . The choice of is irrelevant, but it is convenient to define the label more concretely as .

Note that along paths of inner nodes down from the root the labels when regarded as integers are strictly decreasing. We give an example in Figure 1 and provide a time construction of from in Algorithm 4.

In the context of as described above, the st bit is the most significant bit where elements of differ. Crucially, this is also true for the set for any . Since the elements of are partitioned into and according to their st bit, either all elements of are less than all elements of , or vice versa, depending on whether the st bit of is or . Using that the st bit of is iff , this suggests a simple recursive algorithm to produce in sorted order, given as Algorithm 3.

With the data structure in place, the strategy from 3SUM carries over to 3XOR as seen in Algorithm 2. Summing up, we have obtained the following result:

On a deterministic word RAM the 3XOR problem can be solved in time . ∎ In Algorithm 4 we provide a linear time construction of from a stream containing the sorted array interleaved with the labels (due to sorting the total runtime is ). Despite its brevity, the recursive build function is somewhat subtle.

###### Claim (Correctness of Algorithm 4).

If build() is called while the stream contains the elements , the call consumes a prefix of the stream until where . It returns where .

Once this is established, the correctness of makeTree immediately follows as for the outer call we have and (with the understanding that ).

###### Proof of Section 3.

By the -call we mean the (recursive) call to build() with . In particular the -call consumes from the stream and our claim concerns the -call. It is clear from the algorithm that an -call can only invoke an -call if . Therefore the -call cannot directly or indirectly cause the -call since . At the same time, the -call can only terminate when . This establishes that when the -call ends – the first part of our claim.

Next, note that since is sorted, there is some such that we have and where is the partition from the definition of . Moreover, is the largest label among . This implies that the -call is directly invoked from the -call. Just before the -call is made, the -call played out just as though the stream had been , which would have produced by induction444Formally the induction is on the value of . The case of is trivial.. However, due to , instead of returning , the while loop is entered (again) and produces . The stream for the -call is and is the first label not smaller than . So, again by induction, the -call produces and ends with . Given this, it is clear that afterwards the loop condition in the -call is not satisfied (since ) and the new is returned immediately, establishing the second part of the claim. ∎

## 4 A Subquadratic Randomized Algorithm

In this section we present a subquadratic expected time algorithm for the 3XOR problem. Its basic structure is the same as in the corresponding algorithm for int3SUM presented in [2], in particular, it uses buckets and fingerprints, word packing, word-level parallelism, and lookup tables. Changes are made where necessary to deal with the different setting. This makes it a little more difficult in some parts of the algorithm (mainly because -ing a sorted sequence with some will destroy the order) and easier in other parts (in particular where linearity of hash functions is concerned). Altogether, we get an expected running time that is the same as in [2] for and slightly worse for larger . Recall we assume throughout.

[] A randomized word RAM with word length can solve the 3XOR problem in expected time

 O(n2⋅min{log3ww,(loglogn)2log2n})for w=O(nlogn),

and , otherwise. The crossover point between the and the factor is . The only difference to the running time of [2] is in an extra factor in the word-length-dependent part.

###### Proof.

We briefly describe the main ideas of the algorithm. For full details, see Appendix B. If , we proceed as for . We use two levels of hashing.

We split into buckets , , using a randomly chosen hash function . By linearity, for every solution we also have . Given and , we only have to inspect bucket when looking for a such that .

For , the expected size of bucket is . A bucket of size larger than is called bad, as are elements of bad buckets. All other buckets and elements are called good. By Section 2.3, the expected number of bad elements is smaller than . We can even assume that the total number of bad elements is smaller than . (By Markov’s inequality, we simply have to repeat the choice of expected times until this condition is satisfied.)

#### Fingerprints and Word-Packed Arrays

Furthermore, we use another hash function for some appropriately chosen to calculate -bit fingerprints for all elements in . If , we can pack all fingerprints of elements of a good bucket into one word . This packed representation is called word-packed array. Again by linearity, for every solution we have . On the other hand, the expected number of colliding triples, i. e., triples with but and , is at most .

The total time for all the hashing steps described so far is , see Section 2.2. We consider two choices of and , cf. [2, proof of Lemma 3] and [2, proof of Thm. 2]. The first one is better for larger words of length whereas the second one yields better results for smaller words. In both cases, we search for triples with a fixed number of bad elements separately. The strategies for finding triples of good elements correspond to the approach for int3SUM in [2]. However, for triples with at least one bad element we have to rely on a more fine-grained examination than in [2]. For this, we will use hash tables and another lookup table.

#### Long Words: Exploiting Word-Level Parallelism

For word lengths , we choose and to be able to pack all fingerprints of elements of a good bucket into one word. We examine triples with at most one and at least two bad elements separately, as seen in Algorithm 5 in Section B.4.

When looking for triples with at most one bad element, we do the following for every (good or bad) and where and the corresponding bucket are good (as in [2, proof of Lemma 3] for all good elements): We every fingerprint of the word-packed array with . Then, we apply Section 2.4 to get a list of common pairs in this modified word-packed array and . For each such pair, we only have to check whether it derives from a non-colliding triple. Since we can stop when we find a non-colliding triple and since the expected total number of colliding triples is , we are done in expected time . (The corresponding strategy in [2] is only used to examine triples of good elements.)

In order to examine all triples with at least two bad elements, we provide a hash table for with expected construction time and constant lookup time [8]. Now, for each of the at most pairs of bad elements we can check if in constant time.555Note that it would not be possible to derive expected time for checking all pairs of bad elements if we did not start all over if the number of keys in bad buckets is at least .

The total expected running time for this parameter choice is .

#### Short Words: Using Lookup Tables

For word lengths , we choose and to pack all fingerprints of elements of a good bucket into bits, for some .

We start by looking for triples with no bad element. For this, we consider all triples of corresponding good buckets (as in [2, proof of Thm. 2]). We use a lookup table of size to check whether such a triple of buckets yields a triple of fingerprints (in the word-packed arrays) with in constant time. If this is the case, we search for a corresponding triple in the buckets of size . Since one table entry can be computed in time , setting up the lookup table takes time . Furthermore, the expected colliding triples cause additional expected running time . Since we can stop when we find a non-colliding triple, the total expected time is .

Searching for triples with exactly one bad element can be done in a similar way. For each bad element and each good bucket , , we all fingerprints in the word-packed array with and use a lookup table to check whether it has some fingerprints in common with the word-packed array of the corresponding good bucket. If this lookup yields a positive result, we check all pairs in the corresponding buckets. As before, the expected running time is , including the time due to colliding triples.

Examining all triples with at least two bad elements can be done using a hash table as mentioned above in expected time .

The total expected running time for this parameter choice is . ∎

## 5 Conditional Lower Bounds from the 3XOR Conjecture

As already mentioned in Section 1, the best word RAM algorithm for int3SUM currently known [2] can solve this problem in expected time for . The best deterministic algorithm [5] takes time . It is a popular conjecture that every algorithm for 3SUM (deterministic or randomized) needs (expected) time . Therefore, this conjectured lower bound can be used as a basis for conditional lower bounds for a wide range of other problems [11, 15, 18, 22].

Similarly, it seems natural to conjecture that every algorithm for the related 3XOR problem (deterministic or randomized) needs (expected) time . (In Section 4, the upper bound for short word lengths is where .) Therefore, it is a valid candidate for reductions to other computational problems [15, 23].

The general strategy from [2], already employed in Section 4, is quite similar to the methods in [18]. Therefore, we are able to reduce 3XOR to offline SetDisjointness and offline SetIntersection, too. Hence, the conditional lower bounds for the problems mentioned in [18] (and bounds for dynamic problems from [22]) also hold with respect to the 3XOR conjecture. A detailed discussion can be found in [23]. Below, we will outline the general proof strategy.

### 5.1 Offline SetDisjointness and Offline SetIntersection

We reduce 3XOR to the following two problems.

###### Problem (Offline SetDisjointness).

Input: Finite set , finite families and of subsets of , pairs of subsets .

Task: Find all of the pairs with .

###### Problem (Offline SetIntersection).

Input: Finite set , finite families and of subsets of , pairs of subsets .

Task: List all elements of the intersections of the pairs .

### 5.2 Reductions from 3XOR

By giving an expected time reduction from 3XOR to offline SetDisjointness and offline SetIntersection, we can prove lower bounds for the latter two problems, conditioned on the 3XOR conjecture.

Assume 3XOR requires expected time for on a word RAM. Then for every algorithm for offline SetDisjointness that works on instances with , , for all and requires expected time .

Assume 3XOR requires expected time for on a word RAM. Then for and , every algorithm for offline SetIntersection which works on instances with , , for all , and expected output size requires expected time .

###### Proof.

(For more details, see [23, ch. 6].) Let be the given 3XOR instance. As in Section 4, we use two levels of hashing. Algorithms 7 and 6 in Section B.4 illustrate the reduction to offline SetDisjointness and offline SetIntersection, respectively.

At first, we hash the elements of with a randomly chosen hash function into buckets in time . Then, we apply Section 2.3: There are expected elements in buckets with more than three times their expected size. For each such bad element, we can naively check in time whether it is part of a triple with or not. Since , all bad elements can be checked in expected time . Therefore, we can assume that every bucket , , has elements.

The second level of hashing uses two independently and randomly chosen hash functions where for offline SetDisjointness and for offline SetIntersection. (The function with is randomly chosen from a linear and 1-universal class of hash functions .) The hash values can be calculated in time . (The additional factor is only necessary for offline SetDisjointness, since we need to use choices of hash functions to get an error probability that is small enough.) For each and , we create “shifted” buckets and . One such set can be computed in time . Therefore, all sets can be computed in time for offline SetDisjointness and for offline SetIntersection.

We can show that for all and , if there are such that and , then . Therefore, we create the following offline SetDisjointness (offline SetIntersection) instance: , , and queries for all and in time . (These are queries for offline SetIntersection. For offline SetDisjointness, we create queries for each of the choices of .)

After the offline SetDisjointness or offline SetIntersection instance has been solved, we can use this answer to compute the answer for in expected time . We only have to check if a positive answer from offline SetDisjointness (a pair with non-empty intersection) or offline SetIntersection (an element of an intersection) yields a solution triple of or not.

For offline SetDisjointness, we can show that the probability for a triple to yield a false positive can be made polynomially small if we consider choices of and only examine if this is suggested by all corresponding queries. For offline SetIntersection, the expected number of colliding triples is . By trying to guess a good triple times before creating the offline SetIntersection instance we can avoid a problem for the expected running time if a 3XOR instance yields an offline SetIntersection instance with output size .

For all relevant values of and , the total running time is in addition to the time needed to solve the offline SetDisjointness or offline SetIntersection instance. ∎

## 6 Conclusions and Remarks

We have presented a simple deterministic algorithm with running time . Its core is a version of the Patricia trie for , which makes it possible to traverse the set in ascending order for arbitrary in linear time. Furthermore, our randomized algorithm solves the 3XOR problem in expected time for , and for . The crossover point between the and the factor is . The only difference to the running time of [2] is in an extra factor in the word-length-dependent part. This is due to the necessity to re-sort a word-packed array of size in time after we have -ed each of its elements with a (common) element. Finally, we have reduced 3XOR to offline SetDisjointness and offline SetIntersection, establishing conditional lower bounds (as in [18] conditioned on the int3SUM conjecture).

A simple, but important observation, which is used in apparently all deterministic subquadratic time algorithms for 3SUM, is Fredman’s trick:

 a+b

Unfortunately, such a relation does not exist in our setting, since there is no linear order on such that holds for all . Since all elements are self-inverse, for and any , we would get . Is there another, “trivial-looking” trick for 3XOR, that establishes a basic approach to solve 3XOR in deterministic subquadratic time?

Another open question is how the optimal running times for 3SUM and 3XOR are related. At first sight, the two problems seem to be very similar, but the details make the difference. The observations mentioned above (especially the problem of re-sorting slightly modified word-packed arrays and the possible absence of a relation like Fredman’s trick) hint at a larger gap than expected. On the other hand, the fact that both problems can be reduced to a wide variety of computational problems in a similar way (e.g. listing triangles in a graph, offline SetDisjointness and offline SetIntersection) increases hope for a more concrete dependance.

## References

• [1] Susanne Albers and Torben Hagerup. Improved Parallel Integer Sorting without Concurrent Writing. Information and Computation, 136(1):25–51, 1997.
• [2] Ilya Baran, Erik D. Demaine, and Mihai Pătraşcu. Subquadratic algorithms for 3SUM. Algorithmica, 50(4):584–596, 2008.
• [3] Ilya Baran, Erik D. Demaine, and Mihai Pǎtraşcu. Subquadratic Algorithms for 3SUM. In Proceedings of the 9th International Conference on Algorithms and Data Structures (WADS), pages 409–421. Springer-Verlag, 2005.
• [4] Charles Bouillaguet, Claire Delaplace, and Pierre-Alain Fouque. Revisiting and Improving Algorithms for the 3XOR Problem. IACR Transactions on Symmetric Cryptology, 2018(1):254–276, 2018.
• [5] Timothy M. Chan. More logarithmic-factor speedups for 3SUM, (median, +)-convolution, and some geometric 3SUM-hard problems. In Proceedings of the 29th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 881–897. SIAM, 2018.
• [6] Timothy M. Chan and Moshe Lewenstein. Clustered Integer 3SUM via Additive Combinatorics. In Proceedings of the Forty-seventh Annual ACM Symposium on Theory of Computing, STOC 2015, pages 31–40, New York, NY, USA, 2015. ACM.
• [7] Martin Dietzfelbinger. Universal hashing and

-wise independent random variables via integer arithmetic without primes.

In Proc. 13th Annual Symposium on Theoretical Aspects of Computer Science (STACS), pages 569–580. Springer-Verlag, 1996.
• [8] Michael L. Fredman, János Komlós, and Endre Szemerédi. Storing a Sparse Table with Worst Case Access Time. J. ACM, 31(3):538–544, June 1984.
• [9] Michael L. Fredman and Dan E. Willard. Surpassing the Information Theoretic Bound with Fusion Trees. Journal of Computer and System Sciences, 47(3):424–436, 1993.
• [10] Ari Freund. Improved Subquadratic 3SUM. Algorithmica, 77(2):440–458, Feb 2017.
• [11] Anka Gajentaan and Mark H. Overmars. On a Class of Problems in Computational Geometry. Comput. Geom. Theory Appl., 5(3):165–185, October 1995.
• [12] Omer Gold and Micha Sharir. Improved bounds for 3SUM, K-SUM, and linear degeneracy. CoRR, abs/1512.05279, 2015. URL: http://arxiv.org/abs/1512.05279.
• [13] Torben Hagerup, Peter Bro Miltersen, and Rasmus Pagh. Deterministic dictionaries. J. Algorithms, 41(1):69–85, 2001.
• [14] Yijie Han. Deterministic sorting in time and linear space. J. Algorithms, 50(1):96–105, 2004.
• [15] Zahra Jafargholi and Emanuele Viola. 3SUM, 3XOR, triangles. Algorithmica, 74(1):326–343, 2016.
• [16] Allan Grønlund Jørgensen and Seth Pettie. Threesomes, degenerates, and love triangles. In 55th IEEE Annual Symposium on Foundations of Computer Science (FOCS), pages 621–630, 2014.
• [17] Daniel M. Kane, Shachar Lovett, and Shay Moran. Near-Optimal Linear Decision Trees for k-SUM and Related Problems. CoRR, abs/1705.01720, 2017. URL: http://arxiv.org/abs/1705.01720.
• [18] Tsvi Kopelowitz, Seth Pettie, and Ely Porat. Higher lower bounds from the 3SUM conjecture. In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1272–1287. SIAM, 2016.
• [19] Andrea Lincoln, Virginia Vassilevska Williams, Joshua R. Wang, and R. Ryan Williams. Deterministic Time-Space Tradeoffs for k-SUM. CoRR, abs/1605.07285, 2016. URL: http://arxiv.org/abs/1605.07285.
• [20] Y. Mansour, N. Nisan, and P. Tiwari. The Computational Complexity of Universal Hashing. Theor. Comput. Sci., 107(1):121–133, 1993.
• [21] Donald R. Morrison. PATRICIA - Practical Algorithm To Retrieve Information Coded in Alphanumeric. J. ACM, 15(4):514–534, 1968.
• [22] Mihai Pătraşcu. Towards Polynomial Lower Bounds for Dynamic Problems. In Proc. 42nd ACM Symp. on Theory of Computing (STOC), pages 603–610. ACM, 2010.
• [23] Philipp Schlag. Untere Schranken für Berechnungsprobleme auf der Basis der 3SUM-Vermutung. Master’s thesis, TU Ilmenau, Germany, 2016.
• [24] Joshua R. Wang. Space-Efficient Randomized Algorithms for K-SUM. In Algorithms - ESA 2014: 22th Annual European Symposium. Proceedings, pages 810–829. Springer, 2014.

## Appendix A Appendix

### a.1 Proof of a Hashing Lemma

We prove Section 2.3 from Section 2.3:

See 2.3

###### Proof.

As probability space we use

with the uniform distribution. Fix

with . For we define two sets,

 B′h ={Bh