    # Inner Product and Set Disjointness: Beyond Logarithmically Many Parties

A basic goal in complexity theory is to understand the communication complexity of number-on-the-forehead problems f({0,1}^n)^k→{0,1} with k≫ n parties. We study the problems of inner product and set disjointness and determine their randomized communication complexity for every k≥ n, showing in both cases that Θ(1+ n/1+k/ n) bits are necessary and sufficient. In particular, these problems admit constant-cost protocols if and only if the number of parties is k≥ n^ϵ for some constant ϵ>0.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction

The number-on-the-forehead model, due to Chandra et al. , is the standard model of multiparty communication. The model features collaborative players and a Boolean function with arguments. An input is distributed among the players with overlap, by giving the th player the arguments but not . This arrangement can be visualized as having the players seated in a circle with written on the th player’s forehead, whence the name of the model. The players communicate according to a protocol agreed upon in advance. The communication occurs in the form of broadcasts, with a message sent by any given player instantly reaching everyone else. The players’ objective is to compute on any given input with minimal communication. We are specifically interested in randomized protocols, where the players have an unbounded supply of shared random bits. The cost of a protocol is the total bit length of all the messages broadcast in a worst-case execution. The -error randomized communication complexity is the least cost of a randomized protocol that computes

with probability of error at most

on every input.

Number-on-the-forehead communication complexity is a natural subject of study in its own right, in addition to its applications to circuit complexity, pseudorandomness, and proof complexity [3, 27, 15, 23, 6]. Number-on-the-forehead is the most studied model in the area because any other way of assigning arguments to players results in a less powerful formalism—provided of course that one does not assign all the arguments to some player, in which case there is never a need to communicate. The generous overlap in the players’ inputs makes proving lower bounds in the number-on-the-forehead model difficult. The strongest lower bound for an explicit communication problem is currently obtained by Babai et al.  almost thirty years ago. This lower bound becomes trivial at and it is a longstanding open problem to overcome this logarithmic barrier and prove strong lower bounds for an explicit function with . As one would expect, the existence of such functions is straightforward to prove using a counting argument [3, 13]. In particular, it is known [13, Sec. 9.2] that for all and a uniformly random function almost surely has randomized communication complexity

 R1/3(F)≥n−5, (1.1)

which essentially meets the trivial upper bound

The two most studied problems in communication complexity theory are (generalized) inner product and set disjointness. In the -party versions of these problems, the inputs are subsets As usual, the th player knows but not . In the inner product problem, the objective is to determine whether

is odd. In set disjointness, the objective is to determine whether

. In Boolean form, these two functions are given by the formulas

 \GIPn,k(X)= n⨁i=1k⋀j=1Xi,j, \DISJn,k(X)= n⋀i=1k⋁j=1¯¯¯¯¯Xi,j,

respectively, where the input is an Boolean matrix

whose columns are the characteristic vectors of the input sets. In the setting of two players, the communication complexity is well-known to be

for both inner product  and set disjointness [17, 22, 4]

. A moment’s thought reveals that the

-party communication complexity of these problems is monotonically nonincreasing in and determining this dependence has been the subject of extensive research in the area [3, 14, 26, 7, 19, 9, 5, 25, 24]. On the upper bounds side, Grolmusz  proved that -party inner product has communication complexity which easily carries over to -party set disjointness. The best lower bounds to date are for inner product, due to Babai et al. ; and and for set disjointness, due to Sherstov [25, 24].

### 1.1. Our results

Our work began with a basic question: how many players does it take to compute inner product and set disjointness with constant communication? As discussed above, the best bounds on the communication complexity of these functions for large prior to this paper were and . We close this logarithmic gap, determining the communication complexity up to a multiplicative constant for every .

###### Theorem 1.1 (Main result).

For any inner product and set disjointness have randomized communication complexity

 R1/3(\GIPn,k) =Θ⎛⎜ ⎜⎝lognlog⌈1+klogn⌉+1⎞⎟ ⎟⎠, R1/3(\DISJn,k) =Θ⎛⎜ ⎜⎝lognlog⌈1+klogn⌉+1⎞⎟ ⎟⎠.

To our knowledge, Theorem 1.1 is the first nontrivial (i.e., superconstant) lower bound for any explicit communication problem with players. In particular, inner product and set disjointness have communication protocols with constant cost if and only if the number of players is for some constant It is noteworthy that we prove the upper bounds in Theorem 1.1 using simultaneous protocols, where the players do not interact. In more detail, each player in a simultaneous protocol broadcasts all his messages at once and without regard to the messages from the other players. The output of a simultaneous protocol is fully determined by the shared randomness and the concatenation of the messages broadcast by all the players. The cost of a simultaneous protocol is defined in the usual way, as the total number of bits broadcast by all the players in a worst-case execution. Theorem 1.1 shows that as far as inner product and set disjointness are concerned, simultaneous protocols are asymptotically as powerful as general ones.

A natural next step is to construct a problem whose communication complexity remains nontrivial for all Its existence follows from the lower bound (1.1) on the communication complexity of random functions. In the theorem below, we give an explicit function with communication complexity at least for some absolute constant and all and We remind the reader that stands for the Boolean function that evaluates to true if and only if the sum of its arguments is a multiple of

###### Theorem 1.2.

Define by

 Fn,k(X)=\MOD3(k⨁j=1X1,j,…,k⨁j=1Xn,j).

Then

 R1/3(Fn,k) ≥13logn−13.

As with inner product and set disjointness, we show that the lower bound of Theorem 1.2 is asymptotically tight for all .

### 1.2. Our techniques

The upper bounds in Theorem 1.1 are based on Grolmusz’s deterministic protocol for multiparty inner product , which we are able to speed up using public randomness. The lower bounds in Theorems 1.1 and 1.2 are more subtle. First of all, it may be surprising that we are able to prove any lower bounds at all for players since all known techniques for explicit functions stop working at players. The key is to realize that we only need to rule out communication protocols with cost , and in any given execution of such a protocol all but players remain silent! This makes it possible to reduce the analysis to the setting of players, where strong lower bounds are known. This reduction involves constructing an input distribution such that the portion of the input seen by any small set of players does not significantly help with computing the output. Our communication lower bounds use the discrepancy method, which we adapt here to reflect the number of active players.

The remainder of this paper is organized as follows. Section 2 gives a thorough review of the technical preliminaries. Our results for inner product and set disjointness are presented in Sections 3 and 4, respectively. Section 5 concludes the paper with a proof of Theorem 1.2 along with a matching upper bound.

## 2. Preliminaries

### 2.1. General

We use lowercase letters for vectors and strings, and uppercase letters for matrices. The empty string is denoted  For a bit string we let denote the Hamming weight of We let the bar operator denote either complex conjugation or set complementation, depending on the nature of the argument. For convenience, we adopt the convention that The notation refers to the logarithm of to base

We will view Boolean functions as mappings for a finite set , typically A partial function on a set is a function whose domain of definition, denoted is a proper subset of For (possibly partial) Boolean functions and on and respectively, the symbol refers to the (possibly partial) Boolean function on given by Clearly, the domain of is the set of all for which As usual, for (possibly partial) Boolean functions and , the symbol refers to the (possibly partial) Boolean function given by Observe that in this notation, and are completely different functions. We abbreviate times). The familiar functions and on the Boolean hypercube are given by and We let be the Boolean function given by Finally, we define a partial Boolean function on as the restriction of to In other words,

 \UANDn(x)={x1∧x2∧⋯∧xnif |x|≥n−1,undefinedotherwise.

We let denote the family of matrices with entries in the most common cases being those of real matrices and Boolean matrices For a matrix and a set we let denote the submatrix of obtained by keeping the rows with index in More generally, for sets and we let denote the submatrix of obtained by keeping the rows with index in and columns with index in We adopt the standard convention that the ordering of the rows (and columns) in a submatrix is inherited from the containing matrix.

For nonnegative integers and we define

 (n≤k):=(n0)+(n1)+⋯+(nk)=min{k,n}∑i=0(ni).

The following bounds are well-known [16, Proposition 1.4]:

 (nk)k≤(n≤k) ≤(\enk)k (1≤k≤n). (2.1)

### 2.2. Analytic preliminaries

For a finite set we let denote the linear space of real functions . This space is equipped with the usual norms and inner product:

 ∥ϕ∥∞ =maxx∈X|ϕ(x)| (ϕ∈RX), ∥ϕ∥1 =∑x∈X|ϕ(x)| (ϕ∈RX), ⟨ϕ,ψ⟩ =∑x∈Xϕ(x)ψ(x) (ϕ,ψ∈RX).

The support of is the subset The pointwise (Hadamard) product of is denoted and given by

The tensor product of

and is the function given by The tensor product ( times) is abbreviated Tensor product notation generalizes to partial functions in the natural way: if and are partial real functions on and respectively, then is a partial function on with domain and is given by on that domain. Similarly, is a partial function on with domain

We now recall the Fourier transform on

For a subset define by Then every function has a unique representation of the form where The reals are the Fourier coefficients of , and the mapping is the Fourier transform of .

### 2.3. Probability

We view probability distributions first and foremost as real functions and use the notational shorthands above. In particular, we write

to refer to the support of the probability distribution , and to refer to the Cartesian product of the distributions and . The notation

means that the random variable

is distributed according to We let

denote the binomial distribution with

trials and success probability

###### Fact 2.1.

For any integer

 \Exps∼B(n−1,p)1√n−s ≤1√(1−p)n, (2.2) \Exps∼B(n−1,q)1√s+1 ≤1√qn, (2.3) \Exps∼B(n,p)|s−pn| ≤√p(1−p)n. (2.4)
###### Proof.

For (2.2), we have

 (\Exps∼B(n−1,p)1√n−s)2 ≤\Exps∼B(n−1,p)1n−s =n−1∑s=0(n−1s)ps(1−p)n−1−sn−s =1(1−p)nn−1∑s=0(ns)ps(1−p)n−s =1−pn(1−p)n,

where the first step follows from the Cauchy-Schwarz inequality, and the last step uses the binomial theorem. The bound (2.3) follows from (2.2) since the distribution of for is the same as the distribution of for . For (2.4),

 (\Exps∼B(n,p)|s−pn|)2 ≤\Exps∼B(n,p)[(s−pn)2] =p(1−p)n,

where the first step uses the Cauchy-Schwarz inequality, and the second step uses the fact that the binomial distribution

has variance

### 2.4. Approximation by polynomials

We let denote the total degree of a multivariate polynomial In this paper, we use the terms “degree” and “total degree” interchangeably, preferring the former for brevity. Let be given, for a finite subset The -approximate degree of denoted is the least degree of a real polynomial such that We generalize this definition to partial functions on by defining as the least degree of a real polynomial with

 |ϕ(x)−p(x)|≤ϵ,x∈\domϕ,|p(x)|≤1+ϵ,x∈X∖\domϕ.} (2.5)

For a (possibly partial) real function on a finite subset we define to be the least such that (2.5) holds for some polynomial of degree at most In this notation, The canonical setting of the error parameter is which is without loss of generality since the error in a uniform approximation of a Boolean function can be reduced from any given constant in to any other constant in with only a constant-factor increase in the degree of the approximant. One of the earliest results on the approximation of Boolean functions by polynomials is the following seminal theorem due to Nisan and Szegedy .

### 2.5. Multiparty communication

An excellent introduction to communication complexity theory is the monograph by Kushilevitz and Nisan . In our overview, we will limit ourselves to key definitions and notation. This paper uses the standard model of randomized multiparty communication known as the number-on-the-forehead model . Let be a (possibly partial) Boolean function on for some finite sets The model features players. A given input is distributed among the players by placing on the forehead of party (for ). In other words, party knows but not The players communicate according to an agreed-upon protocol by writing bits on a shared blackboard, visible to them all. They additionally have access to a shared source of random bits which they can use in deciding what messages to send. Their goal is to accurately compute the value of on any given input in the domain of An -error communication protocol for is one which, on every input produces the correct answer with probability at least The cost of a communication protocol is the total number of bits written to the blackboard in the worst case on any input. The -error randomized communication complexity of , denoted , is the least cost of an -error randomized communication protocol for . As usual, the standard setting of the error parameter is which is without loss of generality since the error probability in a communication protocol can be efficiently reduced by running the protocol several times independently and outputting the majority answer.

The communication problems of interest to us are generalized inner product and set disjointness , given by

 \GIPn,k(X)= n⨁i=1k⋀j=1Xi,j, \DISJn,k(X)= n⋀i=1k⋁j=1¯¯¯¯¯Xi,j.

These -party communication problems are both defined on matrices, where the th party receives as input all but the th column of the matrix. The disjointness function evaluates to true if and only if the input matrix does not have an all-ones row, whereas the generalized inner product function evaluates to true if and only if the number of all-ones rows is odd. We also consider a partial Boolean function on , called unique set disjointness and defined as the restriction of to matrices with at most one all-ones row. In other words, is undefined if has two or more all-ones rows, and is given by otherwise.

Let be a (possibly partial) Boolean function on representing a -party communication problem, and let be a (possibly partial) Boolean function on We view the composition as a -party communication problem on It will be helpful to keep in mind that for all positive integers and

 \GIPmn,k =\rm XORm∘\GIPn,k, (2.6) \DISJmn,k =\ANDm∘\DISJn,k, (2.7) \UDISJmn,k =\UANDm∘\UDISJn,k. (2.8)

Similarly, if for is a (possibly partial) -party communication problem on , we view as a -party communication problem on

### 2.6. Cylinder intersections

Let be nonempty finite sets. A cylinder intersection on is any function of the form

 χ(x1,…,xk)=k∏i=1χi(x1,…,xi−1,xi+1,…,xk), (2.9)

where In other words, a cylinder intersection is the product of Boolean functions, where the th function does not depend on the th coordinate but may depend arbitrarily on the other coordinates. For a given set we further specialize this notion to -cylinder intersections, defined as functions of the form

 χ(x1,…,xk)=∏i∈Sχi(x1,…,xi−1,xi+1,…,xk)

for some . Finally, an -cylinder intersection on is any -cylinder intersection for a subset of cardinality at most Cylinder intersections were introduced by Babai, Nisan, and Szegedy  and play a fundamental role in the theory due to the following fact.

###### Fact 2.3.

Let be a deterministic -party communication protocol with cost Then

 Π=2c∑i=1aiχi

for some -cylinder intersections and some

We refer the reader to  for a simple proof of Fact 2.3. Recall that a randomized protocol of cost is a probability distribution on deterministic protocols of cost With this in mind, one easily infers the following from Fact 2.3:

###### Corollary 2.4.

Let be a possibly partial Boolean function on If then

 |F(x1,…,xk)−Π(x1,…,xk)|≤ϵ, (x1,…,xk)∈\domF, |Π(x1,…,xk)|≤1, (x1,…,xk)∈X1×X2×⋯×Xk,

where is a linear combination of -cylinder intersections with

### 2.7. Discrepancy

For a (possibly partial) Boolean function on , a probability distribution on the domain of and a set the -discrepancy of with respect to is defined as

 \discS(F,μ) =maxχ|⟨(−1)F,μ⋅χ⟩| =maxχ∣∣ ∣∣∑x∈\domF(−1)F(x)μ(x)χ(x)∣∣ ∣∣,

where the maximum is over -cylinder intersections . Further maximizing over gives the key notions of -discrepancy and discrepancy, as follows:

 \discℓ(F,μ) =maxS⊆{1,2,…,k}|S|≤ℓ\discS(F,μ), \disc(F,μ) =maxS⊆{1,2,…,k}\discS(F,μ).

By definition,

 \discS(F,μ)≤\discℓ(F,μ)≤\disc(F,μ)

for every and every set of cardinality at most

In light of Corollary 2.4, upper bounds on the discrepancy give lower bounds on the communication complexity. This fundamental technique is known as the discrepancy method [11, 3, 18]:

###### Theorem 2.5 (Discrepancy method).

For every possibly partial Boolean function on and every probability distribution on the domain of

 2Rϵ(F)≥1−2ϵ\disc(F,μ). (2.10) More generally, 2Rϵ(F)≥1−2ϵ\discmin{Rϵ(F),k}(F,μ). (2.11)

A proof of (2.10) can be found in ; that same proof carries over to discrepancy with respect to any given family of functions and in particular establishes (2.11) as well.

A useful property of discrepancy is its convexity in the second argument, as formalized by the following proposition.

###### Proposition 2.6 (Convexity of discrepancy).

Let be a possibly partial Boolean function on , and let and be probability distributions on the domain of Then for every and

 \discS(F,pμ+(1−p)λ) ≤p\discS(F,μ)+(1−p)\discS(F,λ),

and likewise for and

By induction, Proposition 2.6 immediately generalizes to any finite convex combination of probability distributions. It is this more general form that we will invoke in our applications.

###### Proof of Proposition 2.6.

Immediate from the following inequality for any cylinder intersection :

 |⟨(−1)F,(pμ+(1−p)λ)⋅χ⟩|≤p⋅|⟨(−1)F,μ⋅χ⟩|+(1−p)⋅|⟨(−1)F,λ⋅χ⟩|,

where for partial functions the inner products are restricted to the domain of

It is clear that discrepancy is a continuous function of the input distribution. The following proposition quantifies this continuity.

###### Proposition 2.7 (Continuity of discrepancy).

For any -party communication problem and any probability distributions and on the domain of

 \disc(F,μ)≤\disc(F,~μ)+∥μ−~μ∥1.

More generally, for any any possibly partial -party communication problems and any probability distributions and on the corresponding domains,

 \discS(m⨁i=1Fi,m⨂i=1μi)≤∑A⊆{1,2,…,m}\discS(⨁i∈AFi,⨂i∈A~μi)∏i∉A∥μi−~μi∥1

and likewise for and

###### Proof.

It suffices to prove the claim for . Fix a set and an -cylinder intersection with

 \discS(m⨁i=1Fi,m⨂i=1μi)=∣∣ ∣∣⟨m⨂i=1(−1)Fi,χ⋅m⨂i=1μi⟩∣∣ ∣∣,

where as usual the inner product on the right-hand side is restricted to Then

 \discS(m⨁i=1Fi,m⨂i=1μi) =∣∣ ∣∣⟨m⨂i=1(−1)Fi,χ⋅m⨂i=1(~μi+(μi−~μi))⟩∣∣ ∣∣ =∣∣ ∣∣∑A⊆{1,2,…,m}⟨m⨂i=1(−1)Fi,χ⋅ΛA⟩∣∣ ∣∣, (2.12)

where is given by Continuing,

 ∣∣ ∣∣⟨m⨂i=1(−1)Fi,χ⋅ΛA⟩∣∣ ∣∣ =∣∣ ∣∣∑x1,…,xmχ(x)∏i∈A(−1)Fi(xi)~μi(xi)⋅∏i∉A(−1)Fi(xi)(μi(xi)−~μi(xi))∣∣ ∣∣ ≤∑xi:i∉A∣∣ ∣∣∑xi:i∈Aχ(x)∏i∈A(−1)Fi(xi)~μi(xi)∣∣ ∣∣∏i∉A|μi(xi)−~μi(xi)| ≤∑xi:i∉A\discS(⨁i∈AFi,⨂i∈A~μi)∏i∉A|μi(xi)−~μi(xi)| =\discS(⨁i∈AFi,⨂i∈A~μi)∏i∉A∥μi−~μi∥1, (2.13)

where the next to last step is legitimate because for any fixing of for , the function continues to be an -cylinder intersection with respect to the remaining coordinates. In view of (2.12) and (2.13), the proof is complete. ∎

## 3. Inner product

In this section, we determine the communication complexity of the inner product problem for players. Our proofs build on the classic lower and upper bounds for this problem for due to Babai et al.  and Grolmusz , respectively.

### 3.1. Lower bound

For the lower bound, we use the generalization of the discrepancy method given by Theorem 2.5. We will work with the following input distribution.

###### Definition 3.1.

Let denote the probability distribution on Boolean matrices whereby each row is chosen independently and uniformly at random from the set

In particular, is the uniform probability distribution on . In this special case, a strong upper bound on the discrepancy of generalized inner product was obtained in the seminal work of Babai, Nisan, and Szegedy .

###### Theorem 3.2 (Babai, Nisan, and Szegedy).

For any positive integers and

 \disc(\GIPn,k,υn,k,k)≤(1−14k−1)n.

We generalize this discrepancy bound to for any

###### Theorem 3.3.

For any positive integers with

 \discℓ(\GIPn,k,υn,k,ℓ)≤⎛⎝1−12ℓ−1(k≤ℓ)⎞⎠n.

For this discrepancy bound is identical to that of Theorem 3.2. In the setting of interest to us, however, the new bound is substantially stronger.

###### Proof of Theorem 3.3.

Since the communication problem and the probability distribution are both symmetric with respect to the players, we have

 \discℓ(\GIPn,k,υn,k,ℓ)=\disc{1,2,…,ℓ}(\GIPn,k,υn,k,ℓ). (3.1)

For a given set and a probability distribution on matrices let stand for the probability distribution induced by after conditioning on the event that if and only if Observe that is a probability distribution on matrices whereby and are distributed independently such that

 X|S,{1,2,…,ℓ}∼υ|S|,ℓ,ℓ, (3.2) X|S,{ℓ+1,ℓ+2,…,k}=⎡⎢ ⎢ ⎢ ⎢ ⎢⎣11⋯111⋯1⋮⋮⋱⋮11⋯1⎤⎥ ⎥ ⎥ ⎥ ⎥⎦, (3.3)

and does not have an all-ones row. In particular, the rows in

do not affect the value of the function. The conditional probability distribution of

given any value of is always (3.2)–(3.3). Viewing as the convex combination of these conditional probability distributions, corresponding to every possible value of , we conclude by Proposition 2.6 that

 \disc{1,2,…,ℓ}(\GIPn,k,υn,k,ℓ|S) ≤\disc(\GIP|S|,ℓ,υ|S|,ℓ,ℓ). (3.4)

We will now express as a convex combination of probability distributions and use the convexity of discrepancy to complete the proof. Specifically, let Then

 \disc{1,2,…,ℓ} (\GIPn,k,υn,k,ℓ) =\disc{1,2,…,ℓ}(\GIPn,k,\Exps∼B(n,p)\ExpS⊆{1,2,…,n}|S|=sυn,k,ℓ|S) by definition of υn,k,ℓ ≤\Exps∼B(n,p)\ExpS⊆{1,2,…,n}|S|=s\disc{1,2,…,ℓ}(\GIPn,k,υn,k,ℓ|S) by Proposition 2.6 ≤\Exps∼B(n,p)\disc(\GIPs,k,υs,ℓ,ℓ) by (3.4) ≤\Exps∼B(n,p)(1−14ℓ−1)s by Theorem 3.2 =n∑s=0(ns)(1−14ℓ−1)sps(