 # Near-Optimal Communication Lower Bounds for Approximate Nash Equilibria

We prove an N^2-o(1) lower bound on the randomized communication complexity of finding an ϵ-approximate Nash equilibrium (for constant ϵ>0) in a two-player N× N game.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction

How many bits of communication are needed to find an -approximate Nash equilibrium (for a small constant ) in a two-player game? More precisely:

• [itemsep=0pt]

• Alice holds her payoff matrix of some constant precision.

• Bob holds his payoff matrix of some constant precision.

• Output an -approximate Nash equilibrium: a mixed strategy for Alice and a mixed strategy for Bob such that neither player can unilaterally change their strategy and increase their expected payoff by more than . (See Section 1.3 for a formal definition.)

It is well known that such approximate equilibria have a concise -bit description: one may assume wlog that and are supported on at most actions [LMM03]. There is a trivial upper bound of by communicating an entire payoff matrix. Previous work [BR17] showed that finding an -approximate Nash equilibrium requires bits of communication for a small constant . In this work, we improve this to a near-optimal lower bound. Our main theorem is slightly more general, as it also applies for games of asymmetric dimensions.

###### Theorem 1.

There exists an such that for any constants the randomized communication complexity of finding an -approximate Nash equilibrium in an game is .

It is interesting to note that there is an protocol for computing an -correlated equilibrium of an game [BM07]. Hence our result is the first that separates approximate Nash and correlated equilibrium.

Our result also implies the first near-quadratic lower bound for finding an approximate Nash equilibrium in the weaker query complexity model, where the algorithm has black-box oracle access to the payoff matrices (previous work established such lower bounds only against deterministic algorithms [FS16]). In this query complexity model, there is an -queries algorithm for computing an -coarse correlated equilibrium of an game [GR16]. Hence our result is the first that separates approximate Nash and coarse correlated equilibrium in the query complexity model. See Table 1 for a summary of known bounds111We thank Yakov Babichenko for his help in understanding these connections and other insightful communication..

### 1.1 Background

Nash equilibrium is the central solution concept in game theory. It is named after John Nash who, more than 60 years ago, proved that every game has an equilibrium

[Nas51]. Once players are at an equilibrium, they do not have an incentive to deviate. However, Nash’s theorem does not explain how the players arrive at an equilibrium in the first place.

Over the last several decades, many dynamics, or procedures by which players in a repeated game update their respective strategies to adapt to other players’ strategies, have been proposed since Nash’s result (e.g., [Bro51, Rob51, KL93, HM03, FY06]). But despite significant effort, we do not know any plausible dynamics that converge even to an approximate Nash equilibrium. It is thus natural to conjecture that there are no such dynamics. However, one has to be careful about defining “plausible” dynamics. The first example of dynamics we consider implausible, are “players agree a priori on a Nash equilibrium”. The uncoupled dynamics model proposed by Hart and Mas-Collel [HM03] rules out such trivialities by requiring that a player’s strategy depends only on her own utility function and the history of other players’ actions. Another example of implausible dynamics that converge to a Nash equilibrium are exhaustive search dynamics that enumerate over the entire search space. (Exhaustive search dynamics can converge to an approximate Nash equilibrium in finite time by enumerating over an -net of the search space.) We thus consider a second natural desideratum, which is that dynamics should converge (much) faster than exhaustive search. Note that the two restrictions (uncoulpled-ness and fast convergence) are still very minimal—it is still easy to come up with dynamics that satisfy both and yet do not plausibly expect to predict players’ behavior. But, since we are after an impossibility result, it is fair to say that if we can rule out any dynamics that satisfy these two restrictions and converge to a Nash equilibrium, we have strong evidence against any plausible dynamics.

A beautiful observation by Conitzer and Sandholm [CS04] and Hart and Mansour [HM10] is that the communication complexity of computing an (approximate) Nash equilibrium, in the natural setting where each player knows her own utility function, precisely captures (up to a logarithmic factor) the number of rounds for an arbitrary uncoupled dynamics to converge to an (approximate) Nash equilibrium. Thus the question of ruling out plausible dynamics is reduced to the question of proving lower bounds on communication complexity. There are also other good reasons to study the communication complexity of approximate Nash equilibria; see e.g. [Rou14].

### 1.2 Related work

The problem of computing (approximate) Nash equilibrium has been studied extensively, mostly in three models: communication complexity, query complexity, and computational complexity.

##### Communication complexity.

The study of the communication complexity of Nash equilibria was initiated by Conitzer and Sandholm [CS04] who proved a quadratic lower on the communication complexity of deciding whether a game has a pure equilibrium, even for zero-one payoff (note that for pure equilibrium this also rules out any more efficient approximation). Hart and Mansour [HM10] proved exponential lower bounds for pure and exact mixed Nash equilibrium in -player games. Roughgarden and Weinstein [RW16] proved communication complexity lower bounds on the related problem of finding an approximate Brouwer fixed point (on a grid). In [BR17], in addition to the lower bound for two-player game which we improved, there is also an exponential lower bound for -player games. The same paper also posed the open problem of settling the communication complexity of approximate correlated equilibrium in two-player games; partial progress has been made by [GK17, KS17], but to date the problem of determining the communication complexity of -approximate correlated equilibrium remains open (even at the granularity of vs ). On the algorithmic side, Czumaj et al. [CDF16] gave a polylogarithmic protocol for computing a -approximate Nash equilibrium in two-player games, improving upon [GP14a].

##### Query complexity.

In the query complexity model, the algorithm has black-box oracle access to the payoff matrix of each player. Notice that this model is strictly weaker than the communication complexity model (hence our communication lower bound applies to this model as well). For the deterministic query complexity of -approximate Nash equilibrium in two-player games, Fearnley et al. [FGGS15] proved a linear lower bound, which was subsequently improved to (tight) quadratic by Fearnley and Savani [FS16]. For randomized algorithms, the only previous lower bound was , also by [FS16]; notice that this is only interesting for . As mentioned earlier, Goldberg and Roth [GR16] can find an approximate coarse correlated equilibrium with queries (two-player game) or queries (-player game with two actions per player). In the latter regime of -player and two action per player, a long sequence of works [HN16, Bab12, CCT17, Rub16] eventually established an lower bound on the query complexity of -approximate Nash equilibrium. (This last result is implied by and was the starting point for the exponential query complexity lower bound in [BR17]). Finally, some of the aforementioned result are inspired by a query complexity lower bound for approximate fixed point due to Hirsch et al. [HPV89] and its adaptation to -norm in [Rub16]; this construction will also be the starting point of our reduction.

##### Computational complexity.

For computational complexity, the problem of finding an exact Nash equilibrium in a two-player game is -complete [DGP09, CDT09]. Following a sequence of improvements [KPS09, DMP09, DMP07, BBM10, TS08], we know that a -approximate Nash equilibrium can be computed in polynomial time. But there exists some constant , such that assuming the “Exponential Time Hypothesis (ETH) for ”, computing an -approximate Nash equilibrium requires time [Rub16], which is essentially tight by [LMM03].

### 1.3 Definition of ϵ-Nash

A two-player game is defined by two utility functions (or payoff matrices) . A mixed strategy for Alice (resp. Bob) is a distribution () over the set of actions (). We say that is an -approximate Nash equilibrium (-ANE) if every alternative Alice-strategy performs at most better than against Bob’s mixed strategy , and the same holds with roles reversed. Formally, the condition for Alice is

 Ea∼Ab∼B[UA(a,b)] ≥ maxA′ over SAEa′∼A′b∼B[UA(a′,b)]−ϵ.

## 2 Technical Overview

Our proof follows the high-level approach of [BR17]; see also the lecture notes [Rou18] for exposition. The approach of [BR17] consist of four steps.

1. [noitemsep]

2. Query complexity lower bound for the well-known -complete EoL problem.

3. Lifting of the above into a communication lower bound for a two-party version of EoL.

4. Reduction from EoL to -BFP, a problem of finding an (approximate) Brouwer fixed point.

5. Constructing a hard two-player game that combines problems from both Step 2 and Step 3.

In this paper we improve the result from [BR17] by optimizing Steps 1, 2, and 4.

### 2.1 Steps 1–2: Lower bound for End-of-Line

The goal of Steps 1–2 is to obtain a randomized communication lower bound for the End-of-Line (or EoL for short) problem: Given an implicitly described graph on where a special vertex is the start vertex of a path, find an end of a path or a non-special start of a path. The following definition is a “template” in that it does not yet specify the protocols .

EoL template

• [leftmargin=2mm]

• Input:  Alice and Bob receive inputs and that implicitly describe successor and predecessor functions . Namely, for each there is a “low-cost” protocol to compute the pair .

• Output:  Define a digraph where iff and . The goal is to output a vertex such that either

• [noitemsep,topsep=0pt,label=]

• and is a non-source or a sink in ; or

• and is a source or a sink in .

The prior work [BR17] proved an lower bound for a version of EoL where the had communication cost . The cost parameter is, surprisingly, very important: later reductions (in Step 4) will incur a blow-up in input size—and hence a quantitative reduction in the eventual lower bound—that is exponential in . (Namely, when constructing payoff matrices in Step 4, the data defining a strategy for Alice will include a -bit transcript of some .)

In this work, we obtain an optimized lower bound:

###### Theorem 2.

There is a version of EoL with randomized communication complexity where the have constant cost .

Note:  Since we consider , a -bit transcript of an cannot even name arbitrary -bit vertices in . Thus we need to clarify what it means for to “compute” . The formal requirement is that the pair is some prescribed function of both and the -bit transcript . Concretely, we will fix some bounded-degree host graph independent of , and define graphs as subgraphs of . For example, we can let announce as “the -th out-neighbor of in ”, which takes only bits to represent.

As in [BR17], our lower bound is obtained by first proving an analogous result for query complexity, and then applying a lifting theorem that escalates the query hardness into communication hardness. A key difference is that instead of a generic lifting theorem [GLM16, GPW17], as used by [BR17], we employ a less generic, but quantitatively better one [HN12, GP14b].

##### Step 1: Query lower bound.

The query complexity analogue of EoL is defined as follows.

for host digraph

• [leftmargin=2mm]

• Input:  An input describes a (spanning) subgraph of consisting of the edges such that .

• Output:  Find a vertex such that either

• [noitemsep,topsep=0pt,label=]

• and or in ; or

• and or in .

We exhibit a bounded-degree host graph

such that any randomized decision tree needs to make

queries to the input in order to solve . Moreover, the lower bound is proved using critical block sensitivity (cbs), a measure introduced by Huynh and Nordström [HN12] that lower bounds randomized query complexity (among other things); see Section 3.1 for definitions.

###### Lemma 3.

There is a bounded-degree host graph such that .

It is not hard to prove an bound for a complete host graph (equipped with successor/predecessor pointers), nor an bound for a bounded-degree host graph (by reducing degrees in the complete graph via binary trees). But to achieve both an bound and constant degree requires a careful choice of a host graph that has good enough routing properties. Our construction uses butterfly graphs.

Prior to this work, a near-linear randomized query lower bound was known for a bounded-degree Tseitin problem [GP14b], a canonical -complete search problem. Since , our new lower bound is qualitatively stronger (also, the proof is more involved).

##### Step 2: Communication lower bound.

Let be a query search problem (e.g., ), that is, on input the goal is to output some such that . Any such can be converted into a communication problem via gadget composition. Namely, fix some two-party function , called a gadget. The composed search problem is defined as follows: Alice holds , Bob holds , and their goal is to find an such that where

 x \coloneqq gN(α,β) = (g(α1,β1),…,g(αN,βN)).

It is generally conjectured that the randomized communication complexity of is characterized by the randomized query complexity of , provided the gadget is chosen carefully. This was proved in [GPW17], but only for a non-constant-size gadget where Alice’s input is bits. This is prohibitively large for us, since we seek protocols of constant communication cost. We use instead a more restricted lifting theorem due to [GP14b] (building on [HN12]) that works for a constant-size gadget, but can only lift critical block sensitivity bounds.

###### Lemma 4 ([GP14b]).

There is a fixed gadget such that for any the randomized communication complexity of is at least .

Theorem 2 now follows by combining 3 and 4. We need only verify that the composed problem fits our EoL template. For consider the protocol that computes as follows on input :

1. [noitemsep]

2. Alice sends all symbols for incident to .

3. Bob privately computes all values for incident to .

4. Bob announces as the first out-neighbor of in the subgraph determined by if such an out-neighbor exists; otherwise Bob announces . Similarly for .

This protocol has indeed cost because is of bounded degree and is constant.

### 2.2 Step 3: Reduction to ϵ-Bfp

By Brouwer’s fixed point theorem, any continuous function has a fixed point, that is, such that . The BFP query problem is to find such a fixed point, given oracle access to . We will consider the easier -BFP problem, where we merely have to find an such is -close to .

A theorem of [Rub16] reduces Q-EoL to -BFP with . For our purposes, there are two downsides to using this theorem. First, it is a reduction between query complexity problems, which seems to undermine the lifting to communication we obtained in Step 2. (This obstacle was already encountered in [RW16] and resolved in [BR17].)

The second issue with [Rub16]’s reduction is that it blows up the search space. We can discretize to obtain a finite search space. But even if the discretization used one bit per coordinate (and in fact we need a large constant number of bits), the dimension is still larger than by yet another constant factor due to the seemingly-unavoidable use of error correcting codes. All in all we have a polynomial blow-up in the size of the search space, and while that was a non-issue for [Rub16, BR17], it is crucial for our fine-grained result.

Our approach for both obstacles is to postpone dealing with them to Step 4. But for all the magic to happen in Step 4, we need to properly set up some infrastructure before we conclude Step 3. Concretely, without changing the construction of from [Rub16], we observe that it can be computed in a way that is “local” in two different ways (we henceforth say that is doubly-local). Below is an informal description of what this means; see Section 4.4 for details.

• First, every point corresponds to a vertex from the host graph of the Q-EoL problem222In fact, each corresponds to zero, one, or two vertices from the host graph, where the two vertices are either identical or neighbors. For simplicity, in this informal discussion we refer to “the corresponding vertex”.. We observe that in order to compute , one only needs local access to the neighborhood of of the Q-EoL (actual, not host) graph. A similar sense of locality was used in [BR17].

• Second, if we only want to compute the -th coordinate of

, we do not even need to know the entire vector

. Rather, it suffices to know , the values of on a random subset of the coordinates, and the local information of the Q-EoL graph described in the previous bullet (including ). This is somewhat reminiscent of the local decoding used in [Rub16] (but our locality is much simpler and does not require any PCP machinery).

###### Theorem 5 (Q-EoL to ϵ-BFP, informal [Rub16]).

There is a reduction from Q-EoL over vertices to -BFP on a function , where is “doubly-local”.

### 2.3 Step 4: Reduction to ϵ-Nash

The existence of a Nash equilibrium is typically proved using Brouwer’s fixed point theorem. McLennan and Tourky [MT05] proved the other direction, namely that the existence of a Nash equilibrium in a special imitation game implies an existence of a fixed point. Viewed as a reduction from Brouwer fixed point to Nash equilibrium, it turns out to be (roughly) approximation-preserving, and thus extremely useful in recent advances on hardness of approximation of Nash equilibrium in query complexity [Bab16, CCT17, Rub16], computational complexity [Rub15, Rub16], and communication complexity [RW16, BR17].

In the basic imitation game, we think of Alice’s and Bob’s action space as , and define their utility functions as follows. First, Alice chooses that should imitate the chosen by Bob:

 UA(x(a);x(b))\coloneqq−∥∥x(a)−x(b)∥∥22.

Notice that Alice’s expected utility decomposes as

 Ex(b)[UA(x(a);x(b))]=−∥∥x(a)−E[x(b)]∥∥22−Var[x(b)],

where the second term does not depend on Alice’s action at all. This significantly simplifies the analysis because we do not need to think about Bob’s mixed strategy: in expectation, Alice just tries to get as close as possible to . Similarly, Bob’s utility function is defined as:

 UB(x(b);x(a))\coloneqq−∥∥f(x(a))−x(b)∥∥22.

It is not hard to see that in every Nash equilibrium of the game, .

For our reduction, we need to make some modifications to the above imitation game. First, observe that Bob’s utility must not encode the entire function —otherwise Bob could find the fixed point (or Nash equilibrium) with zero communication from Alice! Instead, we ask that Alice’s action specifies a vertex , as well as her inputs to the lifting gadgets associated to (edges adjacent to) . If is indeed the vertex corresponding to , Bob can use his own inputs to the lifting gadgets to locally compute (this corresponds to the first type of “local”).

The second issue is that for our fine-grained reduction, we cannot afford to let Alice’s and Bob’s actions specify an entire point . Instead, we force the equilibria of the game to be strictly mixed, where each player chooses a small (pseudo-)random subset of coordinates . Then, each player’s mixed strategy represents , but each action only specifies its restriction to the corresponding subset of coordinates. By the second type of “local”, Bob can locally compute the value of on the intersection of subsets. Inconveniently, the switch to mixed strategies significantly complicates the analysis: we have to make sure that Alice’s mixed strategy is consistent with a single , deal with the fact that in any approximate equilibrium she is only approximately randomizing her selection of subset, etc.

Finally, the ideas above can be combined to give an lower bound on the communication complexity (already much stronger than previous work). The bottleneck to improving further is that while we are able to distribute the vector across the support of Alice’s mixed strategy, we cannot do the same with the corresponding vertex from the EoL graph. The reason is that given just a single action of Alice (not her mixed strategy), Bob must be able to compute his own utility; for that he needs to locally compute (on some coordinates); and even with the doubly-local property of , that still requires knowing the entire . Finally, even with the most succinct encoding, if Alice’s action represents an arbitrary vertex, she needs at least actions. To improve to the desired lower bound, we observe that when Bob locally computes his utility he does have another input: his own action. We thus split the encoding of between Alice’s action and Bob’s action, enabling us to use an EoL host graph over vertices. (More generally, for an asymmetric game we can split the encoding unevenly.)

## 3 Critical Block Sensitivity of EoL

In this section, we define critical block sensitivity, and then prove 3, restated here: See 3 Our construction of will have one additional property, which will be useful in Section 5.

###### Fact 6.

In 3, we can take (where ) such that the labels of any two adjacent vertices differ in at most coordinates.

### 3.1 Definitions

Let be a search problem, that is, on input the goal is to output some such that . We call an input critical if it admits a unique solution. Let be some total function that solves the search problem, that is, for all . The block sensitivity of at input , denoted , is the maximum number of pairwise-disjoint blocks each of which is sensitive for , meaning where is but with bits in flipped. The critical block sensitivity of  [HN12] is defined as

 cbs(R) \coloneqq minf⊆Rmaxcritical xbs(f,x).

### 3.2 Unbounded degree

As a warm-up, we first study a simple version of the Q-EoL problem relative to a complete host graph, which is equipped with successor and predecessor pointers. The input is a string of length describing a digraph of in/out-degree on the vertex set . Specifically, for each , specifies a -bit predecessor pointer and a -bit successor pointer. We say there is an edge in iff is the successor of and is the predecessor of . The search problem is to find a vertex such that either

• [noitemsep,label=]

• and is a non-source or a sink in ; or

• and is a source or a sink in .

Let be any function that solves the search problem. Our goal is to show that there is a critical input such that .

##### Two examples.

It is instructive to first investigate two extremal examples of . What does  output on a bicritical input consisting of two disjoint paths (one starting at the special vertex )? Any function must make a choice between the unique canonical solution (end of the path starting at vertex 1) and the other two non-canonical solutions:

[t(-5mm),b(-2mm)]figs/bicritical(.45) [c]20,20; [c]140,40;canonical [c]200,40;non-canonical [c]280,40;non-canonical

Example 1: Suppose always outputs the canonical solution on a bicritical input. Consider any critical input (a single path starting at vertex 1) and define a system of disjoint blocks such that is but with the -th edge deleted (say, by assigning null pointers as the successor of and as the predecessor of to ).

[t(-4mm),b(-2mm)]figs/bicritical-canonical(.45) [c]40,20; [c]160,20; [c]200,20; [c]180,40;-th block

Each is sensitive for since is the end of the path whereas is the newly created canonical solution . This shows .

Example 2: Suppose always outputs a non-canonical solution on a bicritical input. Consider any critical input . Pair up the -th and the -th edge of . We use these edge pairs to form blocks (assume is even). Namely, is the bicritical input obtained from by shortcutting the -th pair of edges: delete the -th edge pair, call them and , and insert the edge (say, by assigning null pointers as the predecessor of and as the successor of , and making a predecessor–successor pair).

[t(-2mm),b(-2mm)]figs/bicritical-non-canonical(.45) [c]20,60; [c]220,60; [c]182,52; [c]180,40; [c]60,60; [c]100,60; [c]60,20; [c]100,20; [c]260,60; [c]300,60; [c]260,20; [c]300,20; [c]80,40;-th block

Each is sensitive for since is the end of the path whereas is one of the newly created non-canonical solutions and . This shows .

##### General case.

A general need not fall into either example case discussed above: on bicritical inputs,

can decide whether or not to output a canonical solution based on the path lengths and vertex labels. However, we can still classify any

according to which decision (canonical vs. non-canonical) it makes on most bicritical inputs. Indeed, we define two distributions:

• [itemsep=2pt]

• Critical: Let

be the uniform distribution over critical inputs. That is, generate a directed path of length

with vertex labels picked at random from (without replacement) subject to the start vertex being .

• Bicritical: Let be a distribution over bicritical inputs generated as follows: choose an even number uniformly at random, and output a graph consisting of two directed paths, one having vertices, the other having vertices. The vertex labels are a picked at random from  (without replacement) subject to the start vertex of the first path being .

Given a sample we can generate a sample by either deleting (Example 1) or shortcutting (Example 2) edges of . Specifically:

1. [noitemsep]

2. Deletion: Let and choose at random. Output .

3. Shortcutting: Let and choose at random. Output .

We have two cases depending on whether or not prefers canonical solutions on input

. Indeed, consider the probability

 Pry∼D2[f(y) is canonical] = ⎧⎨⎩  Prx∼D1,i∈{2,4,…,N−2}[f(xB\textsldeli) is canonical],  Prx∼D1,j∈[N/2−1][f(xB\textslcutj) is canonical].

Case “ ”: Here since is non-canonical for . By averaging, there is some fixed critical input such that . But this implies , as desired. Case “ ”: Here since is canonical for . By averaging, there is some fixed critical input such that . But this implies , concluding the proof (for unbounded degree).

### 3.3 Logarithmic degree

Next, we prove an query lower bound for where the host graph has degree . As a minor technicality, in this section, we relax the rules of the Q-EoL problem (as originally defined in Section 2.1) by allowing many paths to pass through a single vertex; we will un-relax this in Section 3.4. Namely, an input describes a subgraph of as before. The problem is to find a vertex such that either

• [noitemsep,label=]

• and in ; or

• and in .

##### Host graph.

For convenience, we define our host graph as a multigraph, allowing parallel edges. We describe below a simple bounded-degree digraph . The actual host graph is then taken as , , defined as the graph but with each edge repeated times.

The digraph is constructed by glueing together two buttefly graphs. The -th butterfly graph is a directed graph with layers, each layer containing vertices: the vertex set is and each vertex , , has two out-neighbours, and , where is but with the -th bit flipped. Let and be two copies of the -th butterfly graph. To construct , we identify the last layer of , , with the first layer of . Thus has altogether vertices where . We rename the first layer of (i.e., last layer of ) as and the remaining vertices arbitrarily so that .

[t(-3mm)]figs/butterfly(.4) [r]-7,62.5; [c]60,2; [c]140,2; [c]20,100; [c]20,75; [c]20,50; [c]20,25; [c]180,100; [c]180,75; [c]180,50; [c]180,25;

##### Oblivious routing.

To prove a critical block sensitivity lower bound we proceed analogously to the unbounded-degree proof in Section 3.2. The key property of that we will exploit is that we can embed inside any bounded-degree digraph on the vertex set . Namely, we can embed the vertices via the identity map , and an edge of as a -path in (left-to-right path in the above figure) in such a way that any two edges of map to edge-disjoint paths. Moreover, such routing can be done nearly obliviously: each path can be chosen independently at random, and the resulting paths can be made edge-disjoint by “local” rearrangements. Let us make this formal.

Define , where , as the uniform distribution over -paths in of the minimum possible length, namely . One way to generate a path is to choose a vertex from ’s middle layer (last layer of ) uniformly at random and define as the concatenation of the unique -path in and the unique -path in . Another equivalent way is to generate a random length- path starting from (in each step, choose a successor according to an unbiased coin) and then following the unique length- path from the middle layer to .

Let be a bounded-degree digraph. We can try to embed inside by sampling a collection of paths from the product distribution . The resulting paths are likely to overlap, but not by too much. Indeed, for an outcome , define the congestion of as the maximum over of the number of paths in that touch .

###### Claim 7.

Let be a bounded-degree digraph. With probability over the congestion is .

###### Proof.

We may assume that the in/out-degree of every vertex in is at most , since every bounded-degree graph is a union of constantly many such graphs. The congestion of the vertices on the last layer of is described by the usual balls-into-bins model, namely, many balls are randomly thrown into bins. It is a basic fact (e.g., [MU05, Lemma 5.1]) that the congestion of each of these vertices is  with probability at least . Similar bounds hold for vertices in the other layers, as they can be viewed as being on the last layer of some smaller butterfly graph. The claim follows by a union bound over all the vertices of . ∎

Suppose has congestion bounded by (which happens for whp by 7). There is a natural way to embed all the paths inside in an edge-disjoint fashion. Since each edge of is used by only paths, say , there is room to use the parallel edges corresponding to in to route the along distinct . Such a local routing is fully specified by some injection .

We are now ready to formalize how embeds into via edge-disjoint paths. Namely, embeds as a distribution over subgraphs of (and for failure) defined as follows.

1. [noitemsep,label=0.]

2. Sample .

3. If has congestion , then output ; otherwise:

1. [noitemsep,label=.]

2. embed randomly into by choosing all the injections uniformly at random;

3. output the resulting subgraph of .

Note that by 7.

##### Lower bound proof.

Let be a function that solves the Q-EoL search problem relative to host graph . Recall the distributions and over critical and bicritical graphs from Section 3.2. We can extend (or ) to a distribution over bicritical subgraphs of by defining , that is, is obtained by first sampling and then sampling from . The Q-EoL solutions of can be classified as canonical/non-canonical in the natural way (which respects the embedding). We again have two cases depending on whether prefers canonical solutions on input . That is, consider the probability

 Pry∼H2[y≠⊥andf(y) is canonical].

Case “ ”: Define a distribution as follows.

1. [noitemsep]

2. Sample , and then .

3. If , output ; otherwise:

1. [noitemsep,label=.]

2. Sample an even .

3. Output , that is, but with the -th path (image of -th edge of ) deleted.

###### Claim 8.

Distributions and are within in statistical distance.

###### Proof.

Given an (where and the embedding is according to ) we can generate a sample as follows: (1) if , output ; (2) otherwise, let be the unique edge such that is critical; (3) sample ; (4) if does not exceed the congestion threshold , output (which equals with the path embedded and then immediately deleted!); otherwise . Here we used the oblivious routing property: in embedding we can first embed all of and then the edge . By 7 we have that with probability , which proves the claim. ∎

From the definition of and 8, we have . By averaging, there is some fixed critical input such that . But this implies , as desired.

Case “ ”: Let be a distribution over graphs as illustrated below (for ) with vertex labels randomly chosen from subject to special vertex being as depicted:

[t(-3mm),b(-4mm)]figs/embedded(.4) [c]20,60;

Let be an embedding of a random graph from . Assuming , we write for the critical input that is the subgraph of consisting of (the embeddings of) the solid edges. Write also for the bicritical input consisting of minus its -th and -th edges plus the -th dashed edge. Note that is a system of pairwise-disjoint edge flips.

Define a distribution as follows.

1. [noitemsep]

2. Sample .

3. If , output ; otherwise:

1. [noitemsep,label=.]

2. Sample .

3. Output .

The following claim is proved analogously to 7.

###### Claim 9.

Distributions and are within in statistical distance.∎

From the definition of and 9, we have . By averaging, there is some fixed critical input such that . But this implies , concluding the proof (for logarithmic degree).

### 3.4 Constant degree

##### Reducing degree.

The digraph has in-degree and out-degree . It is easy to reduce this to a constant by replacing each vertex in with a bounded-degree graph that has connectivity properties similar to a complete bipartite graph between left vertices (corresponding to incoming edges) and right vertices (corresponding to outgoing edges). One way to construct such a graph is to start with a complete bipartite graph on and then replace each degree- vertex with a binary tree of height (assume this is an integer). This produces a layered graph with layers and vertices. Denote by the digraph resulting from replacing each vertex of with a copy of ; formally, this construction is known as the replacement product; see, e.g., [RVW02, 6.2]. We have so that , which is only a polylogarithmic blow-up.

##### Lower bound (sketch).

The critical block sensitivity lower bound in Section 3.3 extends naturally to the host graph . Indeed, every subgraph of consisting of edge-disjoint paths that we considered in Section 3.3 corresponds in a natural 1-to-1 way to subgraphs of consisting of vertex-disjoint paths. In particular, in Section 3.3 we allowed many paths to pass through a single vertex, but the natural mapping will now route at most one path through a vertex. We also interpret each isolated vertex in a subgraph of as having a self-loop, so that an isolated vertex does not count as a solution to . In this way, every solving induces an solving . This concludes the proof of 3.

##### Vertex labels.

Finally, we establish 6. Namely, we argue that for , the vertices of can be labeled with -bit strings having the difference property: the labels of any two adjacent vertices differ in at most coordinates. Since and vertices of are adjacent only if their and parts are adjacent, it suffices to label both and appropriately and then concatenate the labels.

The vertices of , viewed as , can be made to have the difference property by just encoding the index set using a Gray code. Hence it remains to label the vertices of with -bit strings for . We can view ; here, an index in indicates a layer; the first layer is , the last layer is . We can define adjacency similarly as in the butterfly graph so that is adjacent to only if the strings and differ in at most one position and . Moreover, we can encode the index set using a Gray code.

## 4 A Hard Brouwer Function

In this section we present and slightly modify a reduction due to [Rub16] from EoL (for host graph on ) to BFP, the problem of finding an approximate fixed point of a continuous function . The reader should think of the reduction as happening between the query variants of both problem, although we will use further properties of the construction of , as detailed in Section 4.4. The most important, and somewhat novel, part of this section is the latter Section 4.4, where we formulate the sense in which our hard instance of BFP is “local” and even “doubly-local”.

The construction has two main components: 11 shows how to embed an EoL graph as a collection of continuous paths in ; 12 describes how to embed a continuous Brouwer function whose fixed points correspond to endpoints of the paths constructed in 11.

### 4.1 Preliminaries

We use (respectively ) to denote the length- vectors whose value is () in every coordinate.

##### Constants.

This section (and the next) uses several arbitrary small constants that satisfy:

 0<ϵ\textscNash≪ϵ\textscPrecision≪ϵ\textscUniform≪ϵ\textscBrouwer≪δ≪h≪1.

By this we mean that we first pick a sufficiently small constant , and then a sufficiently smaller constant , etc. We will sometimes use the small constants together with asymptotic notation (e.g., ), by which we mean “bounded by ”, for an absolute constant (independent of , , etc.); in particular if then .

Although their significance will be fully understood later, we briefly sketch their roles here: is the approximation factor of Nash equilibrium; is the precision with which the players can specify real values (Section 5.2); is used in the analysis in of the hard game (Section 5.4) to bound distance-from-uniform of certain nearly-uniform distributions; every -Nash equilibrium corresponds to an -approximate fixed point (Section 5.4), whereas we prove that it is hard to find -approximate fixed points (Section 4); finally, in the construction of hard Brouwer functions, quantifies the size of special neighborhoods around special points (Section 4). In Section 5.3 we will define additional small constants and relate them to the constants defined here.

##### Norms.

We use normalized -norms: for a vector we define

 ∥x∥pp\coloneqqEi∈[n][(xi)p],

where the expectation is taken wrt the uniform distribution.

##### Partitioning the coordinates.

Let (where the implicit constant is eventually fixed in Section 5.1). Let and be tiny super-constants, e.g., . We consider two families and of subsets of . Every subset has cardinality exactly , and the intersection, for every “bichromatic” pair of subsets satisfies .

To construct the subsets we think of the elements of as entries of a matrix. Each (resp. ) is a collection of columns (resp. rows). Notice that this guarantees the cardinality and intersection desiderata.

Specifically, we consider a -wise independent hashing of into buckets of equal size. By standard constructions (e.g., using low-degree polynomials [Kop13, Example 7]), this can be done using random bits. Consider all possible buckets ( outcomes of the randomness buckets for each). For , we let (resp. ) be the union of columns (resp. rows) in the -th bucket. This ensures that a random correspond to a -wise independent subset of columns.

##### A concentration bound.

The following Chernoff-type bound for

-wise independent random variables is proved in

[SSS95, Theorem 5.I].

###### Theorem 10 ([Sss95]).

Let be -wise independent random variables, and let and . Then

 Pr[∣∣n∑i=1xi−μ∣∣>δμ] ≤ e−Ω(min{k,δ2μ}).

### 4.2 Embedding with a code

Let be some sufficiently small constant (we later set ). For convenience of notation we will construct a function (instead of ); in particular, now the vertices of the discrete hypercube are interior points of our domain.

###### Lemma 11.

We can efficiently embed an EoL graph over as a collection of continuous paths and cycles in , such that the following hold:

• Each edge in corresponds to a concatenation of a few line segments between vertices of ; we henceforth call them Brouwer line segments and Brouwer vertices.

• The points on any two non-consecutive Brouwer line segments are -far.

• The points on any two consecutive Brouwer line segments are also -far, except near the point where the two Brouwer line segments connect.

• Every two consecutive Brouwer line segments are orthogonal.

• Given any point , we can use the EoL predecessor and successor oracles to determine whether is -close to any Brouwer line segment, and if so what is the distance to this Brouwer line segment, and what are its endpoints.

• There is a one-to-one correspondence between endpoints of the embedded paths and solutions of the EoL instance.

For point