# Generalizing the Sharp Threshold Phenomenon for the Distributed Complexity of the Lovász Local Lemma

Recently, Brandt, Maus and Uitto [PODC'19] showed that, in a restricted setting, the dependency of the complexity of the distributed Lovász Local Lemma (LLL) on the chosen LLL criterion exhibits a sharp threshold phenomenon: They proved that, under the LLL criterion p2^d < 1, if each random variable affects at most 3 events, the deterministic complexity of the LLL in the LOCAL model is O(d^2 + log^* n). In stark contrast, under the criterion p2^d ≤ 1, there is a randomized lower bound of Ω(loglog n) by Brandt et al. [STOC'16] and a deterministic lower bound of Ω(log n) by Chang, Kopelowitz and Pettie [FOCS'16]. Brandt, Maus and Uitto conjectured that the same behavior holds for the unrestricted setting where each random variable affects arbitrarily many events. We prove their conjecture, by providing an algorithm that solves the LLL in time O(d^2 + log^* n) under the LLL criterion p2^d < 1, which is tight in bounded-degree graphs due to an Ω(log^* n) lower bound by Chung, Pettie and Su [PODC'14]. By the work of Brandt, Maus and Uitto, obtaining such an algorithm can be reduced to proving that all members in a certain family of functions in arbitrarily high dimensions are convex on some specific domain. Unfortunately, an analytical description of these functions is known only for dimension at most 3, which led to the aforementioned restriction of their result. While obtaining those descriptions for functions of (substantially) higher dimension seems out of the reach of current techniques, we show that their convexity can be inferred by combinatorial means.

• 25 publications
• 16 publications
• 17 publications
08/17/2019

### A Sharp Threshold Phenomenon for the Distributed Complexity of the Lovasz Local Lemma

The Lovász Local Lemma (LLL) says that, given a set of bad events that d...
03/30/2021

### The randomized local computation complexity of the Lovász local lemma

The Local Computation Algorithm (LCA) model is a popular model in the fi...
12/02/2021

### Near-Optimal Lower Bounds For Convex Optimization For All Orders of Smoothness

We study the complexity of optimizing highly smooth convex functions. Fo...
02/26/2019

### An Automatic Speedup Theorem for Distributed Problems

Recently, Brandt et al. [STOC'16] proved a lower bound for the distribut...
11/12/2021

### Moser-Tardos Algorithm: Beyond Shearer's Bound

In a seminal paper (Moser and Tardos, JACM'10), Moser and Tardos develop...
10/25/2021

### Parameterized Convexity Testing

In this work, we develop new insights into the fundamental problem of co...
01/09/2020

### How to trap a gradient flow

We consider the problem of finding an ε-approximate stationary point of ...

## 1 Introduction

### 1.1 Background

The Lovász Local Lemma is a celebrated result from 1975 due to Erdős and Lovász [EL75], with applications in many types of problems such as coloring, scheduling or satisfiability problems [AS08, Bec91, CPS17, CS00a, CS00b, EPS15, HSS10, LMR99, Mos09]. It states the following.

###### Lovász Local Lemma (LLL).

Let be a set of mutually independent random variables and probabilistic events that depend on the . For each , let denote the random variables depends on. We say that and share a random variable if . Assume that there is some such that for each , we have , and let be a positive integer such that each shares a random variable with at most other (where ). Then, if , there exists an assignment of values to the random variables such that none of the events occurs.111We note that the LLL criterion guaranteeing the existence of the desired variable assignment is not optimal and has been subject to improvements by Spencer [Spe77] and Shearer [She85].

The LLL can be seen as a generalization of the well-known fact that for any set of independent

events that all occur with probability strictly less than

, the probability that none of the events occurs is non-zero: some amount of dependency between the events is tolerable for preserving the avoidance guarantee—how much exactly depends on the parameter that bounds the occurrence probabilities of the events.

While being an indispensable tool for applying the probabilistic method, the LLL, in its original form, is of limited usefulness if seen from an algorithmic standpoint, as it gives a purely existential statement and does not provide a method for finding such an assignment to the random variables. The underlying algorithmic question of computing such an assignment, called the algorithmic (or constructive) LLL (problem) received considerable attention in a series of papers [Alo91, CS00a, MR98, Mos08, Mos09, Sri08], starting with Beck [Bec91] in the 90s, and culminating in a breakthrough result by Moser and Tardos [MT10] in 2010. The latter work showed that an assignment to the random variables that avoids all events can be found quickly by a simple resampling approach. Moreover, this approach is easily parallelizable, and implies a (randomized) distributed algorithm that finds the desired assignment in rounds of communication in a distributed setting w.h.p.222As usual, w.h.p. stands for “with probability at least ”. for the LLL criterion . In the following, we take a closer look at the distributed version of the algorithmic LLL, the main topic of this work.

#### The Distributed LLL

Let an instance of the LLL be given by mutually independent random variables and events satisfying some LLL criterion that guarantees existence of an assignment avoiding all events. The distributed version of the LLL is commonly phrased using the notion of the so-called dependency graph. In the dependency graph of an LLL instance, the events are the nodes, and there is an edge between two events if the two events share a variable. Each node is aware of and knows for exactly which combinations of values for the random variables in the event occurs. As before, the task is to find an assignment to the variables such that none of the events occurs. To specify the output, each node has to output a value for each variable it depends on, and any two nodes outputting a value for the same random variable have to agree on the value.

We will consider the LLL in the LOCAL model333The communication graph for the LLL is the dependency graph. For details regarding the LOCAL model, we refer to Section 2.1. of distributed computing [Lin92, Pel00], in which the LLL has been the focus of a number of important works in recent years (see Section 1.3 for an in-depth overview). One particularly intriguing result underlining the importance of the LLL was given by Chang and Pettie [CP17]: they show that any problem from a very natural problem class, called locally checkable labelings444Roughly speaking, these are problems for which the correctness of the global solution can be verified by checking the correctness of the output in the local neighborhood of each node., that has sublogarithmic randomized complexity also admits a randomized algorithm that solves it in time , where denotes the randomized complexity of the LLL under a polynomial criterion (i.e., a criterion of the form for an arbitrarily large constant ).

#### The LLL criterion

As can be expected, the complexity of solving the algorithmic LLL depends on the chosen LLL criterion. Strengthening the LLL criterion, i.e., restricting555There is some disagreement about whether this should be called a strengthening or weakening of the LLL criterion. We will use the same (sensible) terminology as the closest work to ours by Brandt, Maus and Uitto [BMU19], to make it easier to relate the results. the set of allowed LLL instances by making fewer pairs satisfy the criterion, clearly can only reduce the complexity of the LLL problem, but it is a major open question precisely how the LLL complexity relates to the chosen criterion. At which points in strengthening the LLL criterion does the asymptotic complexity of the LLL change and do we obtain smooth or sharp transitions between the different complexities?

While it is known, due to a result by Chung, Pettie and Su [CPS17], that rounds are required for any LLL criterion, the only lower bounds known so far that could possibly be used to obtain a separation between the complexities for different criteria are an lower bound for randomized algorithms by Brandt et al. [BFH16], and an lower bound for deterministic algorithms by Chang, Pettie and Kopelowitz [CKP16], which both hold even under the strong criterion . It is natural to ask whether any further strengthening of the LLL criterion breaks the lower bound or whether the lower bound can be extended to stronger criteria.

Very recently, Brandt, Maus and Uitto [BMU19] showed that if we restrict the random variables to affect at most events each (which they call rank at most ), then already under the minimally strengthened criterion , there is a deterministic LLL algorithm with a complexity of . They conjectured that this behavior also holds without their restriction on the variables.

###### Conjecture 1.1 ([Bmu19], rephrased).

There is a (deterministic) distributed algorithm that solves the LLL problem in time under the criterion .

### 1.2 Contributions and Techniques

In this work, we prove Conjecture 1.1, by providing such a deterministic algorithm. This gives a first (unrestricted) answer to the aforementioned question about the relation between the LLL criterion and the complexity of the LLL: a sharp transition occurs at the criterion , where the complexity of the LLL drops from randomized, resp.  deterministic, to . Moreover, our upper bound is tight on bounded-degree graphs due to the lower bound by Chung, Pettie and Su [CPS17]. Finally, as is the nature of upper bounds for the LLL, our result immediately implies the same upper bound for all problems that can be phrased as an LLL problem with criterion , such as certain hypergraph edge-coloring problems or orientation problems in hypergraphs (see [BMU19]).

#### Previous Techniques

Our work builds on techniques developed in [BMU19]. In their work, Brandt, Maus and Uitto obtain an -round LLL algorithm under the criterion for the case of variables of rank at most . In the following, we give an informal overview of their approach.

The basic idea of the algorithm is to go sequentially through all variables and fix them to some values one by one while preserving a certain invariant that makes sure that the final assignment avoids all events. In order to define the invariant, each edge of the dependency graph is assigned two non-negative values, one for each endpoint of the edge, that sum up to at most . When fixing a random variable, the algorithm is also allowed to change these “book-keeping” values. The invariant now states that for any node in the dependency graph, the product of the values around multiplied by is an upper bound for the conditional probability of the event associated with node to occur (where we naturally condition on the already-fixed random variables being fixed as prescribed by the (partial) value assignments performed by the algorithm so far). If this invariant is preserved, then, after all variables are fixed, each event occurs with probability at most , and therefore with probability , as desired.

Brandt, Maus and Uitto do not only show that such a sequential process preserving the invariant at all times exists (even if the order in which the random variables have to be fixed are chosen adversarially), but also that it can be made to work in a local manner: in order to fix a random variable, the algorithm only needs to know the random variables and edge values in a small local neighborhood. This allows to process random variables that affect events that are sufficiently far from each other in the dependency graph in parallel. By adding an -round preprocessing step to the algorithm where a -hop node coloring with colors is computed in the dependency graph, the sequential fixing process can then be parallelized by iterating through the color classes in a standard way, yielding the desired runtime of rounds. We will provide a more detailed overview of the algorithm from [BMU19] in Section 2.2.

The crucial, and rather surprising, observation making the algorithm work is that in each step in which a random variable is fixed, the existence of a value for that random variable that preserves the invariant is guaranteed if a certain function is shown to be convex on some domain. Hence, proving the existence of the desired algorithm is reduced to solving an analytical problem for a fixed function , providing a very intriguing connection between distributed algorithms and analysis. To be precise, Brandt, Maus und Uitto show that for any integer , there is a fixed function on some domain satisfying the following property: if is convex, then for any rank- random variable, there is a value that this variable can be fixed to such that the invariant is preserved. By proving the convexity of , they prove the desired upper bound for the case of variables of rank at most .666Taking care of the case of rank- and rank- variables is comparably easy.

One of the main problems with extending this proof to arbitrary ranks is that the function is only given in an indirect way, by a characterization of the set of points in that lie below and on the function. No closed-form expression describing is known for any , and the relatively compact form of the function for the case is arguably due to the cancellation of certain terms that do not cancel out in higher dimensions. In fact, none of the ways to obtain from the characterization of the mentioned point set seems to yield any closed-form expression if adapted to higher dimensions, and even if a closed-form expression for all were found in some way, it is far from clear that proving convexity of these functions would be feasible.

#### New Techniques

We overcome this obstacle by showing that, perhaps surprisingly, even without any analytical access to the functions , we can infer their convexity for all . In the following we give an informal overview of our approach. Our main idea is to prove convexity of —or equivalently, convexity of the set bounded by

from below— by finding a so-called locally supporting hyperplane for each point

on . More precisely, for each such

, we want to find a number of vectors such that the following two properties hold:

1. The affine subspace of spanned by the vectors and containing is a hyperplane, i.e., an affine subspace of dimension .

2. In an -ball around , the hyperplane is contained in the set consisting of all points on and below .

These properties ensure convexity of in ; however, a priori it is completely unclear how to find such vectors. In order to obtain these vectors, we consider the combinatorial description of the points on and below that is tightly connected to the aforementioned invariant: Consider a hyperedge of rank and write two non-negative values that sum up to at most on each edge of the skeleton of the hyperedge (i.e., a clique induced by the hyperedge) one value for each endpoint of the edge. For each endpoint of the hyperedge multiply the values belonging to the endpoint, and consider the -dimensional vector obtained by collecting the resulting products. The points that can be generated in this way are exactly the (non-negative) points that lie on or below .

For each such point , call the tuple of the values written on the edges that generate in the above description a generator of ; a point can have (and usually has) more than one generator. Roughly speaking, we find the desired vectors for a point by picking an arbitrary generator and, for each edge in the skeleton of the hyperedge, computing the vector by which changes if we subtract some small from one value on and add it to the other. A crucial insight is that it is fine to pick such a large set of vectors: due to the specific construction, one can show that the affine subspace spanned by these vectors and containing is -dimensional. Moreover, the redundancy contained in this choice enables us to prove Item 2 by finding, for each on the hyperplane in an -ball around , a way to write as a linear combination of of these vectors that satisfies certain desirable properties.

Note that we will use terminology that does not refer to the convexity of the function as we do not make use of this function from an analytical perspective. Instead, we will aim for the equivalent goal of showing that the set bounded by from below is convex, by making use of its combinatorial description.

### 1.3 Further Related Work

Following the resampling approach of Moser and Tardos [MT10], many of the results for the distributed LLL were based on randomized algorithms. The bounds given in the following hold w.h.p. The bound of for the algorithm by Moser and Tardos [MT10] is due to steps in which variables are resampled, where in each step a maximal independent set (MIS) is computed in rounds in order to perform the resampling in a conflict-free manner. By showing that a weaker variant of an MIS is sufficient for this purpose, Chung, Pettie and Su [CPS17] obtained an upper bound of , for the same LLL criterion . In turn, by improving the computation of such a weak MIS from to , Ghaffari [Gha16] improved this bound to .

In the aforementioned work, Chung, Pettie and Su also showed that faster algorithms can be obtained if the LLL criterion is strengthened: under the criterion , they provide an algorithm running in time , and under an exponential criterion, i.e., a criterion of the form where is exponential in , they give an upper bound of . For LLL instances with , Fischer and Ghaffari [FG17] provided a -round algorithm under the criterion . Ghaffari, Harris and Kuhn [GHK18] improved on this result by showing that for any integer , there is an LLL algorithm running in time under the criterion , where and represent a power tower and the iterated logarithm, respectively. Finally, Rozhon and Ghaffari [RG20] proved, as one of the many implications of their recent breakthrough in computing network decompositions, that on bounded-degree graphs a variable assignment avoiding all events can be found in rounds under the criterion , closing in on a conjecture by Chang and Pettie [CP17] stating that rounds are sufficient.

The latter three works [FG17, GHK18, RG20] also provide the first non-trivial deterministic algorithms for the distributed LLL. The currently best known upper bound by Rozhon and Ghaffari [RG20] (for polynomial criteria) states that rounds suffice under the criterion , for any constant .

## 2 Preliminaries

### 2.1 Model

The model in which we study the LLL is the LOCAL model of distributed computing [Lin92, Pel00]. In the LOCAL model, we usually want to solve a graph problem, but unlike in centralized computation, the actual computation is performed by the nodes of the input graph. To this end, each node of the input graph is considered as a computational entity, and each edge as a communication link over which the entities can communicate. The computation proceeds in synchronous rounds, where in each round two things happen: first, each node sends an arbitrarily large message to each of its neighbors and then, after the messages have arrived, each node can perform an arbitrarily complex internal computation. Each node has to decide at some point that it terminates and then it must output its local part of the global solution to the given problem—in the case of the LLL problem this local part is the values of all random variables the associated event depends on. The runtime of a distributed algorithm is the number of rounds until the last node terminates.

### 2.2 The Reduction

In this section, we will give a detailed explanation of the argumentation presented in [BMU19] that reduces proving the existence of an -round distributed deterministic LLL algorithm under the criterion to showing that a certain family of sets or functions is convex. The blueprint for such an algorithm is given as follows.

Consider an instance of the LLL, given by a set of mutually independent random variables and a set of events that depend on the random variables. Consider the dependency graph of this instance, and denote the event associated with a vertex by , and the maximum degree of by . Let be a parameter such that each event occurs with probability at most , and assume that , i.e., fix the LLL criterion to . As any two events that depend on the same variable are neighbors of each other in , we can create for each random variable a hyperedge that has the nodes such that depends on as endpoints. Technically, the hyperedges are not part of , but for simplicity, we might consider them as such.

Algorithm starts by computing a -hop coloring with colors in rounds, by applying the coloring algorithm by Fraigniaud, Heinrich and Kosowski [FHK16] to , i.e., to the graph obtained by connecting any two nodes of distance at most in by an edge. Then, it iterates through the colors one by one, and each time a color is processed, each node of color fixes each incident random variable (i.e., each random variable whose corresponding hyperedge is incident to ) that has not been fixed so far. We will see that in order to fix all incident random variables of a node in a suitable way, rounds suffice, and as there are colors, algorithm runs in rounds.

The challenging part is to fix the random variables in a manner such that the produced final assignment is correct, i.e., such that none of the events occurs under the assignment. To this end, during the fixing process the authors keep track of, roughly speaking, how favorable or unfavorable the variable fixings performed so far were for the nodes (regarding avoiding the associated event), by assigning two values to each edge. More precisely, they assign a non-negative value to each pair for which is incident to . We can imagine the two values and to be written on edge ; each time a random variable is fixed by a node, the node also updates the values that are written on the edges in the skeleton of the hyperedge corresponding to .

The purpose of these edge values w.r.t. obtaining a correct output in the end of the process is to define a property that is kept as an invariant during the fixing process and guarantees that the final assignment avoids all events. Consider an arbitrary point in the fixing process where some random variables already have been fixed to some values , respectively. Property is satisfied if the following two conditions hold.

1. for each edge .

2. for each node .

If Property is satisfied when all variables have been fixed, then for each event we have a bound of for the probability that occurs, which implies that does not occur since the probability of it occurring can only be or . By initializing each value to , the authors make sure that is satisfied when algorithm starts. The crucial insight in [BMU19] is that there is always a way to preserve Property each time a random variable is fixed if a certain function or set is convex. For the precise statement, the authors introduce the notion of a representable triple.

###### Definition 2.1 (Definition 3.3 of [Bmu19]).

A triple is called representable if there are values such that , , , , , and . Let denote the set of all representable triples.

Using this definition, the authors prove the following statement for the case of rank- random variables (which we give in a reformulated version using the notion of convexity instead of the concept of “incurvedness” used in [BMU19]).

If is a convex set, then there is a way to fix any given random variable of rank at most at any point in time during the algorithm (or, more generally, for any arbitrary fixing of already fixed random variables such that Property is satisfied) such that Property is preserved. Moreover, the only information required to fix is the set of values written on the edges that belong to the skeleton of the hyperedge corresponding to . We refer to [BMU19, Section 3.3] for the details of the proof.

Hence, in algorithm , each node that has the task to fix all its incident unfixed random variables can simply collect all edge values written on edges between nodes in its inclusive -hop neighborhood, and then go through its incident random variables one by one, each time finding a value for the random variable in question that preserves Property . As the sequential fixing does not require any communication after obtaining the required edge values, fixing all incident unfixed variables of a node can be done in rounds. Moreover the local nature of and the fact that the set of edge values required and rewritten by a node during the fixing does not intersect with the set of analogous edge values for a node in distance at least ensures that any two nodes with the same color in the computed -hop coloring can perform the variable fixing in parallel. This concludes the description of the reduction.

As already noted by the authors, the definitions and proofs (for the reduction to the convexity statement) generalize straightforwardly to the case of random variables of arbitrary rank. However, showing that the convexity of the respective set indeed holds for higher dimensions remained unanswered in [BMU19]; and indeed, even given our resolution, it remains unclear and would be interesting to see whether their analytical approach can feasibly be extended to higher dimensions than . To be precise, their approach extends in the following way: to prove the existence of the deterministic algorithm in the case that each random variable affects at most events, it suffices to prove that the set is convex, where is the set of all representable tuples, which are tuples that can be generated by some generator, as defined below.

###### Definition 2.2 (generator).

We call a vector with coordinates a generator if for each we have and . The generator generates the -dimensional tuple with for . We call a generator non-zero, if none of its coordinates is . We use a shorthand notation and denote the generator simply as .

Note that if is a non-zero generator, then for each .

###### Definition 2.3 (representable tuples).

A tuple is called representable if there exists a generator that generates it. Let denote the set of all representable tuples.

Note that , as we require instead of . We consider this scaled version, as this makes the proof cleaner later on: note that being convex directly implies that is convex as the latter is just a scaled variant of the former set. In the following, we drop the superscripts when clear from context and we denote with the set of representable tuples with respect to the scaled down version and as the set of points in which are not representable. Our main contribution is the proof of the following theorem.

###### Theorem 2.4.

For every , is convex.

This settles Conjecture 1.1 as described above.

## 3 Proving that Snon is convex

In this section we prove that set is convex, omitting two longer proofs that are postponed to Section 4 and Section 5.

### 3.1 Notation

We work with the standard Euclidean space where distances are measured with the Euclidean norm; 0 and 1 denote the vectors and , respectively. We define as the closed ball around with radius . A subset is open if for any , there exists such that . A subset is closed if is open. A set is bounded if there exists such that . A set is compact if it is closed and bounded. Equivalently, is compact if every sequence with each has a subsequence that converges to some . The subset is compact. The interior of a set is an open subset of and defined as . The boundary of a set is defined as . A set is path-connected if for any there exists a continuous function such that and .

A hyperplane is an affine subspace of dimension . Equivalently, it is a set of points for some vector and . A weakly supporting hyperplane for intersecting is a hyperplane with and for any . Finally, a weakly locally supporting hyperplane for intersecting is a hyperplane with satisfying the following property: there exists an such that for any we have .

### 3.2 Proof

Convexity of a set can be verified in several equivalent ways. As we outlined in Section 1.2, we rely on the “supporting hyperplane formulation”, i.e., a set is convex if for each boundary point we can find a hyperplane such that the whole set lies on one side of the hyperplane. Moreover, for connected sets, it is enough to prove that each such hyperplane is “locally” supporting as formalized in the following theorem, which is stated in a more general form in [Val75] (Theorem 4.10 there).

###### Theorem 3.1.

Let be an open and path-connected set in . The set is convex if for every point contained in the boundary of , there exists a weakly locally supporting hyperplane with respect to going through .

Note that Theorem 3.1 can only be used to prove convexity of open sets and thus cannot directly applied to establish the convexity of . Instead, we use Theorem 3.1 to first establish convexity of the interior of , which is an open set and which we denote by . Once we have established the convexity of , we prove the convexity of by induction on the dimension . To prove convexity of , we need to show that is path-connected and that for every boundary point of , there exists a weakly locally supporting hyperplane going through the boundary point. We now prove the former, using the following simple observation, which will be used in several other proofs.

###### Observation 3.2.

Let be a representable tuple. Then any tuple with for all is also representable.

###### Proof.

Consider a generator of . For any , pick some and set . Set all other values in equal to the corresponding value in . is a valid generator generating the tuple . ∎

Now, we are ready to prove that is path-connected.

###### Lemma 3.3.

The set is path-connected.

###### Proof.

For any , consider the vector with for every . Note that the union of the two segments between and and between and is a path. Moreover, any tuple on this path is contained in and either dominates or . Hence, by Observation 3.2, each tuple on the path is in . ∎

Next, we need to understand the boundary between and . To do so, it will be helpful to prove that is closed. As is bounded, this is equivalent to show that is compact.

###### Lemma 3.4.

The set is compact.

###### Proof.

The set is defined as an image of a continuous function that maps each generator (Definition 2.2) from the compact set of all generators to the corresponding representable tuple. Hence, it is compact as an image of a compact set under continuous function is always compact. ∎

Next, we set up the notion of maximal tuples.

###### Definition 3.5 (domination and maximal tuples).

Let and be two representable tuples. We say that weakly dominates if for all , and . Moreover, we say that strongly dominates if for all . We call a representable tuple maximal if there is no representable tuple that weakly dominates .

Intuitively, maximal tuples are forming the boundary between and and this is indeed what we prove.

###### Lemma 3.6.

Let be contained in . Then, there either exists such that or is a maximal representable tuple.

We defer the easy, yet slightly technical proof, together with proofs of a few other technical lemmas, to Section 4. Our main technical contribution is a proof that a locally supporting hyperplane can be found for any maximal tuple .

###### Lemma 3.7.

For each maximal representable tuple , there exists a locally supporting hyperplane for intersecting .

The non-trivial proof of the above lemma is deferred to Section 5. As a corollary, we infer that the whole set is convex.

###### Corollary 3.8.

The set is convex.

###### Proof.

By Theorem 3.1 it suffices to provide a weakly locally supporting hyperplane for any . By Lemma 3.6, any is either a maximal representable tuple and hence the existence of the supporting hyperplane follows from Lemma 3.7, or we have or , respectively, for some . But then the hyperplane or , respectively, is a weakly (locally) supporting hyperplane for intersecting . ∎

The proof of Theorem 2.4 now easily follows.

###### Proof of Theorem 2.4.

We prove the statement by induction on . For , the statement trivially holds. Now, let arbitrary and assume that is convex. Let and be arbitrary. We need to show that for we have . As is a closed set (Lemma 3.4), there exists some with such that and, hence, since the whole segment is contained in , and . Furthermore, there exists an such that .

If , then, by Corollary 3.8, it follows that and we are done. Otherwise, or . Without loss of generality, assume that . Since , Lemma 3.6 implies that there exists some with . Our choice of now implies that either or .

In the first case, as is not representable, there exists some with . Therefore, and as , any generator of would need to have and , a contradiction with . Hence, .

In the second case, assume without loss of generality that . Let , , be equal to the vectors , and restricted to the first coordinates. We have , since otherwise taking their generator and augmenting it by zeros would generate or , respectively. As is a convex combination of and , the induction hypothesis implies that and therefore , which concludes the induction step. ∎

## 4 Technical preparation

In this section we prove several technical results that are needed for the proof. First, we prove the equivalence of the notions of weak and strong dominance. To this end, we first show a simple “continuity” statement that shows that for any representable tuple , one can increase all but one of its coordinates a little bit at the expense of decreasing the remaining one.

###### Lemma 4.1.

Let be a representable tuple with for each . For each , there exist an and a such that for all with , the tuple defined by and for is also representable.

###### Proof.

Let be a generator of . As for each , is a non-zero generator. Now, for some , consider with

 bij=⎧⎪⎨⎪⎩aij−δ if i=kaij+δ if j=kaij otherwise

for each . We have for each . Furthermore, if we choose such that , we have for each . In that case, is a valid generator that generates a tuple such that:

 bk=∏j≠kbkj=∏j≠k(akj−δ)≥⎛⎝∏j≠kakj⎞⎠−δf((aij))=ak−δf((aij))

for some function with . Note that such a function exists, as and therefore for each . For each , we have:

 bi =∏j≠ibij=(aik+δ)⋅∏j∉{i,k}aij=ai+δ⋅∏j∉{i,k}aij ≥ai+δ⋅∏j≠iaij=ai+δai

Set , and . Now, consider some arbitrary with . The definition of implies that . Thus, we can represent a tuple with and for . This tuple dominates the tuple . As for each , Observation 3.2 implies that we can represent . ∎

Now we are ready to show that if a tuple is weakly dominated by some other tuple, it is also strongly dominated by (a potentially different) one.

###### Corollary 4.2 (strong vs weak domination).

Let be a representable tuple such that for all we have . If there exists a representable tuple that weakly dominates , then there also exists a representable tuple that strongly dominates .

###### Proof.

Let be a representable tuple that weakly dominates . Note that we have for each . Let such that . According to Lemma 4.1, there exists and such that for all , the tuple is representable. For small enough, this tuple strictly dominates the tuple . ∎

We are now ready to prove Lemma 3.6.

###### Proof of Lemma 3.6.

We show the contrapositive. Let such that for each and is not a maximal representable tuple. We show that this implies the existence of a ball around with radius such that either or , which in turn implies .

For there is clearly such a ball. Otherwise, we have , as we assume that for each , . If , but is not maximal representable, there is a representable tuple that weakly dominates and as , Corollary 4.2 provides a representable tuple that strongly dominates . Hence, there exists some such that the tuple is a representable tuple. This implies that for we have due to Observation 3.2 and, hence, as needed. Finally, in the case Lemma 3.4 implies that the complement of is open which in turn implies the existence of an so that . For we then have , as needed. ∎

## 5 Construction of hyperplanes

In this section we prove our main technical contribution: Lemma 3.7 that states that for each maximal tuple we can find a locally weakly supporting hyperplane for the set . First, we give an informal proof of this result for the case , which captures the intuition behind the general proof for all that we give later.

### 5.1 Informal outline for r=3

Our main observation is that finding a locally supporting hyperplane comes down to proving that a certain set of tuples in the neighbourhood of is representable. In this section, we denote the tuples, more intuitively, as triples. So, we now focus on how to generate triples similar to .

#### Generating more triples

Given a representable triple generated by the generator , what other triples close to are representable? Certainly, all the triples that dominates. Besides, we can play with the generator itself. Adding to and subtracting it from gives us again a valid generator that generates triples of the form

 (a13(a12+α12),(a21−α12)a23,a31a32)=a+α12(a13,−a23,0)

I.e., it generates triples on the line

 a+α12(a13,−a23,0)=a+α12w12,α12∈R

for small enough. Similarly, by adding to and subtracting it from , we can generate triples on the line

 a+α13(a12,0,−a32)=a+α13w13,α13∈R

and by adding to and subtracting it from , we can generate triples on the line

 a+α23(0,a21,−a31)=a+α23w23,α23∈R

in some neighborhood around the triple . We call the three lines and .

Since all components of the generator of are nonzero, these three lines define an affine subspace of dimension at least two. Later we prove that if is a maximal representable triple, then the three lines lie on a common plane. The plane spanned by and then becomes an obvious suspect for the supporting hyperplane we wish to find!

In fact, we prove that not only triples on the lines and are representable, given that they lie in some small neighborhood around the maximal representable triple , but any triple in the affine span of the three lines is representable, provided that for some positive that depends on . This finishes our proof, as we can now find a weakly locally supporting hyperplane for each maximal representable triple .

We now prove that for maximal triples all three lines lie in a common plane and all triples in that plane are representable (if they are close enough to ).

###### Claim 5.1.

For a maximal triple , the affine hull of and is a plane.

Assume the contrary. Then, there exist , such that . Now, change the values of proportional to the values of to obtain the generator with

 a′12 =a12+ξα12,a′21=a21−ξα12; a′13 =a13+ξα13,a′31=a31−ξα13; a′23 =a23+ξα23,a′32=a32−ξα23.

Intuitively, we expect these changes to give us a generator of . This is almost the case:

 a′ =(a′12a′13,a′21a′23,a′31a′32) =((a12+ξα12)(a13+ξα13),(a21−ξα12)(a23+ξα23), (a31−ξα13)(a32−ξα23)) =(a12a13,a21a23,a31a32)+ξα12(a13,−a23,0) +ξα13(a12,0,−a32)+ξα23(0,a21,−a31) +ξ2(α12α13,−α12α23,α13α23) =a+ξα12w12+ξα13w13+ξα23w23 +ξ2(α12α13,−α12α23,α13α23) =a+ξ(1,1,1)+ξ2(α12α13,−α12α23,α13α23)

Choosing small enough, we conclude that the triple is representable and therefore is not maximal, a contradiction!

#### Generating triples on the plane

We are given a maximal triple and some in the affine hull of and that is sufficiently close to . We need to prove that is representable. To do so, we first note that as is contained in the affine hull, there exist and such that