    # Gaussian discrepancy: a probabilistic relaxation of vector balancing

We introduce a novel relaxation of combinatorial discrepancy called Gaussian discrepancy, whereby binary signings are replaced with correlated standard Gaussian random variables. This relaxation effectively reformulates an optimization problem over the Boolean hypercube into one over the space of correlation matrices. We show that Gaussian discrepancy is a tighter relaxation than the previously studied vector and spherical discrepancy problems, and we construct a fast online algorithm that achieves a version of the Banaszczyk bound for Gaussian discrepancy. This work also raises new questions such as the Komlós conjecture for Gaussian discrepancy, which may shed light on classical discrepancy problems.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Introduction and overview of the paper

In this work we introduce a probabilistic relaxation of the classical combinatorial discrepancy problem that we call Gaussian discrepancy. In this section, we first briefly survey discrepancy theory and formally define our relaxation. Then we discuss our main results which consist of (i) sharp comparisons of Gaussian discrepancy to previously studied relaxations, and (ii) a fast algorithm for online Gaussian discrepancy. We conclude this overview with some open problems and an outline of the remainder of the paper.

### 1.1 Discrepancy theory and Gaussian discrepancy

#### 1.1.1 Background on combinatorial discrepancy

Discrepancy theory is a rich area of mathematics which has both inspired the development of novel tools for its study, and found numerous applications in a variety of fields such as combinatorics, computer science, geometry, optimization, and statistics; see the textbooks matousek; chazelle. In one of its most fundamental forms, discrepancy asks the following question: given an matrix , determine the value of the discrepancy of , defined as

 \disc(A) :=minσ∈{±1}n\normAσ∞=minσ∈{±1}nmax1≤i≤m\abs(Aσ)i. (1)

This question can be interpreted in terms of vector balancing: if denote the columns of , then we are looking for a signing, that is, a vector of signs , that makes the signed sum have small entries.

A seminal result in this area, due to spencer and independently gluskin1989extremal (see also giannopoulous1997vectorbalancing), states that

 \disc(A)≤6√n (2)

when and the entries of are bounded in magnitude by . This remarkable result is the best possible, up to the constant factor , and should be compared with the discrepancy incurred by a signing chosen uniformly at random which is of order . A far-reaching extension of this result is the Komlós conjecture (spencer).

[Komlós conjecture] There exists a constant such that for any matrix whose columns have Euclidean norm at most , it holds that

 \disc(A)≤K.

This conjecture remains one of the most important open problems in the field, and the best known bound for the Komlós problem, due to banaszczyk, yields .111Here . The Komlós conjecture contains as a special case the long-standing Beck-Fiala conjecture which states that if has -sparse columns in , then (BecFia81).

The original proofs of these results are non-constructive in the sense that they do not readily yield efficient algorithms for computing signings which achieve these discrepancy upper bounds. In the last decade, considerable effort was devoted to matching these upper bounds algorithmically. Starting with the breakthrough work of Ban10, there are now a number of algorithmic results matching Spencer’s bound  (LovMek12; HarSchSin14; LevRamRot17; Rot17; EldSin18). The task of making Banaszczyk’s bound algorithmic was more challenging, and it was settled in the last few years in a line of works (BanDadGar16; LevRamRot17; BanDadGarLov18; DadGarLov19).

Recently, online discrepancy minimization  (BanSpe20; bansal2020online; alweiss2020discrepancy; BanJiaMek21; LiuSahSawhney21) has seen increasing interest and has led to a new perspective on Banaszczyk’s result. In the oblivious online setting, an adversary picks in advance vectors , each with Euclidean norm at most . During each round , the algorithm receives the vector , and it must output a sign . The goal of the algorithm is to minimize the maximum discrepancy incurred at any time, i.e. the quantity . alweiss2020discrepancy conjecture the following online version of the Banaszczyk bound.

[Online Banaszczyk] There exists a randomized algorithm for online balancing in the oblivious adversarial setting that with high probability achieves the bound

 maxt∈[T]∥∥t∑s=1σsvs∥∥∞≲√log(mT),

for any sequence of vectors with Euclidean norm at most . Such a result would be nearly optimal, as a lower bound of is established in bansal2020online.

#### 1.1.2 Gaussian discrepancy and the coupling perspective

Motivated by the aforementioned longstanding conjectures, recent algorithmic progress, and the goal of shedding new light on the classical discrepancy minimization problem, in this work we introduce a novel relaxation called Gaussian discrepancy. Our route to Gaussian discrepancy is through an alternative perspective on the discrepancy objective (1) based on couplings of random variables.

Recall that a Rademacher random variable is distributed uniformly on . A coupling of Rademacher random variables is a random vector for which each marginal distribution is Rademacher. Since a signing and its negative

achieve the same discrepancy, the uniform distribution on the set of optimal signings furnishes an

optimal Rademacher coupling that minimizes the right-hand-side of (1) in expectation. Precisely, it holds that

 \disc(A)=min{\E\normAσ∞:Pr(σj=−1)=Pr(σj=+1)=12 for all j∈[n]}, (3)

where the minimization above is over couplings of Rademacher random variables.222Such a minimization can be seen as an instance of a multimarginal optimal transport problem (pass2015mot; AltBoi21).

The coupling perspective plays an important role in discrepancy theory and its applications. The recent algorithmic proof of Banaszczyk’s theorem by BanDadGarLov18 relies on the equivalence between Banaszczyk’s theorem333More precisely, in this statement we refer to the general result of banaszczyk about balancing vectors to lie in a convex body ; the case where is the scaled ball corresponds to the combinatorial discrepancy.

and the existence of sub-Gaussian distributions supported on

, as established by DadGarLov19 . Their algorithm, known as the Gram-Schmidt walk, focuses on correlating the entries of in order to control the sub-Gaussian constant of . Further, in applications of discrepancy theory such as randomized control trials, it is important to output not only a single signing but rather a distribution over low-discrepancy signings for the purpose of inferring treatment effects (KriAzrKap19; TurMekRig20; HarSavSpiZha19).

To construct our relaxation, we replace the Rademacher distribution with the standard Gaussian distribution in the coupling interpretation (3) of combinatorial discrepancy. The Gaussian discrepancy is defined to be

 \discG(A)=min{E\normAg∞:gj∼\normal(0,1) for all j∈[n],g1,…,gn jointly% Gaussian}, (4)

where

denotes the standard normal distribution. Equivalently, the minimization is over covariance matrices

for the random vector that lie in the elliptope , the set of all positive semidefinite matrices whose diagonal is the all-ones vector. Unlike Rademacher couplings which require, a priori, an exponential number of parameters to describe, joint Gaussian couplings admit a compact description that is completely determined by the covariance matrix .

Let us record the simple observation that Gaussian discrepancy is indeed a relaxation of combinatorial discrepancy.

For any matrix , it holds that Given an optimal signing for , let for , where is a standard Gaussian (equivalently, the covariance matrix of is ). Then,

 \discG(A) ≤\E\normAg∞=\normAσ∞\E\absξ=√2/π\disc(A).

As a consequence, results on ordinary discrepancy, such as Spencer’s theorem and Banaszczyk’s theorem, immediately translate into bounds on the Gaussian discrepancy.

### 1.2 Notation

We write , , and . We write for the Gaussian distribution with mean and covariance matrix . denotes the unit sphere centered at the origin in , and denotes the set of symmetric positive semidefinite matrices. For we write if . For vectors we write for the norm of , while for matrices , denotes the induced operator norm from to . We write for both the Euclidean inner product between vectors and the Frobenius inner product between matrices. For two sequences of positive real numbers we write if there exists a constant with for all sufficiently large , and write if . Thus, is a synonym for . The identity matrix is denoted by , and the all-ones vector is denoted by . We define the elliptope to be .

### 1.3 Results

#### 1.3.1 Comparisons between relaxations of discrepancy

In this section we develop an understanding of Gaussian discrepancy by comparing it to the vector discrepancy and spherical discrepancy relaxations of combinatorial discrepancy. Let us first define these notions and describe some results known about them.

The vector discrepancy relaxation replaces the signs in combinatorial discrepancy (1) with unit vectors . Formally,

 \discv(A) :=min{maxi∈[m]∥∥n∑j=1Ai,juj∥∥2:u1,…,un∈Sn−1}. (5)

This problem is recast as a semidefinite program over the elliptope by constructing the Gram matrix, for , of the unit vectors ; see Niko13 for more details.

Given a signing and , we can associate to it the unit vectors . From this we see that vector discrepancy is indeed a relaxation of discrepancy:

 \discv(A)≤\disc(A).

Vector discrepancy has been highly influential in discrepancy theory. It led to the initial algorithm of Ban10 for Spencer’s theorem, which uses a random walk guided by vector discrepancy solutions (see also BanDadGar16), as well as the recent constructive proof of Banaszczyk’s theorem (BanDadGarLov18).444BanDadGarLov18 remark that their Gram-Schmidt algorithm was “…inspired by the constructive proof of DadGarLov19 for the existence of solutions to the Komlós vector coloring program of Nikolov.” Vector discrepancy has also been studied in its own right in the work of Niko13 which provides an in-depth analysis of this relaxation using SDP duality.

Spherical discrepancy was more recently introduced by jonesmcpartlon2020spheericaldisc. It is obtained by relaxing the space of solutions in (1) from to the sphere of radius . Formally,

 \discs(A) :=minx∈√nSn−1\normAx∞. (6)

jonesmcpartlon2020spheericaldisc prove sharp bounds on spherical discrepancy in the setting of Spencer’s theorem ( for all ) and the Komlós conjecture ( for all ) and apply their results to derive lower bounds for certain sphere covering problems.

With these definitions in hand, it is natural to wonder about the relationships between the various relaxations of discrepancy. For example, which relaxation gives the best approximation to combinatorial discrepancy? Our first result, whose proof we defer to Section 2, provides a unifying probabilistic perspective on the three relaxations of discrepancy and leads to a straightforward comparison.

For any matrix , it holds that

 \discv(A) ≍min{ maxi∈[m]\E\abs(Ag)i ∣∣g∼\normal(0,Σ),Σ∈\mbbSn+,\diagΣ=\mb1n}, (7) \discs(A) ≍min{ \Emaxi∈[m]\abs(Ag)i ∣∣g∼\normal(0,Σ),Σ∈\mbbSn+,\trΣ=n}. (8)

Here, is the set of symmetric positive semidefinite matrices.

For comparison, we recall that

 \discG(A)

Hence, relaxes the objective of Gaussian discrepancy by moving the maximum over rows outside of the expectation, whereas relaxes the constraint of Gaussian discrepancy from to . In comparison, from Proposition 1.1.2, the usual definition of discrepancy can be understood as adding a constraint to Gaussian discrepancy, namely that . The following relationship between the notions of discrepancy is an immediate consequence.

For any matrix ,

 \discv(A)∨\discs(A) ≲\discG(A)≲\disc(A).

In words, Gaussian discrepancy is a tighter relaxation of discrepancy than vector discrepancy and spherical discrepancy. Moreover, we show via a suite of examples in Section 2 that none of the inequalities in Corollary 1.3.1 can be reversed up to constant factor and that spherical discrepancy and vector discrepancy are incomparable in general.

Although Gaussian discrepancy is always larger than the vector discrepancy, the two relaxations are in fact closely related. Indeed, their common feasible set is the elliptope, which consists of the Gram matrices of collections of unit vectors or equivalently the covariance structures of feasible Gaussian couplings. The connection between these two relaxations yields our main tools for bounding Gaussian discrepancy as well as an -factor approximation algorithm for computing Gaussian discrepancy.

Our main inequality for controlling Gaussian discrepancy relies on a notion of rank-constrained vector discrepancy, defined as follows. For every rank , let

 \discvr(A) :=min{maxi∈[m]∥∥n∑j=1Ai,juj∥∥2:u1,…,un∈Sr−1}. (9)

Equivalently we require the Gram matrix to have rank at most . Since a rank- matrix corresponds precisely to a signing, note that

 \discv(A)=\discvn(A) ≤\discvn−1(A)≤⋯≤\discv1(A)=\disc(A).

The next result compares Gaussian discrepancy with this rank-constrained problem and is a crucial ingredient in our study of online Gaussian discrepancy.

For any , it holds that

Let be an optimal solution for . Then, if denotes a standard Gaussian vector in , we can define for , and it is easily checked that this is feasible for the definition of Gaussian discrepancy. Hence,

 (10)

The last inequality uses the standard probabilistic fact that . When , this can be improved to , which recovers Proposition 1.1.2.

Applying a similar strategy but using the union bound instead of Cauchy–Schwarz shows that vector discrepancy upper bounds Gaussian discrepancy up to a logarithmic term.

It holds that

Let be an optimal solution for , and let . As in the proof of Proposition 1.3.1,

 \discG(A) ≤\Emaxi∈[m]∣∣⟨n∑j=1Ai,juj,ξ⟩∣∣.

The random variable is sub-Gaussian with parameter . Thus the standard maximal inequality for sub-Gaussian random variables (see e.g. boucheronlugosimassart2013concentration, Theorem 2.5) yields

 \discG(A) ≤√2ln(2m)maxi∈[m]∥∥n∑j=1Ai,juj∥∥2=√2ln(2m)\SDPn(A). (11)

We record three useful consequences of Proposition 1.3.1. When combined with Theorem 1.3.1 it implies that Gaussian discrepancy is approximated by vector discrepancy up to a logarithmic factor. Since vector discrepancy admits an SDP formulation, it can be computed in polynomial time using methods from convex optimization (nesterov2018cvxopt), and in turn this yields our claimed -factor approximation algorithm for Gaussian discrepancy.

Second, Proposition 1.3.1 allows us to draw a new connection between spherical discrepancy and vector discrepancy. Combining this result with Corollary 1.3.1 and Proposition 1.3.1, we obtain the following. For any matrix , it holds that As described in detail in Section 2, this inequality is tight and also may not be reversed — in fact it is possible to have while (see Example 2.2). We remark that this sharp result falls out naturally with Gaussian discrepancy as a mediator between spherical and vector discrepancy.

Finally, Proposition 1.3.1 leads to a simple algorithm that achieves Banaszczyk’s bound for Gaussian discrepancy. The main result of Niko13 states that the Komlós conjecture holds for vector discrepancy. Hence, a solution to the SDP formulation of vector discrepancy yields a feasible covariance matrix for Gaussian discrepancy with objective value .555This can be further refined to using standard reductions. Note that since Gaussian discrepancy is a relaxation, existing approaches for combinatorial discrepancy also achieve Banaszczyk’s bound for Gaussian discrepancy with much more involved algorithms (BanDadGar16; LevRamRot17; BanDadGarLov18).

This raises the questions of whether or not the Komlós conjecture (Conjecture 1.1.1) or online Banaszczyk (Conjecture 1.1.1) can be proved for Gaussian discrepancy. More broadly, it is natural to ask whether unresolved conjectures about combinatorial discrepancy can be solved for a given relaxation. Indeed, Niko13 and jonesmcpartlon2020spheericaldisc establish the Komlós conjecture for vector and spherical discrepancy, respectively. While we are not able to establish a Gaussian version of the Komlós conjecture, it serves as a tantalizing question for further study; see Section 1.4 for further discussion. On the other hand, our main algorithmic result establishes a Gaussian discrepancy variant of the online Banaszczyk bound, as discussed in the next section.

#### 1.3.2 Online Banaszczyk bound for Gaussian discrepancy

We begin by recalling the setting of online discrepancy minimization with an oblivious adversary. First, an adversary picks vectors of norm at most . Then, in each round the vector is revealed to the algorithm, which must output a sign . The aim of the algorithm is to minimize the maximum discrepancy incurred, i.e. . Equivalently, in each round the algorithm is required to choose a vector of signs that obeys the following consistency condition:

 σ(t)[t−1]=σ(t−1), (12)

where denotes the restriction of to the first coordinates. That is, the new signing chosen in round is forbidden from changing the signs specified in previous rounds.

The latter formulation motivates our Gaussian discrepancy variant of the online discrepancy problem. At each round , we require the algorithm to output a -dimensional correlation matrix which satisfies a consistency condition analogous to (12), namely

 Σ(t)[t−1]×[t−1]=Σ(t−1). (13)

Thus after round , the correlations between the first coordinates must remain fixed. The aim is still to minimize the maximum discrepancy incurred, i.e. , where . A simple argument based on the Cholesky decomposition (see Appendix B for details) shows that any such sequence can be realized as a sequence of Gram matrices of initial segments of a sequence of unit vectors . Hence, it is equivalent to require that the algorithm output a unit vector at each round , and then set . We arrive at the following formulation of online Gaussian discrepancy. As in online combinatorial discrepancy, we allow the algorithm to be randomized.

[Online Gaussian discrepancy with oblivious adversary] An adversary selects vectors with norm at most in advance. At each round , the algorithm observes and outputs a random unit vector . The algorithm aims to minimize

 \discG(v1,…,vT;Σ\rpT)=maxt∈[T]\E∥∥t∑s=1gsvs∥∥∞,g∼\NN(0,Σ\rpT)

with high probability over the random correlation matrix .

We show in Appendix B that, without loss of generality, it suffices to consider whose support is a subset of the first coordinates.

One strategy for generating feasible couplings in each round is to first fix a rank parameter in advance and output a unit vector in each round. Our main result is an algorithm of this form that solves the Gaussian discrepancy variant of the online Banaszczyk conjecture (Conjecture 1.1.1).

[Online Banaszczyk bound for Gaussian discrepancy] Let denote vectors of norm at most selected in advance by an adversary. For all positive integers , there is a randomized online algorithm that with probability at least outputs such that

 \discG(v1,…,vT;Σ\rpT)=O(√log(mT/δ)),

where . The algorithm runs in time per round.

The proof of Theorem 1.3.2 is the main content of Section 3. If we could prove this theorem with , then the online Banaszczyk problem (Conjecture 1.1.1) would follow from the proof of Proposition 1.1.2. Unfortunately, there is an obstruction preventing us from considering , as described further below. Nevertheless, Theorem 1.3.2 shows that we are ‘one rank away’ from establishing Conjecture 1.1.1.

Our algorithm is based on an intriguing idea in the recent paper LiuSahSawhney21

: namely, if one can find a Markov chain on

whose increments take values in and whose stationary distribution is a Gaussian, then it is possible to construct an algorithm, which they call the Gaussian fixed-point walk, with the property that each partial sum is the difference of two Gaussian vectors. The online Banaszczyk bound would then follow from a union bound. However, LiuSahSawhney21 exhibit a parity obstruction which implies no such Markov chain exists on , and this leads them to consider instead Markov chains whose increments lie in or (and hence the resulting algorithms output partial colorings or improper colorings). We show that an analogous Markov chain does exist on for any whose increments lie in and whose stationary distribution is a Gaussian with independent and identically distributed coordinates (see Figure 1). Working with higher ranks avoids certain technical complications resulting in the partial coloring version derived by LiuSahSawhney21 and allows for a simple algorithm and analysis. Figure 1: (X\rpt)t≥0 denotes the Markov chain on \R2 with unit vector steps and Gaussian stationary distribution. (a) Evolution of (X\rpt)t≥0 over two runs of the Markov chain with 2000 steps. The orange and blue trajectories are initialized inside and outside the unit circle, respectively. (b) Scatter plot of X\rp100 over 5000 independent runs of the Markov chain started from the stationary distribution X\rp0∼\NN(0,σ2I2), where σ=0.5.

Our algorithm also has consequences for the previously unexplored problem of online vector discrepancy. Perhaps surprisingly, Theorem 1 below shows that the Komlós conjecture, with sharp constant, is attainable for vector discrepancy even in the oblivious online setting. In contrast, bansal2020online exhibit an lower bound for online combinatorial discrepancy in the oblivious setting. The next result is an immediate consequence of Theorem 3.3 proved in Section 3.

[Online Komlós bound for vector discrepancy] Let denote vectors of norm at most selected in advance by an adversary. Fix . When run with rank , the algorithm outputs online with

 maxt∈[T]maxi∈[m]∥∥t∑s=1(vs)ius∥∥2≤1+ε

with probability at least . The algorithm runs in time per round.

As a corollary of the above theorem, we obtain a new proof of the vector Komlós theorem of Niko13, which states that for all matrices whose columns have norm at most . Since the vector discrepancy of the identity matrix is , our result is essentially sharp.

Although we present our algorithm for the online setting, it yields new algorithmic implications for the offline setting as well. In particular, our runtime is nearly linear in the input size for moderate values of the approximation parameter . This improves on the time complexity of previous algorithms for vector Komlós such as off-the-shelf SDP solvers, which require arithmetic operations (nesterov2018cvxopt), and an iterative approach of DadGarLov19 that naïvely runs in time due to costly matrix inversion steps.666

We do not take into account fast matrix multiplication for these runtime estimates. We also note that an SDP solver would in general yield the stronger guarantee

for .

Furthermore, setting in Theorem 1 implies that the Komlós conjecture holds for rank-constrained vector discrepancy with . To the best of our knowledge this result was previously unknown even in an offline setting.

### 1.4 Open problems

Our work leads to some natural open questions which we briefly describe below.

1. Komlós conjecture for Gaussian discrepancy. Does there exist a constant such that for any matrix whose columns have norm at most ?

Since Gaussian discrepancy is a relaxation, solving this conjecture is a natural prerequisite for establishing the original Komlós conjecture (Conjecture 1.1.1). It also leads to a related problem: prove that the Komlós conjecture for Gaussian discrepancy implies the Komlós conjecture.

2. Rounding Gaussian discrepancy solutions. Does there exist an efficient rounding scheme to convert Gaussian discrepancy solutions to low-discrepancy signings?

In Appendix C, we show that two simple rounding schemes (Goemans-Williamson rounding and a PCA-based rounding) are not effective in the setting of Spencer’s theorem and the Komlós conjecture.

3. Computational tractability of Gaussian discrepancy. Given a matrix, can we compute its Gaussian discrepancy exactly or approximately in polynomial time? In particular, is it NP-hard to approximate Gaussian discrepancy up to a constant factor? We note that hardness of approximation results are already known for combinatorial discrepancy (ChaNewNik11) and spherical discrepancy (jonesmcpartlon2020spheericaldisc).

4. Achieving Banaszczyk’s bound online. Finally, we mention that the question of achieving Banaszczyk’s bound online for combinatorial discrepancy, as originally posed by alweiss2020discrepancy, is still open.

### 1.5 Organization of the paper

In Section 2.1, we give the proof of our main comparison result between the relaxations of discrepancy (Theorem 1.3.1). In Section 2.2, we provide a suite of examples showing that the inequalities given in Section 1.3.1 are sharp. Finally, in Section 3, we present our algorithm and prove that it achieves the online Banaszczyk bound for Gaussian discrepancy.

## 2 Comparisons between relaxations of discrepancy

In this section we prove Theorem 1.3.1 and give a number of illustrative examples that highlight the differences between the various relaxations of discrepancy. These examples show that none of the inequalities in Corollary 1.3.1 can be reversed in general (at least with constants independent of and ). Moreover, our results in this section give evidence for the following assertion: vector discrepancy and spherical discrepancy each capture distinct aspects of the original discrepancy problem, namely SDP solutions are “aligned with the coordinate axes”, whereas spherical discrepancy solutions are “low rank”. Gaussian discrepancy appears to capture both of these aspects simultaneously.

This assertion can already be partially justified by observing that the feasible covariance matrices for vector discrepancy satisfy

, i.e. the variance along each of the coordinate axes is fixed to be

. On the other hand, from the proof of Theorem 1.3.1 below, we see that a spherical discrepancy solution gives rise to the covariance matrix , which is indeed low rank but is not guaranteed to have unit variance along each coordinate axis.

### 2.1 Proof of main comparison result

[Proof of Theorem 1.3.1] Vector discrepancy: Given , let denote a centered Gaussian vector with covariance matrix . Observe that

 =√maxi∈[m]⟨Ai,:,ΣAi,:⟩. (14)

Therefore

 \discv(A)=minΣ∈\mcEn√maxi∈[m]⟨Ai,:,ΣAi,:⟩ (⋆)≍minΣ∈\mcEnmaxi∈[m]\E∣∣n∑j=1Ai,jgΣj∣∣=minΣ∈\mcEnmaxi∈[m]\E\abs(AgΣ)i,

where uses Lemma A and Cauchy–Schwarz.

Spherical discrepancy: Let us temporarily denote

 \on˜\msfdisc\msfs(A) :=min{\E\normAg∞:g∼\normal(0,Σ),Σ∈\mbbSn+,\trΣ=n}.

Let be a Gaussian that achieves the minimum in the definition of . Then, by Markov’s inequality,

 Pr{\normAg∞≥3\on˜\msfdisc\msfs(A)} ≤13. (15)

From Lemma A, we further have

 Pr{\normg2≤c\E\normg2}≤13%and\E\normg2≳√\E\normg22 (16)

for some constant . From (16) and (15), there exists a realization of with

 \normA\msfg∞≤3\on˜\msfdisc\msfs(A),and\norm\msfg2≥c\E\normg2≳√\E\normg22=√n.

It follows that

 \discs(A) ≤∥∥A√n\msfg\norm\msfg2∥∥∞=√n\norm\msfg2\normA\msfg∞≲\on˜\msfdisc\msfs(A).

Conversely, let be an optimal solution for and let be a standard Gaussian variable on , so and . Then,

 \on˜\msfdisc\msfs(A)≤\E\normA(ξx)∞ =\E\absξ\normAx∞=√2π\discs(A),

which establishes the converse bound.

### 2.2 Tightness of the comparison results

Our first example demonstrates the sharpness of Corollary 1.3.1 with respect to both and .

For this example we first consider an infinitely tall matrix . Let the rows of consist of all unit vectors in . By taking the vector discrepancy solution and using (14), we see that

 \discv(A) ≤√maxa∈Sn−1⟨a,Σa⟩=1.

(In fact it is not hard to see that this is an equality.) However, for any we have

 maxa∈Sn−1\abs⟨a,x⟩ =\normx2=√n,

which shows that . Thus, the spherical discrepancy can be much larger than vector discrepancy.

Although this example used an infinitely tall matrix , we can modify it by taking the rows of to consist of a -net of . Then, the number of rows of can be taken to be , and a standard argument involving nets (see the proof of vershynin2018highdimprob, Lemma 4.4.1) shows that we still have and . In particular, this shows that the bound

 \discs(A) =O(√n∧logm)\discv(A)

obtained in Corollary 1.3.1 is sharp with respect to both and .

The next example shows that Corollary 1.3.1 cannot be reversed, even with a constant depending on and , so that vector discrepancy cannot in general be controlled by spherical discrepancy.

Let be a unit vector and suppose that the rows of consist of all vectors in

 \mcA :=Sn−1∩v⊥,

the set of unit vectors which are orthogonal to (as in the preceding example, this example can also be modified to a matrix with finitely many rows). Then, . On the other hand, we claim that for most choices of . Suppose to the contrary that there exist unit vectors witnessing the fact that . Then, for each ,

 0

where denotes the vector . The inequality implies that is a multiple of , i.e., there exists a scalar such that for all . Since is a unit vector,

 1 =\normuj22=v2jn∑k=1c2k,for all j∈[n].

In order for this to hold, each coordinate of must have the same magnitude, i.e.  must be a signing (scaled by ). Hence, for most directions we have but .

This shows in particular that there does not exist such that for all matrices , even if the constant is allowed to depend on and .

The two preceding examples show that the statements and do not hold with universal constants . In particular, since by Corollary 1.3.1, it also implies that the statements and also do not hold with universal constants and .

We give another example to show that in general, Gaussian discrepancy can indeed be smaller than combinatorial discrepancy.

Consider the case when , so that consists of a single row, and assume that the entries of are i.i.d. standard Gaussians. Then, the discrepancy of is usually referred to as the number balancing problem.

Let us suppose that is a multiple of , and divide the coordinates into groups consisting of coordinates each. Due to concentration of measure, the sums

 ℓ1:=n/3∑i=1\absai,ℓ2:=2n/3∑i=n/3+1\absai,andℓ3:=n∑i=2n/3+1\absai

will concentrate around their expected value . In particular, with high probability we will have , which says that , , and form the side lengths of a triangle in . If we let denote unit vectors corresponding to the sides of these triangles, then

 0 =n/3∑i=1\absaiu1+2n/3∑i=n/3+1\absaiu2+n∑i=2n/3+1\absaiu3 =n/3∑i=1ai(\sgnai)u1+2n/3∑i=n/3+1ai(\sgnai)u2+n∑i=2n/3+1ai(\sgnai)u3.

This shows that with high probability, and hence by Proposition 1.3.1.

On the other hand, it is well-known that (see e.g. KarKarLue86; Cos09; TurMekRig20). In particular, there does not exist a constant (even if the constant is allowed to depend on and ) such that .

In summary, if we require the constants to be universal, then in general none of the inequalities in Corollary 1.3.1 can be reversed, and furthermore vector discrepancy and spherical discrepancy are incomparable.

In this section, we have argued that vector discrepancy and spherical discrepancy capture distinct aspects of the original discrepancy problem, and can therefore be viewed as complementary. It is then natural to ask, whether a combination of vector discrepancy and spherical discrepancy can control Gaussian discrepancy. This is also motivated by the results of Niko13 and jonesmcpartlon2020spheericaldisc that prove the Komlós conjecture for vector discrepancy and spherical discrepancy respectively; hence, a control on the Gaussian discrepancy in terms of these two notions would imply the Komlós conjecture for Gaussian discrepancy as well. We however resolve this question negatively via the following example.

Let the rows of consist of all unit vectors in orthogonal to . Then, similarly to the previous examples, we have and . On the other hand, for any feasible Gaussian coupling ,

 \E\normAg∞=\Emaxa∈Sn−1∩e⊥1\abs⟨a,g⟩=\E\norm\proje⊥1g2 (⋆)≳√\E[\norm(0,g2,…,gn)22]=√n−1,

where the inequality uses Lemma A. Hence, . This in fact shows that there is no non-trivial function which is independent of both and , such that

 \discG(A)≤f(\discv(A),\discs(A))

for all matrices .

## 3 Gaussian fixed-point walk in higher dimensions

### 3.1 High-level overview

We begin by summarizing the Gaussian fixed-point walk as introduced in LiuSahSawhney21. Recall the setting of online discrepancy minimization with oblivious adversary: the adversary chooses in advance vectors and at each time step , the vector is revealed to the algorithm, upon which it must then choose a sign . The loss incurred by the algorithm is the maximum discrepancy .

Let denote the partial sum vector with initialization . The idea behind the Gaussian fixed-point walk is to choose the signs to ensure that for all (observe that if this holds, then the loss incurred by the algorithm is easily controlled via a union bound). Towards this end, we may assume as an inductive hypothesis that . Then, we may write

 wt =\projv⊥twt−1∼\NN(0,Im−vtv⊥t)+[\projvtwt−1∼\NN(0,vtv⊥t)+σtvt].

If we choose independently of , then the two terms in the above decomposition are independent. Moreover, if we could choose such that , then . These considerations lead to the problem of finding a one-dimensional Markov chain (on ) whose increments belong to , and whose stationary distribution is . However, a parity argument given in LiuSahSawhney21 shows that such a Markov chain does not exist, even if we allow for changing the variance of the Gaussian from .

To deal with this issue, LiuSahSawhney21 show that such Markov chains exist if we allow the increments to take values in or , which leads to the algorithm outputting partial or improper colorings. If the variance of the Gaussian is set to be sufficiently large, then they can further argue that the algorithm outputs an actual signing with high probability, but the large variance of the Gaussian prevents them from establishing an online Banaszczyk result.

Our contribution in this section lies in adapting the idea above to online Gaussian discrepancy, by constructing a Markov chain in whose stationary distribution is a Gaussian, and whose increments lie in on the unit sphere; this is given in Section 3.2. The algorithm and analysis are then given in Section 3.3.

### 3.2 Construction of the Markov chain

Our goal in this section is to construct a Markov chain on for whose increments are unit vectors and whose stationary distribution is for some . In this and subsequent sections,

is used to denote the standard deviation (and in particular should not be confused with a signing).

The density of the -distribution with degrees of freedom is

 χr(s)=12r/2−1Γ(r/2)sr−1e−s22\one{s≥0}.

Hence, the density of where is

 χr,σ2(s)=12r/2−1Γ(r/2)sr−1σre−s22σ2\one{s≥0}.

Let , and define the following sets:

 Sx :={y∈\Rr:\normx−y2=1 and \normy2=\normx2}, S′x :={y∈\Rr:\normx−y2=1 and \normy2=1−\normx2}.

Since , if then from the Cauchy-Schwarz inequality, we obtain

 (\normx2−1)2≤\normy22≤(\normx2+1)2.

Conversely, if satisfies , then there exists such that and . From this, we deduce that

 [Sx≠∅if\normx2≥12]and[S′x≠∅if\normx2≤1].

Further note that if , then consists of a single element given by , while is the unit sphere.

The Markov chain transitions from with norm to the next point according to the following rules.

1. If , move to a point chosen uniformly at random in .

2. If :

1. with probability