1.1. Background and related work
Counting colorings of a bounded degree graph is a benchmark problem in approximate counting, due both to its importance in combinatorics and statistical physics, as well as to the fact that it has repeatedly challenged existing algorithmic techniques and stimulated the development of new ones.
Given a finite graph of maximum degree , and a positive integer , the goal is to count the number of (proper) vertex colorings of with colors. It is well known by Brooks’ Theorem  that such a coloring always exists if . While counting colorings exactly is #P-complete, a long-standing conjecture asserts that approximately counting colorings is possible in polynomial time provided . It is known that when , even approximate counting is NP-hard .
This question has led to numerous algorithmic developments over the past 25 years. The first approach was via Markov chain Monte Carlo (MCMC), based on the fact that approximate counting can be reduced to sampling a coloring (almost) uniformly at random. Sampling can be achieved by simulating a natural local Markov chain (or Glauber dynamics) that randomly flips colors on vertices: provided the chain is rapidly mixing, this leads to an efficient algorithm (a fully polynomial randomized approximation scheme, or FPRAS).
Jerrum’s 1995 result  that the Glauber dynamics is rapidly mixing for gave the first non-trivial randomized approximation algorithm for colorings and led to a plethora of follow-up work on MCMC (see, e.g., [11, 12, 15, 21, 22, 23, 24, 33, 38] and  for a survey), focusing on reducing the constant 2 in front of . The best constant known for general graphs remains essentially , obtained by Vigoda  using a more sophisticated Markov chain, though this was very recently reduced to for a very small by Chen et al. . The constant can be substantially improved if additional restrictions are placed on the graph: e.g., Dyer et al.  achieve roughly provided the girth is at least 6 and the degree is a large enough constant, while Hayes and Vigoda improve this to for girth at least 11 and degree , where is the number of vertices.
A significant recent development in approximate counting is the emergence of deterministic approximation algorithms that in some cases match, or even improve upon, the best known MCMC algorithms.111In this case, the notion of an FPRAS is replaced by that of a fully polynomial time approximation scheme, or FPTAS. An FPTAS for -colorings of graphs of maximum degree at most is an algorithm that given the graph and an error parameter on the input, produces a -factor multiplicative approximation to the number of -colorings of in time (the degree of the polynomial is allowed to depend upon the constants and ). These algorithms have made use of one of two main techniques: decay of correlations, which exploits decreasing influence of the spins (colors) on distant vertices on the spin at a given vertex; and polynomial interpolation, which uses the absence of zeros of the partition function in a suitable region of the complex plane. Early examples of the decay of correlations approach include [40, 2, 1], while for early examples of the polynomial interpolation method, we refer to the monograph of Barvinok  (see also, e.g., [4, 25, 34, 27, 30, 13] for more recent examples). Unfortunately, however, in the case of colorings on general bounded degree graphs, these techniques have so far lagged well behind the MCMC algorithms mentioned above. One obstacle to getting correlation decay to work is the lack of a higher-dimensional analog of Weitz’s beautiful algorithmic framework , which allows correlation decay to be fully exploited via strong spatial mixing in the case of spin systems with just two spins (as opposed to the colors present in coloring). For polynomial interpolation, the obstacle has been a lack of precise information about the location of the zeros of associated partition functions (see below for a definition of the partition function in the context of colorings).
So far, the best algorithmic condition for colorings obtained via correlation decay is , due to Lu and Yin , and this remains the best available condition for any deterministic algorithm. This improved on an earlier bound of roughly (proved only for triangle-free graphs), due to Gamarnik and Katz . For the special case , Lu et al.  give a correlation decay algorithm for counting 4-colorings. Furthermore, Gamarnik, Katz and Misra  establish the related property of “strong spatial mixing” under the weaker condition for any constant , where is the unique solution to and is a constant depending on , and under the assumption that is triangle-free (see also [20, 21] for similar results on restricted classes of graphs). However, as discussed in , this strong spatial mixing result unfortunately does not lead to a deterministic algorithm.222The strong spatial mixing condition does imply fast mixing of the Glauber dynamics, and hence an FPRAS, but only when the graph family being considered is “amenable”, i.e., if the size of the -neighborhood of any vertex does not grow exponentially in . This restriction is satisfied by regular lattices, but fails, e.g., for random regular graphs.
The newer technique of polynomial interpolation, pioneered by Barvinok , has also recently been brought to bear on counting colorings. In a recent paper, Bencs et al.  use this technique to derive a FPTAS for counting colorings provided . This result is of independent interest because it uses a different algorithmic approach, and because it establishes a new zero-free region for the associated partition function in the complex plane (see below), but it is weaker than those obtained via correlation decay.
In this paper, we push the polynomial interpolation method further and obtain a FPTAS for counting colorings under the condition :
Fix positive integers and such that . Then there exists a fully polynomial time deterministic approximation scheme (FPTAS) for counting -colorings in any graph of maximum degree .
This is the first deterministic algorithm (of any kind) that for all matches the “natural” bound for MCMC, first obtained by Jerrum . Indeed, remains the best bound known for rapid mixing of the basic Glauber dynamics that does not require either additional assumptions on the graph or a spectral comparison with another Markov chain: all the improvements mentioned above require either lower bounds on the girth and/or maximum degree, or (in the case of Vigoda’s result ) analysis of a more sophisticated Markov chain. This is for good reason, since the bound coincides with the closely related Dobrushin uniqueness condition from statistical physics , which in turn is closely related  to the path coupling method of Bubley and Dyer  that provides the simplest currently known proof of the bound for the Glauber dynamics.
We therefore view our result as a promising starting point for deterministic coloring algorithms to finally compete with their randomized counterparts. In fact, as discussed later in section 1.3, our technique is capable of directly harnessing strong spatial mixing arguments used in the analysis of Markov chains for certain classes of graphs. As an example, we can exploit such an argument of Gamarnik, Katz and Misra  to improve the bound on in Theorem 1.1 when the graph is triangle-free, for all but small values of . (Recall that is the unique positive solution of the equation .)
For every , there exists a such that the following is true. For all integers and such that , there exists a fully polynomial time deterministic approximation scheme (FPTAS) for counting -colorings in any triangle-free graph of maximum degree .
We mention also that our technique applies without further effort to the more general setting of list colorings, where each vertex has a list of allowed colors of size , under the same conditions as above on . Indeed, our proofs are written to handle this more general situation.
In the next subsection we describe our algorithm in more detail.
1.2. Our approach
Let be an -vertex graph of maximum degree , and a set of colors. Define the polynomial
Here ranges over arbitrary (not necessarily proper) assignments of colors to vertices, and each such coloring has a weight , where is the number of monochromatic edges in . Note that the number of proper -colorings of is just .
is the partition function of the Potts model of statistical physics, and implicitly defines a probability distribution on coloringsaccording to their weights in (1). The parameter measures the strength of nearest-neighbor interactions. The value corresponds to the trivial setting where there is no constraint on the colors of neighboring vertices, while imposes the hard constraint that no neighboring vertices receive the same color. For intermediate values , neighbors with the same color are penalized by a factor of . Theorems 1.2 and 1.1 are in fact special cases of the following more general theorem.
Theorem 1.3 of course subsumes Theorems 1.2 and 1.1, but the extension to other values of is of independent interest as the computation of partition functions is a very active area of study in statistical physics and combinatorics.
To prove Theorem 1.3, we view as a polynomial in the complex variable and identify a region in the complex plane in which is guaranteed to have no zeros. Specifically, we will show that this holds for the open connected set obtained by augmenting the real interval with a ball of radius around each point, where is a (small) constant depending only on .
We remark that this theorem is also of independent interest, as the location of zeros of partition functions has a long and noble history going back to the Lee-Yang theorem of the 1950s [29, 41]. In the case of the Potts model, Sokal [36, 37] proved (in the language of the Tutte polynomial) that the partition function has no zeros in the entire unit disk centered at 0, under the strong condition ; the constant was later improved to 6.907 by Fernández and Procacci  (see also ). Much more recently, the work of Bencs et al.  referred to above gives a zero-free region analogous to that in Theorem 1.4 above, but under the stronger condition . We note also that Barvinok and Soberón  (see also  for an improved version) established a zero-free region in a disk centered at .
Theorem 1.4 immediately gives our algorithmic result, Theorem 1.3, by appealing to the recent algorithmic paradigm of Barvinok . The paradigm (see Lemma 2.2.3 of ) states that, for a partition function of degree , if one can identify a connected, zero-free region for in the complex plane that contains a -neighborhood of the interval , and a point on that interval where the evaluation of is easy (in our setting this is the point ), then using the first coefficients of , one can obtain a multiplicative approximation of at any point . Barvinok’s framework is based on exploiting the fact that the zero-freeness of in is equivalent to being analytic in , and then using a carefully chosen transformation to deform into a disk (with the easy point at the center) in order to perform a convergent Taylor expansion. The coefficients of are used to compute the coefficients of this Taylor expansion.
Barvinok’s framework in general leads to a quasi-polynomial time algorithm as the computation of the terms of the expansion may take time for the partition functions considered here. However, additional insights provided by Patel and Regts  (see, e.g., the proof of Theorem 6.2 in ) show how to reduce this computation time to for many models on bounded degree graphs of degree at most , including the Potts model with a bounded number of colors at each vertex. Hence we obtain an FPTAS. This (by now standard) reduction is the same path as that followed by Bencs et al. [6, Corollary 1.2]; for completeness, we provide a sketch in Appendix A. We note that for each fixed and the running time of our final algorithm is polynomial in (the size of ) and , as required for an FPTAS. However, as is typical of deterministic algorithms for approximate counting, the exponent in the polynomial depends on (through the quantity in Theorem 1.4, which in the case of bounded list sizes is inverse polynomial in ).
We end this introduction by sketching our approach to proving Theorem 1.4, which is the main contribution of the paper.
1.3. Technical overview
The starting point of our proof is a simple geometric observation, versions of which have been used before for constructing inductive proofs of zero-freeness of partition functions (see, e.g., [3, 6]). Fix a vertex in the graph . Given , and a color , let denote the restricted partition function in which one only includes those colorings in which . Then, since , the zero-freeness of will follow if the angles between the complex numbers
, viewed as vectors in, are all small, and provided that at least one of the is non-zero. (In fact, this condition on angles can be relaxed for those that are sufficiently small in magnitude, and this flexibility is important when is a complex number close to .) Therefore, one is naturally led to consider so-called marginal ratios:
The broad contours of our approach as outlined so far are quite similar to some recent work [3, 6]. However, it is at the crucial step of how the marginal ratios are analyzed that we depart from these previous results. Instead of attacking the restricted partition functions or the marginal ratios directly for given , as in these previous works, we crucially exploit the fact that for any close to the given , these quantities have natural probabilistic interpretations, and hence can be much better understood via probabilistic and combinatorial methods. For instance, when , the marginal ratio is in fact a ratio of the marginal probabilities and , under the natural probability distribution on colorings . In fact, our analysis cleanly breaks into two separate parts:
First, understand the behavior of true marginal probabilities of the form for . This is carried out in section 3.
A key point in our technical analysis is the notion of “niceness” of vertices, which stipulates that the marginal probability should be small enough in terms only of the local neighborhood of vertex in (see Definition 3.1). Note that this condition refers only to real non-negative , and hence is amenable to analysis via standard combinatorial tools. Indeed, our proofs that the conditions on and in Theorems 1.2 and 1.1 imply this niceness condition are very similar to probabilistic arguments used by Gamarnik et al.  to establish the property of “strong spatial mixing” (in the special case ). We emphasize that this is the only place in our analysis where the lower bounds on are used. One can therefore expect that combinatorial and probabilistic ideas used in the analysis of strong spatial mixing and the Glauber dynamics with smaller number of colors in special classes of graphs can be combined with our analysis to obtain deterministic algorithms for those settings, as we have demonstrated in the case of .
The above ideas are sufficient to understand the real-valued case (part 1 above). For the complex case in part 2, we start from a recurrence for the marginal ratios that is a generalization (to the case ) of a similar recurrence used by Gamarnik et al.  (see Lemma 2.4). The inductive proofs in sections 5 and 4 use this recurrence to show that, if is close to , then all the relevant remain close to throughout. The actual induction, especially in the case when is close to , requires a delicate choice of induction hypotheses (see Lemmas 5.3 and 4.2). The key technical idea is to use the “niceness” property of vertices established in part 1 to argue that the two recurrences (real and complex) remain close at every step of the induction. This in turn depends upon a careful application of the mean value theorem, separately to the real and imaginary parts (see Lemma 2.5), of a function that arises in the analysis of the recurrence (see Lemma 2.6).
1.4. Comparison with correlation-decay based algorithms
We conclude this overview with a brief discussion of how we are able to obtain a better bound on the number of colors than in correlation decay algorithms, such as [18, 31] cited earlier. In these algorithms, one first uses recurrences similar to the one mentioned above to compute the marginal probabilities, and then appeals to self-reducibility to compute the partition function. Of course, expanding the full tree of computations generated by the recurrence will in general give an exponential time (but exact) algorithm. The core of the analysis of these algorithms is to show that even if this tree of computations is only expanded to depth about , and the recurrence at that point is initialized with arbitrary values, the computation still converges to an -approximation of the true value. However, the requirement that the analysis be able to deal with arbitrary initializations implies that one cannot directly use properties of the actual probability distribution (e.g., the “niceness” property alluded to above); indeed, this issue is also pointed out by Gamarnik et al. . In contrast, our analysis does not truncate the recurrence, and thus only has to handle initializations that make sense in the context of the graph being considered. Moreover, the exponential size of the recursion tree is no longer a barrier since, in contrast to correlation decay algorithms, we are using the tree only as a tool to establish zero-freeness; the algorithm itself follows from Barvinok’s polynomial interpolation paradigm. Our approach suggests that this paradigm can be viewed as a method for using (complex-valued generalizations of) strong spatial mixing results to obtain deterministic algorithms.
2.1. Colorings and the Potts model
Throughout, we assume that the graphs that we consider are augmented with a list of colors for every vertex. Formally, a graph is a triple , where is the vertex set, is the edge set, and specifies a list of colors for every vertex. The partition function as defined in the introduction generalizes naturally to this setting: the sum is now over all those colorings which satisfy .
We also allow graphs to contain pinned vertices: a vertex is said to be pinned to a color if only those colorings of are allowed in which has color . Suppose that a vertex of degree in a graph is pinned to a color , and consider the graph obtained by replacing with copies of itself, each of which is pinned to and connected to exactly one of the original neighbors of in . It is clear that for all . We will therefore assume that all pinned vertices in our graphs have degree exactly one. The size of graph, denoted as , is defined to be the number of unpinned vertices. It is worth noting that the above operation of duplicating pinned vertices does not change the size of the graph.
Let be a graph and an unpinned vertex in . A color in the list of is said to be good for if for every pinned neighbor of is pinned to a color different from . The set of good colors for a vertex in graph is denoted . We sometimes omit the graph and write when is clear from the context. A color that is not in is called bad for . Further, given a graph with possibly pinned vertices, we say that the graph is unconflicted if no two neighboring vertices in are pinned to the same color. Note that since all pinned vertices have degree exactly one, each conflicted graph is the direct sum of an unconflicted graph and a collection of disjoint, conflicted edges.
We will assume throughout that all unconflicted graphs we consider have at least one proper coloring: this will be guaranteed in our applications since we will always have for every unpinned vertex in .
For a graph , a vertex and a color , the restricted partition function is the partition function restricted to colorings in which the vertex receives color .
Let be a formal variable. For any , a vertex and colors , we define the marginal ratio of color to color as Similarly we also define formally the corresponding pseudo marginal probability as
Note that when a numerical value is substituted in place of in the above formal definition, is numerically well-defined as long as , and is numerically well-defined as long as . In the proof of the main theorem in sections 5 and 4, we will ensure that the above definitions are numerically instantiated only in cases where the corresponding conditions for such an instantiation to be well-defined, as stated above, are satisfied. For instance, when , this is the case for the first definition when either (i) ; or (ii) , but is unconflicted and ; while for the second definition, this is the case when either (i) ; or (ii) , but is unconflicted.
Note also that when , the pseudo probabilities, if well-defined, are actual marginal probabilities. In this case, we will also write as . For arbitrary complex , this interpretation as probabilities is of course not valid (since can be non-real), but provided that it is still true that We also note that if is pinned to color , then is when and when .
For the case we will sometimes shorten the notations and to and respectively.
Definition 2.3 (The graphs ).
Given a graph and a vertex in , let be the neighbors of . We define (the vertex will be understood from the context) to be the graph obtained from as follows:
first we replace vertex with , and connect to , to , and so on;
next we pin vertices to color , and vertices to color ;
finally we remove the vertex .
Note that the graph has one fewer unpinned vertex than .
We now derive a recurrence relation between the marginal ratios of the graph and pseudo marginal probabilities of the graphs . This is an extension to the Potts model of a similar recurrence relation derived by Gamarnik, Katz and Mishra for the special case of colorings (that is, ).
Let be a formal variable. For a graph , a vertex and colors , we have
where we define . In particular, when a numerical value is substituted in place of , the above recurrence is valid as long as the quantities and for are all non-zero.
For , let be the graph obtained from as follows:
first we replace vertex with , and connect to , to , and so on;
we then pin vertices to color , and vertices to color .
Note that is the same as , except that the last step of the construction of is skipped, i.e, the vertex is not removed, and, further, is pinned to color . We can now write
Next, for , let and . We observe that
Therefore we have
where . The claim about the validity of the recurrence on numerical substitution then follows from the conditions outlined in Definition 2.2. ∎
2.2. Complex analysis
In this subsection we collect some tools and observations from complex analysis. Throughout this paper, we use to denote the imaginary unit , in order to avoid confusion with the symbol “” used for other purposes. For a complex number with , we denote its real part as , its imaginary part as , its length as , and, when , its argument as . We also generalize the notation used for closed real intervals to the case when , and use it to denote the closed straight line segment joining and .
We start with a consequence of the mean value theorem for complex functions, specifically tailored to our application. Let be any domain in with the following properties.
For any , .
For any , there exists a point such that one of the numbers has zero real part while the other has zero imaginary part.
If are such that either or , then the segment lies in .
We remark that a rectangular region symmetric about the real axis will satisfy all the above properties.
Lemma 2.5 (Mean value theorem for complex functions).
Let be a holomorphic function on such that for , has the same sign as . Suppose further that there exist positive constants and such that
for all , ;
for all , .
Then for any , there exists such that
Due to space considerations, we defer the proof to Appendix B.
We will apply the above lemma to the function
which, as we shall see later, will play a central role in our proofs. (We note that here, and also later in the paper, we use to denote the principal branch of the complex logarithm; i.e., if with and , then .) Below we verify that such an application is valid, and record the consequences.
Consider the domain given by
where and are positive real numbers such that . Suppose and consider the function as defined in eq. 2. Then,
The function and the domain satisfy the hypotheses of Lemma 2.5, if and in the statement of the theorem are taken to be and , respectively.
If and are such that and , then for any ,
In particular, we note that the domain is indeed rectangular and symmetric about the real axis. Due to space considerations, we defer the proof to Appendix C.
Let be complex numbers such that the angle between any two non-zero is at most . Then .
3. Properties of the real-valued recurrence
In this section we prove some basic properties of the real-valued recurrence established in Lemma 2.4, that is, in the case where is real (and hence, ).
We remark that in all graphs appearing in our analysis, we will be able to assume that for any unpinned vertex in , . Thus, whenever either (i) ; or (ii) , but is unconflicted. As discussed in the previous section, this implies that the marginal ratios and the pseudo marginal probabilities are well-defined, and, further, the latter are actual probabilities. Moreover, if is not connected, and is the connected component containing , then we have and . Thus without loss of generality, we will only consider connected graphs in this section.
We now formally state the conditions on the list sizes under which our main theorem holds.
Condition 1 (Large lists).
The graph satisfies at least one of the following two conditions.
for each unpinned vertex in .
The graph is triangle-free and further, for each vertex of ,
where is any fixed constant larger than the unique positive solution of the equation and is a constant chosen so that . We note that lies in the interval , and as chosen above is at least .
Note that the condition imposed in case 1 above is without loss of generality, since any vertex with can be removed from after removing the unique color in its list from the lists of its neighbors, without changing the number of colorings of .
As stated in the introduction, an important element of our analysis is going to be the fact that under Condition 1, one can show that certain vertices are “nice” in the sense of the following definition. We emphasize that Condition 1 is ancillary to our main technical development: any condition under which the probability bounds imposed in the following definition can be proved (as is done in Lemma 3.2 below) will be sufficient for the analysis.
Given a graph and an unpinned vertex in , let be the number of unpinned neighbors of . We say the vertex is nice in if for any and any color ,
We adopt the convention that if is a conflicted graph (so that it has no proper colorings) and , then for every color and every unpinned vertex in . This is just to simplify the presentation in this section by avoiding the need to explicitly exclude this case from the lemmas below. In the proof of our main result in sections 5 and 4, we will never consider conflicted graphs in a situation where could be , so that this convention will then be rendered moot.
If satisfies Condition 1 then for any vertex in , and any unpinned neighbor of , we have that is nice in .
We prove this lemma separately for each of the two cases in Condition 1.
3.1. Analysis for case 1 of Condition 1
Consider any valid coloring333Here, we say that a coloring is valid if the color assigns to any vertex is from , and further, in case , no two neighbors are assigned the same color by . of the neighbors of in . For , let denote the number of neighbors of that are colored in . Then for any and ,
since at most of the can be positive. Note in particular that if is not a good color for in , then the probability is . Since this holds for any coloring , we have Now, let be the number of unpinned neighbors of in . Noting that , and recalling the observation above that , we thus have
Thus is nice in . ∎
3.2. Analysis for case 2 of Condition 1
Proof of Lemma 3.2.
We conclude this section by noting that, the niceness condition can be strengthened in the case when all the list sizes are uniformly large (e.g., as in the case of -colorings).
In Condition 1, if we replace the degree of a vertex by the maximum degree (e.g., in case 1 of the condition, if we assume , instead of , for each ), then for every vertex in the graph , it holds that .
To see this, notice that the same calculation as in the proof of Lemma 3.3 above gives We will refer to this stronger condition on list sizes (which holds, in particular, when one is considering the case of -colorings), as the uniformly large list size condition.
4. Zero-free region for small
As explained in the introduction, all our algorithmic results follow from Theorem 1.4, which establishes a zero-free region for the partition function around the interval in the complex plane. We split the proof of Theorem 1.4 into two parts: in this section, we establish the existence of a zero-free disk around the endpoint (see Theorem 4.1): this is the most delicate case because corresponds to proper colorings. Then in section 5 (see Theorem 5.1) we derive a zero-free region around the remainder of the interval, using a similar but less delicate approach. Taken together, Theorems 5.1 and 4.1 immediately imply Theorem 1.4, so this will conclude our analysis.
Fix a positive integer . There exists a such that the following is true. Let be a graph of maximum degree satisfying Condition 1, and having no pinned vertices. Then, for any satisfying .
In the proof, we will encounter several constants which we now fix. Given the degree bound , we define
We will then see that the quantity in the statement of the theorem can be chosen to be . (In fact, we will show that if one has the slightly stronger assumption of uniformly large list sizes considered in creftypecap 5, then can be chosen to be ).
Throughout the rest of this section, we fix to be the maximum degree of the graphs, and let be as above.
We now briefly outline our strategy for the proof. Recall that, for a vertex and colors , the marginal ratio is given by When is an unconflicted graph, is always a well-defined non-negative real number. Intuitively, we would like to show that , independent of the size of , when is close to . Given such an approximation one can use a simple geometric argument (see Consequence 4.3) to conclude that the partition function does not vanish for such . In order to prove the above approximate equality inductively for a given graph , we take an approach that exploits the properties of the “real” case (i.e., of ) and then uses the notion of “niceness” of certain vertices described earlier to control the accumulation of errors. To this end, we will prove the following lemma via induction on the number of unpinned vertices in . Theorem 4.1 will follow almost immediately from the lemma; see the end of this section for the details.
For , .
For , if has all neighbors pinned, then .
For , if has unpinned neighbors, then
For any , if has unpinned neighbors, we have .
For any , then .
We begin by verifying that the induction hypothesis holds in the base case when is the only unpinned vertex in an unconflicted graph . In this case, items 4 and 3 are vacuously true since has no unpinned neighbors. Since all neighbors of in are pinned, the fact that all pinned vertices have degree at most one implies that can be decomposed into two disjoint components and , where consists of and its pinned neighbors, while consists of a disjoint union of unconflicted edges (since is unconflicted). Now, since and are disjoint components, we have for all and all . This proves items 2 and 1. Similarly, when , we have , where is the number of neighbors of pinned to color . This gives
since , and proves item 5.
We now derive some consequences of the above induction hypothesis that will be helpful in carrying out the induction. Throughout, we assume that is an unconflicted graph satisfying Condition 1.
Note that . From item 4, we see that the angle between the complex numbers and , when , is at most . Applying Lemma 2.7 to the terms corresponding to the good colors and item 5 to the terms corresponding to the bad colors, we then have
where we use the fact that in the second inequality, and in the last inequality. Since and , we then have , which in turn is positive from item 1. ∎
The pseudo-probabilities approximate the real probabilities in the following sense:
for any , .
for any ,