Stability of Talagrand's influence inequality

09/26/2019
by   Ronen Eldan, et al.
Weizmann Institute of Science
0

We strengthen several classical inequalities concerning the influences of a Boolean function, showing that near-maximizers must have large vertex boundaries. An inequality due to Talagrand states that for a Boolean function f, var(f)≤ C∑_i=1^nInf_i(f)/1+log(1/Inf_i(f)), where Inf_i(f) denotes the influence of the i-th coordinate. We give a lower bound for the size of the vertex boundary of functions saturating this inequality. As a corollary, we show that for sets that satisfy the edge-isoperimetric inequality or the Kahn-Kalai-Linial inequality up to a constant, a constant proportion of the mass is in the inner vertex boundary. Our proofs rely on new techniques, based on stochastic calculus, and bypass the use of hypercontractivity common to previous proofs.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

09/26/2019

Concentration on the Boolean hypercube via pathwise stochastic analysis

We develop a new technique for proving concentration inequalities which ...
09/07/2021

Convex Influences

We introduce a new notion of influence for symmetric convex sets over Ga...
04/05/2021

Prophet Inequalities for Matching with a Single Sample

We consider the prophet inequality problem for (not necessarily bipartit...
11/24/2019

Revisiting Bourgain-Kalai and Fourier Entropies

The total influence of a function is a central notion in analysis of Boo...
09/11/2020

Hypercontractivity on the symmetric group

The hypercontractive inequality is a fundamental result in analysis, wit...
06/29/2018

Bounds on the Poincaré constant for convolution measures

We establish a Shearer-type inequality for the Poincaré constant, showin...
10/31/2021

On The Absolute Constant in Hanson-Wright Inequality

We revisit and slightly modify the proof of the Gaussian Hanson-Wright i...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction.

The influence of a Boolean function in direction is defined as

where is the same as but with the -th bit flipped, and is the uniform measure on the discrete hypercube

. The expectation and variance of a function are given by

The Poincaré inequality gives an immediate relation between the aforementioned quantities, namely,

This inequality in fact holds for any function, and it is natural to ask whether this can be improved when Boolean functions are considered. A breakthrough paper by Kahn, Kalai and Linial (KKL) [5, Theorem 3.1] showed that this inequality can be improved logarithmically (see below), and their inequality was later generalized by Talagrand [9] who proved the following:

Theorem 1.

There exists an absolute constant such that for every ,

(1)

It is known that this inequality is sharp in the sense that for any sequence of influences, there exist examples for which the inequality is sharp [7].

Talagrand’s original proof of this theorem, as well as later proofs (see e.g [2]), all rely on the hypercontractive principle. In this paper, we give a stochastic-analysis proof of Theorem 1 which bypasses the use of hypercontractivity, and in fact uses classical Boolean Fourier-analysis only sparingly.

The vertex boundary of a function is defined as

and is the disjoint union of the inner vertex boundary,

and the outer vertex boundary,

The main contribution of this paper is the following result, which shows that if near-equality is attained in equation (1), then both the inner and outer vertex boundaries of are large.

Theorem 2.

Let , and denote . There exists an absolute constant such that

In fact, a relation between influences, variance and vertex boundary was also shown in a second paper of Talagrand, [11]. In the same paper, Talagrand puts forth a conjecture which is partially settled by our results. We discuss this connection below.

We conclude by applying Theorem 2 to two related functional inequalities - the isoperimetric inequality and the KKL inequality - showing that when either of the inequalities are tight up to a constant, the function must have a large vertex boundary.

Let with and let be the support of , so that . The edge-isoperimetric inequality [4, section 3] states that

with equality if and only if is a subcube. Our first corollary states that if this inequality is tight up to a constant multiplicative factor, then a constant proportion of the set is in its inner vertex boundary.

Corollary 3.

Let Then there exists a constant depending only on such that

Proof.

As in Theorem 2, denote . Observe that for every index , . Since , we have

This gives a bound on :

Thus, by Theorem 2, there exists a constant such that

The KKL theorem [5, Theorem 3.1] states that a Boolean function must have a variable with a relatively large influence: There exists an absolute constant such that for every , there exists an index with

Our second corollary states that if all influences are of the order , then the function must have a large (inner and outer) vertex boundary.

Corollary 4.

Suppose that for some , we have for all . Then there exists a constant depending only on such that

Proof.

In this case, we have

Thus, by Theorem 2, there exists a constant such that

A relation to a conjecture by Talagrand

For a point denote by the number of points which differ from in one coordinate such that . In [11], Talagrand conjectured the relation

and proved a similar inequality but with a different power over the logarithmic term. An application of Cauchy-Schwartz on the left hand side of the above display would give

(2)

The result of Theorem 2 can be written as,

In some regimes, the last inequality implies (2), but it is not clear to us whether one of them is stronger than the other in general.

Acknowledgements

R.E. would like to thank Noam Lifshitz for useful discussions and in particular for pointing out the possible application to stability of the isoperimetric inequality.

2 Background and notation

2.1 Boolean functions

For a general introduction to Boolean functions, see [8]; in what follows, we provide a brief overview of the required background and notation.

Every Boolean function may be uniquely written as a sum of monomials:

(3)

where . Equation (3) may be used to extend a function’s domain from the discrete hypercube to real space . We call this the harmonic extension, and denote it also by . Under this notation, . In general, for , the harmonic extension is convex combination of ’s values on all the points :

(4)

where .

The derivative of a function in direction is defined as

where has at coordinate , and is identical to at all other coordinates. The gradient is then defined as . A function is called monotone if whenever for all . Similar to the function , we denote the harmonic extension of by , and think of it as a function on .

A short calculation reveals the following properties of the derivative:

  1. The harmonic extension of the derivative is equal to the real-differentiable partial derivative of the harmonic extension of .

  2. For functions whose range is , the derivative takes values in , and the influence of the -th coordinate of is given by

    (5)

2.2 Stochastic processes

For a general introduction to stochastic processes and Poisson processes, see [3] and [6].

A Poisson point process is an integer-valued process with rate such that , and for every , the difference

distributes as a Poisson random variable with rate

. If for all , then the sample-paths of a Poisson point process are right-continuous almost surely. The (random) set of times at which the sample-path is discontinuous is denoted by .

Let be such that for all and let be a Poisson point process with rate . The set is then almost surely discrete. A process is said to be a piecewise-smooth jump process with rate if is right-continuous and is smooth in the interval for every . This definition can be extended to the case where but for all (this happens, for example, when ): In this case has only a single accumulation point at , and intervals between successive jump times are still well defined.

An important notion in the analysis of stochastic processes is quadratic variation. The quadratic variation of a process , denoted , is defined as

if the limit exists; here is an -part partition of , and the notation indicates that the size of the largest part goes to . Not all processes have a (finite) quadratic variation, but piecewise-smooth jump processes do; in fact, it can be seen from definition that if is a piecewise-smooth jump process then

(6)

where is the size of the jump at time .

The quadratic variation is especially useful for martingales due to its relation with the variance: If is a martingale, then

(7)

3 The main tool: A jump process

The proof of Theorems 1 and 2 relies on the construction of a piecewise-smooth jump process martingale , described below. One of its key properties is that it will allow us to express the variance of in terms of derivatives of the harmonic extension:

This integral can then be approximated from above by the right hand side of equation (1) using tools from real analysis and stochastic processes. The process is characterized by the following properties:

  1. , with independent and identically distributed for all .

  2. is a martingale for all .

  3. almost surely for all and .

Proposition 5.

There exists a right continuous martingale with the above properties. Furthermore, for all ,

(8)
Proof.

Let be a standard Brownian motion. Consider the family of stopping times

and define . Then by definition, , and is a martingale due to the optional stopping theorem. Observe that can fail to be right-continuous only if is different from for all in some open interval around

. This event happens with probability

, and so there exists a modification of where paths are right-continuous almost surely. The process is defined as , where are independent copies of .

To prove equation (8), set and use the martingale property:

Rearranging gives as needed. ∎

It can be readily seen that is a piecewise-smooth jump process with rate . Denote its set of discontinuities by .

For a function , the harmonic extension process is a multilinear polynomial. As the product of two independent martingales is also a martingale with respect to its natural filtration, by independence of the coordinates of , we conclude that

Fact 6.

For a function , the process is a martingale.

Some example sample paths for the -bit majority function are given in Figure 1.

Figure 1: Sample paths of for the -bit majority function

Since is a piecewise-smooth jump process, the quadratic variation of is a sum of squared jumps. A crucial property of is that the expected value of these jumps behaves smoothly, as the next lemma shows:

Lemma 7.

Let , let , and let be an index. Let be a bounded, right-continuous process which is independent of and almost surely has only a finite number of discontinuity points in the interval . Then

(9)
Proof.

To prove (9), assume first that , so that the number of jumps that makes in the time interval is almost surely finite. For any integer , partition the interval into equal parts, setting for . Since is independent of , almost surely its discontinuity points are disjoint from those of ; and since almost surely has only a finite number of discontinuity points for , then for every time point , there is a small neighborhood of on which is continuous. Thus, almost surely we have

Since , , and are bounded, the expression

is bounded by a constant times the number of jumps of in the interval , which is integrable. By the dominated convergence theorem, we then have

and since both and are independent of , the expectation breaks up into

(10)

The set is a Poisson process with rate , and so the number of jumps in the interval distributes as , where

The probability of having at least one jump is then equal to

Plugging this into display (10), we get

The factor is negligible in the limit , since the sum contains only bounded terms. We are left with

Since both and are right continuous, by the definition of the Riemann integral, this term is equal to , and we get

for all . Taking the limit gives the desired result for by continuity of the right hand side in . ∎

Corollary 8.

Let . Then

(11)
Proof.

The first equality follows by using the fact that is uniform on the cube , and so . We turn to the second one. Since is a piecewise-smooth jump process, by (6) its quadratic variation is equal to the sum of squares of its jumps. Now, almost surely, can make a jump only in one coordinate at a time, and when the -th coordinate jumps, the value of changes by , since is multi-linear. By (7), the variance is the expected value of the quadratic variation, and so

Setting in (9) completes the proof. ∎

Corollary 9.

Let . Then

Proof.

By the martingale property of ,

Taking the derivative of equation (11) and using the fundamental theorem of calculus on the right hand side gives the desired result. ∎

For every index , let be the harmonic extension of . For monotone functions we have since the derivatives are positive, but in general,

(12)

by convexity. In particular, plugging (12) into Corollary 8, we have

(13)

We call the process the “influence process”, because of how the expectation of its square relates to the influence of : Observe that by (5),

(14)

Thus, at time , we have , while at time , since for , we have . The expected value increases from to as goes from to .

The integral may be more easily handled using a time-change which makes the integrand log-convex; this can be used to bound it by a power of the influence. For this purpose, for , denote

Lemma 10.

is a log-convex function of .

Proof.

Expanding as a Fourier polynomial, we have

(15)

This is a positive linear combination of log convex-functions , and is therefore also log-convex [1, section 3.5.2]. ∎

The next lemma is likely to be well-known to experts, and can be derived from hypercontractivity as shown in [2]. We give a different proof based on the analysis of the stochastic process. Later on, we will see how this analysis can be pushed further to obtain the stability results. On an intuitive level and in light of equation (9) the lemma shows that all of the “action” which contributes to the variance of the function happens very close to time .

Lemma 11.

There exists a universal constant so that if then

(16)

for all .

Proof.

Let to be chosen later. We start by showing that there exists a constant such that

(17)

Let ; by applying Corollary 9 to the function , we see that satisfies

(18)

The right hand side of equation (18) can be bounded using the following lemma, which is a -biased version of the “Level-1” inequality. The proof is postponed to the appendix.

Lemma 12.

There exists a constant so that the following holds. Let be the harmonic extension of a function, and let be such that for all . Then

(19)

Taking and in equation (19) and substituting this in equation (18), we have

For ,

(20)

Let

be the solution to the ordinary differential equation

(21)

Then, since is positive and increasing for , we have that for all in an interval in which ,

(22)

The solution to the differential equation (21) is given by

where in the last equality we used equation (14) and the fact that . Supposing that (so ), we have for all , and so by (22),

for all . Since , if then