    # Ultra-log-concavity and discrete degrees of freedom

We develop the notion of discrete degrees of freedom of a log-concave sequence and use it to prove that the quantity ℙ(X=𝔼 X) is maximized, under fixed integral mean, for a Poisson distribution.

## Authors

12/06/2017

### Cleverarm: A Novel Exoskeleton For Rehabilitation Of Upper Limb Impairments

CLEVERarm (Compact, Low-weight, Ergonomic, Virtual and Augmented Reality...
03/30/2016

### Degrees of Freedom in Deep Neural Networks

In this paper, we explore degrees of freedom in deep sigmoidal neural ne...
05/13/2020

### Recognition of 26 Degrees of Freedom of Hands Using Model-based approach and Depth-Color Images

In this study, we present an model-based approach to recognize full 26 d...
04/26/2018

### Scalable computation of thermomechanical turbomachinery problems

A commonly held view is that finite element methods are not well-suited ...
08/20/2019

### State Space System Modelling of a Quad Copter UAV

In this paper, a linear mathematical model for a quad copter unmanned ae...
06/16/2020

### On parametric tests of relativity with false degrees of freedom

General relativity can be tested by comparing the binary-inspiral signal...
03/01/2018

### Diversity and degrees of freedom in regression ensembles

Ensemble methods are a cornerstone of modern machine learning. The perfo...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction

Let us fix a nonnegative integer and let . We say that a sequence of nonnegative real numbers is log-concave if is a discrete interval and the inequality is satisfied for . The sequence is called ultra-log-concave if is log-concave. Let us define to be the class of all random variables taking values in

and such that their probability mass function

is ultra-log-concave. We shall slightly abuse notation by using the same letter to denote both the law of and its probability mass function. Note that if ,

, stands for the probability mass function of the binomial distribution, then ultra log-concavity of

is equivalent to log-concavity of and thus equivalent to being log-concave with respect to the binomial distribution .

Note that can also be considered. In this case random variables take values in the set of nonnegative integers and are called ultra-log-concave if is log-concave, where stands for the probability mass function of the Poisson random variable with parameter . Thus, ultra-log-concavity is in this case equivalent to saying that is log-concave with respect to .

Ultra-log-concave random variables attracted considerable attention of researchers over last two decades. The definition itself is, according to our best knowledge, due to Pemantle  who introduced it in the context of theory of negative dependence of random variables. After reading the unpublished at that moment manuscript of Pemantle, Liggett wrote his article  where he proved that the convolution of a random variable and a random variable is a random variable. Note that the border case of this statement is the family of binomial random variables in the case of finite support (if and are independent, then ) and the family of Poisson distributions for random variables with infinite support (if and are independent, then ). A short proof of this fact, surprisingly connecting the statement to the famous Alexandrov-Fenchel inequality in convex geometry, was given by Gurvits in . The statement for random variables with infinite support is actually much older and due to Walkup, see Theorem 1 in . A direct and simpler proof of Walkup’s theorem appeared also in  is the context of Khintchine inequalities, see also a recent proof from  using localization techniques. It is worth mentioning that the same statement holds true if one replaces log-concavity with log-convexity and is due to Davenport and Pólya, .

More recently, Johnson in  considered random variables in the context of Shannon entropy , where we use the convention . He proved that the Poisson distribution maximizes entropy in the class of ultra log–concave distributions under fixed mean. The author uses an interesting semigroup technique based on adding Poisson random variable and the operation of thinning. In  Aravinda, Marsiglietti and Melbourne established concentration inequalities for the class . The main ingredient of the proof (see Lemma 2.1 therein) was the inequality satisfied for any , where . By considering Taylor expansion around the authors deduced the inequality . The main tools used to establish these results was a rather sophisticated localization technique developed in .

Our first goal of the present paper is to generalize and give simple proofs of the mentioned above results from  and . In particular, we shall work with the class for arbitrary (here for ) we denote by ). We prove the following theorem.

###### Theorem 1.

Let and so that . Then

• for any convex one has ,

• for we have and for we have .

• and ,

• .

Similarly, if and so that . Then

• for any convex one has ,

• for we have and for we have .

• and ,

• .

We shall use the following generalization of the lemma due to Barthe and Naor, see Lemma 8 in .

###### Lemma 2.

Let be real random variables with laws satisfying and such that the Radon-Nikodym derivative exists and the function changes sign at most two times. Then changes sign precisely two times and if the sign pattern of is , then for any convex function one has .

The inequality valid for all convex

defines the so-called Choquet order on the space of probability distributions. We give a simple proof of the above lemma using the technique of intersecting densities developed in

 and used in the context of information theory in  and convex geometry in 

. The lemma was originally formulated and used for continuous random variables. However, here we shall demonstrate its relevance in the discrete setting. The following corollary is immediate.

###### Corollary 3.

Let be random variables supported in with laws satisfying . Assume that is log-concave and the sequence changes sign at most two times. Then changes sign precisely two times and if the sign pattern of is , then .

Let us also mention that it is clearly possible to formulate an analogue of this corollary in the continuous setting.

In the third chapter we go beyond the convex case discussed in Theorem 1. We develop a discrete analogue of the concept of degrees of freedom introduced in  and use it to prove the following theorem.

###### Theorem 4.

Let be an ultra-log-concave random variable with integral mean. Then

 P(X=EX)≥P(Pois(EX)=EX).

This theorem is motivated by Theorem 1.1 from , where the concentration inequalities for class were derived. Let us also mention that the idea of degrees of freedom has been used in [14, 4] to lower bound entropy and Rényi entropy of log-concave random variables in terms of variance.

## 2. Proofs of Lemma 2 and Theorem 1

We first prove our main lemma.

###### Proof of Lemma 1.

Since , we get that has to change sign at least once. Suppose changes sign exactly once at point . Since implies , we get that . This is a contradiction, since the integrant has fixed sign. We have proved that changes sign exactly two times.

Now, our goal is to show that , which is, for any constants , equivalent to . Suppose changes sign in points . Let us choose in such a way that for (simple system of two linear equations). By convexity we see that has sign pattern and changes sign exactly in . Since has sign pattern , the integrant is non-positive and the assertion follows. ∎

The proof of Corollary 3 is immediate.

###### Proof of Corollary 3.

Note that log concavity of and the fact that has sign pattern implies that the support of is contained in the support of . The sequence is convex and therefore from Lemma 2 we get

 H(Y)=E[−logν(Y)]≥E[−logν(X)]=−∑μ(n)logν(n)≥−∑μ(n)logμ(n)=H(X),

where the last estimate is the well known Gibbs’ inequality. ∎

Lemma 2 easily implies Theorem 1.

###### Proof of Theorem 1.

(a) We can assume that , otherwise and there is nothing to prove. Take . Let be the law of and the law of . With the notation of Lemma 2 we have . Since the sequence is log-affine, by the definition of class we see that is log-concave. Since is supported on a discrete interval and on this interval is a concave sequence, we get that the equation has at most two solutions. Thus changes sign at most two times. Lemma 2 implies that changes sign precisely two times and the concavity of implies that the sign pattern of is . The assertion follows from Lemma 2.

Points (b) and (c) follow immediately from (a). Note that .

(d) Again let . According to Corollary 3 we have to verify the log-concavity of , which reduces, after canceling log-affine factors, to the inequality , . This is equivalent with , which is .

The proofs of points (a’) and (b’) are very similar, but simpler. The last step in the proof of point (d’) is to verify the inequality for , which is equivalent to . ∎

## 3. Discrete degrees of freedom

Suppose is a log-concave sequence supported in some finite discrete interval which without loss of generality can be assumed to be . We say that has degrees of freedom if there exist linearly independent sequences supported in and such that for all the sequence

 p+δ1q1+…+δdqd

is log-concave in .

We shall prove the following lemma describing sequences with small number of degrees of freedom. The proof is a rather straightforward adaptation of the argument presented in .

###### Lemma 5.

Let . Suppose a positive log-concave sequence supported in has degrees of freedom. Then with , where are arithmetic progressions.

###### Proof.

Since is strictly positive and log-concave, it can be written in the form , where is convex. The sequence is called the slope sequence. Clearly the slope sequence is non-decreasing. We prove the lemma by contrapositive. We shall assume that cannot be written as a maximum of arithmetic progressions. Our goal is then to prove that has at least degrees of freedom.

Define the sequence inductively by taking and as long as the set is non-empty. Thus with , as is not piecewise linear with at most pieces.

For let us define the sequence via the expression

 Vi(n)={V(n)n∈[0,ni]V(ni)+V′(ni)(n−ni)n∈[ni,L].

It is not hard to show that are convex. We shall assume that so that the sequence is not constant. If this is not the case it suffices to reflect the picture and use instead of .

Claim 1. There exists such that for all the sequence

 e−V(1+δ+δ0V0+δ1V1+…+δkVk)

is log-concave.

###### Proof of Claim 1.

On each of the intervals , , where we take , the above sequence is given by the expression of the form , where is an arithmetic progression. We first check that for we have for sufficiently small. By continuity one can assume that . We want to prove convexity of

 ~V(n) =V(n)−log(1+μ1W(n)+μ2V(n))=V(n)−log(1+μ2(V(n)+μ1μ2W(n))) =−μ1μ2W(n)+V(n)+μ1μ2W(n)−log(1+μ2(V(n)+μ1μ2W(n))) =−μ1μ2W(n)+hμ2(V(n)+μ1μ2W(n)),

where

 hμ(t)=t−log(1+μt).

The first term is affine. For small the function is increasing and convex. Since the sequence is convex, it is enough to show that is a convex sequence whenever is an increasing convex function and is convex. This is straightforward since

 f(g(n))≤f(12g(n+1)+12g(n−1))≤12f(g(n+1))+12f(g(n−1)).

Now we are left with checking our inequality in points . But since then , the inequality follows by a simple continuity argument.

Claim 2. The sequences are linearly independent.

###### Proof of Claim 2.

Let . Let us consider , . To prove that are linearly independent, it suffices to show that are linearly independent. Indeed, suppose that . This means that

 b−1U−1+b0(U−1+U0)+…+bk(U−1+U0+…+Uk)≡0,

which is

 (b−1+…+bk)U0+(b0+…+bk)U1+…+bkUk≡0.

If are linearly independent, it follows that for , which easily leads to for all .

Now the fact that are linearly independent is easy since for is supported in . These intervals form a decreasing sequence, so in order to show that every combination in fact has zero coefficients it is enough to evaluate this equality first at points to conclude that (note that the support of for is contained in ) and then consecutively at points to conclude that . ∎

Combining Claim 1 and Claim 2 finishes the proof.

Let us now consider the space of all log-concave sequences is an interval . We shall identify the sequence

with a vector in

. Suppose we are given vectors and real numbers . Let us introduce the polytope

 Pd(a,v)={p:⟨p,vi⟩=ai, i=1,…,d}∩[0,∞)|I|.

We will be assuming that this polytope is bounded, which will be the case in our applications. Let us now assume that we are given a convex continuous functional . The following lemma is well known and can be found in the continuous setting in .

###### Lemma 6.

The supremum of a convex continuous functional on is attained on some sequence having at most degrees of freedom.

###### Proof.

Let . By compactness of the supremum of on is attained. By convexity of on the maximum is the same as the maximum on and is attained in some point . Moreover, as we work in a finite dimensional Euclidean space, is also compact. A baby version of the Krein-Milman theorem shows that is a convex combination of extreme points of , that is , where positive numbers sum up to one. By convexity attains its maximum on also in all the points . Thus, the maximum of on is attained in some extreme point of . Clearly extreme points of must belong to . It is therefore enough to show that if has more than degrees of freedom, then is not an extreme point of .

Suppose with support and there exist linearly independent sequences supported in and such that for all the sequence

 pδ=p+δ1q1+…+δkqk

is log-concave in and thus also in . Therefore, it belongs to . Note that the set of parameters for which form a linear subspace of dimension at least . If then this subspace is non-trivial and contains two antipodal points and . Note that and thus is not an extreme point of as both and belong to .

We are now ready to prove Theorem 4.

###### Proof of Theorem 4.

Step 1. Let and let be the probability mass function of . Our goal is to prove the inequality . By an approximation argument one can assume that has its support contained in . Note that , where . We would like to maximize the linear (and thus convex) functional under the constraints given by vectors (fixing to be a probability distribution) and , fixing the mean. Thus Lemma 6 implies that the maximum is attained on sequences having at most two degrees of freedom and therefore for of the form for some . As a consequence, in order to prove the inequality it is enough to consider only sequences of the form

 μ(n)=1f(x0)⋅xn0n!1[k,l](n),wheref(x)=l∑i=kxii!.

Step 2. One can assume that is non-constant. Clearly . Our goal is to prove the inequality

 1f(x0)⋅xn00n0!≥1en0⋅nn00n0!.

This simplifies to which after taking the logarithm reads . Recall that . Plugging this in gives the equivalent form

 logf(x0)≤x0f′(x0)f(x0)(1−log(f′(x0)f(x0))).

It would therefore be enough to show that the function

 h(x)=x⋅f′(x)f(x)−x⋅f′(x)f(x)⋅log(f′(x)f(x))−logf(x)

is nonnegative for all . Taking will then finish the proof.

Step 3. By a direct computation we have

 h′(x)=−logf′(x)f(x)f2(x)⋅(−x(f′(x))2+xf(x)f′′(x)+f(x)f′(x)).

Claim 1. For all we have .

###### Proof of Claim 1..

By Cauchy-Schwarz inequality

 f(x)(xf′(x)+x2f′′(x)) =(l∑i=kxii!)(l∑i=kxi(i−1)!+xi(i−2)!)=(l∑i=kxii!)(l∑i=kixi(i−1)!) ≥(l∑i=kxi(i−1)!)2=(xf′(x))2.

The assertion follows by dividing both sides by . ∎

Claim 2. The function has a unique zero .

###### Proof.

According to a theorem due to Gurvits  (see also [10, 11] for alternative proofs) a function of the form is log-concave for if the sequence is log-concave. Thus is log-concave for . Equivalently is a decreasing function on and thus is increasing.

If then while . By intermediate value property has a unique zero in . If then and is the unique zero of . ∎

We can now easily finish the proof. By Claims 1 and 2 we see that is nonpositive on and nonnegative on . Therefore attains its minimum at . It is therefore enough to check the inequality . Clearly implies that . Thus . The inequality is therefore equivalent to and is obvious as is a truncated sum defining the exponential function.