# On the Shannon entropy of the number of vertices with zero in-degree in randomly oriented hypergraphs

Suppose that you have n colours and m mutually independent dice, each of which has r sides. Each dice lands on any of its sides with equal probability. You may colour the sides of each die in any way you wish, but there is one restriction: you are not allowed to use the same colour more than once on the sides of a die. Any other colouring is allowed. Let X be the number of different colours that you see after rolling the dice. How should you colour the sides of the dice in order to maximize the Shannon entropy of X? In this article we investigate this question. We show that the entropy of X is at most 1/2(n) + O(1) and that the bound is tight, up to a constant additive factor, in the case of there being equally many coins and colours. Our proof employs the differential entropy bound on discrete entropy, along with a lower bound on the entropy of binomial random variables whose outcome is conditioned to be an even integer. We conjecture that the entropy is maximized when the colours are distributed over the sides of the dice as evenly as possible.

## Authors

• 3 publications
08/03/2020

### On the Maximum Entropy of a Sum of Independent Discrete Random Variables

Let X_1, …, X_n be independent random variables taking values in the alp...
06/23/2021

### An Effective Bernstein-type Bound on Shannon Entropy over Countably Infinite Alphabets

We prove a Bernstein-type bound for the difference between the average o...
03/19/2019

### Preprocessing Ambiguous Imprecise Points

Let R = {R_1, R_2, ..., R_n} be a set of regions and let X = {x_1, x_2,...
09/02/2019

### A Tight Uniform Continuity Bound for Equivocation

We prove a tight uniform continuity bound for the conditional Shannon en...
12/08/2018

### Tight Bounds on the Rényi Entropy via Majorization with Applications to Guessing and Compression

This paper provides tight bounds on the Rényi entropy of a function of a...
10/27/2018

### Estimating Differential Entropy under Gaussian Convolutions

This paper studies the problem of estimating the differential entropy h(...
11/14/2019

### Estimating differential entropy using recursive copula splitting

A method for estimating the Shannon differential entropy of multidimensi...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1 Prologue and main results

This work is motivated by the following entropy-maximization problem: Fix positive integers such that . Suppose you are given colours and mutually independent dice, each of which has sides. Each dice lands on any of its sides with equal probability. You can colour the sides of the dice in any way you want, but there is only one restriction: you are not allowed to use the same colour more than once on the sides of a die. All other colourings are allowed. Let be the number of different colours that you see after rolling the dice. In what way should you colour the sides of the dice in order to maximize the Shannon entropy of ?

The Shannon entropy (or entropy, for short) of a random variable that takes values on a finite set is defined as

 H(X)=−∑s∈SP[X=s]⋅logP[X=s],

with the convention . Throughout the text, denotes logarithm with base . Shannon entropy may be thought of as the ”amount of information”, or the ”amount of surprise”, that is evidenced by a random variable and, in a certain sense, random variables with large entropy are less ”predictable”. Entropy enjoys several interesting properties which render itself as a useful tool for several problems in enumeration, statistics and theoretical computer science, among several others. We refer the reader to [2, 3] for excellent textbooks on the topic. A central theme that motivates the development of the theory of entropy concerns the so-called maximum entropy principle: within a given class of random variables, find one that has maximum entropy (see [2] for a whole chapter devoted to the topic).

It is well-known that for any random variable taking values in a finite set , it holds

 H(X)≤log(|S|).

This is a consequence of Jensen’s inequality. Notice that the bound is attained by the random variable that takes each value from with equal probability. The following result, due to James Massey, is referred to as the the differential entropy bound on discrete entropy and may be seen as a refinement of the aforementioned bound.

###### Theorem 1.1 (Massey).

Let

be a discrete random variable with finite variance, denoted

. Then

 H(X)≤12log(2πe(Var(X)+1/12)).

In other words, bounds on the variance of discrete random variables imply bounds on their entropy. A proof of Theorem 1.1 can be found in [4] (see also [2, p. 258]) and a refinement can be found in [5].

This work is concerned with the problem of maximizing the entropy of the random variable that counts the number of different colours after a roll of fair dice whose sides have been coloured using colours, subject to the condition that it is not allowed to use a colour more than once on the sides of a die; we refer to this condition as a proper colouring. The random variable that counts the number of different colours after a toss of properly coloured coins (i.e., when ) has been previously studied in [6, 7], in the setting of maximising its median. Similarly to the setting of the median, we conjecture that a proper colouring over the dice that maximizes Shannon entropy is such that the colours have been distributed as evenly as possible over the sides of the dice. In order to be more precise, we need some extra piece of notation.

Let the positive integers be such that . Suppose that is a configuration consisting of dice all of whose sides have been properly coloured using colours. Let be the number of different colours after a roll of the dice. One may associate a hypergraph, , to this configuration: for every colour put a vertex in and for every (properly) coloured die put an edge in containing all vertices corresponding to the colours on the sides of the die. Notice that and . Moreover, the hypergraph may have isolated vertices and, since the same coloured dice may appear more than once in the configuration , it may have edges that appear more than one time in ; i.e., it is a multi-hypergraph. Notice also that every edge has cardinality or, in other words, the hypergraph is -uniform. A -uniform (multi)hypergraph is just a (multi)graph. Here and later, the class consisting of all -uniform multi-hypergraphs on vertices and edges is denoted by . The class , i.e, the class consisting of all -uniform hypergraphs having vertices and edges, will be of particular interest. When , we write instead of .

Now rolling the dice corresponds to choosing an element from each edge uniformly at random, where each choice is done independently of all previous choices; we refer to this sampling procedure by saying that each edge in is randomly oriented towards one of its elements. Here and later, the phrase random orientation on the edges of a hypergraph expresses the fact that each edge in is oriented towards one of its elements with probability , and independently of all other edges. For every , let denote the in-degree of , i.e., the number of edges in that are oriented towards after a random orientation on the edges of . Notice that

 XC=|{v∈V:deg−H(v)>0}|.

In other words, is the number of vertices with non-zero in-degree after a random orientation on the edges of and the above mentioned question on coloured dice can be equivalently expressed as follows.

###### Problem 1.2.

Fix positive integers such that . For every let be the random variable that counts the number of vertices with non-zero in-degree after a random orientation on the edges of . Find a hypergraph such that

 H(XH)≥H(XF), for all F∈Dn,m,r.

We conjecture that the hypergraph that maximizes entropy is such that the degrees of its vertices are as equal as possible. More precisely, we believe that the following holds true. Given a hypegraph and a vertex , we denote by the number of edges in that contain .

###### Conjecture 1.3.

Let the positive integers be such that . A hypergraph from for which it holds

 H(XH)≥H(XG), for all G∈Dn,m,r

is such that

 |degH(v1)−degH(v2)|≤1, for all v1,v2∈V. (1)

In other words, the colours should be distributed over the dice as evenly as possible. In this note we provide an upper bound on the entropy of , for . More precisely, we obtain the following.

###### Theorem 1.4.

Fix and let be the number of vertices with non-zero in-degree after a random orientation on the edges of . Then

 H(XH)≤12log(n)+12log(πe).

We prove Theorem 1.4 in Section 2. Moreover, we show that the bound is tight, up to an additive constant factor, when and . That is, we show that there exist graphs in for which is of order . In fact, one can find several graphs for which the entropy of is of order . For example, if is even, one can take to be the union of vertex-disjoint double edges. The degree sequence of satisfies (1), but is not connected. An example of a connected graph for which has entropy of order is a star on vertices, plus an edge joining any two of its vertices; however, this graph does not satisfy (1). It would therefore be interesting to know whether there exist connected graphs , whose degree sequence satisfies (1), and are such that is of order . The next result shows that cycles on vertices are examples of such graphs. Moreover, it is not difficult to see that the entropy corresponding to a cycle is slightly larger than the entropies corresponding to the aforementioned examples.

###### Theorem 1.5.

Let denote a cycle on vertices. Let be the number of vertices with non-zero in-degree after a random orientation on the edges of . Then

 H(XCn)≥12log(n)+12log(πe)−32−12ln(2)(n−1).

We conjecture that for all . We prove Theorem 1.5 in Section 3. The proof employs a lower bound on the entropy of a binomial random variable whose outcome is conditioned to be an even positive integer. Our article ends with Section 4 in which we state a conjecture.

## 2 Proof of Theorem 1.4

We show that . The result then follows from Theorem 1.1. For every vertex , let be the indicator of the event and notice that . We may therefore write

 Var(XH) = ∑v1,v2∈V(E[Iv1⋅Iv2]−E[Iv1]⋅E[Iv2]) =

We now show that, whenever , we have

 E[Iv1⋅Iv2]−E[Iv1]⋅E[Iv2]≤0, (2)

or, in other words, the indicators and are negatively correlated. This is clearly true when there exists no edge such that , and we may therefore assume that and are both elements of some edge in . Let be the class consisting of those edges in that contain and do not contain and, similarly, let be the class consisting of those edges in that contain and do not contain . Finally, let be the subset of the edges in that contain both and . Notice that

 E[Iv1]⋅E[Iv2]=(1−(r−1r)degH(v1))⋅(1−(r−1r)degH(v2))

and we proceed by working out the term .

Let be the event ”there is no edge in which is oriented towards either or ”, let be the event ”no edge from is oriented towards , but some edge from is oriented towards ”, let be the event ”no edge from is oriented towards , but some edge from is oriented towards ”, and, finally, let be the event ”some edge from is oriented towards and some other edge from is oriented towards ”. In particular, notice that the event has non-zero probability if and only if . If , then is empty and so . Moreover, when the event is empty as well; thus .

Now we may write

 E[Iv1⋅Iv2]=4∑i=1P(Ai)⋅P(Iv1⋅Iv2=1|Ai).

Let and and notice that and . We compute

Moreover, since

 P(A2)=P(A3)=d3∑i=1(d3i)(1r)i(r−2r)d3−i,

the binomial theorem yields

 P(A2)⋅P(Iv1⋅Iv2=1|A2)=1rd3((r−1)d3−(r−2)d3)⋅(1−(r−1r)d2)

as well as

 P(A3)⋅P(Iv1⋅Iv2=1|A3)=1rd3((r−1)d3−(r−2)d3)⋅(1−(r−1r)d1).

Finally, notice that

 P(A4)⋅P(Iv1⋅Iv2=1|A4)=P(A4)=1−P(A1)−P(A2)−P(A3).

Now straightforward calculations show that

 E[Iv1⋅Iv2]−E[Iv1]⋅E[Iv2]=(r−2r)d3(r−1r)d1+d2−(r−1r)d1+d2+2d3≤0.

Thus (2) holds true and we have shown

 Var(XH)≤∑v∈V(E[Iv]−E[Iv]2).

Now write

 E[Iv]−E[Iv]2 = = (r−1r)degH(v)(1−(r−1r)degH(v)) ≤ 14,

where the last estimate follows from the inequality

, for . Hence and Theorem 1.4 follows from Theorem 1.1.

## 3 Proof of Theorem 1.5

Throughout this section, we denote by

the binomial distribution of parameters

and and we occasionally identify a random variable with its distribution. Given two random variables , the notation indicates that they have the same distribution. Recall that we work within the class and that, given , we denote by the number of vertices with non-zero in-degree after a random orientation on the edges of . A non-negative integer which is equal to zero is referred to as an even integer. Finaly, denotes a random variable conditional on the event that it is even. In particular, notice that

 P[Bin(n,e)=k]=(nk)12n−1, for even k,

a fact that is immediate upon observing that the probability that a random variable is even equals (see also [6, Lemma 1] for a more general result). We begin by showing that the entropy of , where denotes a cycle on vertices, equals the entropy of a random variable. The following result is a direct consequence of [7, Theorem 4], but we include here an independent proof (which borrows ideas from [7]) for the sake of completeness.

###### Lemma 3.1.

Let be a cycle on vertices. Then .

###### Proof.

Let be the number of vertices with zero in-degree after the random orientation on the edges of , and let be the number vertices of even in-degree. Notice that and thus

 H(XCn)=H(Z).

For , let . The in-degree sum formula yields

 n=∑v∈Cndeg−Cn(v)=|V1|+2|V2|.

Since has vertices, we also have

 n=Z+|V1|+|V2|.

If we subtract the last two equations we get and thus, since , we conclude

 E=2⋅Z.

In particular, this implies that is even and

 H(Z)=H(E).

Hence it is enough to determine the entropy of . We claim that

 P(E=2k)=(n2k)12n−1, for k≤n/2.

This is clearly true for , so assume that is such that . Now notice that, if we fix an orientation of the edges of for which , then between any two vertices with zero in-degree there exists a vertex whose in-degree equals two and, conversely, between any two vertices with in-degree equal to two there exists a vertex whose in-degree equals zero. This implies that, if we fix vertices , whose in-degree is even, and we fix the in-degree of vertex , say , then the in-degrees of all other vertices are determined (that is, and all other vertices have in-degree one). Since there are two ways to choose the in-degree of , the claim follows. Summarising, we have shown that has the same distribution as a random variable and so their entropies are equal. ∎

Using Lemma 3.1, we obtain a lower bound on the entropy of which is expressed in terms of the entropy of a binomial random variable.

###### Lemma 3.2.

Let be a cycle of vertices. Then

 H(XCn)≥H(Bin(n−1,1/2))−1.
###### Proof.

Lemma 3.1 implies that it is enough to show

 H(Bin(n,e))≥H(Bin(n−1,1/2))−1.

We first show that an outcome of can be obtained as follows: First draw from a random variable. If the outcome is even, then add

the outcome. If the outcome is odd, then add

. Formally, let and define the random variable by

 δX={0, if X is even1, if X is odd.

We claim that . To prove the claim, first notice that is always even and fix an even integer from . Then, using the convention , whenever or , and the relation , we may write

 P(X+δX=k) = P(Bin(n−1,1/2)=k)+P(Bin(n−1,1/2)=k−1) = (nk)12n−1 = P(Bin(n,e)=k),

and the claim follows. Hence it is enough to show

 H(X+δX)≥H(Bin(n−1,1/2))−1.

Now, using the chain rule for entropy, we have

 H(X+δX,X)=H(X)+H(X+δX|X)=H(X+δX)+H(X|X+δX)

and therefore

 H(X+δX) = H(X)+H(X+δX|X)−H(X|X+δX) = H(Bin(n−1,1/2))+H(X+δX|X)−H(X|X+δX),

which in turn implies that it is enough to show

 H(X+δX|X)−H(X|X+δX)≥−1.

Since determines , we have and we are left with showing

 H(X|X+δX)≤1.

To this end, write

 H(X|X+δX)=∑k evenH(X|X+δX=k)⋅P(X+δX=k).

Now, conditional on the event , it follows and therefore

 H(X|X+δX=k)≤log2=1.

This yields

 H(X|X+δX)≤∑k evenP(X+δX=k)=1,

as desired. ∎

The proof of Theorem 1.5 is almost complete.

###### Proof of Threorem 1.5.

It is known (see [1, p. 4]) that

 H(Bin(n−1,1/2))≥12log(n−1)+12log(πe)−12.

Hence, using the inequality , we conclude

 H(Bin(n−1,1/2))≥12log(n)+12log(πe)−12−12ln(2)(n−1)

and Lemma 3.2 yields

 H(XCn)≥12log(n)+12log(πe)−32−12ln(2)(n−1).

The result follows. ∎

## 4 A conjecture

In this section we define a hypergraph from the class which, we believe, maximizes the entropy of the number of vertices of non-zero in-degree, after a random orientation on its edges.

Fix a positive integer . A circular -uniform hypergraph on vertices is defined as follows: Begin with a cycle on vertices, identified with . A proper subset is a path in if it induces a connected sub-graph of the graph . The size of a path is the number of vertices in the corresponding induced sub-graph. We call a hypergraph circular if (up to isomorphism) its set of vertices is equal to and its edges are paths of of size . We denote by the circular hypergraph on vertices whose edge set consists of all paths of size . Circular hypergraphs have attracted some attention due to the fact that they share similar properties with certain classes of matrices (see [8, 9]). Recall that denotes the number of vertices with non-zero in-degree, after a random orientation on the edges of .

We conjecture that circular hypergraphs are such that

 H(XCn,r)≥H(XH), for all H∈Dn,n,r.

Moreover, we believe that the following holds true.

###### Conjecture 4.1.

Fix a positive integer . Then for all , it holds

 H(XCn,r)≥12log(n/r)−O(1).

## References

• [1] J.A. Adell, A. Lekuona, Y. Yu. Sharp Bounds on the Entropy of the Poisson Law and Related Quantities, IEEE Transactions on Information Theory 56 (2010) 2299 – 2306.
• [2] T. M. Cover, J. A. Thomas. Elements of information theory, 2nd Edition, New York, Wiley, 2006.
• [3] O. Johnson,

Information Theory and the Central Limit Theorem

, Imperial College Press, 2004.
• [4] J.L. Massey, On the entropy of integer-valued random variables, In Proc. 1988 Beijing Int. Workshop on Information Theory, pages C1.1-C1.4, July 1988.
• [5] B.H. Mow. A tight upper bound on discrete entropy, IEEE Transactions on Information Theory 44 (2) (1998) 775 – 778.
• [6] C. Pelekis. Bernoulli trials of fixed parity, random and randomly oriented graphs, Graphs and Combinatorics 32 (4) (2016) 1521–1544.
• [7] C. Pelekis, M. Schauer. Network Coloring and Colored Coin Games. In: Search Theory: A Game-Theoretic Perspective, Alpern S., et al. (eds), Springer, New York, NY, (2013).
• [8] A. Quilliot, Circular representation problem on hypergraphs, Discrete Mathematics 51, Issue 3, p. 251–264, (1984).
• [9] A. Tucker, Matrix characterizations of circular-arc graphs, Pacific Journal of Mathematics, Volume 39, Number 2, p. 535–545, (1971).