On the Maximum Entropy of a Sum of Independent Discrete Random Variables
Let X_1, …, X_n be independent random variables taking values in the alphabet {0, 1, …, r}, and S_n = ∑_i = 1^n X_i. The Shepp–Olkin theorem states that, in the binary case (r = 1), the Shannon entropy of S_n is maximized when all the X_i's are uniformly distributed, i.e., Bernoulli(1/2). In an attempt to generalize this theorem to arbitrary finite alphabets, we obtain a lower bound on the maximum entropy of S_n and prove that it is tight in several special cases. In addition to these special cases, an argument is presented supporting the conjecture that the bound represents the optimal value for all n, r, i.e., that H(S_n) is maximized when X_1, …, X_n-1 are uniformly distributed over {0, r}, while the probability mass function of X_n is a mixture (with explicitly defined non-zero weights) of the uniform distributions over {0, r} and {1, …, r-1}.
READ FULL TEXT