Let denote the binary sum-of-digits function of a nonnegative integer , that is, the number of times the digit occurs in the binary expansion of . Since is increasing in the average, it is natural to expect that
is a rather probable event. More precisely it was asked by T. W. Cusick (personal communication, 2012) whether the asymptotic densities
satisfy, for all integers ,
Here and in what follows, denotes the asymptotic density of a set . It will become clear later, see equation (8), that the density exists in our case. Actually this question arose while Cusick was working on a similar combinatorial problem proposed by Tu and Deng  related to Boolean functions with desirable cryptographic properties, and the results of his work on this problem at that time have been published in .
Concerning his question, Cusick “acquired more confidence in it over time” and consequently “would now refer to the question as a conjecture” (personal communication, September 23, 2015). Although it is quite easy to compute for every fixed (see Section 2), the full statement could not be tackled so far. Our numerical experiments show that (1) holds at least for all , which is a quite good support for Cusick’s conjecture. Moreover, by the same method we computed
for , and the result of this computation suggests that
should hold for all , which increases the significance of the original question.
The main result of this paper is the following asymptotic statement, which gives a positive answer to Cusick’s conjecture for almost all integers but also shows that the bound is tight. Moreover, this theorem gives analogous results concerning .
For any we have
as . In particular, holds for in a subset of of asymptotic density .
The proof is based on an appropriate averaging argument. More precisely we study the distribution of and for and show, using Chebyshev’s inequality, that the values of (resp. ) concentrate well above (resp. below)
. While the average value is relatively easy to handle, the computation of the variance relies on the asymptotic analysis ofdiagonals of a trivariate generating function, which is the most difficult step of the proof.
However, while this theorem shows that there exist many increasing sequences of integers such that , it does not give any concrete example of such a sequence. Of course, by the relation the sequence has this property, but this is admittedly not an interesting example.
We exhibit a more interesting sequence with this property. As it turns out, the sequence we are going to define even has the property that
from above, and we give a more precise asymptotic estimate of these values.
Let and (which has the binary representation for ). Then
Moreover, holds for all .
In the proof of this statement we will again make use of a diagonal of a multivariate generating function, but in this case two variables suffice and extracting the asymptotics is much easier than in the proof of Theorem 1.1.
2. An auxiliary lemma
The following lemma is an extension of the “Lemma of Bésineau” [3, Lemme 1] (note that we only handle the sum-of-digits function in base , although an analogous statement holds for larger bases). It establishes the fundamental two-dimensional recurrence relation that we will use throughout this paper.
Let be an integer. There exists a partition of the set of nonnegative integers having the properties that
Each class is a residue class modulo for some , that is, it is of the form , where .
For all integers the set
is a finite (possibly empty) union of classes from the partition .
In particular, each of the sets possesses an asymptotic density . Moreover, for all and the densities satisfy the following recurrence relation:
We set . For all we have
which follows easily from the elementary property for . Moreover, we have
which follows by writing with and noting that . (We write to denote the exponent of the prime in the prime factorization of .) We prove the statements by induction on . In the case equation (4) implies
since the set of nonnegative exactly divisible by equals . This implies the first line of (3). Let be even, , and . Then
which is by the induction hypothesis a finite union of arithmetic progressions of the form . If
is odd,, we get by analogous reasoning
The recurrence relation for the densities allows us to compute these densities for any given value of . In Table 1 we list some values of the double family , omitting zeros for more clarity. (The rows are indexed by and the columns by .)
(Alternatively, we can also use equation (10) from below, which implies this statement in the form .) Therefore is a finite sum of values : if , we have
The first few values of are therefore all of which are clearly greater than . As already mentioned, a numerical experiment conducted by the authors, using the two-dimensional recurrence relation, reveals that in fact for all . (Note that in order to compute the -th column of , where , we only have to keep track of two adjacent columns with indices and as runs from to . Moreover, only odd have to be taken into account. This can be implemented in a quite efficient way, and the calculation only took a couple of hours on a standard machine.) The minimal value of for in this range is attained at the integer and at the integer obtained by reversing the base- representation of . (In fact, holds for all and , see the article  by Morgenbesser and the third author, which as of 2015 seems to be the only published result on the values .) The value of at these positions equals . Moreover, as we noted in the introduction, the values
which only differ by from , seem to satisfy for all .
3. Equivalent formulations
There are several equivalent formulations of Cusick’s problem. In this section we present some of them.
3.1. Rising factorials
Summing (4) from to yields
where denotes the Pochhammer symbol (or “rising factorial”).111We note that (8) is essentially the special case of the formula due to Legendre, involving the sum of digits in prime base . It follows that if and only if . Since the latter condition is periodic in with period , the existence of the limit in the definition of follows immediately. Writing
property (1) is equivalent to , that is, to
This reformulation obviously asks for generalizations—we therefore pose the following informal problem, however we do not follow this path in the present article.
Find classes of polynomials of degree such that
Cusick’s question is an instance of this problem, taking the polynomials , which should then have less than zeros in .
3.2. Columns in Pascal’s triangle
Combining (8) with the special case , we obtain the identity
Therefore we get a reformulation of (1) as a problem on columns in Pascal’s triangle, namely that
As before, the condition defining the set on the left hand side is periodic with period ; we will see later that the smallest period is in fact much smaller.
analogous to (11). Therefore, assuming that , the integer is the largest exponent such that at least half of the entries in the -th column of Pascal’s triangle are divisible by .
Finally, we note that equals the number of carries that occur when adding to in base (see Kummer ).
3.3. Rows in Pascal’s triangle
Questions on rows in Pascal’s triangle modulo powers of primes have received some attention in the literature. We refer to Barat and Grabner  and Rowland  and the references contained in these articles. In the article  the numbers
are studied (where means ), while the article  works with the closely related expression
where is a power of a prime. In both articles the corresponding integers are expressed in terms of polynomials involving block digital functions. In  explicit expressions for some prime powers are computed. For example, we have
In these formulas, is the number of times the finite word occurs as a subword in the binary representation of . (Note that .)
The formulas for above, and also the case , had already been known before, see Glaisher  (), Carlitz  () and Howard  (). However, Rowland’s method allows (with increasing computational effort) to find an analogous expression for each modulus with prime and . Rowland also implemented this method in a Mathematica package called BinomialCoefficients, available from his website. Moreover, he proved  the following theorem, which is also contained implicitly in the older article .
Theorem (Rowland; Barat–Grabner).
Let be a prime and . Then is a polynomial of degree in , where ranges over the set of words in of length at most .
Moreover, Rowland notes that blocks such that or do not occur in this polynomial.
Surprisingly, these polynomials concerning the rows of Pascal’s triangle modulo powers of can also be used for the columns, which is due to the symmetry expressed by the identity
valid for integers and such that and . This formula can be proved easily via (10) and the identity that holds for . Moreover, Zabek [25, Theorem 3] proved (in particular) that the shortest period of the sequence equals , where and . (Note that this gives the minimal period for the set in (11).) In particular, this means that for . Writing
we obtain therefore
Moreover, for all blocks of length such that and (other blocks do not occur in the polynomials from the above theorem) we have
where . This is valid since the length of the most significant block of s in the binary representation of is at least . With the help of these observations and using the formula again we obtain
and so on. In particular, we obtain explicit formulas for for having a fixed sum of digits, since
see equation (11). Unfortunately we do not yet understand the coefficients of the polynomials well enough (for example, they always seem to be nonnegative, as remarked by Rowland ) in order to use these polynomials for deriving a proof of Cusick’s conjecture.
3.4. Hyperbinary expansions
There is an interesting connection between Cusick’s question and so-called hyperbinary expansions of a nonnegative integer that we would like to examine. We first define a “simplified array” related to
(by just changing the start vectorof the recurrence). Define
In Table 2 we display some values of .
Note that by linearity the values can be recovered from the -th column of : we have
which is clearly valid for , and an easy induction yields the statement. Note moreover that
which is as easy to prove as the corresponding statement (7) for . Interestingly, the (combined) property that for all is implied by a statement on the quantity
According to our numerical experiments, we have for , and we suspect that this minoration holds indefinitely. Therefore the following lemma is of interest.
for all . Then holds for all .
Let and set . We prove that
for , from which one half of the statement of the lemma will follow immediately. Let and be integers. By the definition of we have , which, applied iteratively, implies that
It remains to treat the second half of the statement, concerning . To this end, we use the following symmetry property of the double family . For we define . Then for all and we have
We prove this by induction, the case that being trivial. The case follows from . Assume that . Then and . We obtain
This finishes the proof of the lemma. ∎
We note that and are not directly related to each other by an inequality. For example, we have and .
A hyperbinary expansion  of a nonnegative integer is a sequence such that . We call such an expansion proper if either or and . The following proposition connects these expansions to our problem.
For integers and let be the number of proper hyperbinary expansions of such that and . Then
As an example, we assume that . The proper hyperbinary expansions of are , and . These expansions correspond to and respectively and their weights, given by the factor , are and . This explains column in Table 2.
Proof of Proposition 3.2. The integers satisfy the following recurrence relation, which can be proved easily by resorting to the definition of .
In order to prove (17), we proceed by induction. The statement is clearly valid for , and the case is a trivial consequence of the recurrences governing and . Moreover, we get for