 # On a Conjecture of Cusick Concerning the Sum of Digits of n and n + t

For a nonnegative integer t, let c_t be the asymptotic density of natural numbers n for which s(n + t) ≥ s(n), where s(n) denotes the sum of digits of n in base 2. We prove that c_t > 1/2 for t in a set of asymptotic density 1, thus giving a partial solution to a conjecture of T. W. Cusick stating that c_t > 1/2 for all t. Interestingly, this problem has several equivalent formulations, for example that the polynomial X(X + 1)...(X + t - 1) has less than 2^t zeros modulo 2^t+1. The proof of the main result is based on Chebyshev's inequality and the asymptotic analysis of a trivariate rational function, using methods from analytic combinatorics.

Comments

There are no comments yet.

## Authors

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. Introduction

Let denote the binary sum-of-digits function of a nonnegative integer , that is, the number of times the digit  occurs in the binary expansion of . Since is increasing in the average, it is natural to expect that

is a rather probable event. More precisely it was asked by T. W. Cusick (personal communication, 2012) whether the asymptotic densities

 ct=dens{n≥0:s(n+t)≥s(n)}

satisfy, for all integers ,

 (1) ct>1/2.

Here and in what follows, denotes the asymptotic density of a set . It will become clear later, see equation (8), that the density exists in our case. Actually this question arose while Cusick was working on a similar combinatorial problem proposed by Tu and Deng  related to Boolean functions with desirable cryptographic properties, and the results of his work on this problem at that time have been published in .

Concerning his question, Cusick “acquired more confidence in it over time” and consequently “would now refer to the question as a conjecture” (personal communication, September 23, 2015). Although it is quite easy to compute for every fixed (see Section 2), the full statement could not be tackled so far. Our numerical experiments show that (1) holds at least for all , which is a quite good support for Cusick’s conjecture. Moreover, by the same method we computed

 ~ct=dens{n≥0:s(n+t)>s(n)}

for , and the result of this computation suggests that

 (2) ~ct≤1/2

should hold for all , which increases the significance of the original question.

The main result of this paper is the following asymptotic statement, which gives a positive answer to Cusick’s conjecture for almost all integers but also shows that the bound is tight. Moreover, this theorem gives analogous results concerning .

###### Theorem 1.1.

For any we have

 |{t≤T:1/2−ε<~ct<1/2

as . In particular, holds for in a subset of of asymptotic density .

The proof is based on an appropriate averaging argument. More precisely we study the distribution of and for and show, using Chebyshev’s inequality, that the values of (resp. ) concentrate well above (resp. below)

. While the average value is relatively easy to handle, the computation of the variance relies on the asymptotic analysis of

diagonals of a trivariate generating function, which is the most difficult step of the proof.

However, while this theorem shows that there exist many increasing sequences of integers such that , it does not give any concrete example of such a sequence. Of course, by the relation the sequence has this property, but this is admittedly not an interesting example.

We exhibit a more interesting sequence with this property. As it turns out, the sequence we are going to define even has the property that

from above, and we give a more precise asymptotic estimate of these values.

###### Theorem 1.2.

Let and (which has the binary representation for ). Then

 ctj=12+√34√2πj+O(j−3/2).

Moreover, holds for all .

In the proof of this statement we will again make use of a diagonal of a multivariate generating function, but in this case two variables suffice and extracting the asymptotics is much easier than in the proof of Theorem 1.1.

## 2. An auxiliary lemma

The following lemma is an extension of the “Lemma of Bésineau” [3, Lemme 1] (note that we only handle the sum-of-digits function in base , although an analogous statement holds for larger bases). It establishes the fundamental two-dimensional recurrence relation that we will use throughout this paper.

###### Lemma 2.1.

Let be an integer. There exists a partition of the set of nonnegative integers having the properties that

1. Each class is a residue class modulo for some , that is, it is of the form , where .

2. For all integers the set

 B(k,t)={n∈N:s(n+t)−s(n)=k}

is a finite (possibly empty) union of classes from the partition .

In particular, each of the sets possesses an asymptotic density . Moreover, for all and the densities satisfy the following recurrence relation:

 (3) δ(k,1) ={2k−2,k≤1,0otherwise, δ(k,2t) =δ(k,t), δ(k,2t+1) =12δ(k−1,t)+12δ(k+1,t+1).
###### Proof.

We set . For all we have

 d(2n,2t) =d(n,t), d(2n+1,2t) =d(n,t), d(2n,2t+1) =d(n,t)+1, d(2n+1,2t+1) =d(n,t+1)−1,

which follows easily from the elementary property for . Moreover, we have

 (4) d(n,1)=s(i+1)−s(i)=1−ν2(i+1),

which follows by writing with and noting that . (We write to denote the exponent of the prime in the prime factorization of .) We prove the statements by induction on . In the case equation (4) implies

 B(1−ℓ,t)={−1+2ℓ+2ℓ+1N,ℓ≥0,∅otherwise,

since the set of nonnegative exactly divisible by equals . This implies the first line of (3). Let be even, , and . Then

 (5) B(k,2u)={n:d(n,2u)=k}=2{n:d(2n,2u)=k}∪(2{n:d(2n+1,2u)=k}+1)=2{n:d(n,u)=k}∪(2{n:d(n,u)=k}+1),

which is by the induction hypothesis a finite union of arithmetic progressions of the form . If

is odd,

, we get by analogous reasoning

 (6) B(k,2u+1)={n:d(n,2u+1)=k}=2{n:d(n,u)=k−1}∪(2{n:d(n,u+1)=k+1}+1).

The unions in (5) and (6) respectively are disjoint, therefore the statement on the densities follows. This finishes the proof. ∎

The recurrence relation for the densities allows us to compute these densities for any given value of . In Table 1 we list some values of the double family , omitting zeros for more clarity. (The rows are indexed by and the columns by .)

By induction, using Lemma 2.1, or by taking a close look at Table 1, we obtain

 (7) δ(k,t)=0fork>s(t).

(Alternatively, we can also use equation (10) from below, which implies this statement in the form .) Therefore is a finite sum of values : if , we have

 ct=λ+1∑k=0δ(k,t).

The first few values of are therefore all of which are clearly greater than . As already mentioned, a numerical experiment conducted by the authors, using the two-dimensional recurrence relation, reveals that in fact for all . (Note that in order to compute the -th column of , where , we only have to keep track of two adjacent columns with indices and as runs from to . Moreover, only odd have to be taken into account. This can be implemented in a quite efficient way, and the calculation only took a couple of hours on a standard machine.) The minimal value of for in this range is attained at the integer and at the integer obtained by reversing the base- representation of . (In fact, holds for all and , see the article  by Morgenbesser and the third author, which as of 2015 seems to be the only published result on the values .) The value of at these positions equals . Moreover, as we noted in the introduction, the values

 ~ct=dens{n:s(n+t)>s(n)}=λ+1∑k=1δ(k,t),

which only differ by from , seem to satisfy for all .

## 3. Equivalent formulations

There are several equivalent formulations of Cusick’s problem. In this section we present some of them.

### 3.1. Rising factorials

Summing (4) from to yields

 (8) s(n+t)−s(n)=t−ν2((n+1)t),

where denotes the Pochhammer symbol (or “rising factorial”).111We note that (8) is essentially the special case of the formula due to Legendre, involving the sum of digits in prime base . It follows that if and only if . Since the latter condition is periodic in with period , the existence of the limit in the definition of follows immediately. Writing

 aλ,t=12λ∣∣{n<2λ:2λ∤(n+1)t}∣∣,

property (1) is equivalent to , that is, to

 ∣∣{n<2t+1:2t+1∤(n+1)t}∣∣>2t.

This reformulation obviously asks for generalizations—we therefore pose the following informal problem, however we do not follow this path in the present article.

###### Problem.

Find classes of polynomials of degree such that

 (9) ∣∣{n<2t+1:f(n)≡0mod2t+1}∣∣<2t.

Cusick’s question is an instance of this problem, taking the polynomials , which should then have less than zeros in .

On the other hand, property (2), if true, would imply that (9) fails for the polynomial of degree , that is, this polynomial would have at least zeros in the ring .

### 3.2. Columns in Pascal’s triangle

Combining (8) with the special case , we obtain the identity

 (10) s(n+t)−s(n)=s(t)−ν2(n+tt).

Therefore we get a reformulation of (1) as a problem on columns in Pascal’s triangle, namely that

 (11) dens{n:2s(t)+1∤(n+tt)}>1/2.

As before, the condition defining the set on the left hand side is periodic with period ; we will see later that the smallest period is in fact much smaller.

Of course also the property (2), which is complementary to (1), translates to a statement concerning Pascal’s triangle—it is equivalent to the relation

 dens{n:2s(t)∤(n+tt)}≤1/2

analogous to (11). Therefore, assuming that , the integer is the largest exponent such that at least half of the entries in the -th column of Pascal’s triangle are divisible by .

Finally, we note that equals the number of carries that occur when adding to in base (see Kummer ).

### 3.3. Rows in Pascal’s triangle

Questions on rows in Pascal’s triangle modulo powers of primes have received some attention in the literature. We refer to Barat and Grabner  and Rowland  and the references contained in these articles. In the article  the numbers

 ϑj(t)=∣∣∣{n≤t:pj∥(tn)}∣∣∣

are studied (where means ), while the article  works with the closely related expression

 am(t)=∣∣∣{n≤t:m∤(tn)}∣∣∣,

where is a power of a prime. In both articles the corresponding integers are expressed in terms of polynomials involving block digital functions. In  explicit expressions for some prime powers are computed. For example, we have

 a21(t) =2|t|1, a22(t)/a21(t) =1+12|t|10, a23(t)/a21(t) =1+38|t|10+|t|100+14|t|110+18|t|210, a24(t)/a21(t) =1+512|t|10+12|t|100+18|t|110+2|t|1000+12|t|1010+12|t|1100 +18|t|1110+116|t|210+12|t|10|t|100+18|t|10|t|110+148|t|310.

In these formulas, is the number of times the finite word occurs as a subword in the binary representation of . (Note that .)

The formulas for above, and also the case , had already been known before, see Glaisher  (), Carlitz  () and Howard  (). However, Rowland’s method allows (with increasing computational effort) to find an analogous expression for each modulus with prime and . Rowland also implemented this method in a Mathematica package called BinomialCoefficients, available from his website. Moreover, he proved  the following theorem, which is also contained implicitly in the older article .

###### Theorem (Rowland; Barat–Grabner).

Let be a prime and . Then is a polynomial of degree in , where ranges over the set of words in of length at most .

Moreover, Rowland notes that blocks such that or do not occur in this polynomial.

Surprisingly, these polynomials concerning the rows of Pascal’s triangle modulo powers of can also be used for the columns, which is due to the symmetry expressed by the identity

 ν2(n+tt)=ν2(2λ−1−tn)

valid for integers and such that and . This formula can be proved easily via (10) and the identity that holds for . Moreover, Zabek [25, Theorem 3] proved (in particular) that the shortest period of the sequence equals , where and . (Note that this gives the minimal period for the set in (11).) In particular, this means that for . Writing

 b2α(t)=dens{n:2α∤(n+tt)},

we obtain therefore

 b2α(t) =12λ∣∣∣{n<2λ:2α∤(n+tt)}∣∣∣ =12λ∣∣ ∣∣{n<2λ−1−t:2α∤(2λ−1−tn)}∣∣ ∣∣.

Moreover, for all blocks of length such that and (other blocks do not occur in the polynomials from the above theorem) we have

 ∣∣2λ−1−t∣∣wν−1…w0=|t|w′ν−1…w′0,

where . This is valid since the length of the most significant block of s in the binary representation of is at least . With the help of these observations and using the formula again we obtain

 b20(t) =0, b21(t) =2−|t|1, b22(t)/b21(t) =1+12|t|01, b23(t)/b21(t) =1+38|t|01+|t|011+14|t|001+18|t|201, b24(t)/b21(t) =1+512|t|01+12|t|011+18|t|001+2|t|0111+12|t|0101+12|t|0011 +18|t|0001+116|t|201+12|t|01|t|011+18|t|01|t|001+148|t|301

and so on. In particular, we obtain explicit formulas for for having a fixed sum of digits, since

 ct=b2s(t)+1(t),

see equation (11). Unfortunately we do not yet understand the coefficients of the polynomials well enough (for example, they always seem to be nonnegative, as remarked by Rowland ) in order to use these polynomials for deriving a proof of Cusick’s conjecture.

### 3.4. Hyperbinary expansions

There is an interesting connection between Cusick’s question and so-called hyperbinary expansions of a nonnegative integer that we would like to examine. We first define a “simplified array” related to

(by just changing the start vector

of the recurrence). Define

 φ(k,1) ={1,k=0,0otherwise, φ(k,2t) =φ(k,t), φ(k,2t+1) =12φ(k−1,t)+12φ(k+1,t+1).

In Table 2 we display some values of .

Note that by linearity the values can be recovered from the -th column of : we have

 (12) δ(k,t)=∑i,j∈Zi+j=kφ(i,t)δ(j,1)=∑j≥02−1−jφ(k−1+j,t),

which is clearly valid for , and an easy induction yields the statement. Note moreover that

 (13) φ(k,t)=0fork≥s(t),

which is as easy to prove as the corresponding statement (7) for . Interestingly, the (combined) property that for all is implied by a statement on the quantity

 pt=∑k≥0φ(k,t)=s(t)−1∑k=0φ(k,t).

According to our numerical experiments, we have for , and we suspect that this minoration holds indefinitely. Therefore the following lemma is of interest.

###### Lemma 3.1.

Assume that

 (14) pt≥1/2

for all . Then holds for all .

###### Proof.

Let and set . We prove that

 (15) δ(k,t)=φ(k,t1)

for , from which one half of the statement of the lemma will follow immediately. Let and be integers. By the definition of we have , which, applied iteratively, implies that

 φ(k,2ℓt+1)=12φ(k−1,t)+14φ(k,t)+⋯+12ℓφ(k−2+ℓ,t)+12ℓφ(k+ℓ,t+1).

For the last summand equals zero by (13), and the remaining sum is the right hand side of (12).

It remains to treat the second half of the statement, concerning . To this end, we use the following symmetry property of the double family . For we define . Then for all and we have

 (16) φ(k,t)=φ(−k,t′).

We prove this by induction, the case that being trivial. The case follows from . Assume that . Then and . We obtain

 φ(k,t) =12φ(k−1,u)+12φ(k+1,u+1) =12φ(−k+1,u′)+12φ(−k−1,(u+1)′) =12φ(−k−1,(t′−1)/2)+12φ(−k+1,(t′+1)/2) =φ(−k,t′).

From (15), (16), the property and the assumption (14) (in this order) it follows that

 ~ct=∑k≥1δ(k,t)=∑k≥1φ(k,t1)=∑k≥1φ(−k,t′1)=1−pt′1≤1/2.

This finishes the proof of the lemma. ∎

###### Remark.

We note that and are not directly related to each other by an inequality. For example, we have and .

Moreover, we do not get the strict inequality in Lemma 3.1

; at the moment it does not seem obvious how to prove that

for all .

A hyperbinary expansion  of a nonnegative integer is a sequence such that . We call such an expansion proper if either or and . The following proposition connects these expansions to our problem.

###### Proposition 3.2.

For integers and let be the number of proper hyperbinary expansions of such that and . Then

 (17) φ(k,t)=∑i,j≥0i−j=k2−(i+j)hi,j(t).

As an example, we assume that . The proper hyperbinary expansions of are , and . These expansions correspond to and respectively and their weights, given by the factor , are and . This explains column  in Table 2.

Proof of Proposition 3.2. The integers satisfy the following recurrence relation, which can be proved easily by resorting to the definition of .

 h0,0(1)=1,hi,j(1)=0for (i,j)≠(0,0),hi,j(2t)=hi,j(t)for i,j≥0,hi,0(2t+1)=hi−1,0(t)for i≥1,h0,j(2t+1)=h0,j−1(t+1)for j≥1,hi,j(2t+1)=hi−1,j(t)+hi,j−1(t+1)for i,j≥1.

In order to prove (17), we proceed by induction. The statement is clearly valid for , and the case is a trivial consequence of the recurrences governing and . Moreover, we get for

 φ(k,2t+1) =12φ(k−1,t)+12φ(k+1,t+1) =12∑i,j≥0i−j=k−12−(i+j)hi,j(t)+12∑i,j≥0i−j=k+12−(i+j)hi,j(t) =∑i≥1,j≥0i−j=khi−1,j(t)+∑i≥0,j≥1i−j=khi,j−1(t+1) =∑i≥1i=k2−ihi−1,0(t)+∑j≥1−j=k2−jh0,j−1(t+1)+∑i,j≥1i−j=k2−(i+1)hi,j(2t+1) =∑i,j≥0i−j=k2−(i+j)hi,j(2t+1