DeepAI

# Cantor-solus and Cantor-multus Distributions

The Cantor distribution is obtained from bitstrings; the Cantor-solus distribution (a new name) admits only strings without adjacent 1 bits. We review moments and order statistics associated with these. The Cantor-multus distribution is introduced – which instead admits only strings without isolated 1 bits – and more complicated formulas emerge.

09/10/2018

### On Finding a First-Order Sentence Consistent with a Sample of Strings

We investigate the following problem: given a sample of classified strin...
07/10/2017

### On String Contact Representations in 3D

An axis-aligned string is a simple polygonal path, where each line segme...
12/24/2022

### Testing Distributions of Huge Objects

We initiate a study of a new model of property testing that is a hybrid ...
05/02/2021

### Divergence Scaling for Distribution Matching

Distribution matchers for finite alphabets are shown to have information...
03/19/2021

### On a recolouring version of Hadwiger's conjecture

We prove that for any ε>0, for any large enough t, there is a graph G th...
10/28/2016

### Learnable Visual Markers

We propose a new approach to designing visual markers (analogous to QR-c...
01/27/2018

### A Characterization of Guesswork on Swiftly Tilting Curves

Given a collection of strings, each with an associated probability of oc...

## 1 Cantor Distribution

Let ; for instance, we could take as in the classical case.  Let .  Consider a mapping [5]

 F(ω1ω2ω3⋯ωm)=¯ϑϑm∑i=1ωiϑi

from the set of finite bitstrings () to the nonnegative reals.  The bitstrings in of length are assumed to be equiprobable.  Consider the generating function [1]

 Gn(z)=∑ω∈ΩF(ω)nz|ω|

where denotes the length of the bitstring.  Clearly

 G0(z)=∑ω∈Ωz|ω|=∞∑m=02mzm=11−2z.

The quantity

 [zm]Gn(z)[zm]G0(z)=12m[zm]Gn(z)

is the Cantor moment for strings of length ; let denote the limit of this as .  Denote the empty string by .  From values

 F(ε)=0,F(0ω)=ϑF(ω),F(1ω)=¯ϑ+ϑF(ω)

and employing the recurrence [6]

 Ω=ε+{0,1}×Ω,

we have

 Gn(z) =∑ω∈ΩϑnF(ω)nz1+|ω|+∑ω∈Ω(¯ϑ+ϑF(ω))nz1+|ω| =ϑnzGn(z)+zn∑i=0(ni)¯ϑn−iϑiGi(z) =2ϑnzGn(z)+zn−1∑i=0(ni)¯ϑn−iϑiGi(z)

for ; thus

 Gn(z)=z1−2ϑnzn−1∑i=0(ni)¯ϑn−iϑiGi(z).

Dividing both sides by , we have [5, 7, 8, 9]

 μn=12(1−ϑn)n−1∑i=0(ni)¯ϑn−iϑiμi

because

 limz→z0z1−2ϑnz=12(1−ϑn)

and the singularity of is a simple pole. In particular, when ,

and, up to small periodic fluctuations [9, 10, 11],

 μn∼Cn−ln(2)/ln(3),
 C=12ln(3)∞∫0(∞∏k=21+e−2x/3k2)e−2x/3xln(2)/ln(3)−1dx=0.733874...

as .

We merely mention a problem involving order statistics.  Let denote the expected value of the minimum of

independent Cantor-distributed random variables.  It is known that

[12]

 ξn=12n−2ϑ[¯ϑ+ϑn−1∑i=1(ni)ξi]

in general.  In the special case , it follows that

 ξ1=1/2,ξ2=3/10,ξ3=1/5,ξ4=33/230,ξ5=5/46

and, up to small periodic fluctuations [13],

 ξn∼cn−ln(3)/ln(2),
 c=23ln(2)Γ(ln(3)ln(2))ζ(ln(3)ln(2))=1.9967049717...

as .  If denotes the expected value of the maximum of variables, then

 1−ηn∼cn−ln(3)/ln(2)

by symmetry.

A final problem concerns the sum of all moments of the classical Cantor distribution [14]:

 ∞∑n=0μn =−13+23∞∑k=1(23)k2k∑j=11j =3.3646507281...

## 2 Cantor-solus Distribution

We examine here the set of finite solus bitstrings ().  Let

 fk=fk−1+fk−2,f0=0,f1=1

denote the Fibonacci numbers.  The bitstrings in of length are assumed to be equiprobable.  Clearly

 G0(z)=∑ω∈Ωz|ω|=∞∑m=0fm+2zm=1+z1−z−z2.

 F(1)=¯ϑ,F(10ω)=¯ϑ+ϑ2F(ω)

and employing the recurrence [6]

 Ω=ε+1+{0,10}×Ω,

we have

 Gn(z) =¯ϑnz+∑ω∈ΩϑnF(ω)nz1+|ω|+∑ω∈Ω(¯ϑ+ϑ2F(ω))nz2+|ω| =¯ϑnz+ϑnzGn(z)+z2∑i+j=n(ni,j)¯ϑiϑ2jGj(z) =¯ϑnz+ϑnzGn(z)+ϑ2nz2Gn(z)+z2∑i+j=n,j

for ; thus

 Gn(z)=11−ϑnz−ϑ2nz2⎡⎢ ⎢ ⎢⎣¯ϑnz+z2∑i+j=n,j

The purpose of using multinomial coefficients here, rather than binomial coefficients as in Section 1, is simply to establish precedent for Section 3.  Let be the Golden mean.  Dividing both sides by , we have [1]

 μn =11−ϑn/φ−ϑ2n/φ2⎡⎢ ⎢ ⎢⎣0+1φ2∑i+j=n,j

because

 limz→z0¯ϑnzG0(z)=limz→z01−z−z21+z¯ϑnz=0

and the singularity of is a simple pole.  In particular, when ,

 μ1=0.338826...,μ2=0.203899...,μ2−μ21=0.089096...

and, up to small periodic fluctuations,

 μn∼(0.616005...)n−ln(φ)/ln(3)(3/4)n,

as .  An integral formula in [1] for the preceding numerical coefficient involves a generating function of exponential type:

 M(x)=e−x/3∞∑k=0μkk!(4x9)k,

namely

 12φln(3)∞∫0M(x)e−2x/3xln(φ)/ln(3)−1dx

(we believe that the fifth decimal given in [1] is incorrect, perhaps a typo). Unlike the formula for earlier, this expression depends on the sequence , , , … explicitly.

With regard to order statistics, it is known that [16]

 ξn=11−ϑφ−n−ϑ2φ−2n[¯ϑφ−2n+ϑn−1∑i=1(ni)φ−iφ−2(n−i)ξi],
 ηn=11−ϑφ−n−ϑ2φ−2n[¯ϑ(1−φ−n)+ϑ2n−1∑j=1(nj)φ−2jφ−(n−j)ηj]

in general.  In the special case , we have, up to small periodic fluctuations,

 ξn∼(3.31661...)n−ln(3)/ln(φ),
 3/4−ηn∼(5.35114...)n−ln(3)/ln(φ)

as .

## 3 Cantor-multus Distribution

We examine here the set of finite multus bitstrings ().  Let

 fk=2fk−1−fk−2+fk−3,f0=0,f1=f2=1

denote the second upper Fibonacci numbers [17].  The bitstrings in of length are assumed to be equiprobable.  Clearly

 G0(z)=∑ω∈Ωz|ω|=∞∑m=0fm+2zm=1−z+z21−2z+z2−z3.

 F(11ω)=¯ϑ+¯ϑϑ+ϑ2F(ω),
 F(1110ω)=¯ϑ+¯ϑϑ+¯ϑϑ2+ϑ4F(ω)

and employing the recurrence

 Ω=ε+1+{0,11,1110}×Ω,

we have

 Gn(z) =¯ϑnz+∑ω∈ΩϑnF(ω)nz1+|ω|+∑ω∈Ω(¯ϑ+¯ϑϑ+ϑ2F(ω))nz2+|ω| +∑ω∈Ω(¯ϑ+¯ϑϑ+¯ϑϑ2+ϑ4F(ω))nz4+|ω| =¯ϑnz+ϑnzGn(z)+z2∑i+j+k=n(ni,j,k)¯ϑi(¯ϑϑ)j(ϑ2)kGk(z) +z4∑i+j+k+ℓ=n(ni,j,k,ℓ)¯ϑi(¯ϑϑ)j(¯ϑϑ2)k(ϑ4)ℓGℓ(z) =¯ϑnz+ϑnzGn(z)+ϑ2nz2Gn(z)+z2∑i+j+k=n,k

for ; thus

 Gn(z) =11−ϑnz−ϑ2nz2−ϑ4nz4⎡⎢ ⎢⎣¯ϑnz+z2∑i+j+k=n,k

Let

 ψ=13⎡⎢⎣2+(25+3√692)1/3+(25−3√692)1/3⎤⎥⎦=1.7548776662...

be the second upper Golden mean [17, 18].  Dividing both sides by , we have

 μn =11−ϑn/ψ−ϑ2n/ψ2−ϑ4n/ψ4⎡⎢ ⎢⎣0+1ψ2∑i+j+k=n,k

because

 limz→z0¯ϑnzG0(z)=limz→z01−2z+z2−z31−z+z2¯ϑnz=0

and the singularity of is a simple pole.  In particular, when ,

 μ1=0.504968...,μ2=0.416013...,μ2−μ21=0.161020...

but no asymptotics for are known.  Order statistics likewise remain open.

## 4 Bitsums

We turn to a more fundamental topic: given a set of finite bitstrings, what can be said about the bitsum of a random of length ?  If is unconstrained, i.e., if all strings are included in the sample, then

 E(Sn)=n/2,V(Sn)=n/4

because a sum of independent Bernoulli() variables is Binomial(,).  Expressed differently, the average density of s in a random unconstrained string is

, with a corresponding variance

.

Let us impose constraints.  If consists of solus bitstrings, then the total bitsum of all of length has generating function [19, 20]

 ∞∑n=0anzn=z(1−z−z2)2=z+2z2+5z3+10z4+20z5+⋯

and the total bitsum squared has generating function

 ∞∑n=0bnzn=z(1−z+z2)(1−z−z2)3=z+2z2+7z3+16z4+38z5+⋯;

hence has generating function

 ∞∑n=0cnzn=z(1−z)(1+z)3(1−3z+z2)2=z+2z2+10z3+28z4+94z5+⋯

where is as in Section 2.  Standard techniques [6] give asymptotics

 limn→∞E(Sn)n=limn→∞annfn+2=5−√510=0.2763932022...,
 limn→∞V(Sn)n=limn→∞cnnf2n+2=15√5=0.0894427190...

for the average density of s in a random solus string and corresponding variance.

If instead consists of multus bitstrings, then the total bitsum of all of length has generating function [21]

 ∞∑n=0anzn=z2(2−z)(1−2z+z2−z3)2=2z2+7z3+16z4+34z5+⋯

and the total bitsum squared has generating function

 ∞∑n=0bnzn=z2(4−7z+4z2+3z3−z4)(1−2z+z2−z3)3=4z2+17z3+46z4+116z5+⋯;

hence has generating function

 ∞∑n=0cnzn=z2(4−9z+9z2−9z3−6z4+z5−6z6+z8)(1−z+2z2−z3)3(1−2z−3z2−z3)2=4z2+19z3+66z4+236z5+⋯

where is as in Section 3.  We obtain asymptotics

 limn→∞E(Sn)n =limn→∞annfn+2 =13⎡⎢⎣2−(23+3√691058)1/3+(−23+3√691058)1/3⎤⎥⎦ =0.5885044113...,
 limn→∞V(Sn)n =limn→∞cnnf2n+2 =11587(692)1/3[(404685+35053√69)1/3+(404685−35053√69)1/3] =0.2810976123...

for the average density of s in a random multus string and corresponding variance.  Unsurprisingly and ; a clumping of s forces a higher density than a separating of s.

A famous example of an infinite aperiodic solus bitstring is the Fibonacci word [2, 3], which is the limit obtained recursively starting with and satisfying substitution rules , .  The density of s in this word is [22], which exceeds the average but falls well within the one-sigma upper limit .  We wonder if an analogously simple construction might give an infinite aperiodic multus bitstring with known density.

## 5 Longest Bitruns

We turn to a different topic: given a set of finite bitstrings, what can be said about the duration of the longest run of s in a random of length ?  If is unconstrained, then [6]

 E(Rn,1)=12n[zn]∞∑k=1(11−2z−1−zk1−2z+zk+1),

the Taylor expansion of the numerator series is [23]

 z+4z2+11z3+27z4+62z5+138z6+300z7+643z8+1363z9+2866z10+⋯

and, up to small periodic fluctuations [24, 25],

 E(Rn,1)∼ln(n)ln(2)−(32−γln(2))

as .  Of course, identical results hold for , the duration of the longest run of s in .

If consists of solus bitstrings, then it makes little sense to talk about -runs.  For -runs, over all , we have

 E(Rn,0)=1fn+2[zn]∞∑k=1(1+z1−z−z2−1+z−zk−zk+11−z−z2+zk+1)

and the Taylor expansion of the numerator series is [23]

 z+4z2+9z3+18z4+34z5+62z6+110z7+192z8+331z9+565z10+⋯

where is as in Section 2.

If instead consists of multus bitstrings, then we can talk both about -runs [23]:

 E(Rn,1)=1fn+2[zn]{−z(1−z)(1−z+z2)+∞∑k=1(1+z21−2z+z2−z3−1+z2−zk−1−zk1−2z+z2−z3+zk+1)z},
 num=2z2+7z3+16z4+32z5+62z6+118z7+221z8+409z9+751z10+⋯

and -runs:

 E(Rn,0)=1fn+2[zn]∞∑k=1(1+z21−2z+z2−z3−1+z2−zk−1+zk−2zk+11−2z+z2−z3+zk+2)z,
 num=z+2z2+5z3+11z4+23z5+45z6+87z7+165z8+309z9+573z10+⋯

where is as in Section 3.  Proof: the number of multus bitstrings with no runs of s has generating function [26]

 1+z2−zk−1−zk1−2z+z2−z3+zk+1zif k>1;z1−zif k=1;

we conclude by use of the summation identity

 ∞∑j=0j⋅hj(z)=∞∑k=0(∞∑i=0hi(z)−k∑i=0hi(z)).

Study of runs of s proceeds analogously [27]. The solus and multus results here are new, as far as is known.  Asymptotics would be good to see someday.

## 6 Acknowledgements

I am thankful to Alois Heinz for helpful discussions and for providing the generating function associated with via the Maple gfun package; R and Mathematica have been useful throughout. I am also indebted to a friend, who wishes to remain anonymous, for giving encouragement and support (in these dark days of the novel coronavirus outbreak).

## References

• [1] H. Prodinger, The Cantor-Fibonacci distribution, Applications of Fibonacci Numbers, v. 7, Proc. 1996 Graz conf., ed. G. E. Bergum, A. N. Philippou and A. F. Horadam, Kluwer Acad. Publ., 1998, pp. 311–318; MR1638457.
• [2] S. R. Finch, Prouhet-Thue-Morse constant, Mathematical Constants, Cambridge Univ. Press, 2003, pp. 436–441; MR2003519.
• [3] S. R. Finch, Substitution dynamics, Mathematical Constants II, Cambridge Univ. Press, 2019, pp. 599–603; MR3887550.
• [4] R. Austin and R. Guy, Binary sequences without isolated ones, Fibonacci Quart. 16 (1978) 84–86; MR0465892.
• [5] F. R. Lad and W. F. C. Taylor, The moments of the Cantor distribution, Statist. Probab. Lett. 13 (1992) 307–310; MR1160752.
• [6] R. Sedgewick and P. Flajolet, Introduction to the Analysis of Algorithms, Addison-Wesley, 1996, pp. 120–121, 159–161, 366–373, 379.
• [7] G. C. Evans, Calculation of moments for a Cantor-Vitali function, Amer. Math. Monthly 64 (1957) 22–27; MR0100204.
• [8] C. P. Dettmann and N. E. Frankel, Potential theory and analytic properties of a Cantor set, J. Phys. A 26 (1993) 1009–1022; MR1211344.
• [9] O. Dovgoshey, O. Martio, V. Ryazanov and M. Vuorinen, The Cantor function, Expo. Math. 24 (2006) 1–37; MR2195181.
• [10] W. Goh and J. Wimp, Asymptotics for the moments of singular distributions, J. Approx. Theory 74 (1993) 301–334; MR1233457.
• [11]

P. J. Grabner and H. Prodinger, Asymptotic analysis of the moments of the Cantor distribution,

Statist. Probab. Lett. 26 (1996) 243–248; MR1394899.
• [12] J. R. M. Hosking, Moments of order statistics of the Cantor distribution, Statist. Probab. Lett. 19 (1994) 161–165; MR1256706.
• [13] A. Knopfmacher and H. Prodinger, Explicit and asymptotic formulae for the expected values of the order statistics of the Cantor distribution, Statist. Probab. Lett. 27 (1996) 189–194; MR1400005.
• [14] H. Prodinger, On Cantor’s singular moments, Southwest J. Pure Appl. Math. (2000), n. 1, 27–29; arXiv:math/9904072; MR1770778.
• [15] H. G. Diamond, B. Reznick, K. F. Andersen and O. Kouba, Cantor’s singular moments, Amer. Math. Monthly 106 (1999) 175–176; MR1543421.
• [16] L.-L. Cristea and H. Prodinger, Order statistics for the Cantor-Fibonacci distribution, Aequationes Math. 73 (2007) 78–91; MR2311656.
• [17] V. Krčadinac, A new generalization of the golden ratio, Fibonacci Quart. 44 (2006) 335–340; MR2335005.
• [18] S. R. Finch, Feller’s coin tossing constants, Mathematical Constants, Cambridge Univ. Press, 2003, pp. 339–342; MR2003519.
• [19] N. J. A. Sloane, On-Line Encyclopedia of Integer Sequences, A000045, A001629 and A224227.
• [20] N. Gautheir, A. Plaza and S. Falcón, Binomial coefficients and Fibonacci and Lucas numbers, Fibonacci Quart. 50 (2012) 379–381.
• [21] N. J. A. Sloane, On-Line Encyclopedia of Integer Sequences, A005251, A259966 and A332863.
• [22] J. Grytczuk, Infinite self-similar words, Discrete Math. 161 (1996) 133–141; MR1420526.
• [23] N. J. A. Sloane, On-Line Encyclopedia of Integer Sequences, A119706, A333394, A333395 and A333396.
• [24] D. W. Boyd, Losing runs in Bernoulli trials, unpublished note (1975), https://www.math.ubc.ca/~boyd/bern.runs/bernoulli.html.
• [25] M. F. Schilling, The longest run of heads, College Math. J. 21 (1990) 196–207; MR1070635.
• [26] N. J. A. Sloane, On-Line Encyclopedia of Integer Sequences, A000930, A006498, A000570, A079816, A189593 and A189600.
• [27] N. J. A. Sloane, On-Line Encyclopedia of Integer Sequences, A000931, A003410 and A179070.  Steven Finch MIT Sloan School of Management Cambridge, MA, USA steven_finch@harvard.edu