# The Absent-Minded Passengers Problem: A Motivating Challenge Solved by Computer Algebra

In (S.B. Ekhad and D. Zeilberger, 2020) an exciting case study has been initiated in which experimental mathematics and symbolic computation are utilized to discover new properties concerning the so-called Absent-Minded Passengers Problem. Based on these results, Doron Zeilberger raised some challenging tasks to gain further probabilistic insight. In this note we report on this enterprise. In particular, we demonstrate how the computer algebra packages of RISC can be used to carry out the underlying heavy calculations.

## Authors

• 20 publications
• ### Sparse Representations of Clifford and Tensor algebras in Maxima

Clifford algebras have broad applications in science and engineering. Th...
04/23/2016 ∙ by Dimiter Prodanov, et al. ∙ 0

• ### Computer Algebra in R with caracas

The capability of R to do symbolic mathematics is enhanced by the caraca...
04/12/2021 ∙ by Mikkel Meyer Andersen, et al. ∙ 0

• ### Computer Algebra for Microhydrodynamics

I describe a method for computer algebra that helps with laborious calcu...
08/19/2017 ∙ by Jonas Einarsson, et al. ∙ 0

• ### Deep Learning for Symbolic Mathematics

Neural networks have a reputation for being better at solving statistica...
12/02/2019 ∙ by Guillaume Lample, et al. ∙ 0

• ### Numeric Deduction in Symbolic Computation. Application to Normalizing Transformations

Algorithms of numeric (in exact arithmetic) deduction of analytical expr...
05/27/2016 ∙ by Ivan I. Shevchenko, et al. ∙ 0

• ### Computer Algebra and Material Design

12/06/2016 ∙ by Akihito Kikuchi, et al. ∙ 0

• ### Computer Algebra Methods in Control Systems

As dynamic and control systems become more complex, relying purely on nu...
12/26/2017 ∙ by Masoud Abbaszadeh, et al. ∙ 0

##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## 1. A challenging Email

On January 22, 2020 I received the following email by Doron Zeilberger:

Dear Carsten,

I (and Shalosh) just posted a paper

https://arxiv.org/abs/2001.06839with a challenge to you (see the middle of page 4)

Can you (and Sigma) extend theorem 5 of that paper
to the general case with k absent-minded passengers?

These expressions should be expressions in n,k,
the harmonic numbers H_{n-1}, H_{k-1} and their
generalizations (the partial sum , to n-1, k-1, respectively of
zeta(n) and zeta(k)).

If you and Sigma can do the fourth moment, and derive
the asymptotic in n (with a fixed but arbitrary k), I will
donate $100$ to the OEIS in your honor.

...

Best wishes,
Doron



When I received this email, I was thrilled: First, if one gets such an email – in particular from Doron Zeilberger, one is automatically eager to solve it. And second, I highly appreciate “The On-Line Encyclopedia of Integer Sequences” (OEIS) at http://www.oeis.org and supporting it by a donation of Doron Zeilberger gave an extra strong motivation.

Summarizing, the above email provoked various heavy calculations by means of computer algebra that will be introduced in the following.

## 2. The underlying problem and symbolic summation

The combinatorial problem of absent-minded passengers can be introduced as follows.

###### Definition 2.1.

Consider a plane with seats and suppose that passengers enter the plane step-wise taking their seats. In addition, suppose that the first passengers are absent-minded, i.e., they lost their seat ticket and take a seat (uniformally) at random. The remaining not absent-minded (but shy) passengers take their dedicated seats (as given in the plane ticket) if it is still free; otherwise, they choose (uniformally) at random one of the still available free seats. For , let

be the probability that exactly

passengers sit in the wrong seat and let

be the random variable for “the number of passengers sitting in the wrong seat”. Then the expected value for the passengers sitting in the wrong seat is

 E(Xn) =n∑r=1rpn,k,r and its variance is V(Xn) =E(X2n)−E(Xn)2=n∑r=1r2pn,k,r−E(Xn)2.

The situation of one absent-minded passenger () has been considered in [17, 5] and has been explored further in [7] for the general case . Among other fascinating results, closed forms for and have been obtained in [7] by skillful combinatorial arguments. More precisely, the definite sum representations

 E(Xn) =k(−1+n)n+−k+n∑i=1k1−i+n (2.1) and V(Xn) =−k+n∑i=1(1−i−k+n)(1−1−i−k+n1−i+n)1−i+n (2.2) ] +2⎛⎜⎝(−1+k)k2(−1+n)n2+k∑i=1−k+n∑j=11−j−k+n−j+n−1−j−k+n1−j+nn⎞⎟⎠

have been derived and simplified to

 E(Xn) =k(−1+n)n−kS1(k)+kS1(n) (2.3) and V(Xn) =2k−k2−2n−2kn+2k2n+2n2−kn2(−1+n)n2 (2.4) −k(2+n)S1(k)n+k(2+n)S1(n)n+k2S2(k)−k2S2(n)

in terms of the harmonic numbers

 S1(n)=n∑i=11io

of order (often they are also denoted by with the special case ). In the following we will prefer to write such expressions in terms of the modified harmonic numbers

 ¯So(n)=So(n−1)=n−1∑i=11io

yielding, e.g., the more compact expressions

 E(Xn) =−1+k−k¯S1(k)+k¯S1(n) (2.5) and V(Xn) =(−1+k)k(−1+n)n−k(2+n)¯S1(k)n+k(2+n)¯S1(n)n+k2¯S2(k)−k2¯S2(n). (2.6)

While the simplification from (2.1) to (2.3) is straightforward, more work has to be carried out to derive (2.4) from (2.2). In [7] further details are suppressed how these simplifications have been carried out. Here I want to point out that such (usually painful) classical manipulations are meanwhile obsolete. For instance, by loading in the computer algebra packages

In[1]:=

Sigma - A summation package by Carsten Schneider © RISC-JKU

In[2]:=

HarmonicSums by Jakob Ablinger © RISC-JKU

In[3]:=

EvaluateMultiSums by Carsten Scneider © RISC-JKU

into Mathematica the multi-sum expression for the variance

In[4]:=

can be simplified to a closed form within seconds by executing the command111Within the Sigma package (and later the HarmonicSums package) is denoted by S[o,n].

In[5]:=

Out[5]=

Remark. Internally, the command EvaluateMultiSums of the package EvaluateMultiSums.m [14] uses the summation paradigms of telescoping, creative telescoping and recurrence solving of the summation package Sigma.m [12]; the underlying algorithms generalize the hypergeometric case [9] to the class of indefinite nested sums and products in the setting of difference fields and rings [8, 15, 16]. In addition, the calculations are supported by special function algorithms of the package HarmonicSums.m [2, 3]. We note that the user is completely dispensed from carrying out the summation tools explicitly. However, all the calculation steps are accomplished by proof certificates (based on the creative telescoping paradigm introduced in [18]). Thus if necessary, a rigorous correctness proof can be extracted.

## 3. The generating function approach for large moments

As elaborated in details in [6] the expectation, the variance and higher moments can be elegantly described with the generating function approach. Namely, consider the generating function

 f(k)n(w)=n∑r=0pn,k,rwr

of the probability introduced in Definition 2.1. Then the expectation and variance (see also Definition 2.1) can be straightforwardly connected to the generating function via

 E(Xn)=ddwf(k)n(w)|w=1

and

 V(Xn)=(wddw)2f(k)n(w)|w=1.

More generally, define

 M0(n,k)=1

and

 Ml(n,k)=(wddw)lf(k)n(w)|w=1 (3.1)

for ; note that . Then one can define the th moment by

 ml(n,k)=l∑i=0(li)(−1)l−iMi(n,k)Ml−i1(n,k). (3.2)

In the above email but also in [6] the task was assigned to compute (besides the known moments and ) further moments (at least for ).

The combinatorial approach of [7] to derive multi-sum expressions of for larger seems hopeless. Even though it would have been pure fun to challenge Sigma.m, similarly as in in [10], with more complicated sums than (2.3). Another exciting approach has been carried out in [6, Theorem 5] by guessing the moments for the special case and . As it turns out (and proposed by Doron Zeilberger in his email), this attempt can be pushed forward by the following powerful formula given in [6, Theorem 3]:

 f(k)n(w)=1nk∑r=0r!(kr)wr(1−w)k−rn−k−1∏i=0(rw+i+1)=1+rwnk∑r=0r!(kr)wr(1−w)k−r(2+rw)n−k−1; (3.3)

here denotes the Pochhammer symbol. As a consequence, one can calculate straightforwardly any value of and thus of for particularly chosen and . E.g., we can compute

In[6]:=

In[7]:=

Out[7]=

In[8]:=

Out[8]=

In[9]:=

Out[9]=

In[10]:=

Out[10]=

and activating the formula (3.2) for , and yields the third moment

In[11]:=

Out[11]=

But even more is possible with the hypergeometric sum representation (3.3). As already promoted in [6] one can derive easily the closed form expressions of for symbolic and . For this task we observe that

 Ml(n,k)= (wddw)lf(k)n(w)|w=1 = (wddw)l[1+rwnk∑r=0r!(kr)wr(1−w)k−r(2+rw)n−k−1]∣∣w=1 = (wddw)l⎡⎣1+rwnk∑r=max(0,k−l)r!(kr)wr(1−w)k−r(2+rw)n−k−1⎤⎦∣∣w=1, (3.4)

i.e., for any at most summands contribute and the remaining summands vanish with the evaluation . Moreover the differentiation of the arising building blocks in (3.4) can be carried out easily. For instance if we differentiate twice w.r.t.  by using the Mathematica-command D and the Sigma-commands ToSigma and CollectProdSum we get

In[12]:=

Out[12]=

In this way, one can compute and rediscovers (2.5). Similarly, one gets, e.g.,

 M2(n,k)=(−1+k)2+(−2k+kn−2k2n)¯S1(k)n+k2(¯S1(k))2+(2k−kn+2k2n)¯S1(n)n−2k2¯S1(k)¯S1(n)+k2(¯S1(n))2+k2¯S2(k)−k2¯S2(n)+(−1+k)k(−1+n)nδ(−2+k) (3.5)

with

 δ(x)={1 if x≥00 if x<0.

Note that in the special case the factor does not contribute; this comes from the fact that in the summation of (3.4) the lower bound should be not negative.

In a nutshell, one can calculate sufficiently many and using the formula (3.2) one gets the desired moments . E.g., with given in (2.5) and (3.5) one can reproduce as given in (2.6). Similarly, one obtains, e.g.,

 m3(n,k)= −(−2+k)(−1+k)k(−2+n)(−1+n)n+2k3¯S3(n)+(6k2−5kn−kn2)¯S1(k)(−1+n)n−3k(¯S1(k))2n +(−6k2+5kn+kn2)¯S1(n)(−1+n)n+6k¯S1(k)¯S1(n)n−3k(¯S1(n))2n+3(−k+2k2+k2n)¯S2(k)n −2k3¯S3(k)−3(−k+2k2+k2n)¯S2(n)n+3(2k2−k3−2kn+k2n)(−1+n)2nδ(−2+k) and m4(n,k)= (−3+k)(−2+k)(−1+k)k(−3+n)(−2+n)(−1+n)n+6k−12k2+30k3−6k4+18kn−29k2n−7k2n2(−1+n)n¯S2(n) +¯S2(k)(−1+n)n(−6k+12k2−30k3+6k4−18kn+29k2n+7k2n2) +¯S1(k)(−2+n)(−1+n)n(−52k+36k2−12k3+40kn−11kn2−kn3) +¯S1(n)(−2+n)(−1+n)n(52k−36k2+12k3−40kn+11kn2+kn3) +3(−2k+4k2−6kn+3k2n+k2n2)(¯S1(k))2(−1+n)n−4k(¯S1(k))3n+3k4(¯S2(n))2 −6(−2k+4k2−6kn+3k2n+k2n2)¯S1(k)¯S1(n)(−1+n)n+12k(¯S1(k))2¯S1(n)n +3(−2k+4k2−6kn+3k2n+k2n2)(¯S1(n))2(−1+n)n−12k¯S1(k)(¯S1(n))2n+4k(¯S1(n))3n −6(2k−4k2+2k3+k3n)¯S1(k)¯S2(k)n+6(2k−4k2+2k3+k3n)¯S1(n)¯S2(k)n −4(2k−6k2+6k3+3k3n)¯S3(k)n+6k4¯S4(k)+6(2k−4k2+2k3+k3n)¯S1(k)¯S2(n)n −6(2k−4k2+2k3+k3n)¯S1(n)¯S2(n)n−6k4¯S2(k)¯S2(n) +4(2k−6k2+6k3+3k3n)¯S3(n)n−6k4¯S4(n)−2(−2+n)2(−1+n)2n(−4k+36k2 −30k3+6k4−24kn+k2n+15k3n−4k4n+16kn2−15k2n2+3k3n2)δ(−3+k) +[−6¯S1(k)(−1+n)2n(−2k−5k2+3k3+10kn−7k2n+k3n) +6¯S1(n)(−1+n)2n(−2k−5k2+3k3+10kn−7k2n+k3n)+1(−1+n)3n(−k+37k2

For instance, specializing to gives

 m4(n,1) =(14+nn−6¯S2(n))¯S1(n)+3(−2+n)¯S1(n)2n+4¯S1(n)3n −(18+7n)¯S2(n)n+3¯S2(n)2+4(2+3n)¯S3(n)n−6¯S4(n)

for one absent-minded passenger (as derived in [6, Theorem 5]) and

 m4(n,2) =(2(−74+13n+13n2)(−1+n)n−24(1+2n)¯S2(n)n)¯S1(n)+48¯S2(n)2 +12(5−2n+n2)¯S1(n)2(−1+n)n+8¯S1(n)3n−4(21+19n)¯S2(n)n +16(7+6n)¯S3(n)n−96¯S4(n)+2(51−45n+19n2)(−1+n)n

for two absent-minded passengers.

Following this strategy we succeeded in computing the moments up to on a machine with 12 cores and 1.5TB memory, but failed to proceed due to time and space limitations. Luckily, another great trick from [19] enabled us to compute even more moments. Namely, the calculation of

 ¯Ml(Xn)=(ddw)lf(k)n(w)|w=1=(ddw)l⎡⎣1+rwnk∑r=max(0,k−l)r!(kr)wr(1−w)k−r(2+rw)n−k−1⎤⎦∣∣w=1 (3.6)

turns out to be much faster (and less memory consuming) than the calculation of (3.4). Given the moments of this so-called exponential moments, one gets back the ordinary moments by using the following formula from [19]:

 Ml(Xn)=l∑r=1S(l,r)¯Ml(Xn) (3.7)

where denote the Stirling Numbers of the Second kind. In summary, we carried out the following steps:

1. Calculate the exponential moments with the formula (3.6).

2. Calculate the ordinary moments with the formula (3.7).

3. Finally, calculate the desired moments with the formula (3.2).

Using this approach, we succeeded in calculating generously222Originally only was required in the challenge. the moments up to in a recent amount of time and memory. The moments up to order 16 are online available and the link can be found in [11]. The corresponding timings333The calculations have been distributed (parallelized) neatly using about 10 Mathematica subkernels. for the different steps, the total time and the size of the moments are summarized in the table of Figure 1.

We note that for the closed form expression can be given in terms of the harmonic numbers and (resp.  and ) with . We note further that the sequences generated by the harmonic numbers (resp. ) are algebraically independent among the rational sequences, i.e., the representation of in terms of the harmonic numbers is optimal. Interesting enough, the algebraic independence can be shown with the help of the summation paradigm of parameterized (creative) telescoping; see [13, Example 6.3]. For more general classes of harmonic sums we refer also to [4].

## 4. Asymptotic expansions

Two days after Doron Zeilberger’s first email, I got a slight update of his challenge:

Finally, to get the donation in your honor you have to complete
the challenge. Find asymptotic expressions for $m_l(n)$ for
at least l=7 (it would be nice to also have l=8), and
prove (hopefully automatically) that

 limn→∞ml(n)m2(n)l/2
equals 0 if l is odd and (2*l-1)(2*l-3)...1  if l is even.
This would be a partial elementary reproof of the
asymptotic normality proved for arbitrary r using "fancy" probability
of the asymptotic normality, in the Henze-Last paper.


Given the calculations of we proceed as follows. For we load in the computed closed form expression

In[13]:=

and use the HarmonicSums command to calculate the first terms of the expansion in :

In[14]:=

Out[14]=

where is Euler’s constant, , and denotes the Riemann-Zeta function. Summarizing, we have calculated

 m2(n,k)=7(−1+k)k6n3+k(−25+18k)12n2+k(−1+2k)2n−(k+2kn)S1(k)+k2S2(k)−k2ζ(2)+(k+2kn)(γ+log(n))+O(1n4). (4.1)

Analogously, we obtain, e.g., the expansion of :

 m3(n,k) =−k(31−38k+8k2)4n3−k(73−90k+12k2)12n2+12k(−1+6k)−3k(−1+2k)ζ(2)n +(γ+log(n))(k−k(−13+12k)2n3−3k(−3+2k)n2+6kn+6kS1(k)n) +(−k+k(−13+12k)2n3+3k(−3+2k)n2−6kn)S1(k) −3k(γ+log(n))2n+(−3(−2+k)2kn3+3(−2+k)kn2)δ(−2+k)+O(1n4).

In total we calculated these expansions of up to with the following timings

 l56789101112131415161718time1.2s3.6s9.6s2565s167s405s986s2419s1.6h3.8h9.8h22.7h53.4h.

Here we parallelized the calculations on 10 Mathematica kernels. E.g., we needed hours to obtain the expansion for ; however summing up all the timings of the used kernels we needed in total days of CPU time to tackle the case . The expansions up to order 16 are online available and the corresponding link can be found in [11].

To fully win Doron Zeilberger’s challenge, we also dealt with the limit

 cl(k)=limn→∞ml(n,k)m2(n,k)l/2,l≥2.

Looking at (4.1) the leading term in the expansion of equals . In addition the leading term in is also . Thus

 c3(k)=limn→∞m3(n,k)m3(n,k)3/2=limn→∞1√klog(n)=0.

Similarly, given the other computed expansions we get

 c2l+1=0,0≤l≤8

and

 l24681012141618cl(k)131510594510395135135202702534459425.

As a consequence we can confirm that the first values agree with the double factorial of odd numbers (sequence A001147 in OEIS), i.e.,

 c2l(k)=(2l−1)!!=l∏i=1(2i−1)=(2l−1)!(l−1)!2l−1

holds for all .

## 5. Conclusion

When I received Doron Zeilberger’s email, the first calculations were straightforward – except that I was (and I am still not) expert in probability theory and thus made some annoying errors in the beginning. However, pushing the calculations further to the

moment and computing its asymptotic expansions was a challenge. Here the computer algebra packages developed at RISC (Sigma.m and EvaluateMultiSums.m by myself and HarmonicSums.m by Jakob Ablinger) were of great help to calculate these huge expressions. I hope that this little note advertises how existing computer algebra tools in general and in particular those of the combinatorics group at RISC can push forward interesting (combinatorial) research topics. For instance, these packages have been heavily used in the last years to carry out large-scale QCD-calculations in the research field of elementary particle physics; see, e.g., [1] and references therein. Last but not least, I am very grateful to Doron Zeilberger who challenged me with these problems and initiated this fun project.