1 Introduction
Several randomness test suites have been proposed as evaluation methods for random or pseudorandom number generators (PRNGs)[1, 2], in which randomness is tested at two levels. The firstlevel test is an individual test that yields pvalues as well as pass or fail results for each tested sequence, and the secondlevel test evaluates the results of the firstlevel tests. As one of the secondlevel tests, the uniformity of pvalues obtained by the firstlevel test was tested using the goodnessoffit test. However, it is known that the exact distribution of pvalues differs from the uniform distribution depending on the firstlevel test[3, 4, 5]. For the test adopted as one of the secondlevel tests in the test suite NIST SP80022, the effect of this difference on the test results was analyzed, and upper limits of sample size (number of tested sequences) were proposed by F. Pareschi et. al[3], by H. Haramoto[4], and [5]. Pareschi et. al. also considered adopting the KolmogorovSmirnov (KS) test as a secondlevel test [3]
, but their analysis was limited to the case where firstlevel tests were based on the binomial distribution.
In this study, we adopt the KS test as the secondlevel test, without restricting the nature of the firstlevel tests. We analyze the effect of the deviation of the exact distribution of pvalues from the uniform distribution on
, which is usually assumed by the null hypothesis of randomness. Therefore, we derive an inequality that provides an upper bound on the expected value of the KS test statistic. The obtained inequality is numerically examined for a toy distribution of pvalues and some of the practical firstlevel tests in NIST SP80022. This inequality also allows us to estimate the maximal sample sizes required to preempt a high probability of incorrectly identifying an ideal generator as nonrandom. To improve the secondlevel test, we propose using the KS test based on the empirical distribution of pvalues generated by the firstlevel test results of ideal random sequences. In practice, we propose using pseudorandom sequences obtained from the chaotic true orbits of the Bernoulli map
[6, 7] as a substitute for such ideal random sequences.2 Secondlevel randomness test based on the KS test
Using the KS test, we can test the goodnessoffit between the empirical distribution and the reference distribution, or between two empirical distributions.
Let be the pvalues obtained by the firstlevel randomness test. The empirical distribution with samples is defined as
(1) 
where and denotes the number of elements in a set .
Let the null hypothesis be from the reference distribution . The reference distribution is usually assumed to be a uniform distribution . However, there are some cases in which the exact distribution of the pvalue is different from depending on the firstlevel randomness test[3].
The test statistic of the onesample KS test with reference distribution is defined as follows.
(2) 
The null hypothesis is accepted if
(3) 
where is the boundary value for the significant level . This boundary value can be approximated as for a large and small [8]. The boundary values for and are given by and , respectively.
3 Inequality for the expected value of test statistic
Let be the exact distribution for . The test statistic of the KS test with the exact reference distribution is defined as
(4) 
The distribution of asymptotically obeys the Kolmogorov distribution under the null hypothesis if the exact reference distribution is continuous. If the distribution of pvalues of the firstlevel test is discrete, is not continuous but is a piecewise constant. Following Pareschi et. al. [3], we also assume that the distribution of still obeys the Kolmogorov distribution, even if is piecewise constant.
In the following, we analyze the difference between the expected values of the test statistics and under this assumption. Applying the triangle inequality to the righthand side of Equation (2), we obtain
(5) 
where
(6) 
This is a constant determined by the reference distribution and the exact distribution for the firstlevel test.
Considering the expectation with respect to the direct product of the measure determined by for inequality (5), we obtain the inequality
(7) 
It is known that the expected value converges to the constant
(8) 
when , and the constant is independent of [9] .
Inequality (7) implies that the difference has an upper bound of . Note that for the test, the difference between the expected value of the test statistic based on the reference distribution that differs from the exact distribution and that based on the exact distribution is proportional to [10].
From this perspective, the KS test is regarded as more robust to increasing sample size than the test, because the difference in the test statistics is proportional to for the KS test. However, for the same reason, the power of the KS test is expected to be lower than that of the test.
Furthermore, the safety of the randomness test was evaluated using inequality (7). If the difference is admissible for , the maximum sample size within the difference is given by .
(a) The onesample KS test with the uniform distribution  (b) The twosample KS test with the empirical distribution  
No.  Test name  pvalue  Pass Rate  pvalue  Pass Rate  
mean  SD  mean  SD  
1  Frequency Test  0.033  0.025  8/10  10/10  0.510  0.234  10/10  10/10 
2  Block Frequency Test  0.499  0.303  10/10  10/10  0.511  0.329  10/10  10/10 
3  Runs Test  0.374  0.259  10/10  10/10  0.489  0.327  10/10  10/10 
4  Longest Run of Ones Test  0.000  0.000  0/10  0/10  0.594  0.252  10/10  10/10 
5  Binary Matrix Rank Test  0.000  0.000  0/10  0/10  0.504  0.321  10/10  10/10 
6  Discrete Fourier Transform Test 
0.000  0.000  0/10  0/10  0.618  0.354  10/10  10/10 
7  Nonoverlapping Template Matching Test (1)  0.394  0.314  10/10  10/10  0.656  0.303  10/10  10/10 
8  Overlapping Template Matching Test  0.000  0.000  0/10  0/10  0.636  0.260  10/10  10/10 
9  Maurer’s ”Universal Statistical” Test  0.000  0.000  0/10  0/10  0.439  0.240  10/10  10/10 
10  Linear Complexity Test  0.064  0.121  6/10  10/10  0.489  0.291  10/10  10/10 
11  Serial Test (1)  0.415  0.131  10/10  10/10  0.520  0.170  10/10  10/10 
12  Approximate Entropy Test  0.000  0.000  0/10  0/10  0.394  0.347  9/10  10/10 
13  Cumulative Sums Test (1)  0.089  0.138  7/10  10/10  0.409  0.247  10/10  10/10 

4 Twosample KS test with ideal empirical distribution
A simple method to improve the KS test based secondlevel test involves the use of the statistic instead of if the exact distribution is known for the target firstlevel test. In this case, we can obtain test statistics without the error effect. However, it is not always possible to compute the exact distribution for a given firstlevel test. Therefore, as another method, we examine a method that uses the empirical distribution of pvalues obtained from the firstlevel test for ideal random sequences as the reference distribution.
Let be the pvalues obtained by the firstlevel test for ideal or nearly ideal random sequences. By the definition, the distribution of obeys . Similar to Equation (1), the empirical distribution of is defined as
(9) 
By using the twosample KS test, the goodnessoffit between the empirical distribution and is also tested as a secondlevel randomness test. The test statistic of this twosample KS test is defined as
(10) 
For the twosample KS test, the null hypothesis that , and are from the same exact distribution is accepted if
(11) 
for the significance level .
In this study, we propose to construct an empirical distribution using the chaotic true orbit of the Bernoulli map[6, 7]. The dynamical system given by the Bernoulli map is defined as
(12) 
where and . By providing an irrational algebraic number as an initial state , we can generate a chaotic true orbit with infinite precision. Then, we can obtain the binary sequence by assigning
(13) 
This binary sequence corresponds to the binary expansion of the initial state . See [6] and [7] for mathematical support for the good statistical qualities of .
5 Numerical results
5.1 Examples of secondlevel tests based on the KS test
As a first numerical experiment, two secondlevel tests based on the KS test were applied to some of the firstlevel tests in NIST SP80022. One secondlevel test was based on the onesample KS test with the reference distribution , and the second is the secondlevel test based on the twosample KS test with the empirical distribution that was separately prepared. We performed these secondlevel tests ten times, wherein, for each secondlevel test, we used the pvalues obtained by applying the firstlevel test to sequences with length . The tested sequences were generated by the Mersenne twisterbased PRNG. The empirical distribution used as a reference was constructed based on the results of the firstlevel tests for the PRNG based on the chaotic true orbit of the Bernoulli map with and .
The results of the onesample KS test and the twosample KS test are shown in columns (a) and (b) of Table 1, respectively. Here, the mean and the standard deviation of the ten obtained pvalues, and the pass rate of the number of passes divided by ten are shown for each randomness test. For the firstlevel tests of Nos. 7, 11, and 13, which consist of several tests, the result for one test is only shown as an example. The random excursions test and the random excursions variant test were excluded because the number of obtained pvalues varied depending on the tested sequences. The results of the onesample KS test with a uniform distribution completely failed for the firstlevel tests of Nos. 4, 5, 6, 8, 9, and 12. However, almost all the results of the twosample KS test with the empirical distribution were successful. These results suggest an improvement in the secondlevel test using the twosample KS test with the empirical distribution constructed using highquality PRNG.
5.2 Examination of the derived inequality
To examine the inequality (7), we numerically analyze the difference between test statistics and for a particular distribution under the reference distribution . As a toy model, we consider the exact distribution , which is a piecewise linear function, given by
(14) 
where . The graph of is shown in Fig. 1. The constant in Equation (6) for and is equal to .
For a given sample size and constant parameter , we randomly generate that obeys the distribution and calculate and for times. Then, we obtain the mean values and and , respectively. In Fig. 2, (circles) and (solid line) are shown for the cases and . Here, ten samples of are plotted for each . As a result, is less than for both cases and converges to with increasing for . This result is consistent with the inequality (7).
5.3 Safe sample sizes for the frequency test and the binary matrix rank test
Here, we analyze the frequency test and the binary matrix rank test shown in Table 1 as examples. The frequency test was analyzed by Pareschi et. al. as an example of tests based on binomial distribution. As a different example, we analyzed the binary matrix rank test based on the trinomial distribution. The binary matrix rank test also failed for the onesample KS test with a uniform distribution. For these two tests, we calculated the exact distributions for the sequence length and obtained the exact value of the constant [11]. The statistics and , and their difference
were also calculated from the test results shown in column (a) of Table 1. Results are shown in Table 2. The range of the standard error of the mean (SEM) is also shown. The difference
is less than for both tests, and these results are consistent with the inequality (7).For the safety of these tests, we can obtain the maximum sample size for the given admissible difference of the expected values of and , as mentioned in Section 3. For example, if , which is 10% of the boundary value , is admissible, the maximum sample size is for the frequency test and for the binary matrix rank test. Furthermore, the sample size , which is the recommended parameter of NIST SP80022, is safe if is admissible for the frequency test, and is admissible for the binary matrix rank test.
Frequency Test  Binary Matrix Rank Test  

1.3190.114  5.0820.107  
0.8630.057  0.8400.093  
0.4560.100  4.2420.120  
0.798  4.860 
6 Conclusion
In this work we derived an inequality that provides the upper bound on the difference of the expected values of the test statistics for the KS test based secondlevel randomness test. The derived inequality was numerically examined and consistent results were obtained. In addition, we examined the secondlevel test that uses the twosample KS test with the nearly ideal empirical distribution constructed from the PRNG based on the chaotic true orbit for several randomness tests in NIST SP80022. These results are expected to prove useful for evaluating the safety of the randomness test using the KS test. We intend to perform an analysis of the other goodnessoffit tests, such as the CrámervonMises test and the AndersonDarling test, in future work.
Acknowledgement
This work was supported by JSPS KAKENHI Grant Numbers 16KK0005, 17K00355. The computation was carried out using the computer resources offered under the category of General Projects by the Research Institute for Information Technology, Kyushu University.
References
 [1] L. E. Bassham et al., NIST SP80022 Rev. 1a: A Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications, 2010, https://csrc.nist.gov/publications/detail/sp/80022/rev1a/final. (accessed 27 Sep. 2021)
 [2] P. L’Ecuyer and R. Simard, TestU01: A C Library for Empirical Testing of Random Number Generators, ACM Transactions on Mathematical Software, 33 (2007), 1–40.
 [3] F. Pareschi, R. Rovatti and G. Setti, On Statistical Tests for Randomness Included in the NIST SP80022 Test Suite and Based on the Binomial Distribution, IEEE Transactions on Information Forensics and Security, 7 (2012), 491–505.
 [4] H. Haramoto, Study on Upper Limit of Sample Size for a Twolevel Test in NIST SP80022, Japan J. Indust. Appl. Math., 38 (2021), 193–209.
 [5] A. Yamaguchi and A. Saito, On the statistical test of randomness based on the uniformity of pvalues used in NIST statistical test suite (in Japanese), in: Proc. of the 2015 JSIAM Annual Meeting, pp. 34–35, JSIAM, 2015.
 [6] A. Saito and A. Yamaguchi, Pseudorandom Number Generation using Chaotic True Orbits of the Bernoulli Map, Chaos, 26 (2016), 063112.
 [7] A. Saito and A. Yamaguchi, Pseudorandom Number Generator based on the Bernoulli Map on Cubic Algebraic Integers, Chaos, 28 (2018), 103122.
 [8] W. H. Press et al., Numerical Recipes in C : the Art of Scientific Computing, Cambridge University Press, Cambridge, 1992.
 [9] G. Marsaglia, W. W. Tsang and J. Wang, Evaluating Kolmogorov’s Distribution, J. Statistical Software, 8 (2003), 1–4.
 [10] M. Matsumoto and T. Nishimura, A Nonempirical Test on the Weight of Pseudorandom Number Generators, in: Proc. of Monte Carlo and QuasiMonte Carlo Methods 2000, pp.381–395, Springer, 2002.
 [11] A. Yamaguchi and A. Saito, Analysis of the Effect of Discreteness of the pvalue Distribution on the Randomness Test using the GoodnessofFit Test with a Uniform Distribution (in Japanese), in: Proc. of the 2018 JSIAM Annual Meeting, pp. 123–124, JSIAM, 2018.
Comments
There are no comments yet.