1 Introduction
For two independent samples, the twosample problem is concerned to determine whether the two samples are generated from the same population. Although it is considered an old problem in statistics, it always attracts the attention of researchers due to it applications in different fields. For instance, in medical studies, one may want to asses the efficiency of a new drug to two groups of patients.
The twosample problem can be stated formally as follows. Given two independent samples and , with and
being unknown continuous cumulative distribution functions (cdf’s), the aim is to test the null hypothesis
against all other alternatives.The methodology developed in this paper is Bayesian and it is inspired from the recent work of AlLabadi and Evans (2018) for model checking. At first, two Dirichlet processes and are considered as priors for and , respectively. Then the concentration of the posterior distribution of the distance between the two processes is compared to the concentration of the prior distribution of the distance between the two processes. If the posterior is more concentrated about the model than the prior, then this is evidence in favor of and if the posterior is less concentrated, then this is evidence against . This comparison is made through a particular measure of evidence known as the relative belief ratio, which will indicate whether there is evidence for or against . Moreover, a calibration of this evidence is provided concerning whether there is strong or weak evidence for or against the hypothesis. The proposed methodology is simple, general and does not require obtaining a closed form of the relative belief ratio. More details about relative belief ratio are highlighted in Section 2 of this paper.
Developing procedures for hypothesis testing has recently given a considerable attention in the literature of Bayesian nonparametric inference. A main stream of these procedures has focused on embedding the suggested model as a null hypothesis in a larger family of distributions. Then priors are placed on the null and the alternative and a Bayes factor is computed. For instance, Florens, Richard, and Rolin (1996) used a Dirichlet process for the prior on the alternative. Carota and Parmigiani (1996), Verdinelli and Wasserman (1998), Berger and Guglielmi (2001) and McVinish, Rousseau, and Mengersen (2009) considered a mixture of Dirichlet processes, a mixture of Gaussian processes, a mixture of Pólya trees and a mixture of triangular distributions, respectively, for the prior on the alternative. Another approach for model testing is based on placing a prior on the true distribution generating the data and measuring the distance between the posterior distribution and the proposed one. Swartz (1999) and AlLabadi and Zarepour (2013, 2014a) considered the Dirichlet process prior and used the Kolmogorov distance to derive a goodnessoffit test for continuous models. Viele (2000) used the Dirichlet process and the KullbackLeibler distance to test discrete models. Hsieh (2011) used the Pólya tree prior and the KullbackLeibler distance to test continuous distributions. The work described above focuses only on goodness of fit tests and model checking. With regard to the twosample problem, the literature is very scarce and scattered. Some exceptions include the remarkable work of Holmes, Caron, Griffin, and Stephens (2015) who developed a way to compute the Bayes factor for testing the null hypothesis through the marginal likelihood of the data with Pólya tree priors centered either subjectively or using an empirical procedure. Under the null hypothesis, they modeled the two samples to come from a single random measure distributed as a Pólya tree, whereas under the alternative hypothesis the two samples come from two separate Pólya tree random measures. Ma and Wong (2011) allowed the two distributions to be generated jointly through optional coupling of a Pólya tree prior. Borgwardt and Ghahramani (2009) discussed twosample tests based on Dirichlet process mixture models and derived a formula to compute the Bayes factor in this case. An extension of the Bayes factor approach based on Pólya tree priors to cover censored and multivariate data was proposed by Chen and Hanson (2014). Huang and Ghosh (2014) considered the twosample hypothesis testing problems under Pólya tree priors and Lehmann alternatives. Shang and Reilly (2017) introduced a class of tests, which use the connection between the Dirichlet process prior and the Wilcoxon rank sum test. They also extend their idea using the Dirichlet process mixture prior and developed a Bayesian counterpart to the Wilcoxon rank sum statistic and the weighted log rank statistic for right and interval censored data. In a recent work, AlLabadi and Zarepour (2017) proposed a method based on the Kolmogorov distance and samples from the Dirichlet process to assess the equality of two unknown distributions, where the distance between two posterior Dirichlet processes is compared with a reference distance. The parameters of the two Dirichlet processes are chosen so that any discrepancy between the posterior distance and the reference distance is only attributed to the difference between the two samples.
In Section 3, the Dirichlet process prior is briefly reviewed. In Section 4, the Cramérvon Mises distance between two Dirichlet processes is considered and several of its theoretical properties are developed. Section 5 addresses setting parameters of the two Dirichlet processes. In Section 6, a computational algorithm of the approach is developed. Section 7 presents several examples where the behaviour of the approach is inspected. Finally, some concluding remarks are made in Section 8. The proofs are placed in the Appendix.
2 Relative Belief Ratios
In this section, for the reader’s convenience, some background of relative belief ratios is provided. For more details about this topic consult, for example, Evans (2015). Let be a collection of densities on a sample space and be a prior on The posterior distribution of given that data is . For an arbitrary parameter of interest the prior and posterior densities of are denoted by and respectively. The relative belief ratio for a value is then defined by , where is a sequence of neighbourhoods of converging nicely (see, for example, Rudin (1974)) to as Quit generally
(1) 
the ratio of the posterior density to the prior density at That is, is measuring how beliefs have changed that is the true value from a priori to a posteriori
. Note that, a relative belief ratio is similar to a Bayes factor, as both are measures of evidence, but the latter measures this via the change in an odds ratio. A discussion about the relationship between relative belief ratios and Bayes factors is detailed in (Baskurt and Evans, 2013). In particular, when a Bayes factor is defined via a limit in the continuous case, the limiting value is the corresponding relative belief ratio.
By a basic principle of evidence,
means that the data led to an increase in the probability that
is correct, and so there is evidence in favour of while means that the data led to a decrease in the probability that is correct, and so there is evidence against . Clearly, when , then there is no evidence either way.Thus, the value measures the evidence for the hypothesis It is also important to calibrate whether this is strong or weak evidence for or against . As suggested in Evans (2015), a useful calibration of is obtained by computing the tail probability
(2) 
One way to view (2
) is as the posterior probability that the true value of
has a relative belief ratio no greater than that of the hypothesized value When so there is evidence against then a small value for (2) indicates a large posterior probability that the true value has a relative belief ratio greater than and so there is strong evidence against When so there is evidence in favour of then a large value for (2) indicates a small posterior probability that the true value has a relative belief ratio greater than and so there is strong evidence in favour of while a small value of (2) only indicates weak evidence in favour of3 The Dirichlet Process
In this section, a concise summary of the Dirichlet process is given. Because of its attractive features, the Dirichlet process, formally introduced in Ferguson (1973), is considered the most wellknown and widely used prior in Bayesian nonparametric inference. Consider a space with a algebra of subsets of . Let be a fixed probability measure on , called the base measure, and be a positive number, called the concentration parameter. Following Ferguson (1973), a random probability measure is called a Dirichlet process on with parameters and , denoted by , if for any finite measurable partition of with , . It is assumed that if , then with a probability one. Note that, for any and so and Thus, can be viewed as the center of the process. On the other hand, controls concentration, as the larger value of , the more likely that will be close to . We refer the reader to AlLabadi and Abdelrazeq (2017) for additional interesting asymptotic properties of the Dirichlet process and other nonparametric priors.
A distinctive feature of the Dirichlet process, among many other nonparametric priors, is its conjugacy property. Specifically, if is a sample from , then the posterior distribution of is where
(3) 
with and is the Dirac measure at Notice that, is a convex combination of the prior base distribution and the empirical distribution. Clearly, as while as .
Following Ferguson (1973), has the following series representation
(4) 
where , , independent of , and . It follows clearly from (4) that a realization of the Dirichlet process is a discrete probability measure. This is true even when the base measure is absolutely continuous. One could resemble the discreteness of with the discreteness of . Note that, since data is always measured to finite accuracy, the true distribution being sampled from is discrete. This makes the discreteness property of with no practical significant limitation. Indeed, by imposing the weak topology, the support for the Dirichlet process is quite large. Specifically, the support for the Dirichlet process is the set of all probability measures whose support is contained in the support of the base measure. This means if the support of the base measure is , then the space of all probability measures is the support of the Dirichlet process. In particular, if we have a normal base measure, then the Dirichlet process can choose any probability measure.
Zarepour and AlLabadi (2012) derived the following series approximation with monotonically decreasing weights for the Dirichlet process
(5) 
where and are as defined in (4), be the cocdf of the g distribution and . They proved that, as , converges almost surely to (4). Note that is the
th quantile of the g
distribution. This provides the following algorithm.Algorithm A: Approximately generating a value from
1. Fix a relatively large positive integer .
2. For , generate .
3. Independent of , for generate exponential and put
4. For , compute
5. Use in (5) to obtain an approximate value from
.
For other simulation methods for the Dirichlet process, see, for instance, Bondesson (1982), Sethuraman (1994), Wolpert and Ickstadt (1998) and AlLabadi and Zarepour (2014b).
Throughout the paper, the notation could refer to either a probability measure or its corresponding cdf where the context determines the appropriate interpretation. That is, for all .
4 Cramérvon Mises Distance
A wellknown and widely used distance between two distributions is the Cramérvon Mises Distance. For cdf’s and this is defined as
The next lemma demonstrates that, as sample sizes get large, the Cramérvon Mises distance between posterior distributions of Dirichlet processes converges to the Cramérvon Mises distance between the true distributions generated the data.
Lemma 1
Given two independent samples and , with and being continuous cdf’s. Let , , and . Then, as , .
The next corollary shows that the posterior distribution of becomes concentrated around 0 as sample sizes increase if and only if holds. The proof follows straightforwardly from Lemma 1.
Lemma 2
Let and , with and being continuous cdf’s. Let and . As , (i) if is true, then and (ii) if is false, then
The following result allows the use of the approximation (5) when considering the prior and posterior distributions of the Cramérvon Mises distance.
Lemma 3
Let and . Let and be two approximations of and , respectively, as defined in (5). Then, as ,
The next lemma demonstrates that the distribution of the distance between two Dirichlet processes is independent from the base measures. This result will play a key role in the proposed approach.
Lemma 4
Let and , where and are continuous. If , then the distribution of does not depend on and .
5 The Approach
Let and be independent samples with and being unknown continuous cdf’s. The goal to test the null hypothesis . To this end, we use the priors and so, by (3), and . From Lemma 1, almost surely approximate . Thus, it looks clear that if is true, then the posterior distribution of the distance between and should be more concentrated about than the prior distribution of the distance between and For example, in Figure 1a (see Example 1), since is true, the plot of the posterior density of is much more concentrated about 0 than the the plot of the prior density of . So, the proposed test includes a comparison of the concentrations of the prior and posterior distributions of via a relative belief ratio based on with the interpretation as discussed inSection 2.
The success of the approach depends significantly on a suitable selection of the parameters of and . As illustrated below, inappropriate values of the parameters can lead to a failure in computing . We discuss first setting values of and . By Lemma 4, the distribution of is independent from the choice of the base measures when , where both need to be continuous. Thus, we suggest to set , although other choices of continuous distributions are certainly possible. An additional and important reason supporting the choice of is to avoid priordata conflict (Evans and Moshonov, 2006; AlLabadi and Evans, 2017). Priordata conflict means that there is a tiny overlap between the effective support regions of and . In this context, the existence of priordata conflict can yield to a failure in computing the distribution of about 0. To avoid priordata conflict, it is necessary that and share the same effective support (note that, and have the same support as and , respectively), which can certainly be secured by setting . The effect of priordata conflict is demonstrated in Section 7, Table 2.
The selection of and is also important. It is possible to consider several values of and . In general, the values of and depends in and , respectively. As indicated in AlLabadi and Zarepour (2017), should be chosen to have a value at most as otherwise the prior may become too influential. Holmes et al. (2015) recommend using values between 1 and 10 and checking the sensitivity of the results to the chosen values. The following algorithm outlines a procedure for selecting the concentration parameters.
Algorithm B: Selection of concentration parameters
1. Start by setting and compute the relative belief ratio and its strength. Algorithm C in the next section addresses such computations.
2. Consider more concentrated priors by setting larger values of and .
3. Compute the corresponding relative belief ratio. There are two scenarios:

If the value of the relative belief ratio in step 1 is less (greater) than 1 and the new value is less (greater) than 1, then there is an evidence against (in favour) .

If the value of the relative belief ratio in step 1 is greater than 1 and the new value is greater (less) than 1, then this is an evidence against (in favour) .
Algorithm B is further explored in Table 1 of Section 7. In most cases, setting is found to be adequate. Holmes et al. (2015) recommend using values between 1 and 10 and checking the sensitivity of the results to the chosen values.
6 Computations
Closed forms of the densities of and are typically not available. Thus, the relative belief ratios need to be approximated via simulation. The following gives a computational algorithm to test . This algorithm is a revised version of Algorithm B of Al Labadi and Evans (2018). Algorithm C: Relative belief algorithm for the twosample problem
1. Use Algorithm A to (approximately) generate a from and a from .
2. Compute .
3. Repeat steps (1)(2) to obtain a sample of values from the prior of .
4. Use Algorithm A to (approximately) generate a from and from .
5. Compute .
6. Repeat steps (4)(5) to obtain a sample of values of .
7. Let be a positive number. Let denote the empirical cdf of based on the prior sample in (3) and for let
be the estimate of
the th prior quantile of Here , and is the largest value of . Let denote the empirical cdf of based on the posterior sample in 6. For , estimate by(6) 
the ratio of the estimates of the posterior and prior contents of It follows that, we estimate by where and is chosen so that is not too small (typically .
8. Estimate the strength by the finite sum
(7) 
For fixed as then converges almost surely to and (6) and (7) converge almost surely to and , respectively.
9. As detailed in Algorithm B, repeat steps (1)(8) for larger values of and .
The following proposition establishes the consistency of the approach to the twosample problem as sample size increases. So the procedure performs correctly as sample size increases when is true. The proof follows immediately from Evans (2015), Section 4.7.1.
Proposition 5
Consider the discretization
. As
(i) if is true, then
and (ii) if is false and , then and
7 Examples
In this section, the approach is illustrated through three examples. In Examples 1 and 2, the methodology is assessed using simulated samples from a variety of distributions and in Example 3 an application to a real data set is presented.
The following notation is used for the distributions in the tables, namely,
is the normal distribution with mean
, is the distribution with degrees of freedom, expis the exponential distribution with mean
andis the uniform distribution over
. For all cases, we set in Algorithm A and , in Algorithm B. The results are also compared with the frequentist Cramérvon Mises (CvM) test. To calculate pvalues of the CvM test, the R function “cramer.test” is used. We also compared our results with the Bayesian nonparametric tests of Holmes et al. (2015) and AlLabadi and Zarepour (2017). Since the obtained results are similar in these tests, we reported only the results of the new approach.Example 1. Consider samples generated from the distributions in Table 1, where each sample is of size 50 (Case 1 Case 9). These distributions are also considered in Holmes et al. (2015) and AlLabadi and Zarepour (2017). To study the sensitivity of the approach to the choice of concentration parameters, various values of and are considered. The results are reported in Table 1. Recall that, we want and the strength close to 1 when is true and and the strength close to 0 when is false. It follows that, the methodology performs perfectly in all cases. For example, in Case 1, since and strength, there is no reason to doubt that the two sampling distributions are not identical. On the other hand, in Case 2, since and strength, the two samples are drawn from two different distributions. We point out that the standard Cramérvon Mises test failed to recognize the difference in Case 6 (i.e., and ). Notice that, in all cases, the appropriate conclusion is attained with . The other values of and considered in Table 1 support the reached conclusions.
Figure 1 provides plots of the density of the prior distance and the posterior distance for some cases in Example 1. It follows, for instance, from Figure 1 that the posterior density of the distance is more concentrated about 0 than the prior density of the distance when the two distributions are equal but not to the same degree otherwise.
Samples  (Strength)  pvalues  

,  1  9.40(1)  0.2977 
10  8.54(1)  
20  4.48(0.776)  
,  1  0(0)  0.0000 
10  0(0)  
20  0(0)  
,  1  0(0)  0.0030 
10  0.08(0.004)  
20  0(0)  
,  1  0(0)  0.0000 
10  0(0)  
20  0(0)  
,  1  9.40(1)  0.4316 
10  8.60(1)  
20  5.78(1)  
,  1  0(0)  0.1169 
10  0.28(0.023)  
20  0.04(0.002)  
,  1  0.02(0.001)  0.0000 
10  0.02(0.001)  
20  0.02(0.001)  
,  1  0.10(0)  0.0020 
10  0.12(0.006)  
20  0.06(0.003)  
,  1  9.02(1)  0.6134 
10  7.30(1)  
20  4.86(1) 
It is also interesting to consider the effect of priordata conflict on the methodology. As discussed in Section 5, priordata conflict will occur whenever there is only a tiny overlap between and . Table 2 gives the outcomes when and for a particular sample of sizes with various choices of and . Obviously, only when we get the correct conclusion. This illustrates the importance of setting in the priors and .
Distribution  (Strength)  pvalue  

,  0.0000  
Figure 2 also provides plots of the density of the prior distance and the posterior distance for the cases in Table 2. It follows that the correct conclusion is only obtained when .
Example 2. In this example, we explore the performance of the proposed test as sample sizes increase. We consider samples from the distributions , (Case 1) and , (Case 2). The results are summarized in Table 3. It follows that the null hypothesis is not rejected in Case 1 but rejected in Case 2 . Clearly, the proposed approach works well even with small sample sizes.
Sample Sizes  ,  ,  

(Strength)  pvalue  (Strength)  pvalue  
1.80(0.586)  0.7083  0.36(0.02)  0.1628  
1.24(0.250)  0.8132  0.48(0.064)  0.1359  
3.48(0.538)  0.9261  0.08(0.004)  0.0069  
2.64(0.422)  0.7103  0.12(0.010)  0.0170  
5.60(1)  0.5864  0.08(0.006)  0.0020  
9.40(1)  0.2977  0(0)  0.0030  
13.08(1)  0.4236  0(0)  0.0000  
17.88(1)  0.2697  0(0)  0.0000 
Example 3. The proposed approach of the twosample problems is illustrated on the chickwts data in R, where weights in grams are recorded for six groups of newly hatched chicks fed different supplements. The goal of this experiment was to measure and compare the effectiveness of various feed supplements on the growth rate of chickens. The first hypotheses of interest is to test whether the distributions of weight of chicks fed by soybean and linseed supplements differ. In the second hypothesis, we examine whether the distributions of weight of chick for sunflower and linseed groups differ. The ordered chick weights for the three samples are:
soybean: 158 171 193 199 230 243 248 248 250 267 271 316 327 329
linseed: 141 148 169 181 203 213 229 244 257 260 271 309
sunflower: 226 295 297 318 320 322 334 339 340 341 392 423
The values recorded in Table 4 do not support the evidence that the distributions of the weight of chicks fed by soybean and linseed supplements differ. On the other hand, they underline that the sunflower and linseed groups differ.
Samples  (Strength)  pvalue  

: soybean & : linseed  1  0.3487  
2  
3  
4  
5  
: soybean & : sunflower  1  0  
2  
3  
4  
5 
8 Concluding Remarks
A Bayesian approach for the twosample problem based on the use of the Dirichlet process and relative belief has been developed. Implementing the approach is fairly simple and does not require obtaining a closed form of the relative belief ratio. Through several examples, it has been shown that the approach performs extremely well. While Cramérvon Mises distance has been used in this paper, other distance measures such as AndersonDarling distance and the KullbackLeibler distance are possible. It is also possible to extend the approach to cover the case of censored data.
References

[1]
AlLabadi, L., and Abdelrazeq, I. (2017). On functional central limit theorems of Bayesian nonparametric priors.
Statistical Methods & Applications, 26, 215–229.  [2] AlLabadi, L., and Evans, M. (2018). Prior based model checking. To appear in Canadian Journal of Statistics.
 [3] AlLabadi, L., and Evans, M. (2017). Optimal robustness results for relative belief inferences and the relationship to priordata conflict. Bayesian Analysis, 12, 705–728.
 [4] AlLabadi, L., and Zarepour, M. (2014a). Goodness of fit tests based on the distance between the Dirichlet process and its base measure. Journal of Nonparametric Statistics, 26, 341–357.
 [5] AlLabadi, L., and Zarepour, M. (2014b). On simulations from the twoparameter PoissonDirichlet process and the normalized inverseGaussian process. Sankhyā A, 76, 158–176.
 [6] AlLabadi, L., and Zarepour, M. (2013). A Bayesian nonparametric goodness of fit test for right censored data based on approximate samples from the betaStacy process. Canadian Journal of Statistics, 41, 3, 466–487.
 [7] AlLabadi, L., and Zarepour, M. (2017). Twosample KolmogorovSmirnov test using a Bayesian nonparametric approach. Mathematical Methods of Statistics, 26, 212–225.
 [8] Baskurt, Z. , and Evans, M. (2013). Hypothesis assessment and inequalities for Bayes factors and relative belief ratios. Bayesian Analysis, 8, 3, 569590.

[9]
Berger, J. O., and Guglielmi, A. (2001). Bayesian testing of a parametric model versus nonparametric alternatives.
Journal of the American Statistical Association, 96, 174–184.  [10] Bondesson, L. (1982). On simulation from infinitely divisible distributions. Advances in Applied Probability, 14, 885–869.
 [11] Borgwardt, K. M., and Ghahramani, Z. (2009). Bayesian twosample tests. http://arxiv.org/abs/0906.4032.
 [12] Carota, C., and Parmigiani, G. (1996). On Bayes factors for nonparametric alternatives. In Bayesian Statistics 5 (J. M. Bernardo, J. . Berger, A. P. Dawid, and A. F. M., eds.) Smith. Oxford University Press, London.
 [13] Chen, Y., and Hanson, T. (2014). Bayesian nonparametric ksample tests for censored and uncensored data. Computational Statistics and Data Analysis, 71, 335–346.
 [14] Evans, M. (2015). Measuring Statistical Evidence Using Relative Belief. Monographs on Statistics and Applied Probability 144, CRC Press, Taylor & Francis Group.
 [15] Evans, M. and Moshonov, H. (2006). Checking for priordata conflict. Bayesian Analysis, 1, 893–914.
 [16] Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. Annals of Statistics, 1, 209230.
 [17] Florens, J. P., Richard, J. F., and Rolin, J. M. (1996). Bayesian encompassing specification tests of a parametric model against a nonparametric alternative. Technical Report 9608, Universitsé Catholique de Louvain, Institut de statistique.

[18]
Hsieh, P. (2011). A nonparametric assessment of model adequacy based on KullbackLeibler divergence.
Statistics and Computing, 23, 149162.  [19] Holmes, C. C., Caron, F., Griffin, J. E., and Stephens, D. A. (2015). Twosample Bayesian nonparametric hypothesis testing. Bayesian Analysis, 2, 297–320.
 [20] James, L. F. (2008). Large sample asymptotics for the twoparameter PoissonDirichlet process. In Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh, ed. B. Clarke and S. Ghosal, Ohio: Institute of Mathematical Statistics, 187–199.
 [21] Lavine, M. (1992). Some aspects of Pólya tree distributions for statistical modelling. Annals of Statistics, 20, 1222–1235.
 [22] Ma, L., and Wong, W. H. (2011). Coupling optional pólya trees and the two sample problem. Journal of the American Statistical Association, 106, 1553–1565.
 [23] McVinish, R., Rousseau, J., and Mengersen, K. (2009). Bayesian goodness of fit testing with mixtures of triangular distributions. Scandivavian Journal of Statistics, 36, 337–354.
 [24] Rudin, W. (1974). Real And Complex Analysis, Second Edition. McGrawHill, New York.
 [25] Evans, M. (2015). Measuring Statistical Evidence Using Relative Belief. Monographs on Statistics and Applied Probability 144, CRC Press, Taylor & Francis Group.
 [26] Sethuraman, J. (1994). A constructive definition of Dirichlet priors. Statistica Sinica, 4, 639–650.
 [27] Swartz, T. B. (1999). Nonparametric goodnessoffit. Communications in Statistics: Theory and Methods, 28, 28212841.
 [28] Verdinelli, I., and Wasserman, L. (1998). Bayesian goodnessoffit testing using finitedimensional exponential families. Annals of Statistics, 26, 1215–1241.
 [29] Viele, K., (2000). Evaluating fit using Dirichlet processes. Technical Report 384, University of Kentucky, Dept. of Statistics.
 [30] Wolpert, R. L., and Ickstadt, K., (1998). Simulation of Lévy random fields. In Practical Nonparametric and Semiparametric Bayesian Statistics, ed. D. Day, P. Muller, and D. Sinha, Springer, 227–242.
 [31] Zarepour, M., and AlLabadi, L. (2012). On a rapid simulation of the Dirichlet process. Statistics & Probability Letters, 82, 5, 916–924.
Appendix A Proofs
Proof of Lemma 1 For any cdf’s and , we have . Since , and (James, 2008; AlLabadi and Abdelrazeq, 2017), the dominated convergence theorem completes the proof.
Proof of Lemma 3 The proof is similar to the proof of Lemma 1. We include the proof for the sake of completeness. For cdf’s and , we have . Since , and , the result is followed by the dominating convergence theorem.
Proof of Lemma 4 Since is nondecreasing, we have
It follows from (4) that
Observe that, since
is a sequence of i.i.d. random variables with continuous distribution
for we have , where is a sequence of i.i.d. random variables with a uniform distribution on . Hence, where and is the Lebesgue measure on . Similarly, where . Thus,If , and since is continuous, we have
This shows that the distribution of does not depend on the base measures and whenever .