Asymptotic Distribution and Simultaneous Confidence Bands for Ratios of Quantile Functions

Ratio of medians or other suitable quantiles of two distributions is widely used in medical research to compare treatment and control groups or in economics to compare various economic variables when repeated cross-sectional data are available. Inspired by the so-called growth incidence curves introduced in poverty research, we argue that the ratio of quantile functions is a more appropriate and informative tool to compare two distributions. We present an estimator for the ratio of quantile functions and develop corresponding simultaneous confidence bands, which allow to assess significance of certain features of the quantile functions ratio. Derived simultaneous confidence bands rely on the asymptotic distribution of the quantile functions ratio and do not require re-sampling techniques. The performance of the simultaneous confidence bands is demonstrated in simulations. Analysis of the expenditure data from Uganda in years 1999, 2002 and 2005 illustrates the relevance of our approach.

Authors

• 3 publications
• 1 publication
• 5 publications
03/21/2018

Network and Panel Quantile Effects Via Distribution Regression

This paper provides a method to construct simultaneous confidence bands ...
03/11/2019

Confidence Interval for Quantile Ratio of the Dagum Distribution

In economic research inequality measures based on ratios of quantiles ar...
11/30/2021

Application of Equal Local Levels to Improve Q-Q Plot Testing Bands with R Package qqconf

Quantile-Quantile (Q-Q) plots are often difficult to interpret because i...
06/06/2020

Bi-s^*-Concave Distributions

We introduce a new shape-constrained class of distribution functions on ...
05/20/2020

Simultaneous Confidence Tubes for Comparison of Several Multivariate Linear Regression Models

Much of the research on multiple comparison and simultaneous inference i...
07/27/2019

Modellvalidierung mit Hilfe von Quantil-Quantil-Plots unter Solvency II (Model validation on the basis of quantile-quantile-plots under Solvency II)

After several years of development, the Solvency II-project has finally ...
02/25/2019

Multiscale quantile segmentation

We introduce a new methodology for analyzing serial data by quantile reg...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Let and

be two independent random variables with cumulative distribution functions

and , respectively. The corresponding quantile functions are given by , . In many applications it is of interest to compare quantiles of two random variables at a given , which can be done by considering

 g(p)=Q2(p)Q1(p).

For example, if is income in some population at time and is income at time , then reports the proportion by which the -quantile of income changed from to , with indicating income growth. In medical research one can compare quantiles of some measures obtained in treatment and control groups and then shows the effect of the treatment on the -quantile.

In applications is either considered and interpreted at a fixed or the curve , is reduced to some number. For example, Cheng and Wu (2010) as well as Wu (2010) studied the effect of cancer treatment measured by the ratio of the cancer volumes in the treatment and the control group, the so-called -ratio. The -ratio can be formed for the mean cancer volume or for a certain quantile of the volume in the treatment and the control group, but typically is not considered as a function of . Dominici et al. (2005) and Dominici and Zeger (2005) used the whole curve , but only to calculate the mean difference

 Δ=E(X1)−E(X2)=∫10{Q1(p)−Q2(p)}dp=∫10[Q1(p){1−g(p)}]dp

which is known as the average treatment effect (ATE). To obtain , is estimated by a smooth function. This approach has been applied to estimate the difference in medical expenditures between persons suffering from diseases attributable to smoking and persons without these diseases.

However, it is clearly more advantageous to view as a function of . To the best of our knowledge, this has been done only in the poverty research context. In particular, Ravallion and Chen (2003) used the curve

 G(p)={Q2(p)Q1(p)}m−1={g(p)}m−1,p∈(0,1),m=1t2−t1∈(0,1]

for the analysis of income distributions in developing countries at times and called the growth incidence curve (GIC). Poverty reduction can be understood as increasing the incomes of the poor. In this sense poverty is reduced from period to , if takes positive values for all small quantiles up the quantile where the poverty line was located in the first period. Such growth that increases the incomes of poor quantiles has been called “weak absolute” pro-poor growth, i.e. growth that is accompanied by absolute poverty reduction without making any statement about the distributional pattern of growth, see Klasen (2008). On the other hand, if has a negative slope, growth was pro-poor in the relative sense, i.e. the poor benefited (proportionately) more from growth than the non-poor. This means that such growth episodes led to a decrease in inequality and relative poverty. For a detailed discussion of different notions of pro-poor growth we refer to Ravallion (2004) and Klasen (2008). Growth incidence curves were also applied to non-income data in Grosse et al. (2008).

Hence, considering the whole curves or , provides more informative comparison of two distributions and can be applied not only in the poverty research context. The goal of this work is to derive the asymptotic distribution of an estimator of and build simultaneous confidence bands for . Estimation and inference for is then straightforward.

Dominici et al. (2005) proposed an estimator for using smoothing splines. Venturini et al. (2015) extend the work by Dominici et al. (2005), employing a Bayesian approach to get a smooth estimator of , for some known monotone differentiable function . A much simpler approach, which we pursue, would be to replace the unknown in by some estimator , to get . There are several quantile estimators available (see e.g. Harrell and Davis, 1982; Kaigh and Lachenbruch, 1982; Cheng, 1985). In this work we employ the classical empirical quantile function.

Apparently, simultaneous inference about the curve , is crucial in applications, but has not been considered so far, to the best of our knowledge. Dominici et al. (2005) rather focused on estimation of the average treatment effect with the help of and do not discuss inference about . Cheng and Wu (2010) consider estimation of at a given

and build a confidence interval for

using asymptotic normality arguments and several estimators for the variance of

. The Worldbank Poverty Analysis Toolkit (can be found at http://go.worldbank.org/YF9PVNXJY0) provides also only point-wise confidence intervals for growth incidence curves, similar in spirit to that of Cheng and Wu (2010). More specifically, the confidence statement in this toolkit is constructed for a discretization of by . For every , expectation and variance for some estimator of are estimated with a bootstrap. Critical values and are then taken from the corresponding -distribution for some level . This implicitly assumes that is asymptotically normal. The resulting confidence statement has the form

 P{c–i≤G(pi)≤¯¯ci}=1−α,for each i=1,2,…,k,

where is some pre-specified confidence level. Obviously, these confidence intervals provide inference only at a given . For example, if we would like to test significance of the poverty reduction (or treatment effect) at the median, it is enough to build a point-wise confidence interval for (or for ) and check if it includes zero (or one). However, for the test if the growth was pro-poor in the relative sense, a confidence statement about the slope of has to be made and, hence, simultaneous confidence bands should be considered. That is, the goal is to find such and that

 P{c–(p)≤G(p)≤¯¯c(p)for all% p∈(0,1)}=1−α.

The difference to the point-wise intervals is that holds not only separately for every , but simultaneously for all .

To build simultaneous confidence bands for or , the analysis of the asymptotic distribution of the function is necessary. This involves the theory of empirical processes which goes back to Glivenko (1933), Cantelli (1933), Donsker (1952), and Komlós et al. (1975). Our analysis builds on results for empirical quantile processes and its simultaneous confidence bands developed in Csörgő and Révész (1978), Csörgő and Révész (1984), and Csörgő (1983). The main benefit of this approach is that it allows for faster computation of the confidence bands without re-sampling techniques.

The paper is organized as follows. In Section 2 we introduce a simple sample counterpart estimator and analyse its asymptotic distribution. This estimator is also used by the World Bank Toolkit. The results about the asymptotic distribution motivates two constructions for asymptotic simultaneous confidence bands presented in Section 3. Section 4 evaluates the small sample properties of our confidence bands by Monte Carlo simulations. Expenditure data from Uganda are analysed with our confidence bands in Section 5 before we conclude in Section 6.

2 Estimation and asymptotic distribution

Throughout this section we assume that we have i.i.d. samples of and of . Furthermore, we assume that the samples are stochastically independent of each other. This assumption is justified if the data are collected in two independent groups (e.g. treatment and control) or in repeated cross-sections. Note that there is a related concept of non-anonymous growth incidence curves proposed for panel data in Grimm (2007) and Bourguignon (2011). Non-anonymous growth incidence curves are built based on two dependent samples and are not treated in this work.

2.1 Quantile ratio estimator

We start by presenting a simple sample estimator for and . For we denote the -th order statistic of the sample by . The sample quantile function is the inverse of the right continuous empirical distribution function, which is known to be

 ˆQj(p)=ˆF−1j(p)=Xj,(k),% for k−1nj

We now define estimators of and as

 ˆg(p)=ˆQ2(p)ˆQ1(p) and ˆG(y)={ˆg(p)}m−1,m∈(0,1]. (2)

It is well-known that the quantile function and its empirical version are equivariant under strictly monotone transformations. Let us denote by and the cumulative distribution and quantile functions of , , respectively. Also, let be the empirical quantile function of the log-transformed sample , , . Then, , as well as , . Consequently,

 log{g(p)}=Q2(p)−Q1(p),log{ˆg(p)}=ˆQ2(p)−ˆQ1(p)log{G(p)+1}=m{Q2(p)−Q1(p)},log{ˆG(p)+1}=m{ˆQ2(p)−ˆQ1(p)}. (3)

Hence, a simultaneous confidence band for can be obtained observing that

 P{c–(p)≤g(p)≤¯¯c(p),∀p∈(0,1)}=P[log{c–(p)}≤Q2(p)−Q1(p)≤log{¯¯c(p)},∀p∈(0,1)].

Note that the difference of two quantile functions is known as quantile treatment effect (QTE), sometimes also named the percentile-specific effect between two populations, see Dominici et al. (2006). To the best of our knowledge, the inference for QTE is usually done at a fixed , rather than simultaneously.

2.2 Point-wise asymptotic distribution

We first characterizes the asymptotic distribution of at a fixed . The following assumption usually holds for data on income, expenditure, or cancer volume, etc.

Assumption 1.

Two independent random variables and

with finite second moments and cumulative distribution functions

and are given together with random samples , . The log-transformed has the cumulative distribution function and density , . The corresponding quantile function has the quantile density , , .

Theorem 1.

Let Assumption 1 hold and be fixed. Moreover, assume and are continuously differentiable at some and , respectively, such that and .

• For the estimator is asymptotically log-normal with the parameters and

 σ(p)= ⎷m2p(1−p)[{q1(p)}2n1+{q2(p)}2n2].
• If in addition and are continuously differentiable at some and , respectively, such that , for some , and , then the asymptotic distribution of is bivariate log-normal with the parameters and

 σ(p,~p)=m2p(1−~p){q1(p)q1(~p)n1+q2(p)q2(~p)n2}.
Corollary 1.

Under the assumptions of Theorem 1 we have asymptotic normality for in the sense that

 ˆG(p)+1−{g(p)}m{g(p)}mσ(p)D⟶N(0,1)

converges in distribution to a standard normal random variable for and for any fixed .

The World Bank Toolkit and Cheng and Wu (2010) implicitly employ the asymptotic normality of and to build point-wise confidence intervals, but use different variance estimators, based either on bootstrap or on certain approximations. To the best of our knowledge, the result of Corollary 1 is new. Note also that depends on unknown , , which have to be consistently estimated in practice.

Theorem 1 and Corollary 1 provide two different ways for deriving point-wise confidence statements about (or about by setting ). We can approximate the distribution of for a fixed

either by a log-normal or by a normal distribution. However, the log-normal approximation is preferable for positive random variables. Indeed,

, implies for all . Hence, a normal approximation of the distribution of

puts probability mass outside of

. This can cause confidence intervals to take impossible values, in particular in small samples, and affect the actual coverage of the band. Taking a log-normal approximation helps to avoid this. We use the log-normal approximation implicitly in our constructions of simultaneous confidence bands in Section 3.

2.3 Approximation by Brownian bridges

In the previous Section 2.2 derivation of the confidence statements about or at one or at a finite number of points reduces to finding the limiting distribution of at a fixed . To obtain confidence statements about or that hold for all simultaneously, we need to find the limiting distribution of , which is treated as a stochastic process indexed in .

Let us define the following stochastic process

 Dn1,n2(p;s)=√n1n2n1+s2n2{sˆQ1(p)−Q1(p)q1(p)−ˆQ2(p)−Q2(p)q2(p)},p∈(0,1),

where is a fixed scaling parameter independent of needed later for technical reasons. For the analysis of this process we need the following set of assumptions on and .

Assumption 2.

The cumulative distribution functions of the log-transformed , are twice differentiable on , where , , and on . In addition, there exists some such that

 supx∈(a,b)Fj(x){1−Fj(x)}∣∣ ∣∣f′j(x){fj(x)}2∣∣ ∣∣≤γ,j=1,2. (4)
Assumption 3.

For and , one of the following conditions hold

• If , then is non-decreasing on an interval to the right of and non-increasing on an interval to the left of .

If and are log-normal, as typically the case for income, expenditure and similar positive random variables, then is the density of a normal distribution. Hence, existence, positivity and differentiability of on are trivially fulfilled. The supremum in (4) is for normally distributed random variables independent of expectation and variance. The property in Assumption 3 is called tail-monotonicity. For normal distributions and Assumption 3 (ii) obviously holds.

The following result shows that converges uniformly to a Brownian bridge . Recall that a Brownian bridge is a standard Wiener process with , i.e. , . In particular, and for all .

Theorem 2.

Let Assumptions 1 and 2 hold and set . Then a series of Brownian bridges can be defined such that for any fixed

 supp∈[δn,1−δn]∣∣Dn1,n2(p;s)−Bn1,n2(p)∣∣a.s.=O{n−1/2log(n)}

with . If in addition Assumption 3 holds, a Brownian bridge can be defined such that in case of Assumption 3 (i)

 supp∈[0,1]∣∣Dn1,n2(p;s)−Bn1,n2(p)∣∣a.s.=O{n−1/2log(n)}

and in case of Assumption 3 (ii)

 supp∈(0,1)∣∣Dn1,n2(p;s)−Bn1,n2(p)∣∣a.s.={O{n−1/2log(n)}if γ<2O[n−1/2{loglog(n)}γ{log(n)}(1+ε)(γ−1)]if γ≥2

for arbitrary .

For example, if are approximately log-normal in a way that has the tail behavior of a normal variable, then according to Theorem 2 the process converges to a Brownian bridge simultaneously on with the rate .

Constructing confidence sets for or requires knowledge of the asymptotic distribution of , while in Theorem 2 contains instead. Therefore, let us consider

 D∗n1,n2(p;s)=2√n1n2n1+s2n2ˆQ1(p)−Q1(p)−{ˆQ2(p)−Q2(p)}q1(p)/s+q2(p).

and discuss the choice of . First, introduce the following assumption.

Assumption 4.

There exists a constant such that the quantile densities satisfy , .

Obviously, under Assumption 4 we have that

 D∗n1,n2(p;s)=Dn1,n2(p;s)=√n1n2n1+s2n2ˆQ1(p)−Q1(p)−{ˆQ2(p)−Q2(p)}q2(p)

and Theorem 2 can be applied to get the asymptotic distribution of and hence the simultaneous confidence bands for or .

It is shown in the Appendix, that if Assumption 4 is true, then

 s=∫∞−∞{f2(x)}2dx∫∞−∞{f1(x)}2dx. (5)

Moreover, if the have distribution from the location-scale family of distributions with locations and scales , , then Assumption 4 implies that . This can be seen directly from (5) applying the change of variable . Also, let denote the quantile function of and the corresponding quantile density. Then, and therefore , , . In particular, Assumption 4 implies that and thus the distributions of and differ only in location and scale parameters.

For example, if

are both log-normally distributed with arbitrary location parameters and scale parameters

, then , are normally distributed and . In applications, to check if distributions of and differ only in the location and scale, one can inspect the QQ-plot of standardised log-transformed data.

If the quantile densities are not proportional, that is, Assumption 4 is not fulfilled, we have to handle the term

 D∗n1,n2(p;s)−Dn1,n2(p;s)=q1(p)−sq2(p)q1(p)+sq2(p)√n1n2n1+s2n2{ˆQ1(p)−Q1(p)q1(p)/s+ˆQ2(p)−Q2(p)q2(p)}.
Lemma 1.

Under Assumptions 1, 2 and 3

 limsupn1,n2→∞(loglog√n1n2n1+s2n2)−1/2 supp∈(1/n,1−1/n)∣∣D∗n1,n2(p;s)−Dn1,n2(p;s)∣∣ a.s.≤4ν√2supp∈(1/n,1−1/n)∣∣∣q1(p)−sq2(p)q1(p)+sq2(p){p(1−p)}ν∣∣∣

for all .

Note that the bound on the right hand side is always smaller are equal for every . Since and are usually similar functions in applications, much smaller bounds can be expected.

3 Simultaneous confidence bands

Based on the results of the previous section, we can derive simultaneous confidence bands for and transform them into simultaneous confidence bands for or . Note that simultaneous confidence bands for the quantile treatment effect follow immediately. We make use of Theorem 2 and Lemma 1 from the last section, as well as the Kolmogorov distribution

 P(supp∈[0,1]|B(p)|≤c)=∞∑k=−∞(−1)ke−2k2c2. (6)

Throughout this section we assume a confidence level and denote the corresponding critical value for the Brownian bridge by such that . In addition, we denote by an asymptotically almost sure upper bound from Lemma 1

with some .

In the following, we present two ways of using the approximation by Brownian bridges for the construction of simultaneous confidence band for . Similar approaches for the quantile function have been explored in Csörgő and Révész (1984).

3.1 Confidence bands with quantile density estimation

The first approach to the construction of confidence bands relies on the following argument

 1−α ≈P(∣∣Dn1,n2(p;s)∣∣≤cα, for all 0

The quantities , are unknown and have to be estimated. Various nonparametric methods for the estimation of

have been proposed, typically based on kernel density estimation, see e.g.

Csörgő et al. (1991), Jones (1992), Cheng (1995), Cheng and Parzen (1997), Soni et al. (2012), and Chesneau et al. (2016). We make the following assumption on the densities.

Assumption 5.

The densities , fulfill

 supx∈(a,b)[Fj(x){1−Fj(x)}]2fj(x)<∞andsupx∈(a,b)∣∣f′′j(x)∣∣<∞.

Now we can get the simultaneous confidence bands for the difference of two quantile functions.

Theorem 3.

Let Assumptions 1, 2, 3 and 5 hold and let be a second order kernel with support in . For set

 ˆqj(p)=h−1nj∫10K(y−zhnj)dˆQj(z).

Then a series of Brownian bridges can be defined such that for any fixed

 supp∈[εn,1−εn]∣∣ ∣∣√n1n2n1+s2n2{ˆQ1(p)−Q1(p)ˆq1(p)/s−ˆQ2(p)−Q2(p)ˆq2(p)}−Bn1,n2(p)∣∣ ∣∣a.s.=O{√loglog(n)nδ}

and for

 c∗α(p)=(cα+cs)√n1+s2n2n1n2ˆq1(p)/s+ˆq2(p)2 (7)

we get

 1−α≤ (8) ≤ˆQ2(p)−ˆQ1(p)+c∗α(p),p∈(εn,1−εn)}

with , , , , and .

Note that if Assumption 4 holds, then in (7) is set to zero and is chosen as in (5). Simultaneous confidence bands (8) are given for the difference of two quantile functions, known as the quantile treatment effect. To get simultaneous confidence bands for and recall that so that

 P{ˆQ2(p)−ˆQ1(p)−c∗α(p)≤Q2(p)−Q1(p)≤ˆQ2(p)−ˆQ1(p)+c∗α(p),p∈(εn,1−εn)} = P{exp(−c∗α(p))ˆg(p)≤g(p)≤exp(c∗α(p))ˆg(p),p∈(εn,1−εn)} = P[{ˆG(p)+1}exp(−c∗α(p)m)−1≤G(p)≤{ˆG(p)+1}exp(c∗α(p)m)−1,p∈(εn,1−εn)].

3.2 Direct confidence bands

The confidence band above depends on nonparametric estimation of quantile densities. Two smoothing parameters , have to be chosen, which might be unfavourable in applications. This can be avoided with the alternative construction of confidence bands given in the following theorem.

Theorem 4.

Let Assumption 1 and 2 hold. Then

 1−α=limn1,n2→∞P{ˆQ2 ≤ˆQ2(p+cα√2n2)−ˆQ1(p−cα√2n1);εn≤y≤1−εn},

with for any .

Theorem 4 requires fewer assumptions than Theorem 3, but there is no explicit convergence rate given. However, these confidence bands give good results in numerical simulations. To obtain simultaneous confidence bands for or use .

4 Simulation study

We evaluate the properties of the confidence bands by using synthetic data and building confidence bands for growth incidence curves . Confidence bands for the quantile treatment effect and are equivalent. We consider two settings and in both of them fix . In the first setting and are drawn from log-normal distributions. Thereby, has location parameter and scale parameter , while has location parameter and scale parameter . As already discussed, Assumption 4 holds in this example with . This value is estimated in the simulations, while is set to zero. In the second setting, is as in the first setting, while

is drawn from the gamma distribution with the shape parameter

and scale parameter . In this setting Assumption 4 does not hold and is estimated for the plug-in confidence bands.

We considered four sample sizes . For probability values we used an equidistant grid of length to build the confidence bands; setting the grid length to does not change the results significantly, but increases the computation time in Monte Carlo simulations. The results are based on the Monte Carlo samples of size . The following Table 1 summarizes the actual coverage probability with simulated data for . The results are given in both settings for the confidence bands with plug-in estimators, for the direct confidence bands and for the confidence bands built with the World Bank algorithm.

First of all, the coverage of the confidence bands obtained with the World Bank algorithm is way too small. The reason is that we tested simultaneous coverage, while the World Bank algorithm constructs only point-wise confidence bands.

The actual coverage probability of all our constructions (about ) is slightly larger than the theoretical probability , except for the plug-in confidence bands for , where the coverage is lower than the nominal. This can be attributed to the quality of the nonparametric estimates of the quantile densities in small samples, as also expected from Theorem 3. Once the sample size is large, both confidence bands perform very similar, even with the estimated correction for the plug-in bands in the second setting.

The plots in Figure 1 show typical estimates from the first setting together with plug-in and direct confidence bands for (left) and (right). The true growth incidence curve is the dashed line, while its estimate is the solid line. Plug-in confidence bands are shown as a grey area, while direct confidence bands are solid lines enveloping the growth incidence curve. In accordance with the simulation results, plug-in confidence bands are somewhat narrower for small , while for both confidence bands are nearly indistinguishable. As stated in Theorem 3 and Theorem 4 the confidence bands are not defined for close to and close to . The plots show the bands for probabilities between and .

5 Application to household data

Our work is motivated by the application of growth incidence curves to the evaluation of pro-poorness of growth in developing countries. Absolute poverty is reduced if the growth incidence curve is positive for all income quantiles below the poverty line and such growth is called pro-poor using the weak absolute definition mentioned in the introduction. In this case, there is some income growth for the poor and absolute poverty is reduced. In addition, relative poverty is reduced if has a negative slope, such growth is called pro-poor using the relative definition as it is associated with declining inequality and declining relative poverty.

We analyse data from the Uganda National Household Survey for the years , , and . This is a standard multi-purpose household survey that is regularly conducted to monitor trends in poverty and inequality and its most important correlates. The sample sizes are , , and . We measure welfare by household expenditure per adult equivalent in prices and compute the related growth incidence curves.

First, we consider the growth incidence curve for the time from to . Inspecting in Figure 2

QQ-plots of the standardised log-transformed data (left and middle), we can deduce that both samples show slight departures from the log-normal distribution, but differ from each other only in location and scale, up to four outliers. Hence, we can estimate

according to (5) and set .

The estimated growth incidence curve shown in Figure 3 is close to on the whole interval . It takes positive values up to the quantile and negative values for higher incomes. The slope tends to be negative. This might suggest that absolute poverty and relative poverty was reduced, and growth was pro-poor according to the weak absolute and relative definition. Both simultaneous confidence bands are shown in the left panel; the grey area corresponds to the plug-in confidence bands, while bold lines are the direct confidence bands. As in simulations for large samples, both approaches lead to nearly the same bands. Simultaneous confidence bands include the zero line, which suggests that none of the discussed effects is in fact significant. In contrast, the considerably tighter confidence bands of the World Bank Toolkit, shown in the right plot, would wrongly suggest otherwise, over-interpreting the non-significant poverty reduction and pro-poor growth.

Let us now consider the expenditure data from and