1 Introduction
The literature on regression methods for discrete response variables has been traditionally concerned with parametric distributions. Generalized linear models (GLMs)
(McCullagh and Nelder, 1989), which assume that the conditional distribution of the response belongs to an exponential family, have long dominated the scene in methodological and applied research. The reasons are manifold and include convenient interpretation of the regression coefficients, e.g., as (log) odds ratios in logistic regression or (log) rate ratios in Poisson regression, universal availability in statistical software, and the benefits of a welldeveloped, unifying maximum likelihood theory.
Research has been increasingly directed toward the development of nonparametric (distributionfree) methods to overcome situations in which traditional approaches are unsatisfactory or, more in general, when the goal of the inference transcends the conditional mean of the response. In its classical formulation (Koenker and Bassett, 1978), quantile regression (QR) provides a distributionfree approach to the modeling and estimation of the effects of covariates on different quantiles of the conditional distribution of a continuous response variable. Median regression is a special case of QR and represents a robust alternative to mean regression.
While most of the progress in QR methods has revolved around continuous responses, relatively less contributions have been made in the discrete case so far. Major hindrances include lack of a general theory for handling different types of discreteness, practical estimation challenges, and the troublesome asymptotic behavior of sample quantiles in the presence of ties. Thus, it is not suprising that existing approaches to discrete QR rely on some notion of continuity, either postulated or artificially induced. Early works in the former category date back to the 1950’s (Rosenblatt, 1958).
Prominent in the econometric literature, Maximum Score Estimation (MSE) deals with conditional median models of binary (Manski, 1975, 1985; Horowitz, 1992) and ordered discrete (Lee, 1992) response variables. More recently, Kordas (2006) extended Horowitz’s (1992) estimator for binary outcomes to quantiles other than the median. In the MSE approach, the key assumption is the existence of a continuous latent variable, say , which undergoes the working of a threshold mechanism resulting in the observable binary outcome
. The conditional quantiles of the observable outcome are obtained as transformed quantiles of the latent outcome. However, MSE is computationally expensive as it involves nonconvex loss functions. Jittering is another strategy used for quantile estimation with discrete responses. A continuous variable, say
, is obtained by adding random noise, say , to the observable discrete outcome, i.e.. Estimation then proceeds by applying standard algorithms for convex quantile loss functions (e.g., linear programming) and, successively, by averaging the noise out. This approach, which has been adopted for modeling count
(Machado and Santos Silva, 2005) and ordinal (Hong and He, 2010) response variables, may lack generality as it requires that adjacent values in the support of are equally spaced.In this paper, we build on midquantiles (Parzen, 1993) to introduce an alternative estimation approach for conditional quantiles of discrete responses. Sample midquantiles, which are based on essentially the same idea of the midvalue (Lancaster, 1961), offer a unifying theory for quantile estimation with continuous or discrete variables and are wellbehaved asymptotically (Ma, Genton and Parzen, 2011). In our approach, we develop a twostep estimator that can be applied to a large variety of discrete responses, including binary, ordinal, and count variables, and is shown to have good theoretical properties. In a simulation study and in real data analyses, we gather empirical evidence that conditional midquantile estimation is more stable and efficient than jittering. However, this evidence is contextual and may not be generalizable.
The rest of the paper is organized as follows. In the next section we discuss modeling, estimation, and theoretical properties of conditional midquantile estimators. In Section 3
, we report the results of a simulation study to assess bias and efficiency of the proposed estimator, as well as confidence interval coverage. In Section
4, we illustrate two real data applications, one on global terrorism and the other on prescription drugs in the United States. We conclude with final remarks in Section 5.2 Methods
2.1 Marginal midquantiles
Let
be a discrete random variable with probability mass function
and cumulative distribution function (CDF)
. The th quantile of , denoted by , is defined as for any . We may define the quantile function (QF) of , , as the generalized inverse of the CDF of Y, that is . In the discrete case, the CDF is not injective, thus a discrete QF is not the ordinary inverse of the CDF. Now let be an independent sample of size from the population . The sample CDF is defined as , , while the (classical) sample QF, defined as the inverse of the sample CDF, is such that , for . This function, too, is discrete.In general, sample quantiles as defined above may not be consistent for the population quantiles when the underlying distribution is discrete (Jentsch and Leucht, 2016). For example, not only does the sample median lack asymptotic normality if is discrete, but its limiting distribution appears to be discrete in the presence of ties, even if is continuous (Genton, Ma and Parzen, 2006).
We now introduce the midcumulative distribution function (midCDF) (Parzen, 1993, 2004), a modification of the standard CDF that plays an important role in discrete modeling and in samples with ties. For a random variable with CDF , either continuous or discrete, the function
(1) 
is called middistribution function (midCDF). If is continuous, then since .
Further, let be the set of distinct values that the discrete random variable can take on, with corresponding probabilities . We also define the midprobabilities and , for . The following function
(2) 
is called midquantile function (midQF) (Ma, Genton and Parzen, 2011). If , then the last category is suppressed. One can verify that . Examples of when is discrete uniform, count, or binary are given in Figure 1. The continuous version of the midQF is piecewise linear and it connects the points (dashed lines in Figure 1).
The sample midCDF corresponding to (1) is , where is the sample relative frequency of . To define the sample midquantiles, we need to introduce some more notation. Let , , be distinct values that occur in the sample and let , , be the corresponding relative frequencies. Then , where is such that , , , , and . In samples with ties, is the piecewise linear function connecting the values . Ma, Genton and Parzen (2011) showed that if the underlying distribution is absolutely continuous, then the sample midquantiles have the same asymptotic properties as the classical sample quantiles. More importantly, if is discrete, then the sample midquantiles are consistent estimators of the population midquantiles and their sampling distribution is normal (Ma, Genton and Parzen, 2011).
2.2 Conditional midquantiles
Analogously to (1), we define the conditional midCDF as
(3) 
where is a
dimensional vector of covariates (these may include a vector of ones). The
conditional midQF is given by .We assume a model that is linear on the scale of , i.e.,
(4) 
where is a known monotone and differentiable transformation function, and is a vector of unknown regression coefficients. In our approach,
may simply be the identity or a linear transformation, the logarithmic function—which is typically used in the modeling of counts
(e.g., see Machado and Santos Silva, 2005), the logistic function, or belong to a family of flexible transformation models (Chamberlain, 1994; Mu and He, 2007; Yin, Zeng and Li, 2008; Geraci and Jones, 2015). These often involve the BoxCox (Box and Cox, 1964) or ArandaOrdaz (ArandaOrdaz, 1981) families.Our definition of conditional midquantiles is general as it applies to any type of discrete response variable that can be ordered. An interesting special case is when is binary. According to (3), the midCDF is then given by , where . This leads to the following conditional midquantile function
where and . Therefore the conditional midmedian is exactly equal to , while all the other conditional midquantiles are shifted by .
2.3 Estimation
We define the sample equivalent of (3) as
(5) 
where, again, , , is the th distinct observation of that occurs in the sample. Also, define as an interpolation of the sample estimates , for a given .
Inference for model (4) proceeds in two steps. First, we estimate the midCDF (this is discussed in more detail at the end of this section). Secondly, we estimate by solving the implicit equation , where . Our objective function and estimator are thus given by
(6) 
and
(7) 
respectively.
We can make (7) explicit by using the following linear interpolating function
where and . The index identifies, for a given , the value , , such that . The indices and identify, respectively, the largest and smallest value such that and . If we restrict , where , then we find that our estimator, conditionally on , has the form
(8) 
where is a matrix with th row and is a vector with th element . It is straightforward to verify that (8) is a minimizer by plugging it into (6). The closedform of (8) is, clearly, computationally convenient.
We can derive the variancecovariance of
via the total variance law as follows:(9) 
We estimate the first term in the righthand side of (9) using a HuberWhite variancecovariance estimator, which is given by , where and , . To obtain an estimate of the second term, we note that, by the delta method,
(10) 
The expression for depends on the variance of the estimator , which we discuss further below. We omit the tedious algebra for the Jacobian , which can be easily obtained via numerical differentiation. Also, note that the latter is carried out efficiently since the Jacobian is sparse, with sparsity no less than . This follows from the fact that the th partial derivatives of with respect to elements of with indices other than and are null (hence, there are at most nonzero partial derivatives).
Of course, one can still apply numerical optimization to obtain (7), regardless of whether is within the interval . In this case, the variance of can be obtained using nonparametric bootstrap (Jentsch and Leucht, 2016), an estimator of the asymptotic variance as derived in Theorem 2 below (more details are given at the end of Appendix A.2), or a numerical Hessian approximation within the optimization procedure.
The estimation of plays a key role in our approach. In our formulation, we require an estimator that can be applied to discrete responses and that admits continuous and discrete covariates (or a mix thereof). In line with the nonparametric flavor of our modeling strategy, we considered the conditional CDF estimator proposed by Li and Racine (2008). This takes the form
(11) 
where is the (product) kernel with bandwidth vector and is the kernel estimator of the marginal density of . We refer the reader to Li and Racine (2008), Li, Lin and Racine (2013), and Hayfield and Racine (2008) for technical and implementation details. By applying (11) to the sample observations, we obtain , , and , which we plug into (5). Here, we set , hence . It follows that the diagonal elements of the matrix are given by
, , and . In the expression above, we can neglect the covariance between and , , as this is asymptotically zero as shown in the proof of Theorem 2 (Section 2.4). This means that the offdiagonal elements of are also asymptotically negligible. Finally, the expression for the variance of is given in Li and Racine (2008).
Unfortunately, nonparametric estimation of entails a loss of performance when the dimension of is large, the design is sparse, or both. In these cases a semiparametric approach may be preferred. A natural choice is the binomial estimator , where is the predictor of a binomial model on the response , , with dimensional parameter vector and link function . An estimator of this sort is discussed by Peracchi (2002).
2.4 Theoretical results
We first show consistency of under general assumptions. We then provide its asymptotic distribution. Here, we assume that the conditional CDF estimator of Li and Racine (2008) is used to obtain
. The identity matrix of order
will be denoted by . The proofs of the theorems in this section are given in Appendix A.Theorem 1.
Generate independent observations from a discrete distribution with parameters satisfying (4). Assume that the marginal density of is strictly positive and that . Assume also that the kernel in (11) is symmetric, bounded, and compactly supported, and that , while for all . The interpolation operator is assumed to be deterministic. Let denote the solution in (7) and let be its population counterpart.
Then, for all , . Additionally, and .
Theorem 2.
In addition to the assumptions in Theorem 1, assume that the interpolation operator is differentiable. Assume also that the design matrix is full rank, that exists and is a positivedefinite matrix. Finally, assume that . Then,
where , , and .
3 Simulation study
Data were generated according to four models with homoscedastic discrete uniform, heteroscedastic discrete uniform, Poisson, and Bernoulli errors. Each model was considered with either one discrete covariate (scenario a), or with two continuous covariates (scenario b) as follows:

, where and ;

, where , , and ;

, where and ;

, where , , and ;

, where , , and ;

as in scenario (3a) with , , and ;

, where , , and ;

as in scenario (4a) with , , and ;
where and
denote random variables with, respectively, discrete and continuous uniform distribution on
. Samples of size were independently drawn from each model for replications. We then fitted the linear midquantile model with data generated under models 1 and 2; the loglinear midquantile model with data generated under model 3; and the logistic midquantile modelwith data generated under model 4. All models were estimated for 7 deciles,
, except the logistic model, which was estimated for the median only.Let denote the true midquantile at level for a given under any of the datagenerating models defined above and be the corresponding estimate for replication . We assessed the performance of the proposed methods in terms of average bias and root mean squared error (RMSE) of the midQF, i.e.
and
respectively. We also report the average true midquantiles at
as a term of comparison for assessing the relative magnitude of the bias. Finally, we calculated confidence intervals to assess coverage of the slope parameter in midquantile models for
when data were generated under scenarios 1a, 2a, and 3a. The corresponding standard errors were computed based on expression (
9).Estimated bias and RMSE of the proposed estimator are shown in Tables 13 for scenarios 1a, 2a, and 3a, and in Tables B1B3 (Appendix B) for scenarios 1b, 2b, and 3b. The bias was, in general, small, never exceeding of the average midquantile for the homoscedastic discrete uniform model (1a and 1b), for the heteroscedastic discrete uniform model (2a and 2b), and for the Poisson model with a discrete covariate (3a). In contrast the bias was relatively higher (up to of the average midquantile) in the Poisson scenario with continuous covariates (3b), although this issue was limited to the tail quantiles at smaller sample sizes. Both bias and RMSE decreased with at approximately the expected rate for all three models. The estimated bias and RMSE of the proposed estimator for the Bernoulli model (4a and 4b) were extremely small at all sample sizes (results not shown). The estimated conditional midquantiles from all replications and the average estimated conditional midquantiles () are shown in Figure B1 (Appendix B) for scenarios 1b, 2b, and 3b, and in Figure B2 (Appendix B) for scenario 4b. All midquantiles are plotted as functions of , with set equal to the median of . The observed coverage at the nominal confidence level for the slope in selected scenarios is given in Table 4. The results are in general accurate, although frequencies are occasionally slightly away from the nominal level. This is not surprising since the sample estimator of (9) relies on the HuberWhite estimator and on several approximations.
It would be remiss of us not to make a contrast between our proposed estimator and existing alternatives. The estimator developed by Machado and Santos Silva (2005)
(hereinafter referred to as MSS) is a natural candidate. However, comparison is inevitably restricted: first of all, the response must be a count or ordinal variable; more importantly, neither the ‘true’ coefficients nor the population quantiles underlying midquantile and jitteringbased estimation are, in general, the same quantities. Indeed, the quantiles modeled by MSS are defined as
, where and is uniformly distributed on . The jittered quantiles dominate the true quantiles since , uniformly over . In contrast, midquantiles interpolate the true quantiles. For these reasons, we consider only Poisson data as in scenario 3a. The ratio of the variance of the slope estimates for the jitteringbased estimator relative to ours (VR) is given in Table 5. There is clear evidence that our estimator is generally more efficient in this particular instance, but clearly results are not generalizable.Bias  RMSE  Bias  RMSE  Bias  RMSE  

0.2  0.015  0.508  0.023  0.221  0.006  0.153  8.494 
0.3  0.097  0.548  0.004  0.246  0.011  0.172  9.494 
0.4  0.113  0.574  0.008  0.263  0.016  0.182  10.494 
0.5  0.121  0.587  0.010  0.271  0.016  0.186  11.494 
0.6  0.121  0.579  0.012  0.269  0.013  0.183  12.494 
0.7  0.135  0.550  0.014  0.251  0.012  0.167  13.494 
0.8  0.211  0.532  0.045  0.223  0.028  0.146  14.494 
Bias  RMSE  Bias  RMSE  Bias  RMSE  

0.2  0.417  1.689  0.281  1.056  0.071  0.791  14.737 
0.3  0.482  1.940  0.389  1.108  0.229  0.856  18.234 
0.4  0.397  2.038  0.181  1.071  0.048  0.817  21.731 
0.5  0.444  2.100  0.418  1.068  0.297  0.810  25.228 
0.6  0.308  2.134  0.322  1.138  0.290  0.887  28.725 
0.7  0.073  1.933  0.252  0.967  0.168  0.724  32.222 
0.8  0.484  1.864  0.273  0.817  0.259  0.594  35.719 
Bias  RMSE  Bias  RMSE  Bias  RMSE  

0.2  0.232  3.369  0.190  1.848  0.226  1.552  243.938 
0.3  0.711  2.921  0.482  1.476  0.492  1.199  247.933 
0.4  0.554  2.794  0.340  1.358  0.377  1.037  251.596 
0.5  0.862  2.892  0.635  1.434  0.642  1.144  254.593 
0.6  0.891  2.983  0.605  1.454  0.610  1.122  257.921 
0.7  1.265  3.462  0.856  1.842  0.844  1.504  261.251 
0.8  1.580  4.288  0.888  2.388  0.827  2.050  265.580 
1a  2a  3a  

100  500  1000  100  500  1000  100  500  1000  
0.3  97.70  97.70  97.90  96.60  95.90  93.30  93.70  94.59  95.09 
0.5  95.90  94.90  96.10  94.60  94.60  92.50  93.90  95.19  95.09 
0.7  98.30  96.70  98.50  96.00  96.00  96.80  97.60  97.49  96.99 
Variance  VR  Variance  VR  Variance  VR  

0.2  0.997  0.878  0.189  0.895  0.089  0.978 
0.3  0.489  1.316  0.097  1.312  0.055  1.214 
0.4  0.409  1.384  0.081  1.307  0.043  1.364 
0.5  0.386  1.442  0.073  1.257  0.039  1.308 
0.6  0.372  1.357  0.072  1.332  0.037  1.399 
0.7  0.423  1.250  0.081  1.220  0.040  1.387 
0.8  0.578  0.939  0.114  1.025  0.058  1.072 
4 Applications
Two real data applications are discussed in this section. The first application concerns deaths from terrorist attacks around the world, while the second deals with prescription drugs use in the United States (US).
4.1 Global terrorism
The Global Terrorism Database (GTD), accessible at https://www.start.umd.edu/gtd, is a comprehensive, opensource database that provides information on terrorist events around the world from 1970 through 2017 (National Consortium for the Study of Terrorism and Responses to Terrorism (START), 2018). For each GTD incident, several variables are available including date and location of the incident, method of attack, and the number of fatalities. Due to an issue of consistency with data collection up to 2007, we restricted our analysis to the period 20082017. We analyzed attacks in selected regions of the world (East Asia, North America, Western Europe). Moreover, we included observations of selected types of attacks (armed assaults, bombings, hijacking, facility/infrastructure attacks) and excluded assassinations, hostage takings, kidnappings, unarmed assaults and attacks of unknown type. Figure 2
shows the marginal distribution and midquantile function of fatalities estimated from a total of 2,491 incidents, by region. The maximum number of casualties in a single attack was 184 as a result of riots (armed assaults) in Urumqi, China. The extreme skewness of the distribution is apparent, as is the sparsity of the observations. The latter increases with the size of the counts, particularly in East Asia where the count distribution had the largest gap (going from 50 to 184 deaths, the two most extreme counts). This makes the application of the MSS estimator somewhat troublesome, as argued further below.
We fitted loglinear midquantile models, separately for each region, to estimate temporal trends in number of deaths. Here, the quantile level is an index of the lethality of the attacks in a given year. In Figure 3 we show the estimated conditional midquantiles by year, for , . During 20082017, the estimated number of fatalities in terrorist attacks increased in East Asia () and Western Europe (), but remained constant in North America (). In Western Europe, the yearly rates of increase differed by quantile level, with an estimated rate of at the median (90% confidence interval: to ), but at (90% confidence interval: to ). In other words, deadlier attacks in Europe are becoming more deadly over time. Of course, these results are to be interpreted with caution since the attacks included in this analysis originate, in general, from heterogeneous motives.
Predictions obtained with the MSS approach (based on jittered samples) showed some instability, resulting in quantilecrossing between the 7th and the 8th predicted deciles of casualties in East Asia. These results were insensitive to increasing the number of jittered samples (up to ). The purpose of this comparison is not to claim a superiority of our method in general but simply to remark that, in the presence of large gaps between responses, our interpolation based estimator might have an advantage over jittering.
4.2 Prescription drugs
In this section we illustrate another application of midquantile regression using data on prescription medications from the National Health and Nutrition Examination Survey (NHANES) (National Center for Health Statistics (NCHS), 2019). The US is the worldwide leader in per capita prescription drug spending (The Kaiser Family Foundation, 2015)
, and its pharmaceutical market represents a major economic sector worth hundreds of billions of dollars. In a recent quantile regression analysis of NHANES data,
Hong et al. (2019) found a higher opioid use (morphine milligram equivalent) in adults with longstanding physical disability and those with inflammatory conditions as compared to individuals with other conditions. Differences were markedly larger at the 75th and 95th percentiles than those at lower percentiles. In the context of medications use, a higher percentile can be interpreted as an index of diminished health, lower quality of life, and higher financial burden. A quantile regression analysis of prescription medications use is therefore of both public health and health economics interest.We abstracted data () on number of prescription medicines taken and the main reason for use from the 20152016 Dietary Supplement and Prescription Medication section of the Sample Person Questionnaire. We also obtained information on age (years), sex, and race. Before carrying out the analysis, we removed the effect of NHANES oversampling by first restricting the dataset to all observations for White persons (about of the overall sample), and subsequently adding observations for persons of other races that were subsampled with probabilities proportional to their NHANES weights. This resulted in a sample () composed of about of White persons and females.
For the purpose of this analysis, the main reason for medication use, originally converted by the NHANES to an ICD10 (International Statistical Classification of Diseases, 10th revision) code, was grouped into: diseases of the circulatory system (“I codes”), metabolic diseases (“E codes”), mental disorders (“F codes”), diseases of the respiratory system (“J codes”), and all other codes (“other codes”). Race was categorized as White, Black, and other races. We restricted the dataset to adults aged 1865 years. The final sample size for analysis was
. Figure 4 shows the marginal distribution and midquantile function of number of prescription medicines.Preliminarily, we investigated a loglinear model with age (centered at 40 years and scaled by 10), sex (baseline: female), and the interaction between age and sex. The results (not shown) indicated that prescription medications count increases with age (as expected) and that the rate of increase is quantiledependent, but not sexspecific. We subsequently fitted a loglinear model with reason for medication use (baseline: other codes) and race (baseline: White), in addition to sex and age (but no interaction between age and sex). Due to the high proportion of zeros, the admissible range for the application of (8) was . In Figure 5, we report the estimated coefficients and their confidence intervals for . Males tend to use less medications as compared to women, and the inequality is more marked among high users. Age is confirmed to be an important predictor, with an estimated difference of about 4 prescriptions between a person aged 65 years and an 18 year old at . However, the difference climbs to 25 prescriptions at . The effect of race is not statistically significant for any of the quantiles. In contrast, a rather strong effect is due to reason for medication. As compared to persons in the group of ‘other’ ICD10 codes, those assigned to the circulatory diseases group have the highest use of prescription drugs, followed by persons assigned to mental disorders, metabolic diseases, and respiratory diseases groups. Moreover, effects for all groups decrease with the quantile index . The decreasing magnitude of the estimated coefficients indicate that, with larger , predictions of one group relative to the reference group become proportionally smaller. Yet, the absolute differences between predictions increase with . For instance, as compared to persons in the reference group, those in the circulatory diseases group received 44 more prescription drugs at , but 98 at . By comparison, an analogous GLM for Poisson data predicts a difference of 34 and 37 prescriptions at and , respectively, thus grossly underestimating the effect of the reason for medication.
Finally, we compared our estimates to those obtained using the MSS estimator. The results, reported in Table 6, confirm the findings from the simulation study according to which our proposed estimator might have an advantage in terms of efficiency.
MIDQR  MSS  MIDQR  MSS  
Est.  SE  Est.  SE  Est.  SE  Est.  SE  
Intercept  0.126  0.073  0.607  0.123  1.041  0.104  0.368  0.126 
Sex (male)  0.181  0.074  0.503  0.164  0.624  0.100  0.623  0.131 
Age (years)  0.073  0.032  0.188  0.063  0.164  0.043  0.265  0.059 
Race (Black)  0.039  0.112  0.385  0.268  0.040  0.151  0.299  0.172 
Race (other)  0.083  0.083  0.385  0.162  0.137  0.113  0.262  0.132 
Circulatory (I)  3.930  0.121  4.484  0.254  3.573  0.157  4.129  0.255 
Metabolic (E)  2.987  0.138  3.682  0.263  2.634  0.183  3.347  0.202 
Mental (F)  3.192  0.132  3.783  0.237  2.892  0.175  3.367  0.122 
Respiratory (J)  2.438  0.203  3.197  0.409  2.359  0.285  3.223  0.237 
5 Concluding remarks
We developed an approach to conditional quantile estimation with discrete responses. We established the theoretical properties of our conditional midquantile estimator under general conditions and showed its good performance in a simulation study with data generated from different discrete response models. Our twostep estimator is easy to implement. When constraining the quantile index to a datadriven admissible range, the secondstep estimating equation has a leastsquares type, closedform solution, which is computationally efficient. In real data analyses, conditional midquantiles provided results that reveal interesting aspects of important matters like terrorism and prescription drug use.
We believe that midquantile regression is amenable to several possible extensions, including estimation in the presence of censoring and modeling of clustered data.
Acknowledgements
Appendix A  Proofs of Theorems
a.1 Proof of Theorem 1
Proof.
Under the conditions stated (Li and Racine, 2008) Since the interpolation operator is deterministic by assumption, then it follows that Consequently,
Consider now . It is straightforward to verify that
In fact, if for some value of and , then ; while all other values are obtained through interpolation. A consequence is that is, eventually, a solution of the minimization problem in (7). Additionally, there is only one such solution, since, by assumption, for all , and is monotonic for , where and are the midprobabilities corresponding to, respectively, the smallest and largest discrete value (if , then ). This implies consistency of , the minimizer in (7). Consistency of the predicted midquantiles follows directly. ∎
a.2 Proof of Theorem 2
Proof.
Since the differentiability of follows from the assumptions, we can apply a firstorder Taylor expansion to obtain
(A.1) 
where is a point in the interior of the hypercube delimited by and . Expressions for and are given in the next section. Note that since is the minimizer in (7). The assumption on the design matrix guarantees that the Hessian is positive definite. Hence, we can rewrite (A.1) as
(A.2) 
To derive the asymptotic distribution of , it suffices to study the asymptotic distribution of the righthand side of (A.2). First, let . By using the consistency results in Theorem 1 and the triangle inequality, it is immediate to show that weakly converges elementwise to . Using the results in the web supplement, we then can write
We need to demonstrate that the expression above converges in distribution, thus we expand the quantities on the righthand side as follows:
(A.3) 
where . First of all, as shown in Li and Racine (2008), converges in distribution to a Gaussian random variable for all . Additionally, the assumptions on the bandwidths guarantee asymptotic independence of and for and all . To see this, note that for all . According to the dominated convergence theorem, the asymptotic covariance of
Comments
There are no comments yet.