1 Introduction
Quantile regression analysis of clustered data is a very active area of research. Since the seminal work of
Koenker and Bassett (1978) on methods for crosssectional observations, there have been a number of proposals on how to accommodate for the dependency induced by clustered (e.g., longitudinal) designs. As outlined by Geraci and Bottai (2014) and then extensively reviewed by Marino and Farcomeni (2015), approaches to linear quantile regression with clustered data can be classified into distributionfree and likelihoodbased approaches. The former include fixed effects
(Koenker, 2004; Lamarche, 2010; Galvao and MontesRojas, 2010; Galvao, 2011) and weighted (Lipsitz et al., 1997; Fu and Wang, 2012) approaches. The latter are mainly based on the asymmetric Laplace (AL) density (Geraci and Bottai, 2007, 2014; Yuan and Yin, 2010; Farcomeni, 2012) or other, usually flexible, parametric distributions (for example, Rigby and Stasinopoulos, 2005; Reich, Bondell and Wang, 2010; Noufaily and Jones, 2013). A different classification can be made into approaches that include clusterspecific effects (e.g., Koenker, 2004; Geraci and Bottai, 2014) and those that ignore or remove them (Lipsitz et al., 1997; Canay, 2011; Parente and Santos Silva, 2016).In some applications, the assumption of linearity may not be appropriate. This is often the case in, for example, pharmacokinetics (Lindsey, 2001) and growth curve modeling (Panik, 2014). Contributions to statistical methods for nonlinear mean regression when data are clustered can be found in the literature of mixedeffects modeling (Lindstrom and Bates, 1990; Pinheiro and Bates, 1995, 2000) as well as generalized estimating equations (Davidian and Giltinan, 1995, 2003; Contreras and Ryan, 2000; Vonesh et al., 2002). We examined the statistical literature on parametric nonlinear quantile regression functions with clustered data (thus, in our review, we did not consider smoothing for nonparametric quantile functions). To the best of our knowledge, there seem to be only a handful of published articles. Karlsson (2008) considered nonlinear longitudinal data and proposed weighting the standard quantile regression estimator (Koenker and Bassett, 1978) with prespecified weights. Wang (2012), taking her cue from Geraci and Bottai (2007), used the AL distribution to define the likelihood of a Bayesian nonlinear quantile regression model. Huang and Chen (2016) proposed a Bayesian joint model for timetoevent and longitudinal data. An approach based on copulas is developed by Chen, Koenker and Xiao (2009). Finally, Oberhofer and Haupt (2016) established the consistency of the norm nonlinear quantile estimator under weak dependency. None of the cited papers provides an approach to modeling and estimation of nonlinear quantile functions with random effects in a frequentist framework. Such an approach is desirable when the correlation between measurements is modeled by means of random effects, but the parameters of interest are assumed to be fixed.
In this paper, we propose an extension of Geraci and Bottai’s (2014) linear quantile mixed model (LQMM) to the nonlinear case. In Section 2, we briefly outline the LQMM approach and, in Section 3, we introduce the nonlinear quantile mixedeffects model or nonlinear quantile mixed model (NLQMM) for short. Estimation is carried out using a novel algorithm which is a combination of a smoothing algorithm for quantile regression and a second order Laplacian approximation for nonlinear mixed models. We then carry out a simulation study in Section 4 to assess the performance of the proposed methods. In Section 5, we consider an application of NLQMM to pharmacokinetics and growth curves modeling. We conclude with some remarks in Section 6.
2 Linear quantile mixed models
Consider data from a twolevel nested design in the form , for and , , where is the th row of a known matrix , is the th row of a known matrix and is the
th observation of the response vector
for the th cluster. Throughout the paper, the covariates and are assumed to be given. The vector of ones will be denoted by , the identity matrix by , and the matrix of zeros by .In a distributionfree approach, the linear quantile regression model for clustered (or panel) data (e.g., Koenker, 2004; Abrevaya and Dahl, 2008; Bache, Dahl and Kristensen, 2013) can be specified as
(1) 
where is the given quantile level, is a vector of coefficients common to all clusters, while the vector may vary with cluster. All the parameters are specific, although the clusterspecific effects are often specified simply as pure locationshift effects (Koenker, 2004; Lamarche, 2010). Fitting can be achieved by solving the classical norm regression problem
(2) 
where
is the loss function and
denotes the indicator function. The penalty on the clusterspecific effects controls the variability introduced by a large number of parameters and is usually based on the norm (Koenker, 2004; Lamarche, 2010; Bache, Dahl and Kristensen, 2013).To mimic the minimization problem (2) in a likelihood framework, Geraci and Bottai (2014) introduced the convenient assumption that the responses , , , conditionally on a vector of random effects , independently follow the asymmetric Laplace (AL) density
with location and scale parameters given by and , respectively, which we write as
. (The third parameter of the AL is the skew parameter
which, in this model, is fixed and defines the quantile level of interest.) Also, they assumed that , for , is a random vector independent from the model’s error term with mean zero and variancecovariance matrix . Note that all the parameters are dependent. The random effects vector depends on through the variancecovariance matrix. If we let and , the joint density based on clusters for the th linear quantile mixed model (LQMM) is given by (Geraci and Bottai, 2014)Geraci and Bottai (2014) proposed estimating LQMMs through a combination of Gaussian quadrature and nonsmooth optimization. They approximated the marginal (over the random effects) loglikelihood using the rule
(3) 
with , where and , ,
, denote, respectively, the abscissas and weights of the (onedimensional) Gaussian quadrature. In principle, one can consider different distributions for the random effects, which may be naturally linked to different quadrature rules (or penalties). For example, it is immediate to verify that the double exponential distribution confers robustness to the model and is akin to a GaussLaguerre quadrature
(Geraci and Bottai, 2014). Alternatively, one can avoid parametric assumptions altogether (Alfò, Salvati and Ranallli, 2016). Throughout this paper we assume that the random effects are normally distributed and we do not explore this issue any further. However, this assumption can be relaxed and the developments that follow can be modified as appropriate.
3 Nonlinear quantile mixed models
3.1 The model
We consider the nonlinear quantile regression function
(4) 
where is a nonlinear, smooth function of the random parameter , and are two given design matrices of dimensions and , respectively, which in general contain elements of the covariates .
To stress the functional dependence of the quantiles on the fixed parameter and on the random parameter , we write . For estimation purposes, model (4) can be equivalently written as
(5) 
conditionally on , where . Moreover, we assume , independently from .
Note the similarities and dissimilarities between the proposed model (5) and the traditional nonlinear mixedeffects (NLME) model
with and . First of all, conditionally on the random effects, both models impose a restriction on the error term (Powell, 1994). However, the NLME model requires , while the ALbased specification of the error given in (5) leads to or, equivalently, . Secondly, the fixed effects can be interpreted as the average value of the clusterspecific parameters, i.e. , or as the regression parameters of the ‘zeromedian’ cluster, i.e. a cluster with a zero randomeffect vector. However, the parameter is allowed to vary with the quantile level , while in the NLME model is not (except for a location shift). Finally, in both approaches the variancecovariance matrix of the random effects gives a measure of the variability of around but, again, estimates are allowed to differ by only for the quantile mixedeffects model.
In general, neither model (5
) nor the NLME model provide fixed parameters that can be interpreted as, respectively, regression quantiles or regression means for the population. This is because random effects are allowed to enter nonlinearly in the model. (Similarly, several generalized linear mixed models with nonlinear link functions lack marginal interpretability
(Ritz and Spiegelman, 2004; Gory, Craigmile and MacEachern, 2016).) In contrast, the fixed effects of a linear (normal) mixed model remain the same after the random effects are integrated out, whereas, in general, this is not true for the fixed effects of the LQMMs of Geraci and Bottai (2014).3.2 Laplacian approximation
Let be the scaled variancecovariance matrix of the random effects and define , where an an unrestricted dimensional vector, , of nonredundant parameters in . Our goal is to maximize the marginal loglikelihood
(6) 
where and .
First of all, we consider the following smooth approximation of (Madsen and Nielsen, 1993; Chen, 2007):
where and is a scalar “tuning” parameter. For , we have that . A similar approximation is given by Muggeo, Sciandra and Augugliaro (2012) who claimed that their method provides a better approximation than Chen’s (2007) algorithm. However, no analytical evidence was provided in their paper to support such claim. This point might offer scope for additional investigation but, here, it represents a secondary issue and will not be discussed any further.
Let be the vector of residuals , , for the th cluster, and define the corresponding sign vector with
(7) 
(Note that the notation above has been simplified since the ’s as well as the ’s should be written as functions of the ’s.) Then we have
(8) 
where is an diagonal matrix with diagonal elements , and are two vectors with elements
and
respectively.
We now define the function
(9) 
which is akin to a regularized, nonlinear, weighted leastsquares loss function. The gradient of with respect to is given by
(10) 
where and , while the Hessian is given by
(11) 
Moreover, let
(12) 
be the conditional mode of . For a given value of , this can be obtained by means of penalized leastsquares.
A secondorder approximation of around is given by
where , , and . Since is zero at , we have the following Laplacian approximation of the loglikelihood
(13) 
If we ignore the negligible contribution of the first term in expression (3.2) (Pinheiro and Bates, 1995), then only the firstorder partial derivatives of are required to compute (3.2).
Since does not depend on , the loglikelihood can be profiled on leading to
(14) 
where .
Estimation of the parameters can be carried out iteratively. A pseudocode of the algorithm is given in Appendix. The algorithm requires setting the starting value of and , the tuning parameter , the tolerance for the change in the loglikelihood, and the maximum number of iterations. Moreover, the modes of the random effects can be obtained by minimizing the objective function of the penalized leastsquares problem using a GaussNewton method. Let be the relative precision factor such that (Pinheiro and Bates, 2000). Then the function in (9) can be rewritten as
(15) 
where
When using the asymmetric Laplace as pseudolikelihood, inference must be restrained to point estimation (see for example Yang, Wang and He, 2016)
. Standard errors for nonrandom parameters can be calculated using block bootstrap, although this increases the computational cost. Bootstrap confidence intervals have been shown to have good coverage in LQMMs
(Geraci and Bottai, 2014).4 Simulation study
In this section, we perform a simulation study to evaluate the proposed methods. We start from a setting similar to the one used in Pinheiro and Bates (1995), which is ideal for normal NLME models, and then investigate scenarios more apposite for NLQMM.
In the first scenario, we simulated the data from the following logistic model
(16) 
where , , , and . The random effects are thus associated with the asymptotes ( and ) and the sigmoid’s midpoint (). Their variancecovariance matrix was defined as
In the second scenario, we used the same model (16
), but we sampled the errors from a standardized chisquared distribution with
degrees of freedom, i.e. .In the third scenario, we slightly changed model (16) and used
(17) 
where , , , and
. Note that the error is skewed as in the second scenario but now operates within the exponential function. In this heteroscedastic model, there is only one random effect associated with the sigmoid’s midpoint.
In the fourth and last scenario, we used the biexponential model
(18) 
where , , and , with parameters and .
In all scenarios, we used , , . Instances of replications are shown in Figure 1. For data sampled from models (16) and (17), we fitted mixedeffects logistic quantile functions with parameter , where
in the first 3 scenarios,
in the first and second scenarios, and in the third scenario. For data sampled from model (4), we fitted mixedeffects biexponential quantile functions with parameter , where .
For each scenario, we replicated datasets and fitted NLQMMs at three quantile levels using . Estimation was carried out by following the algorithm as described in Appendix. An attempt to maximize the approximated Laplacian loglikelihood (3.2) was made by using the BroydenFletcherGoldfarbShanno (BFGS) algorithm in the first instance. Upon failure of the BFGS algorithm during any iteration of the main estimation algorithm, the latter was started again and a new attempt to maximize (3.2) was made by using the NelderMead algorithm. The maximum number of iterations was set to 500, while the tolerance for the relative change in the loglikelihood was set to . Between two successive iterations, the tuning parameter was multiplied by the factor . All the parameters of the optimization algorithms in optim and nlm were set at their default values. Computations were performed using the R environment for statistical computing and graphics (R Core Team, 2016) version 3.3.2 on a desktop computer with a 3.60GHz quad core i74790 processor and 16 gigabytes of RAM.
Before we proceed with the analysis of the results, it is important to note that, in general, the nonlinearity of the models along with the presence of the random effects pose a difficulty for establishing the ‘true’ value of for quantiles other than the median (see for example the simulation study in Karlsson, 2008), even when the errors are normal. For example, in the logistic model not only the asymptotes and , but also the midpoint and the scale change with in a rather complicated way. (An exception is given by model (17) for which the lower and upper asymptotes ( and , respectively) do not change with
.) We find solace in observing that such limitation brings out one of the advantages of quantilebased over momentbased modeling, since direct estimation of conditional quantiles does not require nontrivial manipulation of nonlinear relationships
(Demidenko, 2013, p.435). As a reference, we can consider the corresponding results from standard nonlinear quantile regression (NLRQ) (Koenker and Park, 1996) under the assumption of independent observations. Similarity of the magnitude and direction of the estimates would support the interpretation of as regression parameters of the ‘zeromedian’ cluster, while comparing the variability of the estimates from NLQMM and NLRQ would inform us on whether accounting for clustering provides a gain in efficiency.The average estimates
and standard deviations of the estimates are reported in Tables
14. In summary, NLQMM estimates were close to NLRQ estimates in all scenarios. The variability of the estimates from NLQMM was either lower or close to that of the estimates from NLQR. Of all the results, perhaps those related to the quantile in the third scenario (Table 3) deserve more discussion. Both NLQMM and NLRQ clearly failed to provide meaningful estimates of the parameters. This is due to the fact that the range of the simulated values for was not wide enough to correctly estimate the upper asymptote at upper quantiles. This observation may have a particular relevance when modeling reference growth curves. Further, the estimated variancecovariance parameters (Table 5) and the predicted random effects obtained from (12) (Figures 25) were, in general, consistent with the parameters of the true distribution of the random effects, although, as noted before, direct comparisons are not straightforward.We now provide basic details about the algorithm’s performance and a few recommendations. On average, it took about 7 iterations (approximately 35 seconds) to fit one model for the quantiles or , and about 6 iterations (approximately 20 seconds) for the median in the first two scenarios. In the third scenario it took between and iterations (approximately seconds on average) to fit one model for any of the three quantile levels. In the fourth scenario, the algorithm needed a similar number of iterations as in the first two scenarios but the time to convergence was, on average, twice as long. This means that, within each iteration, the number of function evaluations required by optim to fit the more complex biexponential model was greater than that needed to fit the logistic model. In the first two scenarios, the median value of the factor at the last iteration was about for all considered quantiles. In the third scenario, it was less than for the tail quantiles and for the median. In the fourth scenario, it was less than for all considered quantiles.
In a separate analysis (results not shown), the average number of iterations to convergence increased to at least 10 when was increased to . In contrast, the algorithm converged too quickly to smaller values of the loglikelihood when setting to less than 0.2. This was to be expected since controls the speed at which the smoothing parameter approaches zero. As in Chen (2007), we recommend using in most situations.
Further, in the first three scenarios the average number of iterations and the values of the estimates were not particularly sensitive to the specific algorithm used for optimizing the loglikelihood, although the BFGS algorithm did fail to converge in about of the replications, more often when estimating tail quantiles () rather than when estimating the median (). In contrast, BFGS never failed to converge in the fourth scenario. We then ran a separate analysis (results not shown) in which biexponential models were fitted exclusively using NelderMead. For , estimates were unreasonable. We recommend using BFGS as default optimization algorithm.
NLQMM  

68.00 (0.73)  12.66 (0.40)  3.06 (0.10)  8.79 (0.34)  
70.23 (0.38)  9.99 (0.28)  3.04 (0.05)  9.70 (0.22)  
73.47 (0.56)  7.28 (0.38)  3.15 (0.10)  9.76 (0.52)  
NLRQ  
68.47 (2.76)  12.77 (0.58)  3.04 (0.27)  9.54 (0.53)  
69.79 (0.92)  9.99 (0.33)  3.00 (0.18)  10.11 (0.71)  
72.26 (0.85)  7.19 (0.53)  3.11 (0.32)  9.53 (2.75) 
NLQMM  

69.28 (0.68)  12.70 (0.40)  3.05 (0.08)  10.25 (0.22)  
71.39 (0.35)  9.97 (0.27)  3.05 (0.05)  10.64 (0.19)  
74.67 (0.55)  7.26 (0.38)  3.15 (0.10)  10.98 (0.58)  
NLRQ  
69.67 (2.53)  12.77 (0.55)  3.03 (0.24)  10.80 (0.46)  
71.03 (0.90)  9.98 (0.31)  3.01 (0.18)  11.25 (0.72)  
73.49 (0.82)  7.17 (0.54)  3.11 (0.32)  10.73 (2.86) 
NLQMM  

0.95 (0.06)  4.18 (0.20)  0.92 (0.08)  0.00 (0.00)  
1.00 (0.03)  3.19 (0.08)  0.85 (0.05)  0.00 (0.01)  
0.25 (0.93)  2.09 (1.34)  1.16 (1.40)  1.11 (0.36)  
NLRQ  
1.01 (0.03)  3.96 (0.06)  1.01 (0.02)  0.00 (0.00)  
1.00 (0.04)  3.18 (0.10)  0.87 (0.06)  0.00 (0.01)  
0.48 (0.57)  1.24 (1.65)  2.19 (0.98)  0.40 (0.22) 
NLQMM  

1.93 (0.19)  1.06 (0.12)  0.46 (0.13)  ()  
2.04 (0.15)  0.72 (0.10)  1.03 (0.11)  3.19 (0.17)  
2.05 (0.17)  0.61 (0.15)  1.82 (0.12)  2.48 (0.11)  
NLRQ  
1.86 (0.26)  0.98 (0.19)  0.61 (0.14)  ()  
2.04 (0.18)  0.70 (0.14)  1.01 (0.13)  3.27 (0.24)  
2.17 (0.21)  0.61 (0.19)  1.55 (0.14)  2.32 (1.89) 
First scenario  

1.73  0.97  1.68  
3.30  1.83  3.19  
6.61  3.67  6.36  
Second scenario  
1.76  0.99  1.73  
3.34  1.85  3.25  
6.62  3.65  6.38  
Third scenario  
0.01  0.04  0.00  
Fourth scenario  
0.16  0.22  0.04  
0.24  0.25  0.08  
0.16  0.18  0.17  
0.09  0.17  0.01 
5 Applications
5.1 Pharmacokinetics
We begin with the analysis of a dataset taken from an old pharmacokinetics study (Kwan et al., 1976), often used as a toy example in nonlinear regression modeling (Davidian and Giltinan, 1995; Pinheiro and Bates, 2000). The data consists of 11 measurements of plasma concentrations of indomethacin which was injected intravenously in 6 subjects. We used the biexponential model
where denotes the th measurement of drug concentration (mcg/ml), , on the th subject, , and is the time (hr) of the measurement since injection (given that the dataset is balanced, does not depend on ). We modeled the variancecovariance of the random effects using the diagonal matrix (variance components). Note that the regression model above includes 3 random effects, one for each of the first 3 fixed effects. In a separate analysis (results not shown), we found that the random effect associated with , , had nearzero variance (see also Pinheiro and Bates, 2000, p.283).
In this twocompartment model, the first exponential term determines the initial, rapidly declining distribution phase of the drug. The elimination of the drug is the predominant process during the second phase and is primarily determined by the second exponential term. Besides the average rates at which the drug is distributed and then eliminated, it might be of interest to assess the change over time of concentrations that are higher or lower than the mean. The left plot of Figure 6 shows the boxplots of indomethacin concentration at each measurement occasion. It appears that the scale and possibly even the shape of the distribution are changing over time. This would mean that the rates are not similar across the quantiles of the conditional distribution.
2.31 (0.48)  0.99 (0.16)  0.30 (0.13)  1.19 (0.57)  
2.55 (0.28)  0.58 (0.19)  0.44 (0.17)  1.33 (0.23)  
3.73 (0.52)  0.75 (0.35)  0.69 (0.34)  1.49 (0.37)  
NLME  2.83 (0.26)  0.77 (0.11)  0.46 (0.11)  1.35 (0.23) 
0.78  0.06  0.02  
0.59  0.08  0.02  
1.34  0.05  0.06  
NLME  0.33  0.03  0.01 
The fitted biexponential curves for are given in the right plot of Figure 6, while estimates of the fixed effects and their standard errors are reported in Table 6. The average rate (NLME) at which the drug decreases during the distribution phase was comparable to that of the 90th centile. However, the decrease was about faster at the lower 10th centile but about slower at the median as compared to the mean. During the second phase, the rate of decrease was, again, greatest at the 10th centile. However, the average rate was similar to that of the median and greater than that of the 90th centile. One implication is that the distribution of the response becomes increasingly rightskewed as time passes. This is an advantage of NLQMM over NLME as the latter cannot obviously model changes in the shape of the distribution.
Finally, there was substantial heterogeneity among subjects regarding starting concentration levels during the distribution phase, especially at the 90th centile (Table 7).
5.2 Growth curves
In this section, we analyze data on growth patterns of two genotypes of soybeans: Plant Introduction (P), an experimental strain, and Forrest (F), a commercial variety (Davidian and Giltinan, 1995)
. The response variable is the average leaf weight of 6 plants chosen at random from each experimental plot and measured at approximately weekly intervals, between two and eleven weeks after planting. The experiment was carried out over three different planting years: 1988, 1989, and 1990. Eight plots were planted with each genotype in each planting year, giving a total of 48 plots in the study
(Pinheiro and Bates, 2000).Figure 7 shows the temporal trajectories of the average leaf weight for individual plots. It is apparent that the experimental strain yielded heavier leaves that the F variety, at least on average. There also seem to be differences between planting years, with a wider spread of the curves in 1989. Given that previous analyses of these data focused on the average growth curves (Davidian and Giltinan, 1995; Pinheiro and Bates, 2000), we set out to investigate growth in the tails of the distribution. For our analysis, we used the same logistic model as that in Pinheiro and Bates (2000, p.293) which was selected over a number of alternative models, that is
where denotes the average leaf weight (g) observed on occasion , , in the th plot, , and is the time (days) of the measurement after planting. The number of measurements per plot ranged between 8 and 10. The design matrices of the parameter were given by
and . Thus, is a vector. The covariates in the
matrix are dummy variables for year of planting,
and , and genotype, . The baseline is represented by year 1988 and variety F. The only random effect included in the model was associated with the asymptote.The three plots in Figure 8 show the th centile, th centile, and mean predicted growth curves by variety and planting year, while the estimates and standard errors of the fixed effects are reported in Table 8. For the sake of brevity, we confine our discussion to the genotypic effect on the asymptote. In 1988, the experimental strain had an advantage over the commercial variety but only at the 95th centile of the leaf weight distribution, with an estimated asymptote difference of g. In the following year, there was a statistically significant interaction between variety and year at the 5th centile, corresponding to an estimated overall effect equal to g. The estimated overall effect of variety P on the asymptote at the 95th centile was 10.67 g, thus greater than the effect at the 5th centile and at the mean (7.19 g). Finally, the estimated differences between asymptotes of the growth curves for the experimental and commercial strains in 1990 were negligible: 0.58 g (5th centile), 2.81 g (95th centile) and 0.77 g (mean). In summary, the experimental strain did yield heavier leaves than the F variety, not only in 1989 as estimated by NLME, but also in 1988, and the magnitude of the genotypic effect was dependent on the quantile level.
NLME  
17.49 (1.47)  21.43 (2.34)  19.43 (0.95)  
7.99 (1.53)  7.02 (2.30)  8.84 (1.07)  
0.66 (2.06)  1.67 (2.49)  3.71 (1.18)  
1.64 (2.01)  6.31 (1.99)  1.62 (1.04)  
8.59 (1.93)  4.36 (2.41)  5.57 (1.17)  
2.22 (2.05)  3.50 (2.01)  0.15 (1.18)  
56.16 (1.13)  53.71 (2.57)  54.81 (0.75)  
3.30 (2.11)  0.86 (2.85)  2.24 (0.97)  
1.94 (2.48)  3.14 (2.79)  4.97 (0.97)  
2.50 (1.70)  0.51 (0.97)  1.30 (0.41)  
8.11 (0.32)  8.63 (0.79)  8.06 (0.15)  
0.29 (0.51)  0.76 (0.85)  0.90 (0.20)  
0.40 (0.49)  0.44 (0.91)  0.67 (0.21) 
6 Discussion
Mixedeffects modeling has a long tradition in statistical applications. There is a vast number of applications of mixed models to the analysis of clustered data in the social, life and physical sciences (Pinheiro and Bates, 2000; Demidenko, 2013). Linear quantile mixed models (Geraci and Bottai, 2007, 2014) have too been used in a wide range of research areas, including marine biology (Muir et al., 2015; Duffy et al., 2015; Barneche et al., 2016), environmental science (Fornaroli et al., 2015), cardiovascular disease (Degerud et al., 2014; Blankenberg et al., 2016), physical activity (Ng et al., 2014; Beets et al., 2016), and ophthalmology (Patel et al., 2015; Patel, Geraci and CortinaBorja, 2016). The present paper provides a novel and valuable contribution to the modeling of nonlinear quantile functions which broadens the applicability of quantile regression for clustered data.
NLQMMs represent a flexible alternative to nonlinear mixed models for the mean as they allow direct estimation of conditional quantile functions without imposing normal assumptions on the errors. As shown in two real data examples, NLQMMs reveal nonlinear relationships that may be quantitatively and qualitatively different at different quantiles. Also, changes in location, scale, and shape of the response distribution determined by the covariates are naturally brought into light by examining central and tail quantiles (Geraci, 2016). As compared to nonlinear quantile regression for independent data, our nonlinear estimators are more efficient and they provide additional information about the heterogeneity among clusters.
Acknowledgements
This research has been supported by the National Institutes of Health – National Institute of Child Health and Human Development (Grant Number: 1R03HD08480701A1).
Appendix
The estimation algorithm for NLQMM is based on a set of decreasing values of . This optimization approach has the appealing advantage of reducing the original problem to an approximated problem. The pseudocode is given below.
Smoothing Algorithm with Laplacian Approximation for Nonlinear Quantile Mixed Models

Set the maximum number of iterations ; the factor for reducing the tuning parameter ; the tolerance for the change in the loglikelihood; and . Estimate the starting values as follows:

obtain an estimate for using nonlinear quantile regression (Koenker and Park, 1996). See, for example, the function nlrq in the R package quantreg (Koenker, 2016) which supports selfstarting models such as SSlogis. If the nonlinear quantile regression algorithm fails, consider the estimate of the fixed effects from the NLME model in step (1.b) below or, if the latter fails too, a standard nonlinear least squares estimate (Bates and Watts, 1988);

obtain an estimate for from an NLME model. See, for example, the function nlme in the R package (Pinheiro et al., 2014). If the NLME algorithm fails, provide an arbitrary value ;

obtain an estimate for . For example, this can be estimated as the mean of the absolute residuals from step (1.a) above;

provide a starting value (see, for example, Chen, 2007, p.143);

using , , and , solve the penalized leastsquares problem (12) to obtain , . See, for example, the R function nlm.


Update and , .
References
 Abrevaya and Dahl (2008) [author] Abrevaya, J.J. Dahl, C. M.C. M. (2008). The effects of birth inputs on birthweight: Evidence from quantile estimation on panel data. Journal of Business and Economic Statistics 26 379397.
 Alfò, Salvati and Ranallli (2016) [author] Alfò, MarcoM., Salvati, NicolaN. Ranallli, M. GiovannaM. G. (2016). Finite mixtures of quantile and Mquantile regression models. Statistics and Computing 27 547570.
 Bache, Dahl and Kristensen (2013) [author] Bache, StefanHolstMiltonS., Dahl, ChristianMøllerC. Kristensen, JohannesTangJ. (2013). Headlights on tobacco road to low birthweight outcomes. Empirical Economics 44 15931633.
 Barneche et al. (2016) [author] Barneche, D. R.D. R., Kulbicki, M.M., Floeter, S. R.S. R., Friedlander, A. M.A. M. Allen, A. P.A. P. (2016). Energetic and ecological constraints on population density of reef fishes. Proceedings of the Royal Society B: Biological Sciences 283 18.
 Bates and Watts (1988) [author] Bates, Douglas MD. M. Watts, Donald GD. G. (1988). Nonlinear regression analysis and its applications. Wiley, Hoboken, NJ.
 Beets et al. (2016) [author] Beets, Michael W.M. W., Weaver, Robert GlennR. G., TurnerMcGrievy, GabrielleG., Moore, Justin B.J. B., Webster, CollinC., Brazendale, KeithK. et al. (2016). Are we there yet? Compliance with physical activity standards in YMCA afterschool programs. Childhood Obesity 12 237246.
 Blankenberg et al. (2016) [author] Blankenberg, StefanS., Salomaa, VeikkoV., Makarova, NataliyaN., Ojeda, FranciscoF., Wild, PhilippP., Lackner, Karl J.K. J. et al. (2016). Troponin I and cardiovascular risk prediction in the general population: the BiomarCaRE consortium. European Heart Journal 37 24282437.
 Canay (2011) [author] Canay, I. A.I. A. (2011). A simple approach to quantile regression for panel data. Econometrics Journal 14 368386.
 Chen (2007) [author] Chen, ColinC. (2007). A finite smoothing algorithm for quantile regression. Journal of Computational and Graphical Statistics 16 136164.
 Chen, Koenker and Xiao (2009) [author] Chen, XiaohongX., Koenker, RogerR. Xiao, ZhijieZ. (2009). Copulabased nonlinear quantile autoregression. Econometrics Journal 12 S50S67.
 Contreras and Ryan (2000) [author] Contreras, MarthaM. Ryan, Louise M.L. M. (2000). Fitting nonlinear and constrained generalized estimating equations with optimization software. Biometrics 56 12681271.
 Davidian and Giltinan (1995) [author] Davidian, M.M. Giltinan, D. M.D. M. (1995). Nonlinear models for repeated measurement data. Chapman and Hall/CRC, Boca Raton, FL.
 Davidian and Giltinan (2003) [author] Davidian, MarieM. Giltinan, David M.D. M. (2003). Nonlinear models for repeated measurement data: An overview and update. Journal of Agricultural, Biological, and Environmental Statistics 8 387.
 Degerud et al. (2014) [author] Degerud, EirikE., Løland, Kjetil HK. H., Nygård, OttarO., Midttun, ØivindØ., Ueland, Per MP. M., Seifert, ReinhardR. et al. (2014). Vitamin D status was not associated with ‘oneyear’ progression of coronary artery disease, assessed by coronary angiography in statintreated patients. European Journal of Preventive Cardiology 22 594602.
 Demidenko (2013) [author] Demidenko, E.E. (2013). Mixed models: Theory and applications with R, Second ed. John Wiley & Sons, Hoboken, NJ.
 Duffy et al. (2015) [author] Duffy, L. M.L. M., Olson, R. J.R. J., LennertCody, C. E.C. E., GalvánMagaña, F.F., BocanegraCastillo, N.N. Kuhnert, P. M.P. M. (2015). Foraging ecology of silky sharks, Carcharhinus falciformis, captured by the tuna purseseine fishery in the eastern Pacific Ocean. Marine Biology 162 571593.
 Farcomeni (2012) [author] Farcomeni, AlessioA. (2012). Quantile regression for longitudinal data based on latent Markov subjectspecific parameters. Statistics and Computing 22 141152.
 Fornaroli et al. (2015) [author] Fornaroli, R.R., Cabrini, R.R., Sartori, L.L., Marazzi, F.F., Vracevic, D.D., Mezzanotte, V.V. et al. (2015). Predicting the constraint effect of environmental characteristics on macroinvertebrate density and diversity using quantile regression mixed model. Hydrobiologia 742 153167.
 Fu and Wang (2012) [author] Fu, LiyaL. Wang, YouGanY.G. (2012). Quantile regression for longitudinal data with a working correlation model. Computational Statistics & Data Analysis 56 25262538.
 Galvao (2011) [author] Galvao, A. F.A. F. (2011). Quantile regression for dynamic panel data with fixed effects. Journal of Econometrics 164 142157.
 Galvao and MontesRojas (2010) [author] Galvao, A. F.A. F. MontesRojas, G. V.G. V. (2010). Penalized quantile regression for dynamic panel data. Journal of Statistical Planning and Inference 140 34763497.
 Geraci (2016) [author] Geraci, M.M. (2016). Qtools: A collection of models and other tools for quantile inference. R Journal 8 117138.
 Geraci and Bottai (2007) [author] Geraci, M.M. Bottai, M.M. (2007). Quantile regression for longitudinal data using the asymmetric Laplace distribution. Biostatistics 8 140154.
 Geraci and Bottai (2014) [author] Geraci, M.M. Bottai, M.M. (2014). Linear quantile mixed models. Statistics and Computing 24 461479.
 Gory, Craigmile and MacEachern (2016) [author] Gory, J. J.J. J., Craigmile, P. F.P. F. MacEachern, S. N.S. N. (2016). Marginally interpretable generalized linear mixed models. ArXiv eprints 1610.01526.
 Huang and Chen (2016) [author] Huang, YangxinY. Chen, JiaqingJ. (2016). Bayesian quantile regressionbased nonlinear mixedeffects joint models for timetoevent and longitudinal data with multiple features. Statistics in Medicine 35 56665685.
 Karlsson (2008) [author] Karlsson, A.A. (2008). Nonlinear quantile regression estimation of longitudinal data. Communications in StatisticsSimulation and Computation 37 114131.

Koenker (2004)
[author] Koenker, R.R. (2004). Quantile regression for longitudinal data. Journal of Multivariate Analysis 91 7489.
 Koenker (2016) [author] Koenker, R.R. (2016). quantreg: Quantile Regression R package version 5.29.
 Koenker and Bassett (1978) [author] Koenker, R.R. Bassett, G.G. (1978). Regression quantiles. Econometrica 46 3350.
 Koenker and Park (1996) [author] Koenker, R.R. Park, B. J.B. J. (1996). An interior point algorithm for nonlinear quantile regression. Journal of Econometrics 71 265283.
 Kwan et al. (1976) [author] Kwan, K. C.K. C., Breault, G. O.G. O., Umbenhauer, E. R.E. R., McMahon, F. G.F. G. Duggan, D. E.D. E. (1976). Kinetics of indomethacin absorption, elimination, and enterohepatic circulation in man. Journal of Pharmacokinetics and Biopharmaceutics 4 25580.
 Lamarche (2010) [author] Lamarche, CC. (2010). Robust penalized quantile regression estimation for panel data. Journal of Econometrics 157 396498.
 Lindsey (2001) [author] Lindsey, J. K.J. K. (2001). Nonlinear models for medical statistics. Oxford University Press, New York, NY.
 Lindstrom and Bates (1990) [author] Lindstrom, M. J.M. J. Bates, D. M.D. M. (1990). Nonlinear mixed effects models for repeated measures data. Biometrics 46 673687.
 Lipsitz et al. (1997) [author] Lipsitz, S. R.S. R., Fitzmaurice, G. M.G. M., Molenberghs, G.G. Zhao, L. P.L. P. (1997). Quantile regression methods for longitudinal data with dropouts: Application to CD4 cell counts of patients infected with the human immunodeficiency virus. Journal of the Royal Statistical Society C 46 463476.
 Madsen and Nielsen (1993) [author] Madsen, K.K. Nielsen, H. B.H. B. (1993). A finite smoothing algorithm for linear estimation. Siam Journal on Optimization 3 223235.
 Marino and Farcomeni (2015) [author] Marino, Maria FrancescaM. F. Farcomeni, AlessioA. (2015). Linear quantile regression models for longitudinal experiments: an overview. METRON 73 229247.
 Muggeo, Sciandra and Augugliaro (2012) [author] Muggeo, V. M. R.V. M. R., Sciandra, M.M. Augugliaro, L.L. (2012). Quantile regression via iterative least squares computations. Journal of Statistical Computation and Simulation 82 15571569.
 Muir et al. (2015) [author] Muir, P. R.P. R., Wallace, C. C.C. C., Done, T.T. Aguirre, J. D.J. D. (2015). Limited scope for latitudinal extension of reef corals. Science 348 11351138.
 Ng et al. (2014) [author] Ng, S. W.S. W., Howard, A. G.A. G., Wang, H. J.H. J., Su, C.C. Zhang, B.B. (2014). The physical activity transition among adults in China: 19912011. Obesity Reviews 15 2736.

Noufaily and Jones (2013)
[author] Noufaily, AngelaA. Jones, M. C.M. C. (2013). Parametric quantile regression based on the generalized gamma distribution. Journal of the Royal Statistical Society C 62 723740.
 Oberhofer and Haupt (2016) [author] Oberhofer, WalterW. Haupt, HarryH. (2016). Asymptotic theory for nonlinear quantile regression under weak dependence. Econometric Theory 32 686713.
 Panik (2014) [author] Panik, M. J.M. J. (2014). Growth curve modeling: Theory and applications. John Wiley and Sons, Hoboken, NJ.
 Parente and Santos Silva (2016) [author] Parente, P. M. D. C.P. M. D. C. Santos Silva, J. M. C.J. M. C. (2016). Quantile regression with clustered data. Journal of Econometric Methods 5 116.
 Patel, Geraci and CortinaBorja (2016) [author] Patel, D. E.D. E., Geraci, M.M. CortinaBorja, M.M. (2016). Modelling normative kinetic perimetry isopters using mixedeffects quantile regression. Journal of Vision 16 16.
 Patel et al. (2015) [author] Patel, D. E.D. E., Cumberland, P. M.P. M., Walters, B. C.B. C., RussellEggitt, I.I., CortinaBorja, M.M. Rahi, J. S.J. S. (2015). Study of optimal perimetric testing in children (OPTIC): Normative visual field values in children. Ophthalmology 122 17111717.
 Pinheiro and Bates (1995) [author] Pinheiro, JCJ. Bates, DMD. (1995). Approximations to the loglikelihood function in the nonlinear mixedeffects model. Journal of Computational and Graphical Statistics 4 1235.
 Pinheiro and Bates (2000) [author] Pinheiro, J. C.J. C. Bates, D. M.D. M. (2000). Mixedeffects models in S and SPLUS. Springer Verlag, New York.
 Pinheiro et al. (2014) [author] Pinheiro, J.J., Bates, D.D., DebRoy, S.S., Sarkar, D.D. R Core Team (2014). nlme: Linear and nonlinear mixed effects models R package version 3.1117. http://CRAN.Rproject.org/package=nlme.
 Powell (1994) [author] Powell, James L.J. L. (1994). Estimation of semiparametric models In Handbook of Econometrics Volume 4 41, 24432521. Elsevier.
 Reich, Bondell and Wang (2010) [author] Reich, Brian J.B. J., Bondell, Howard D.H. D. Wang, Huixia J.H. J. (2010). Flexible Bayesian quantile regression for independent and clustered data. Biostatistics 11 337352.
 Rigby and Stasinopoulos (2005) [author] Rigby, RAR. Stasinopoulos, DMD. (2005). Generalized additive models for location, scale and shape. Journal of the Royal Statistical Society C 54 507554.
 Ritz and Spiegelman (2004) [author] Ritz, JohnJ. Spiegelman, DonnaD. (2004). Equivalence of conditional and marginal regression models for clustered and longitudinal data. Statistical Methods in Medical Research 13 309323.
 R Core Team (2016) [author] R Core Team (2016). R: A Language and Environment for Statistical Computing R Foundation for Statistical Computing, Vienna, Austria.
 Vonesh et al. (2002) [author] Vonesh, E. F.E. F., Wang, H.H., Nie, L.L. Majumdar, D.D. (2002). Conditional secondorder generalized estimating equations for generalized linear and nonlinear mixedeffects models. Journal of the American Statistical Association 97 271283.
 Wang (2012) [author] Wang, JingJ. (2012). Bayesian quantile regression for parametric nonlinear mixed effects models. Statistical Methods & Applications 21 279295.
 Yang, Wang and He (2016) [author] Yang, YunwenY., Wang, Huixia JudyH. J. He, XumingX. (2016). Posterior inference in Bayesian quantile regression with asymmetric Laplace likelihood. International Statistical Review 84 327–344. doi:10.1111/insr.12114.

Yuan and Yin (2010)
[author] Yuan, Y.Y. Yin, G.G. (2010). Bayesian quantile regression for longitudinal studies with nonignorable missing data. Biometrics 66 105114.
Comments
There are no comments yet.