Nonlinear quantile mixed models

by   Marco Geraci, et al.
University of South Carolina

In regression applications, the presence of nonlinearity and correlation among observations offer computational challenges not only in traditional settings such as least squares regression, but also (and especially) when the objective function is non-smooth as in the case of quantile regression. In this paper, we develop methods for the modeling and estimation of nonlinear conditional quantile functions when data are clustered within two-level nested designs. This work represents an extension of the linear quantile mixed models of Geraci and Bottai (2014, Statistics and Computing). We develop a novel algorithm which is a blend of a smoothing algorithm for quantile regression and a second order Laplacian approximation for nonlinear mixed models. To assess the proposed methods, we present a simulation study and two applications, one in pharmacokinetics and one related to growth curve modeling in agriculture.



There are no comments yet.


page 9

page 14

page 15


Additive quantile regression for clustered data with an application to children's physical activity

Additive models are flexible regression tools that handle linear as well...

Communication-Efficient Distributed Quantile Regression with Optimal Statistical Guarantees

We address the problem of how to achieve optimal inference in distribute...

Letter to the Editor

Galarza, Lachos and Bandyopadhyay (2017) have recently proposed a method...

Bayesian Quantile Regression with Multiple Proxy Variables

Data integration has become more challenging with the emerging availabil...

Quantile Regression Modelling via Location and Scale Mixtures of Normal Distributions

We show that the estimating equations for quantile regression can be sol...

On the Unbiased Asymptotic Normality of Quantile Regression with Fixed Effects

Nonlinear panel data models with fixed individual effects provide an imp...

Predicting Conditional Quantiles via Reduction to Classification

We show how to reduce the process of predicting general order statistics...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Quantile regression analysis of clustered data is a very active area of research. Since the seminal work of

Koenker and Bassett (1978) on methods for cross-sectional observations, there have been a number of proposals on how to accommodate for the dependency induced by clustered (e.g., longitudinal) designs. As outlined by Geraci and Bottai (2014) and then extensively reviewed by Marino and Farcomeni (2015)

, approaches to linear quantile regression with clustered data can be classified into distribution-free and likelihood-based approaches. The former include fixed effects

(Koenker, 2004; Lamarche, 2010; Galvao and Montes-Rojas, 2010; Galvao, 2011) and weighted (Lipsitz et al., 1997; Fu and Wang, 2012) approaches. The latter are mainly based on the asymmetric Laplace (AL) density (Geraci and Bottai, 2007, 2014; Yuan and Yin, 2010; Farcomeni, 2012) or other, usually flexible, parametric distributions (for example, Rigby and Stasinopoulos, 2005; Reich, Bondell and Wang, 2010; Noufaily and Jones, 2013). A different classification can be made into approaches that include cluster-specific effects (e.g., Koenker, 2004; Geraci and Bottai, 2014) and those that ignore or remove them (Lipsitz et al., 1997; Canay, 2011; Parente and Santos Silva, 2016).

In some applications, the assumption of linearity may not be appropriate. This is often the case in, for example, pharmacokinetics (Lindsey, 2001) and growth curve modeling (Panik, 2014). Contributions to statistical methods for nonlinear mean regression when data are clustered can be found in the literature of mixed-effects modeling (Lindstrom and Bates, 1990; Pinheiro and Bates, 1995, 2000) as well as generalized estimating equations (Davidian and Giltinan, 1995, 2003; Contreras and Ryan, 2000; Vonesh et al., 2002). We examined the statistical literature on parametric nonlinear quantile regression functions with clustered data (thus, in our review, we did not consider smoothing for nonparametric quantile functions). To the best of our knowledge, there seem to be only a handful of published articles. Karlsson (2008) considered nonlinear longitudinal data and proposed weighting the standard quantile regression estimator (Koenker and Bassett, 1978) with pre-specified weights. Wang (2012), taking her cue from Geraci and Bottai (2007), used the AL distribution to define the likelihood of a Bayesian nonlinear quantile regression model. Huang and Chen (2016) proposed a Bayesian joint model for time-to-event and longitudinal data. An approach based on copulas is developed by Chen, Koenker and Xiao (2009). Finally, Oberhofer and Haupt (2016) established the consistency of the -norm nonlinear quantile estimator under weak dependency. None of the cited papers provides an approach to modeling and estimation of nonlinear quantile functions with random effects in a frequentist framework. Such an approach is desirable when the correlation between measurements is modeled by means of random effects, but the parameters of interest are assumed to be fixed.

In this paper, we propose an extension of Geraci and Bottai’s (2014) linear quantile mixed model (LQMM) to the nonlinear case. In Section 2, we briefly outline the LQMM approach and, in Section 3, we introduce the nonlinear quantile mixed-effects model or nonlinear quantile mixed model (NLQMM) for short. Estimation is carried out using a novel algorithm which is a combination of a smoothing algorithm for quantile regression and a second order Laplacian approximation for nonlinear mixed models. We then carry out a simulation study in Section 4 to assess the performance of the proposed methods. In Section 5, we consider an application of NLQMM to pharmacokinetics and growth curves modeling. We conclude with some remarks in Section 6.

2 Linear quantile mixed models

Consider data from a two-level nested design in the form , for and , , where is the th row of a known matrix , is the th row of a known matrix and is the

th observation of the response vector

for the th cluster. Throughout the paper, the covariates and are assumed to be given. The vector of ones will be denoted by , the identity matrix by , and the matrix of zeros by .

In a distribution-free approach, the linear quantile regression model for clustered (or panel) data (e.g., Koenker, 2004; Abrevaya and Dahl, 2008; Bache, Dahl and Kristensen, 2013) can be specified as


where is the given quantile level, is a vector of coefficients common to all clusters, while the vector may vary with cluster. All the parameters are -specific, although the cluster-specific effects are often specified simply as pure location-shift effects (Koenker, 2004; Lamarche, 2010). Fitting can be achieved by solving the classical -norm regression problem



is the loss function and

denotes the indicator function. The penalty on the cluster-specific effects controls the variability introduced by a large number of parameters and is usually based on the -norm (Koenker, 2004; Lamarche, 2010; Bache, Dahl and Kristensen, 2013).

To mimic the minimization problem (2) in a likelihood framework, Geraci and Bottai (2014) introduced the convenient assumption that the responses , , , conditionally on a vector of random effects , independently follow the asymmetric Laplace (AL) density

with location and scale parameters given by and , respectively, which we write as

. (The third parameter of the AL is the skew parameter

which, in this model, is fixed and defines the quantile level of interest.) Also, they assumed that , for , is a random vector independent from the model’s error term with mean zero and variance-covariance matrix . Note that all the parameters are -dependent. The random effects vector depends on through the variance-covariance matrix. If we let and , the joint density based on clusters for the th linear quantile mixed model (LQMM) is given by (Geraci and Bottai, 2014)

Geraci and Bottai (2014) proposed estimating LQMMs through a combination of Gaussian quadrature and non-smooth optimization. They approximated the marginal (over the random effects) log-likelihood using the rule


with , where and , ,

, denote, respectively, the abscissas and weights of the (one-dimensional) Gaussian quadrature. In principle, one can consider different distributions for the random effects, which may be naturally linked to different quadrature rules (or penalties). For example, it is immediate to verify that the double exponential distribution confers robustness to the model and is akin to a Gauss-Laguerre quadrature

(Geraci and Bottai, 2014). Alternatively, one can avoid parametric assumptions altogether (Alfò, Salvati and Ranallli, 2016)

. Throughout this paper we assume that the random effects are normally distributed and we do not explore this issue any further. However, this assumption can be relaxed and the developments that follow can be modified as appropriate.

3 Nonlinear quantile mixed models

3.1 The model

We consider the nonlinear quantile regression function


where is a nonlinear, smooth function of the random parameter , and are two given design matrices of dimensions and , respectively, which in general contain elements of the covariates .

To stress the functional dependence of the quantiles on the fixed parameter and on the random parameter , we write . For estimation purposes, model (4) can be equivalently written as


conditionally on , where . Moreover, we assume , independently from .

Note the similarities and dissimilarities between the proposed model (5) and the traditional nonlinear mixed-effects (NLME) model

with and . First of all, conditionally on the random effects, both models impose a restriction on the error term (Powell, 1994). However, the NLME model requires , while the AL-based specification of the error given in (5) leads to or, equivalently, . Secondly, the fixed effects can be interpreted as the average value of the cluster-specific parameters, i.e. , or as the regression parameters of the ‘zero-median’ cluster, i.e. a cluster with a zero random-effect vector. However, the parameter is allowed to vary with the quantile level , while in the NLME model is not (except for a location shift). Finally, in both approaches the variance-covariance matrix of the random effects gives a measure of the variability of around but, again, estimates are allowed to differ by only for the quantile mixed-effects model.

In general, neither model (5

) nor the NLME model provide fixed parameters that can be interpreted as, respectively, regression quantiles or regression means for the population. This is because random effects are allowed to enter nonlinearly in the model. (Similarly, several generalized linear mixed models with nonlinear link functions lack marginal interpretability

(Ritz and Spiegelman, 2004; Gory, Craigmile and MacEachern, 2016).) In contrast, the fixed effects of a linear (normal) mixed model remain the same after the random effects are integrated out, whereas, in general, this is not true for the fixed effects of the LQMMs of Geraci and Bottai (2014).

3.2 Laplacian approximation

Let be the scaled variance-covariance matrix of the random effects and define , where an an unrestricted -dimensional vector, , of non-redundant parameters in . Our goal is to maximize the marginal log-likelihood


where and .

First of all, we consider the following smooth approximation of (Madsen and Nielsen, 1993; Chen, 2007):

where and is a scalar “tuning” parameter. For , we have that . A similar approximation is given by Muggeo, Sciandra and Augugliaro (2012) who claimed that their method provides a better approximation than Chen’s (2007) algorithm. However, no analytical evidence was provided in their paper to support such claim. This point might offer scope for additional investigation but, here, it represents a secondary issue and will not be discussed any further.

Let be the vector of residuals , , for the th cluster, and define the corresponding sign vector with


(Note that the notation above has been simplified since the ’s as well as the ’s should be written as functions of the ’s.) Then we have


where is an diagonal matrix with diagonal elements , and are two vectors with elements



We now define the function


which is akin to a regularized, nonlinear, weighted least-squares loss function. The gradient of with respect to is given by


where and , while the Hessian is given by


Moreover, let


be the conditional mode of . For a given value of , this can be obtained by means of penalized least-squares.

A second-order approximation of around is given by

where , , and . Since is zero at , we have the following Laplacian approximation of the log-likelihood


If we ignore the negligible contribution of the first term in expression (3.2) (Pinheiro and Bates, 1995), then only the first-order partial derivatives of are required to compute (3.2).

Since does not depend on , the log-likelihood can be profiled on leading to


where .

Estimation of the parameters can be carried out iteratively. A pseudo-code of the algorithm is given in Appendix. The algorithm requires setting the starting value of and , the tuning parameter , the tolerance for the change in the log-likelihood, and the maximum number of iterations. Moreover, the modes of the random effects can be obtained by minimizing the objective function of the penalized least-squares problem using a Gauss-Newton method. Let be the relative precision factor such that (Pinheiro and Bates, 2000). Then the function in (9) can be rewritten as



When using the asymmetric Laplace as pseudo-likelihood, inference must be restrained to point estimation (see for example Yang, Wang and He, 2016)

. Standard errors for non-random parameters can be calculated using block bootstrap, although this increases the computational cost. Bootstrap confidence intervals have been shown to have good coverage in LQMMs

(Geraci and Bottai, 2014).

4 Simulation study

In this section, we perform a simulation study to evaluate the proposed methods. We start from a setting similar to the one used in Pinheiro and Bates (1995), which is ideal for normal NLME models, and then investigate scenarios more apposite for NLQMM.

In the first scenario, we simulated the data from the following logistic model


where , , , and . The random effects are thus associated with the asymptotes ( and ) and the sigmoid’s midpoint (). Their variance-covariance matrix was defined as

In the second scenario, we used the same model (16

), but we sampled the errors from a standardized chi-squared distribution with

degrees of freedom, i.e. .

In the third scenario, we slightly changed model (16) and used


where , , , and

. Note that the error is skewed as in the second scenario but now operates within the exponential function. In this heteroscedastic model, there is only one random effect associated with the sigmoid’s midpoint.

In the fourth and last scenario, we used the biexponential model


where , , and , with parameters and .

In all scenarios, we used , , . Instances of replications are shown in Figure 1. For data sampled from models (16) and (17), we fitted mixed-effects logistic quantile functions with parameter , where

in the first 3 scenarios,

in the first and second scenarios, and in the third scenario. For data sampled from model (4), we fitted mixed-effects biexponential quantile functions with parameter , where .

Figure 1: Examples of data generated from the logistic (scenarios 1-3) and the biexponential (scenario 4) models.

For each scenario, we replicated datasets and fitted NLQMMs at three quantile levels using . Estimation was carried out by following the algorithm as described in Appendix. An attempt to maximize the approximated Laplacian log-likelihood (3.2) was made by using the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm in the first instance. Upon failure of the BFGS algorithm during any iteration of the main estimation algorithm, the latter was started again and a new attempt to maximize (3.2) was made by using the Nelder-Mead algorithm. The maximum number of iterations was set to 500, while the tolerance for the relative change in the log-likelihood was set to . Between two successive iterations, the tuning parameter was multiplied by the factor . All the parameters of the optimization algorithms in optim and nlm were set at their default values. Computations were performed using the R environment for statistical computing and graphics (R Core Team, 2016) version 3.3.2 on a desktop computer with a 3.60GHz quad core i7-4790 processor and 16 gigabytes of RAM.

Before we proceed with the analysis of the results, it is important to note that, in general, the nonlinearity of the models along with the presence of the random effects pose a difficulty for establishing the ‘true’ value of for quantiles other than the median (see for example the simulation study in Karlsson, 2008), even when the errors are normal. For example, in the logistic model not only the asymptotes and , but also the midpoint and the scale change with in a rather complicated way. (An exception is given by model (17) for which the lower and upper asymptotes ( and , respectively) do not change with

.) We find solace in observing that such limitation brings out one of the advantages of quantile-based over moment-based modeling, since direct estimation of conditional quantiles does not require nontrivial manipulation of nonlinear relationships

(Demidenko, 2013, p.435). As a reference, we can consider the corresponding results from standard nonlinear quantile regression (NLRQ) (Koenker and Park, 1996) under the assumption of independent observations. Similarity of the magnitude and direction of the estimates would support the interpretation of as regression parameters of the ‘zero-median’ cluster, while comparing the variability of the estimates from NLQMM and NLRQ would inform us on whether accounting for clustering provides a gain in efficiency.

The average estimates

and standard deviations of the estimates are reported in Tables 

1-4. In summary, NLQMM estimates were close to NLRQ estimates in all scenarios. The variability of the estimates from NLQMM was either lower or close to that of the estimates from NLQR. Of all the results, perhaps those related to the quantile in the third scenario (Table 3) deserve more discussion. Both NLQMM and NLRQ clearly failed to provide meaningful estimates of the parameters. This is due to the fact that the range of the simulated values for was not wide enough to correctly estimate the upper asymptote at upper quantiles. This observation may have a particular relevance when modeling reference growth curves. Further, the estimated variance-covariance parameters (Table 5) and the predicted random effects obtained from (12) (Figures 2-5) were, in general, consistent with the parameters of the true distribution of the random effects, although, as noted before, direct comparisons are not straightforward.

We now provide basic details about the algorithm’s performance and a few recommendations. On average, it took about 7 iterations (approximately 35 seconds) to fit one model for the quantiles or , and about 6 iterations (approximately 20 seconds) for the median in the first two scenarios. In the third scenario it took between and iterations (approximately seconds on average) to fit one model for any of the three quantile levels. In the fourth scenario, the algorithm needed a similar number of iterations as in the first two scenarios but the time to convergence was, on average, twice as long. This means that, within each iteration, the number of function evaluations required by optim to fit the more complex biexponential model was greater than that needed to fit the logistic model. In the first two scenarios, the median value of the factor at the last iteration was about for all considered quantiles. In the third scenario, it was less than for the tail quantiles and for the median. In the fourth scenario, it was less than for all considered quantiles.

In a separate analysis (results not shown), the average number of iterations to convergence increased to at least 10 when was increased to . In contrast, the algorithm converged too quickly to smaller values of the log-likelihood when setting to less than 0.2. This was to be expected since controls the speed at which the smoothing parameter approaches zero. As in Chen (2007), we recommend using in most situations.

Further, in the first three scenarios the average number of iterations and the values of the estimates were not particularly sensitive to the specific algorithm used for optimizing the log-likelihood, although the BFGS algorithm did fail to converge in about of the replications, more often when estimating tail quantiles () rather than when estimating the median (). In contrast, BFGS never failed to converge in the fourth scenario. We then ran a separate analysis (results not shown) in which biexponential models were fitted exclusively using Nelder-Mead. For , estimates were unreasonable. We recommend using BFGS as default optimization algorithm.

68.00 (0.73) 12.66 (0.40) 3.06 (0.10) 8.79 (0.34)
70.23 (0.38) 9.99 (0.28) 3.04 (0.05) 9.70 (0.22)
73.47 (0.56) 7.28 (0.38) 3.15 (0.10) 9.76 (0.52)
68.47 (2.76) 12.77 (0.58) 3.04 (0.27) 9.54 (0.53)
69.79 (0.92) 9.99 (0.33) 3.00 (0.18) 10.11 (0.71)
72.26 (0.85) 7.19 (0.53) 3.11 (0.32) 9.53 (2.75)
Table 1: Estimates of the fixed effects from nonlinear quantile mixed-effects regression (NLQMM) and from nonlinear quantile regression (NLRQ) with for the first scenario. The estimates are averaged over 500 replications and the standard deviations are reported in brackets.
69.28 (0.68) 12.70 (0.40) 3.05 (0.08) 10.25 (0.22)
71.39 (0.35) 9.97 (0.27) 3.05 (0.05) 10.64 (0.19)
74.67 (0.55) 7.26 (0.38) 3.15 (0.10) 10.98 (0.58)
69.67 (2.53) 12.77 (0.55) 3.03 (0.24) 10.80 (0.46)
71.03 (0.90) 9.98 (0.31) 3.01 (0.18) 11.25 (0.72)
73.49 (0.82) 7.17 (0.54) 3.11 (0.32) 10.73 (2.86)
Table 2: Estimates of the fixed effects from nonlinear quantile mixed-effects regression (NLQMM) and from nonlinear quantile regression (NLRQ) with for the second scenario. The estimates are averaged over 500 replications and the standard deviations are reported in brackets.
0.95 (0.06) 4.18 (0.20) 0.92 (0.08) 0.00 (0.00)
1.00 (0.03) 3.19 (0.08) 0.85 (0.05) 0.00 (0.01)
0.25 (0.93) 2.09 (1.34) 1.16 (1.40) 1.11 (0.36)
1.01 (0.03) 3.96 (0.06) 1.01 (0.02) 0.00 (0.00)
1.00 (0.04) 3.18 (0.10) 0.87 (0.06) 0.00 (0.01)
0.48 (0.57) 1.24 (1.65) 2.19 (0.98) 0.40 (0.22)
Table 3: Estimates of the fixed effects from nonlinear quantile mixed-effects regression (NLQMM) and from nonlinear quantile regression (NLRQ) with for the third scenario. The estimates are averaged over 500 replications and the standard deviations are reported in brackets.
1.93 (0.19) 1.06 (0.12) 0.46 (0.13) ()
2.04 (0.15) 0.72 (0.10) 1.03 (0.11) 3.19 (0.17)
2.05 (0.17) 0.61 (0.15) 1.82 (0.12) 2.48 (0.11)
1.86 (0.26) 0.98 (0.19) 0.61 (0.14) ()
2.04 (0.18) 0.70 (0.14) 1.01 (0.13) 3.27 (0.24)
2.17 (0.21) 0.61 (0.19) 1.55 (0.14) 2.32 (1.89)
Table 4: Estimates of the fixed effects from nonlinear quantile mixed-effects regression (NLQMM) and from nonlinear quantile regression (NLRQ) with for the fourth scenario. The estimates are averaged over 500 replications and the standard deviations are reported in brackets.
First scenario
1.73 0.97 1.68
3.30 1.83 3.19
6.61 3.67 6.36
Second scenario
1.76 0.99 1.73
3.34 1.85 3.25
6.62 3.65 6.38
Third scenario
0.01 0.04 0.00
Fourth scenario
0.16 0.22 0.04
0.24 0.25 0.08
0.16 0.18 0.17
0.09 0.17 0.01
Table 5: Estimates of the variance-covariance parameters from nonlinear quantile mixed-effects regression with for all scenarios. The estimates are averaged over 500 replications.
Figure 2: Boxplots of random effects predicted from the median model for 100 clusters based on 500 replications from the first scenario. The dashed red lines mark the 25th, 50th, and 75th percentiles of the true distribution of the random effects.
Figure 3: Boxplots of random effects predicted from the median model for 100 clusters based on 500 replications from the second scenario. The dashed red lines mark the 25th, 50th, and 75th percentiles of the true distribution of the random effects.
Figure 4: Boxplots of random effects predicted from the median model for 100 clusters based on 500 replications from the third scenario. The dashed red lines mark the 25th, 50th, and 75th percentiles of the true distribution of the random effects.
Figure 5: Boxplots of random effects predicted from the median model for 100 clusters based on 500 replications from the fourth scenario. The dashed red lines mark the 25th, 50th, and 75th percentiles of the true distribution of the random effects.

5 Applications

5.1 Pharmacokinetics

Figure 6: Boxplots of indomethacin concentration by measurement occasion (left) and fitted biexponential curves at the 10th, 50th and 90th centiles of drug concentration conditional on time since injection (right).

We begin with the analysis of a dataset taken from an old pharmacokinetics study (Kwan et al., 1976), often used as a toy example in nonlinear regression modeling (Davidian and Giltinan, 1995; Pinheiro and Bates, 2000). The data consists of 11 measurements of plasma concentrations of indomethacin which was injected intravenously in 6 subjects. We used the biexponential model

where denotes the th measurement of drug concentration (mcg/ml), , on the th subject, , and is the time (hr) of the measurement since injection (given that the dataset is balanced, does not depend on ). We modeled the variance-covariance of the random effects using the diagonal matrix (variance components). Note that the regression model above includes 3 random effects, one for each of the first 3 fixed effects. In a separate analysis (results not shown), we found that the random effect associated with , , had near-zero variance (see also Pinheiro and Bates, 2000, p.283).

In this two-compartment model, the first exponential term determines the initial, rapidly declining distribution phase of the drug. The elimination of the drug is the predominant process during the second phase and is primarily determined by the second exponential term. Besides the average rates at which the drug is distributed and then eliminated, it might be of interest to assess the change over time of concentrations that are higher or lower than the mean. The left plot of Figure 6 shows the boxplots of indomethacin concentration at each measurement occasion. It appears that the scale and possibly even the shape of the distribution are changing over time. This would mean that the rates are not similar across the quantiles of the conditional distribution.

2.31 (0.48) 0.99 (0.16) 0.30 (0.13) 1.19 (0.57)
2.55 (0.28) 0.58 (0.19) 0.44 (0.17) 1.33 (0.23)
3.73 (0.52) 0.75 (0.35) 0.69 (0.34) 1.49 (0.37)
NLME 2.83 (0.26) 0.77 (0.11) 0.46 (0.11) 1.35 (0.23)
Table 6: Estimates of the fixed effects (standard errors) from three biexponential quantile mixed-effects models with and from the normal nonlinear mixed-effects model (NLME) using the Indomethacin Data. Standard errors for quantile regression estimates are based on 200 bootstrap replications. Bold denotes statistically significant at the level.
0.78 0.06 0.02
0.59 0.08 0.02
1.34 0.05 0.06
NLME 0.33 0.03 0.01
Table 7: Estimates of the variance components from three biexponential quantile mixed-effects models with and from the normal nonlinear mixed-effects model (NLME) using the Indomethacin Data.

The fitted biexponential curves for are given in the right plot of Figure 6, while estimates of the fixed effects and their standard errors are reported in Table 6. The average rate (NLME) at which the drug decreases during the distribution phase was comparable to that of the 90th centile. However, the decrease was about faster at the lower 10th centile but about slower at the median as compared to the mean. During the second phase, the rate of decrease was, again, greatest at the 10th centile. However, the average rate was similar to that of the median and greater than that of the 90th centile. One implication is that the distribution of the response becomes increasingly right-skewed as time passes. This is an advantage of NLQMM over NLME as the latter cannot obviously model changes in the shape of the distribution.

Finally, there was substantial heterogeneity among subjects regarding starting concentration levels during the distribution phase, especially at the 90th centile (Table 7).

5.2 Growth curves

In this section, we analyze data on growth patterns of two genotypes of soybeans: Plant Introduction (P), an experimental strain, and Forrest (F), a commercial variety (Davidian and Giltinan, 1995)

. The response variable is the average leaf weight of 6 plants chosen at random from each experimental plot and measured at approximately weekly intervals, between two and eleven weeks after planting. The experiment was carried out over three different planting years: 1988, 1989, and 1990. Eight plots were planted with each genotype in each planting year, giving a total of 48 plots in the study

(Pinheiro and Bates, 2000).

Figure 7: Observed growth curves of soybean plants. Each line represents the average leaf weight per plant in each experimental plot. Curves are grouped by variety (left) or by year (right).

Figure 7 shows the temporal trajectories of the average leaf weight for individual plots. It is apparent that the experimental strain yielded heavier leaves that the F variety, at least on average. There also seem to be differences between planting years, with a wider spread of the curves in 1989. Given that previous analyses of these data focused on the average growth curves (Davidian and Giltinan, 1995; Pinheiro and Bates, 2000), we set out to investigate growth in the tails of the distribution. For our analysis, we used the same logistic model as that in Pinheiro and Bates (2000, p.293) which was selected over a number of alternative models, that is

where denotes the average leaf weight (g) observed on occasion , , in the th plot, , and is the time (days) of the measurement after planting. The number of measurements per plot ranged between 8 and 10. The design matrices of the parameter were given by

and . Thus, is a vector. The covariates in the

matrix are dummy variables for year of planting,

and , and genotype, . The baseline is represented by year 1988 and variety F. The only random effect included in the model was associated with the asymptote.

The three plots in Figure 8 show the th centile, th centile, and mean predicted growth curves by variety and planting year, while the estimates and standard errors of the fixed effects are reported in Table 8. For the sake of brevity, we confine our discussion to the genotypic effect on the asymptote. In 1988, the experimental strain had an advantage over the commercial variety but only at the 95th centile of the leaf weight distribution, with an estimated asymptote difference of g. In the following year, there was a statistically significant interaction between variety and year at the 5th centile, corresponding to an estimated overall effect equal to g. The estimated overall effect of variety P on the asymptote at the 95th centile was 10.67 g, thus greater than the effect at the 5th centile and at the mean (7.19 g). Finally, the estimated differences between asymptotes of the growth curves for the experimental and commercial strains in 1990 were negligible: 0.58 g (5th centile), 2.81 g (95th centile) and 0.77 g (mean). In summary, the experimental strain did yield heavier leaves than the F variety, not only in 1989 as estimated by NLME, but also in 1988, and the magnitude of the genotypic effect was dependent on the quantile level.

Figure 8: Fitted logistic growth curves of soybean plants at the 5th centile (left), 95th centile (center), and at the mean.
17.49 (1.47) 21.43 (2.34) 19.43 (0.95)
7.99 (1.53) 7.02 (2.30) 8.84 (1.07)
0.66 (2.06) 1.67 (2.49) 3.71 (1.18)
1.64 (2.01) 6.31 (1.99) 1.62 (1.04)
8.59 (1.93) 4.36 (2.41) 5.57 (1.17)
2.22 (2.05) 3.50 (2.01) 0.15 (1.18)
56.16 (1.13) 53.71 (2.57) 54.81 (0.75)
3.30 (2.11) 0.86 (2.85) 2.24 (0.97)
1.94 (2.48) 3.14 (2.79) 4.97 (0.97)
2.50 (1.70) 0.51 (0.97) 1.30 (0.41)
8.11 (0.32) 8.63 (0.79) 8.06 (0.15)
0.29 (0.51) 0.76 (0.85) 0.90 (0.20)
0.40 (0.49) 0.44 (0.91) 0.67 (0.21)
Table 8: Estimates of the fixed effects (standard errors) from two logistic quantile mixed-effects models with and from the normal nonlinear mixed-effects model (NLME) using the Soybean Data. Standard errors for quantile regression estimates are based on 200 bootstrap replications. Bold denotes statistically significant at the level.

6 Discussion

Mixed-effects modeling has a long tradition in statistical applications. There is a vast number of applications of mixed models to the analysis of clustered data in the social, life and physical sciences (Pinheiro and Bates, 2000; Demidenko, 2013). Linear quantile mixed models (Geraci and Bottai, 2007, 2014) have too been used in a wide range of research areas, including marine biology (Muir et al., 2015; Duffy et al., 2015; Barneche et al., 2016), environmental science (Fornaroli et al., 2015), cardiovascular disease (Degerud et al., 2014; Blankenberg et al., 2016), physical activity (Ng et al., 2014; Beets et al., 2016), and ophthalmology (Patel et al., 2015; Patel, Geraci and Cortina-Borja, 2016). The present paper provides a novel and valuable contribution to the modeling of nonlinear quantile functions which broadens the applicability of quantile regression for clustered data.

NLQMMs represent a flexible alternative to nonlinear mixed models for the mean as they allow direct estimation of conditional quantile functions without imposing normal assumptions on the errors. As shown in two real data examples, NLQMMs reveal nonlinear relationships that may be quantitatively and qualitatively different at different quantiles. Also, changes in location, scale, and shape of the response distribution determined by the covariates are naturally brought into light by examining central and tail quantiles (Geraci, 2016). As compared to nonlinear quantile regression for independent data, our nonlinear estimators are more efficient and they provide additional information about the heterogeneity among clusters.


This research has been supported by the National Institutes of Health – National Institute of Child Health and Human Development (Grant Number: 1R03HD084807-01A1).


The estimation algorithm for NLQMM is based on a set of decreasing values of . This optimization approach has the appealing advantage of reducing the original problem to an approximated problem. The pseudo-code is given below.

Smoothing Algorithm with Laplacian Approximation for Nonlinear Quantile Mixed Models

  • Set the maximum number of iterations ; the factor for reducing the tuning parameter ; the tolerance for the change in the log-likelihood; and . Estimate the starting values as follows:

    • obtain an estimate for using nonlinear quantile regression (Koenker and Park, 1996). See, for example, the function nlrq in the R package quantreg (Koenker, 2016) which supports self-starting models such as SSlogis. If the nonlinear quantile regression algorithm fails, consider the estimate of the fixed effects from the NLME model in step (1.b) below or, if the latter fails too, a standard nonlinear least squares estimate (Bates and Watts, 1988);

    • obtain an estimate for from an NLME model. See, for example, the function nlme in the R package (Pinheiro et al., 2014). If the NLME algorithm fails, provide an arbitrary value ;

    • obtain an estimate for . For example, this can be estimated as the mean of the absolute residuals from step (1.a) above;

    • provide a starting value (see, for example, Chen, 2007, p.143);

    • using , , and , solve the penalized least-squares problem (12) to obtain , . See, for example, the R function nlm.

  • While

    • Update by minimizing (3.2) (or (14)). See, for example, the R function optim.

    • If the change in the log-likelihood is smaller than a given tolerance

      • then return ;

      • else set ; ; ; go to step (2.a).

  • Update and , .


  • Abrevaya and Dahl (2008) [author] Abrevaya, J.J. Dahl, C. M.C. M. (2008). The effects of birth inputs on birthweight: Evidence from quantile estimation on panel data. Journal of Business and Economic Statistics 26 379-397.
  • Alfò, Salvati and Ranallli (2016) [author] Alfò, MarcoM., Salvati, NicolaN. Ranallli, M. GiovannaM. G. (2016). Finite mixtures of quantile and M-quantile regression models. Statistics and Computing 27 547-570.
  • Bache, Dahl and Kristensen (2013) [author] Bache, StefanHolstMiltonS., Dahl, ChristianMøllerC. Kristensen, JohannesTangJ. (2013). Headlights on tobacco road to low birthweight outcomes. Empirical Economics 44 1593-1633.
  • Barneche et al. (2016) [author] Barneche, D. R.D. R., Kulbicki, M.M., Floeter, S. R.S. R., Friedlander, A. M.A. M. Allen, A. P.A. P. (2016). Energetic and ecological constraints on population density of reef fishes. Proceedings of the Royal Society B: Biological Sciences 283 1-8.
  • Bates and Watts (1988) [author] Bates, Douglas MD. M. Watts, Donald GD. G. (1988). Nonlinear regression analysis and its applications. Wiley, Hoboken, NJ.
  • Beets et al. (2016) [author] Beets, Michael W.M. W., Weaver, Robert GlennR. G., Turner-McGrievy, GabrielleG., Moore, Justin B.J. B., Webster, CollinC., Brazendale, KeithK. et al. (2016). Are we there yet? Compliance with physical activity standards in YMCA afterschool programs. Childhood Obesity 12 237-246.
  • Blankenberg et al. (2016) [author] Blankenberg, StefanS., Salomaa, VeikkoV., Makarova, NataliyaN., Ojeda, FranciscoF., Wild, PhilippP., Lackner, Karl J.K. J. et al. (2016). Troponin I and cardiovascular risk prediction in the general population: the BiomarCaRE consortium. European Heart Journal 37 2428-2437.
  • Canay (2011) [author] Canay, I. A.I. A. (2011). A simple approach to quantile regression for panel data. Econometrics Journal 14 368-386.
  • Chen (2007) [author] Chen, ColinC. (2007). A finite smoothing algorithm for quantile regression. Journal of Computational and Graphical Statistics 16 136-164.
  • Chen, Koenker and Xiao (2009) [author] Chen, XiaohongX., Koenker, RogerR. Xiao, ZhijieZ. (2009). Copula-based nonlinear quantile autoregression. Econometrics Journal 12 S50-S67.
  • Contreras and Ryan (2000) [author] Contreras, MarthaM. Ryan, Louise M.L. M. (2000). Fitting nonlinear and constrained generalized estimating equations with optimization software. Biometrics 56 1268-1271.
  • Davidian and Giltinan (1995) [author] Davidian, M.M. Giltinan, D. M.D. M. (1995). Nonlinear models for repeated measurement data. Chapman and Hall/CRC, Boca Raton, FL.
  • Davidian and Giltinan (2003) [author] Davidian, MarieM. Giltinan, David M.D. M. (2003). Nonlinear models for repeated measurement data: An overview and update. Journal of Agricultural, Biological, and Environmental Statistics 8 387.
  • Degerud et al. (2014) [author] Degerud, EirikE., Løland, Kjetil HK. H., Nygård, OttarO., Midttun, ØivindØ., Ueland, Per MP. M., Seifert, ReinhardR. et al. (2014). Vitamin D status was not associated with ‘one-year’ progression of coronary artery disease, assessed by coronary angiography in statin-treated patients. European Journal of Preventive Cardiology 22 594-602.
  • Demidenko (2013) [author] Demidenko, E.E. (2013). Mixed models: Theory and applications with R, Second ed. John Wiley & Sons, Hoboken, NJ.
  • Duffy et al. (2015) [author] Duffy, L. M.L. M., Olson, R. J.R. J., Lennert-Cody, C. E.C. E., Galván-Magaña, F.F., Bocanegra-Castillo, N.N. Kuhnert, P. M.P. M. (2015). Foraging ecology of silky sharks, Carcharhinus falciformis, captured by the tuna purse-seine fishery in the eastern Pacific Ocean. Marine Biology 162 571-593.
  • Farcomeni (2012) [author] Farcomeni, AlessioA. (2012). Quantile regression for longitudinal data based on latent Markov subject-specific parameters. Statistics and Computing 22 141-152.
  • Fornaroli et al. (2015) [author] Fornaroli, R.R., Cabrini, R.R., Sartori, L.L., Marazzi, F.F., Vracevic, D.D., Mezzanotte, V.V. et al. (2015). Predicting the constraint effect of environmental characteristics on macroinvertebrate density and diversity using quantile regression mixed model. Hydrobiologia 742 153-167.
  • Fu and Wang (2012) [author] Fu, LiyaL. Wang, You-GanY.-G. (2012). Quantile regression for longitudinal data with a working correlation model. Computational Statistics & Data Analysis 56 2526-2538.
  • Galvao (2011) [author] Galvao, A. F.A. F. (2011). Quantile regression for dynamic panel data with fixed effects. Journal of Econometrics 164 142-157.
  • Galvao and Montes-Rojas (2010) [author] Galvao, A. F.A. F. Montes-Rojas, G. V.G. V. (2010). Penalized quantile regression for dynamic panel data. Journal of Statistical Planning and Inference 140 3476-3497.
  • Geraci (2016) [author] Geraci, M.M. (2016). Qtools: A collection of models and other tools for quantile inference. R Journal 8 117-138.
  • Geraci and Bottai (2007) [author] Geraci, M.M. Bottai, M.M. (2007). Quantile regression for longitudinal data using the asymmetric Laplace distribution. Biostatistics 8 140-154.
  • Geraci and Bottai (2014) [author] Geraci, M.M. Bottai, M.M. (2014). Linear quantile mixed models. Statistics and Computing 24 461-479.
  • Gory, Craigmile and MacEachern (2016) [author] Gory, J. J.J. J., Craigmile, P. F.P. F. MacEachern, S. N.S. N. (2016). Marginally interpretable generalized linear mixed models. ArXiv e-prints 1610.01526.
  • Huang and Chen (2016) [author] Huang, YangxinY. Chen, JiaqingJ. (2016). Bayesian quantile regression-based nonlinear mixed-effects joint models for time-to-event and longitudinal data with multiple features. Statistics in Medicine 35 5666-5685.
  • Karlsson (2008) [author] Karlsson, A.A. (2008). Nonlinear quantile regression estimation of longitudinal data. Communications in Statistics-Simulation and Computation 37 114-131.
  • Koenker (2004)

    [author] Koenker, R.R. (2004). Quantile regression for longitudinal data. Journal of Multivariate Analysis 91 74-89.

  • Koenker (2016) [author] Koenker, R.R. (2016). quantreg: Quantile Regression R package version 5.29.
  • Koenker and Bassett (1978) [author] Koenker, R.R. Bassett, G.G. (1978). Regression quantiles. Econometrica 46 33-50.
  • Koenker and Park (1996) [author] Koenker, R.R. Park, B. J.B. J. (1996). An interior point algorithm for nonlinear quantile regression. Journal of Econometrics 71 265-283.
  • Kwan et al. (1976) [author] Kwan, K. C.K. C., Breault, G. O.G. O., Umbenhauer, E. R.E. R., McMahon, F. G.F. G. Duggan, D. E.D. E. (1976). Kinetics of indomethacin absorption, elimination, and enterohepatic circulation in man. Journal of Pharmacokinetics and Biopharmaceutics 4 255-80.
  • Lamarche (2010) [author] Lamarche, CC. (2010). Robust penalized quantile regression estimation for panel data. Journal of Econometrics 157 396-498.
  • Lindsey (2001) [author] Lindsey, J. K.J. K. (2001). Nonlinear models for medical statistics. Oxford University Press, New York, NY.
  • Lindstrom and Bates (1990) [author] Lindstrom, M. J.M. J. Bates, D. M.D. M. (1990). Nonlinear mixed effects models for repeated measures data. Biometrics 46 673-687.
  • Lipsitz et al. (1997) [author] Lipsitz, S. R.S. R., Fitzmaurice, G. M.G. M., Molenberghs, G.G. Zhao, L. P.L. P. (1997). Quantile regression methods for longitudinal data with drop-outs: Application to CD4 cell counts of patients infected with the human immunodeficiency virus. Journal of the Royal Statistical Society C 46 463-476.
  • Madsen and Nielsen (1993) [author] Madsen, K.K. Nielsen, H. B.H. B. (1993). A finite smoothing algorithm for linear estimation. Siam Journal on Optimization 3 223-235.
  • Marino and Farcomeni (2015) [author] Marino, Maria FrancescaM. F. Farcomeni, AlessioA. (2015). Linear quantile regression models for longitudinal experiments: an overview. METRON 73 229-247.
  • Muggeo, Sciandra and Augugliaro (2012) [author] Muggeo, V. M. R.V. M. R., Sciandra, M.M. Augugliaro, L.L. (2012). Quantile regression via iterative least squares computations. Journal of Statistical Computation and Simulation 82 1557-1569.
  • Muir et al. (2015) [author] Muir, P. R.P. R., Wallace, C. C.C. C., Done, T.T. Aguirre, J. D.J. D. (2015). Limited scope for latitudinal extension of reef corals. Science 348 1135-1138.
  • Ng et al. (2014) [author] Ng, S. W.S. W., Howard, A. G.A. G., Wang, H. J.H. J., Su, C.C. Zhang, B.B. (2014). The physical activity transition among adults in China: 1991-2011. Obesity Reviews 15 27-36.
  • Noufaily and Jones (2013)

    [author] Noufaily, AngelaA. Jones, M. C.M. C. (2013). Parametric quantile regression based on the generalized gamma distribution. Journal of the Royal Statistical Society C 62 723-740.

  • Oberhofer and Haupt (2016) [author] Oberhofer, WalterW. Haupt, HarryH. (2016). Asymptotic theory for nonlinear quantile regression under weak dependence. Econometric Theory 32 686-713.
  • Panik (2014) [author] Panik, M. J.M. J. (2014). Growth curve modeling: Theory and applications. John Wiley and Sons, Hoboken, NJ.
  • Parente and Santos Silva (2016) [author] Parente, P. M. D. C.P. M. D. C. Santos Silva, J. M. C.J. M. C. (2016). Quantile regression with clustered data. Journal of Econometric Methods 5 1-16.
  • Patel, Geraci and Cortina-Borja (2016) [author] Patel, D. E.D. E., Geraci, M.M. Cortina-Borja, M.M. (2016). Modelling normative kinetic perimetry isopters using mixed-effects quantile regression. Journal of Vision 16 1-6.
  • Patel et al. (2015) [author] Patel, D. E.D. E., Cumberland, P. M.P. M., Walters, B. C.B. C., Russell-Eggitt, I.I., Cortina-Borja, M.M. Rahi, J. S.J. S. (2015). Study of optimal perimetric testing in children (OPTIC): Normative visual field values in children. Ophthalmology 122 1711-1717.
  • Pinheiro and Bates (1995) [author] Pinheiro, JCJ. Bates, DMD. (1995). Approximations to the log-likelihood function in the nonlinear mixed-effects model. Journal of Computational and Graphical Statistics 4 12-35.
  • Pinheiro and Bates (2000) [author] Pinheiro, J. C.J. C. Bates, D. M.D. M. (2000). Mixed-effects models in S and S-PLUS. Springer Verlag, New York.
  • Pinheiro et al. (2014) [author] Pinheiro, J.J., Bates, D.D., DebRoy, S.S., Sarkar, D.D. R Core Team (2014). nlme: Linear and nonlinear mixed effects models R package version 3.1-117.
  • Powell (1994) [author] Powell, James L.J. L. (1994). Estimation of semiparametric models In Handbook of Econometrics Volume 4 41, 2443-2521. Elsevier.
  • Reich, Bondell and Wang (2010) [author] Reich, Brian J.B. J., Bondell, Howard D.H. D. Wang, Huixia J.H. J. (2010). Flexible Bayesian quantile regression for independent and clustered data. Biostatistics 11 337-352.
  • Rigby and Stasinopoulos (2005) [author] Rigby, RAR. Stasinopoulos, DMD. (2005). Generalized additive models for location, scale and shape. Journal of the Royal Statistical Society C 54 507-554.
  • Ritz and Spiegelman (2004) [author] Ritz, JohnJ. Spiegelman, DonnaD. (2004). Equivalence of conditional and marginal regression models for clustered and longitudinal data. Statistical Methods in Medical Research 13 309-323.
  • R Core Team (2016) [author] R Core Team (2016). R: A Language and Environment for Statistical Computing R Foundation for Statistical Computing, Vienna, Austria.
  • Vonesh et al. (2002) [author] Vonesh, E. F.E. F., Wang, H.H., Nie, L.L. Majumdar, D.D. (2002). Conditional second-order generalized estimating equations for generalized linear and nonlinear mixed-effects models. Journal of the American Statistical Association 97 271-283.
  • Wang (2012) [author] Wang, JingJ. (2012). Bayesian quantile regression for parametric nonlinear mixed effects models. Statistical Methods & Applications 21 279-295.
  • Yang, Wang and He (2016) [author] Yang, YunwenY., Wang, Huixia JudyH. J. He, XumingX. (2016). Posterior inference in Bayesian quantile regression with asymmetric Laplace likelihood. International Statistical Review 84 327–344. doi:10.1111/insr.12114.
  • Yuan and Yin (2010)

    [author] Yuan, Y.Y. Yin, G.G. (2010). Bayesian quantile regression for longitudinal studies with nonignorable missing data. Biometrics 66 105-114.