Percentile-Based Residuals for Model Assessment
Residuals are a key component of diagnosing model fit. The usual practice is to compute standardized residuals using expected values and standard deviations of the observed data, then use these values to detect outliers and assess model fit. Approximate normality of these residuals is key for this process to have good properties, but in many modeling contexts, especially for complex, multi-level models, normality may not hold. In these cases outlier detection and model diagnostics aren't properly calibrated. Alternatively, as we demonstrate, residuals computed from the percentile location of a datum's value in its full predictive distribution lead to well calibrated evaluations of model fit. We generalize an approach described by Dunn and Smyth (1996) and evaluate properties mathematically, via case-studies and by simulation. In addition, we show that the standard residuals can be calibrated to mimic the percentile approach, but that this extra step is avoided by directly using percentile-based residuals. For both the percentile-based residuals and the calibrated standard residuals, the use of full predictive distributions with the appropriate location, spread and shape is necessary for valid assessments.
READ FULL TEXT