Multilevel Models Allow Modular Specification of What and Where to Regularize, Especially in Small Area Estimation
Through the lense of multilevel model (MLM) specification and regularization, this is a connect-the-dots introductory summary of Small Area Estimation, e.g. small group prediction informed by a complex sampling design. While a comprehensive book is (Rao and Molina 2015), the goal of this paper is to get interested researchers up to speed with some current developments. We first provide historical context of two kinds of regularization: 1) the regularization 'within' the components of a predictor and 2) the regularization 'between' outcome and predictor. We focus on the MLM framework as it allows the analyst to flexibly control the targets of the regularization. The flexible control is useful when analysts want to overcome shortcomings in design-based estimates. We'll describe the precision deficiencies (high variance) typical of design-based estimates of small groups. We then highlight an interesting MLM example from (Chaudhuri and Ghosh 2011) that integrates both kinds of regularization (between and within). The key idea is to use the design-based variance to control the amount of 'between' regularization and prior information to regularize the components 'within' a predictor. The goal is to let the design-based estimate have authority (when precise) but defer to a model-based prediction when imprecise. We conclude by discussing optional criteria to incorporate into a MLM prediction and possible entrypoints for extensions.
READ FULL TEXT