Recursive Neyman Algorithm for Optimum Sample Allocation under Box Constraints on Sample Sizes in Strata
The optimal sample allocation in stratified sampling is one of the basic issues of modern survey sampling methodology. It is a procedure of dividing the total sample among pairwise disjoint subsets of a finite population, called strata, such that for chosen survey sampling designs in strata, it produces the smallest variance for estimating a population total (or mean) of a given study variable. In this paper we are concerned with the optimal allocation of a sample, under lower and upper bounds imposed jointly on the sample strata-sizes. We will consider a family of sampling designs that give rise to variances of estimators of a natural generic form. In particular, this family includes simple random sampling without replacement (abbreviated as SI) in strata, which is perhaps, the most important example of stratified sampling design. First, we identify the allocation problem as a convex optimization problem. This methodology allows to establish a generic form of the optimal solution, so called optimality conditions. Second, based on these optimality conditions, we propose new and efficient recursive algorithm, named RNABOX, which solves the allocation problem considered. This new algorithm can be viewed as a generalization of the classical recursive Neyman allocation algorithm, a popular tool for optimal sample allocation in stratified sampling with SI design in all strata, when only upper bounds are imposed on sample strata-sizes. We implement the RNABOX in R as a part of our package stratallo, which is available from the Comprehensive R Archive Network (CRAN). Finally, in the context of the established optimality conditions, we briefly discuss two existing methodologies dedicated to the allocation problem being studied: the noptcond algorithm introduced in Gabler, Ganninger and Münnich (2012); and fixed iteration procedures from Münnich, Sachs and Wagner (2012).
READ FULL TEXT