Harmonizing Child Mortality Data at Disparate Geographic Levels
There is an increasing focus on reducing inequalities in health outcomes in developing countries. Subnational variation is of particular interest, with geographic data used to understand the spatial risk of detrimental outcomes and to identify who is at greatest risk. While some health surveys provide observations with associated geographic coordinates, many others provide data that have their locations masked and instead only report the strata within which the data resides. How to harmonize these data sources for spatial analysis has seen previously considered though no method has been agreed upon and comparison of the validity of methods are lacking. In this paper, we present a new method for analyzing masked survey data alongside traditional geolocated data, using a method that is consistent with the data generating process. In addition, we critique two proposed approaches to analyzing masked data and illustrate that they are fundamentally flawed methodologically. To validate our method, we compare our approach with previously formulated solutions in several realistic simulation environments in which the underlying structure of the risk field is known. We simulate samples from spatial fields in a way that mimics the sampling frame implemented in the most common health surveys in low and middle income countries, the DHS and MICS. In simulations, the newly proposed approach outperforms previously proposed approaches in terms of minimizing error while increasing the precision of estimates. The approaches are subsequently compared using child mortality data from the Dominican Republic where our findings are reinforced. Accurately increasing precision of child mortality estimates, and health estimates in general, by leveraging various types of data improves our ability to implement precision public health initiatives and better understand the landscape of geographic health inequalities.
READ FULL TEXT