Big data, big problems: Responding to "Are we there yet?"
Bradley et al. (arXiv:2106.05818v2), as part of an analysis of the performance of large-but-biased surveys during the COVID-19 pandemic, argue that the data defect correlation provides a useful tool to quantify the effects of sampling bias on survey results. We examine their analyses of results from the COVID-19 Trends and Impact Survey (CTIS) and show that, despite their claims, CTIS in fact performs well for its intended goals. Our examination reveals several limitations in the data defect correlation framework, including that it is only applicable for a single goal (population point estimation) and that it does not admit the possibility of measurement error. Through examples, we show that these limitations seriously affect the applicability of the framework for analyzing CTIS results. Through our own alternative analyses, we arrive at different conclusions, and we argue for a more expansive view of survey quality that accounts for the intended uses of the data and all sources of error, in line with the Total Survey Error framework that have been widely studied and implemented by survey methodologists.
READ FULL TEXT