Are You Sure You're Sure? – Effects of Visual Representation on the Cliff Effect in Statistical Inference
Common reporting styles of statistical results, such as confidence intervals (CI), are prone to dichotomous interpretations especially on null hypothesis testing frameworks, for example by claiming significant differences between drug treatment and placebo groups due to the non-overlapping CIs of the mean effects, while disregarding the magnitudes and absolute difference in the effect sizes. Techniques relying on the visual estimation of the strength of evidence have been recommended to limit such dichotomous interpretations but their effectiveness has been challenged. We ran two experiments to compare several representation alternatives of confidence intervals, and used Bayesian multilevel models to estimate the effects of the representation styles on differences in subjective confidence of the results and preferences in visualization styles. Our results suggest that adding visual information to classic CI representation can decrease the sudden drop around p-value 0.05 compared to classic CIs and textual representation of CI with p-values. All data analysis and scripts are available at https://github.com/helske/statvis.
READ FULL TEXT