Histograms lie about distribution shapes and Pearson's coefficient of variation lies about variability

11/12/2021
by   Paulo S. P. Silveira, et al.
0

Background and Objective: Histograms and Pearson's coefficient of variation are among the most popular summary statistics. Researchers use them to judge the shape of quantitative data distribution by visual inspection of histograms. The coefficient of variation is taken as an estimator of relative variability of these data. We explore properties of histograms and coefficient of variation by examples in R, thus offering better alternatives: density plots and Eisenhauer's relative dispersion coefficient. Methods: Hypothetical examples developed in R are applied to create histograms and density and to compute coefficient of variation and relative dispersion coefficient. Results: These hypothetical examples clearly show that these two traditional approaches are flawed. Histograms are incapable of reflecting the distribution of probabilities and the coefficient of variation has issues with negative and positive values in the same dataset, it is sensible to outliers, and it is severely affected by mean value of a distribution. Potential replacements are explained and applied for contrast. Conclusions: With the use of modern computers and R language it is easy to replace histograms by density plots, which are able to approximate the theoretical probability distribution. In addition, Eisenhauer's relative dispersion coefficient is suggested as a suitable estimator of relative variability, including corrections for lower and upper bounds.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro