A comparison of Gap statistic definitions with and without logarithm function

03/24/2011 ∙ by Mojgan Mohajer, et al. ∙ 0

The Gap statistic is a standard method for determining the number of clusters in a set of data. The Gap statistic standardizes the graph of (W_k), where W_k is the within-cluster dispersion, by comparing it to its expectation under an appropriate null reference distribution of the data. We suggest to use W_k instead of (W_k), and to compare it to the expectation of W_k under a null reference distribution. In fact, whenever a number fulfills the original Gap statistic inequality, this number also fulfills the inequality of a Gap statistic using W_k, but not vice versa. The two definitions of the Gap function are evaluated on several simulated data sets and on a real data of DCE-MR images.



There are no comments yet.


page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.