A comparison of Gap statistic definitions with and without logarithm function

03/24/2011 ∙ by Mojgan Mohajer, et al. ∙ 0

The Gap statistic is a standard method for determining the number of clusters in a set of data. The Gap statistic standardizes the graph of (W_k), where W_k is the within-cluster dispersion, by comparing it to its expectation under an appropriate null reference distribution of the data. We suggest to use W_k instead of (W_k), and to compare it to the expectation of W_k under a null reference distribution. In fact, whenever a number fulfills the original Gap statistic inequality, this number also fulfills the inequality of a Gap statistic using W_k, but not vice versa. The two definitions of the Gap function are evaluated on several simulated data sets and on a real data of DCE-MR images.

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 14

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.