K-tree: Large Scale Document Clustering

01/06/2010

∙

We introduce K-tree in an information retrieval context. It is an efficient approximation of the k-means clustering algorithm. Unlike k-means it forms a hierarchy of clusters. It has been extended to address issues with sparse representations. We compare performance and quality to CLUTO using document collections. The K-tree has a low time complexity that is suitable for large document collections. This tree structure allows for efficient disk based implementations where space requirements exceed that of main memory.

READ FULL TEXT

K-tree: Large Scale Document Clustering

Sign in with Google

Consider DeepAI Pro