DeepAI AI Chat
Log In Sign Up

Random Indexing K-tree

by   Christopher M. de Vries, et al.

Random Indexing (RI) K-tree is the combination of two algorithms for clustering. Many large scale problems exist in document clustering. RI K-tree scales well with large inputs due to its low complexity. It also exhibits features that are useful for managing a changing collection. Furthermore, it solves previous issues with sparse document vectors when using K-tree. The algorithms and data structures are defined, explained and motivated. Specific modifications to K-tree are made for use with RI. Experiments have been executed to measure quality. The results indicate that RI K-tree improves document cluster quality over the original K-tree algorithm.


page 1

page 2

page 3

page 4


Document Clustering with K-tree

This paper describes the approach taken to the XML Mining track at INEX ...

K-tree: Large Scale Document Clustering

We introduce K-tree in an information retrieval context. It is an effici...

A comparison of two suffix tree-based document clustering algorithms

Document clustering as an unsupervised approach extensively used to navi...

Document Clustering based on Topic Maps

Importance of document clustering is now widely acknowledged by research...

Document Clustering Evaluation: Divergence from a Random Baseline

Divergence from a random baseline is a technique for the evaluation of d...

A computational study of Gomory-Hu tree algorithms

We present an experimental study of algorithms for computing the Gomory-...

On the Reproducibility of Experiments of Indexing Repetitive Document Collections

This work introduces a companion reproducible paper with the aim of allo...