Document Clustering with K-tree

01/06/2010
by   Christopher M. de Vries, et al.
0

This paper describes the approach taken to the XML Mining track at INEX 2008 by a group at the Queensland University of Technology. We introduce the K-tree clustering algorithm in an Information Retrieval context by adapting it for document clustering. Many large scale problems exist in document clustering. K-tree scales well with large inputs due to its low complexity. It offers promising results both in terms of efficiency and quality. Document classification was completed using Support Vector Machines.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/06/2010

K-tree: Large Scale Document Clustering

We introduce K-tree in an information retrieval context. It is an effici...
research
01/06/2010

Random Indexing K-tree

Random Indexing (RI) K-tree is the combination of two algorithms for clu...
research
08/28/2012

Document Clustering Evaluation: Divergence from a Random Baseline

Divergence from a random baseline is a technique for the evaluation of d...
research
12/29/2011

A comparison of two suffix tree-based document clustering algorithms

Document clustering as an unsupervised approach extensively used to navi...
research
12/01/2021

Efficient Big Text Data Clustering Algorithms using Hadoop and Spark

Document clustering is a traditional, efficient and yet quite effective,...
research
10/30/2022

Recognizing Handwriting Styles in a Historical Scanned Document Using Scikit-Fuzzy c-means Clustering

The forensic attribution of the handwriting in a digitized document to m...
research
08/02/2016

Shape and Centroid Independent Clustring Algorithm for Crowd Management Applications

Clustering techniques play an important role in data mining and its rela...

Please sign up or login with your details

Forgot password? Click here to reset