Which Clustering Do You Want? Inducing Your Ideal Clustering with Minimal Feedback

01/16/2014
by   Sajib Dasgupta, et al.
0

While traditional research on text clustering has largely focused on grouping documents by topic, it is conceivable that a user may want to cluster documents along other dimensions, such as the authors mood, gender, age, or sentiment. Without knowing the users intention, a clustering algorithm will only group documents along the most prominent dimension, which may not be the one the user desires. To address the problem of clustering documents along the user-desired dimension, previous work has focused on learning a similarity metric from data manually annotated with the users intention or having a human construct a feature space in an interactive manner during the clustering process. With the goal of reducing reliance on human knowledge for fine-tuning the similarity function or selecting the relevant features required by these approaches, we propose a novel active clustering algorithm, which allows a user to easily select the dimension along which she wants to cluster the documents by inspecting only a small number of words. We demonstrate the viability of our algorithm on a variety of commonly-used sentiment datasets.

READ FULL TEXT
research
04/15/2021

Vec2GC – A Graph Based Clustering Method for Text Representations

NLP pipelines with limited or no labeled data, rely on unsupervised meth...
research
07/18/2017

Discovering topics in text datasets by visualizing relevant words

When dealing with large collections of documents, it is imperative to qu...
research
06/19/2015

Representation Learning for Clustering: A Statistical Framework

We address the problem of communicating domain knowledge from a user to ...
research
08/17/2012

Content-based Text Categorization using Wikitology

A major computational burden, while performing document clustering, is t...
research
10/09/2009

Color Image Clustering using Block Truncation Algorithm

With the advancement in image capturing device, the image data been gene...
research
10/03/2021

Subtractive mountain clustering algorithm applied to a chatbot to assist elderly people in medication intake

Errors in medication intake among elderly people are very common. One of...
research
07/04/2012

Two-Way Latent Grouping Model for User Preference Prediction

We introduce a novel latent grouping model for predicting the relevance ...

Please sign up or login with your details

Forgot password? Click here to reset