DeepAI AI Chat
Log In Sign Up

Discovering topics in text datasets by visualizing relevant words

by   Franziska Horn, et al.
Berlin Institute of Technology (Technische Universität Berlin)

When dealing with large collections of documents, it is imperative to quickly get an overview of the texts' contents. In this paper we show how this can be achieved by using a clustering algorithm to identify topics in the dataset and then selecting and visualizing relevant words, which distinguish a group of documents from the rest of the texts, to summarize the contents of the documents belonging to each topic. We demonstrate our approach by discovering trending topics in a collection of New York Times article snippets.


Exploring text datasets by visualizing relevant words

When working with a new dataset, it is important to first explore and fa...

Discovering topics with neural topic models built from PLSA assumptions

In this paper we present a model for unsupervised topic discovery in tex...

Which Clustering Do You Want? Inducing Your Ideal Clustering with Minimal Feedback

While traditional research on text clustering has largely focused on gro...

Topic Extraction of Crawled Documents Collection using Correlated Topic Model in MapReduce Framework

The tremendous increase in the amount of available research documents im...

Is there something I'm missing? Topic Modeling in eDiscovery

In legal eDiscovery, the parties are required to search through their el...

Probably Reasonable Search in eDiscovery

In eDiscovery, a party to a lawsuit or similar action must search throug...

Dataset of Philippine Presidents Speeches from 1935 to 2016

The dataset was collected to examine and identify possible key topics wi...