DeepAI AI Chat
Log In Sign Up

Discovering topics in text datasets by visualizing relevant words

07/18/2017
by   Franziska Horn, et al.
Berlin Institute of Technology (Technische Universität Berlin)
0

When dealing with large collections of documents, it is imperative to quickly get an overview of the texts' contents. In this paper we show how this can be achieved by using a clustering algorithm to identify topics in the dataset and then selecting and visualizing relevant words, which distinguish a group of documents from the rest of the texts, to summarize the contents of the documents belonging to each topic. We demonstrate our approach by discovering trending topics in a collection of New York Times article snippets.

READ FULL TEXT
07/17/2017

Exploring text datasets by visualizing relevant words

When working with a new dataset, it is important to first explore and fa...
11/25/2019

Discovering topics with neural topic models built from PLSA assumptions

In this paper we present a model for unsupervised topic discovery in tex...
01/16/2014

Which Clustering Do You Want? Inducing Your Ideal Clustering with Minimal Feedback

While traditional research on text clustering has largely focused on gro...
01/06/2020

Topic Extraction of Crawled Documents Collection using Correlated Topic Model in MapReduce Framework

The tremendous increase in the amount of available research documents im...
07/30/2020

Is there something I'm missing? Topic Modeling in eDiscovery

In legal eDiscovery, the parties are required to search through their el...
01/28/2022

Probably Reasonable Search in eDiscovery

In eDiscovery, a party to a lawsuit or similar action must search throug...
11/12/2021

Dataset of Philippine Presidents Speeches from 1935 to 2016

The dataset was collected to examine and identify possible key topics wi...