Diachronic Topics in New High German Poetry

09/24/2019
by   Thomas N. Haider, et al.
0

Statistical topic models are increasingly and popularly used by Digital Humanities scholars to perform distant reading tasks on literary data. It allows us to estimate what people talk about. Especially Latent Dirichlet Allocation (LDA) has shown its usefulness, as it is unsupervised, robust, easy to use, scalable, and it offers interpretable results. In a preliminary study, we apply LDA to a corpus of New High German poetry (textgrid, with 51k poems, 8m token), and use the distribution of topics over documents for a classification of poems into time periods and for authorship attribution.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/28/2020

Mapping Topic Evolution Across Poetic Traditions

Poetic traditions across languages evolved differently, but we find that...
research
05/04/2012

Variable Selection for Latent Dirichlet Allocation

In latent Dirichlet allocation (LDA), topics are multinomial distributio...
research
06/28/2015

Topic2Vec: Learning Distributed Representations of Topics

Latent Dirichlet Allocation (LDA) mining thematic structure of documents...
research
02/03/2014

A high-reproducibility and high-accuracy method for automated topic classification

Much of human knowledge sits in large databases of unstructured text. Le...
research
06/23/2022

A Temporal Extension of Latent Dirichlet Allocation for Unsupervised Acoustic Unit Discovery

Latent Dirichlet allocation (LDA) is widely used for unsupervised topic ...
research
12/16/2014

Application of Topic Models to Judgments from Public Procurement Domain

In this work, automatic analysis of themes contained in a large corpora ...
research
10/29/2015

WarpLDA: a Cache Efficient O(1) Algorithm for Latent Dirichlet Allocation

Developing efficient and scalable algorithms for Latent Dirichlet Alloca...

Please sign up or login with your details

Forgot password? Click here to reset