Automatic Identification of Types of Alterations in Historical Manuscripts

03/20/2020
by   David Lassner, et al.
11

Alterations in historical manuscripts such as letters represent a promising field of research. On the one hand, they help understand the construction of text. On the other hand, topics that are being considered sensitive at the time of the manuscript gain coherence and contextuality when taking alterations into account, especially in the case of deletions. The analysis of alterations in manuscripts, though, is a traditionally very tedious work. In this paper, we present a machine learning-based approach to help categorize alterations in documents. In particular, we present a new probabilistic model (Alteration Latent Dirichlet Allocation, alterLDA in the following) that categorizes content-related alterations. The method proposed here is developed based on experiments carried out on the digital scholarly edition Berlin Intellectuals, for which alterLDA achieves high performance in the recognition of alterations on labelled data. On unlabelled data, applying alterLDA leads to interesting new insights into the alteration behavior of authors, editors and other manuscript contributors, as well as insights into sensitive topics in the correspondence of Berlin intellectuals around 1800. In addition to the findings based on the digital scholarly edition Berlin Intellectuals, we present a general framework for the analysis of text genesis that can be used in the context of other digital resources representing document variants. To that end, we present in detail the methodological steps that are to be followed in order to achieve such results, giving thereby a prime example of an Machine Learning application the Digital Humanities.

READ FULL TEXT

page 8

page 12

page 21

page 23

page 26

research
11/18/2021

A Bibliometric Analysis of the BPM Conference Using Computational Data Analytics

The BPM conference has a long tradition as the premier venue for publish...
research
10/26/2022

The Biscari Archive. A case study of the application of Transkribus tool

The Paterno' Castello Principi di Biscari Archive, preserved at the Stat...
research
01/20/2018

Determination of Digital Straight Segments Using the Slope

We present a new method for the recognition of digital straight lines ba...
research
01/18/2015

Deep Belief Nets for Topic Modeling

Applying traditional collaborative filtering to digital publishing is ch...
research
10/17/2015

A Historical Analysis of the Field of OR/MS using Topic Models

This study investigates the content of the published scientific literatu...
research
11/21/2022

A plea for an upgrade to the digital craft of the historian and digital methodology for discovering the past

This essay aims to bid analogue historians assume that digitisation is t...
research
02/02/2021

Two Demonstrations of the Machine Translation Applications to Historical Documents

We present our demonstration of two machine translation applications to ...

Please sign up or login with your details

Forgot password? Click here to reset