Decipherment of Historical Manuscript Images

10/09/2018
by   Xusen Yin, et al.
0

European libraries and archives are filled with enciphered manuscripts from the early modern period. These include military and diplomatic correspondence, records of secret societies, private letters, and so on. Although they are enciphered with classical cryptographic algorithms, their contents are unavailable to working historians. We therefore attack the problem of automatically converting cipher manuscript images into plaintext. We develop unsupervised models for character segmentation, character-image clustering, and decipherment of cluster sequences. We experiment with both pipelined and joint models, and we give empirical results for multiple ciphers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/12/2023

Contrastive Attention Networks for Attribution of Early Modern Print

In this paper, we develop machine learning techniques to identify unknow...
research
09/06/2023

Character Queries: A Transformer-based Approach to On-Line Handwritten Character Segmentation

On-line handwritten character segmentation is often associated with hand...
research
03/08/2018

Towards Knowledge Discovery from the Vatican Secret Archives. In Codice Ratio -- Episode 1: Machine Transcription of the Manuscripts

In Codice Ratio is a research project to study tools and techniques for ...
research
08/10/2021

Util::Lookup: Exploiting key decoding in cryptographic libraries

Implementations of cryptographic libraries have been scrutinized for sec...
research
09/14/2018

Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin

In this paper we describe a dataset of German and Latin ground truth (GT...
research
08/28/2022

An Access Control Method with Secret Key for Semantic Segmentation Models

A novel method for access control with a secret key is proposed to prote...

Please sign up or login with your details

Forgot password? Click here to reset