Can Evolutionary Computation Help us to Crib the Voynich Manuscript ?

07/07/2021
by   Daniel Devatman Hromada, et al.
0

Departing from the postulate that Voynich Manuscript is not a hoax but rather encodes authentic contents, our article presents an evolutionary algorithm which aims to find the most optimal mapping between voynichian glyphs and candidate phonemic values. Core component of the decoding algorithm is a process of maximization of a fitness function which aims to find most optimal set of substitution rules allowing to transcribe the part of the manuscript – which we call the Calendar – into lists of feminine names. This leads to sets of character subsitution rules which allow us to consistently transcribe dozens among three hundred calendar tokens into feminine names: a result far surpassing both “popular” as well as "state of the art" tentatives to crack the manuscript. What's more, by using name lists stemming from different languages as potential cribs, our “adaptive” method can also be useful in identification of the language in which the manuscript is written. As far as we can currently tell, results of our experiments indicate that the Calendar part of the manuscript contains names from baltoslavic, balkanic or hebrew language strata. Two further indications are also given: primo, highest fitness values were obtained when the crib list contains names with specific infixes at token's penultimate position as is the case, for example, for slavic feminine diminutives (i.e. names ending with -ka and not -a). In the most successful scenario, 240 characters contained in 35 distinct Voynichese tokens were successfully transcribed. Secundo, in case of crib stemming from Hebrew language, whole adaptation process converges to significantly better fitness values when transcribing voynichian tokens whose order of individual characters have been reversed, and when lists feminine and not masculine names are used as the crib.

READ FULL TEXT
research
02/13/2018

Substation Signal Matching with a Bagged Token Classifier

Currently, engineers at substation service providers match customer data...
research
06/06/2022

What do tokens know about their characters and how do they know it?

Pre-trained language models (PLMs) that use subword tokenization schemes...
research
04/01/2021

Mining Wikidata for Name Resources for African Languages

This work supports further development of language technology for the la...
research
10/31/2019

Implementation of an Index Optimize Technology for Highly Specialized Terms based on the Phonetic Algorithm Metaphone

When compiling databases, for example to meet the needs of healthcare es...
research
02/27/2021

A Context-based Automated Approach for Method Name Consistency Checking and Suggestion

Misleading method names in software projects can confuse developers, whi...

Please sign up or login with your details

Forgot password? Click here to reset