Basic concepts and tools for the Toki Pona minimalist and constructed language: Wordnet synsets; analysis of the vocabulary; synthesis and syntax highlighting of texts
A minimalist constructed language (conlang) is useful for experiments and comfortable for making tools. The Toki Pona (TP) conlang is minimalist both in the vocabulary (with only 14 letters and 124 words) and in the ≈10 syntax rules. The language is useful for being a used and somewhat established minimalist conlang with at least hundreds of fluent speakers. In this article, we describe current concepts and resources for TP, and make available Python scripted routines for the analysis of the language, the synthesis of texts, the specification of syntax highlighting schemes, and the achievement of a preliminary TP Wordnet wordnet. We focus on the analysis of the basic vocabulary, as corpus analyses were found in corpus. The synthesis is based on sentence templates, relates to context by keeping track of used words, and renders larger texts by using a fixed number of phonemes (e.g. for poems) and number of sentences, words and letters (e.g. for paragraphs). Syntax highlighting reflects morphosyntactic classes given in the official dictionary and different solutions are described and implemented in the well-established Vim text editor vim. The tentative TP Wordnet is made available in three forms that reflect the choices of the synsets related to each word. In summary, this text holds potentially novel conceptualizations about, and tools and results in analyzing, synthesizing and syntax highlighting the TP language.
READ FULL TEXT