Basic concepts and tools for the Toki Pona minimalist and constructed language: Wordnet synsets; analysis of the vocabulary; synthesis and syntax highlighting of texts

12/26/2017
by   Renato Fabbri, et al.
0

A minimalist constructed language (conlang) is useful for experiments and comfortable for making tools. The Toki Pona (TP) conlang is minimalist both in the vocabulary (with only 14 letters and 124 words) and in the ≈10 syntax rules. The language is useful for being a used and somewhat established minimalist conlang with at least hundreds of fluent speakers. In this article, we describe current concepts and resources for TP, and make available Python scripted routines for the analysis of the language, the synthesis of texts, the specification of syntax highlighting schemes, and the achievement of a preliminary TP Wordnet wordnet. We focus on the analysis of the basic vocabulary, as corpus analyses were found in corpus. The synthesis is based on sentence templates, relates to context by keeping track of used words, and renders larger texts by using a fixed number of phonemes (e.g. for poems) and number of sentences, words and letters (e.g. for paragraphs). Syntax highlighting reflects morphosyntactic classes given in the official dictionary and different solutions are described and implemented in the well-established Vim text editor vim. The tentative TP Wordnet is made available in three forms that reflect the choices of the synsets related to each word. In summary, this text holds potentially novel conceptualizations about, and tools and results in analyzing, synthesizing and syntax highlighting the TP language.

READ FULL TEXT
research
01/28/2020

Why Should Anyone use Colours? or, Syntax Highlighting Beyond Code Snippets

Syntax highlighting in the form of colours and font diversification, is ...
research
12/06/2018

Yaps: Python Frontend to Stan

Stan is a popular probabilistic programming language with a self-contain...
research
06/24/2022

The syntax-lexicon tradeoff in writing

As speakers turn their thoughts into sentences, they maintain a balance ...
research
05/31/2016

Determining the Characteristic Vocabulary for a Specialized Dictionary using Word2vec and a Directed Crawler

Specialized dictionaries are used to understand concepts in specific dom...
research
04/19/2019

Recognizing the vocabulary of Brazilian popular newspapers with a free-access computational dictionary

We report an experiment to check the identification of a set of words in...
research
08/24/2020

MyPDDL: Tools for efficiently creating PDDL domains and problems

The Planning Domain Definition Language (PDDL) is the state-of-the-art l...
research
08/30/2015

The Prose Storyboard Language: A Tool for Annotating and Directing Movies

The prose storyboard language is a formal language for describing movies...

Please sign up or login with your details

Forgot password? Click here to reset