Random Wheeler Automata

07/14/2023
by   Ruben Becker, et al.
0

Wheeler automata were introduced in 2017 as a tool to generalize existing indexing and compression techniques based on the Burrows-Wheeler transform. Intuitively, an automaton is said to be Wheeler if there exists a total order on its states reflecting the co-lexicographic order of the strings labeling the automaton's paths; this property makes it possible to represent the automaton's topology in a constant number of bits per transition, as well as efficiently solving pattern matching queries on its accepted regular language. After their introduction, Wheeler automata have been the subject of a prolific line of research, both from the algorithmic and language-theoretic points of view. A recurring issue faced in these studies is the lack of large datasets of Wheeler automata on which the developed algorithms and theories could be tested. One possible way to overcome this issue is to generate random Wheeler automata. Motivated by this observation, in this paper we initiate the theoretical study of random Wheeler automata, focusing on the deterministic case (Wheeler DFAs – WDFAs). We start by extending the Erdős-Rényi random graph model to WDFAs, and proceed by providing an algorithm generating uniform WDFAs according to this model. Our algorithm generates a uniform WDFA with n states, m transitions, and alphabet's cardinality σ in O(m) expected time (O(mlog m) worst-case time w.h.p.) and constant working space for all alphabets of size σ≤ m/ln m. As a by-product, we also give formulas for the number of distinct WDFAs and obtain that nσ + (n - σ) logσ bits are necessary and sufficient to encode a WDFA with n states and alphabet of size σ, up to an additive Θ(n) term. We present an implementation of our algorithm and show that it is extremely fast in practice, with a throughput of over 8 million transitions per second.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/15/2020

On Indexing and Compressing Finite Automata

An index for a finite automaton is a powerful data structure that suppor...
research
04/21/2023

Faster Prefix-Sorting Algorithms for Deterministic Finite Automata

Sorting is a fundamental algorithmic pre-processing technique which ofte...
research
11/27/2017

Efficient reduction of nondeterministic automata with application to language inclusion testing

We present efficient algorithms to reduce the size of nondeterministic B...
research
11/10/2020

A translation of weighted LTL formulas to weighted Büchi automata over ω-valuation monoids

In this paper we introduce a weighted LTL over product ω-valuation monoi...
research
05/17/2019

Simulations in Rank-Based Büchi Automata Complementation

The long search for an optimal complementation construction for Büchi au...
research
12/08/2002

JohnnyVon: Self-Replicating Automata in Continuous Two-Dimensional Space

JohnnyVon is an implementation of self-replicating automata in continuou...
research
08/10/2019

Large Scale Geometries of Infinite Strings

We introduce geometric consideration into the theory of formal languages...

Please sign up or login with your details

Forgot password? Click here to reset