Linear-time Minimization of Wheeler DFAs

11/03/2021
by   Jarno Alanko, et al.
0

Wheeler DFAs (WDFAs) are a sub-class of finite-state automata which is playing an important role in the emerging field of compressed data structures: as opposed to general automata, WDFAs can be stored in just logσ + O(1) bits per edge, σ being the alphabet's size, and support optimal-time pattern matching queries on the substring closure of the language they recognize. An important step to achieve further compression is minimization. When the input 𝒜 is a general deterministic finite-state automaton (DFA), the state-of-the-art is represented by the classic Hopcroft's algorithm, which runs in O(|𝒜|log |𝒜|) time. This algorithm stands at the core of the only existing minimization algorithm for Wheeler DFAs, which inherits its complexity. In this work, we show that the minimum WDFA equivalent to a given input WDFA can be computed in linear O(|𝒜|) time. When run on de Bruijn WDFAs built from real DNA datasets, an implementation of our algorithm reduces the number of nodes from 14 million nodes per second.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/03/2020

Efficiently Testing Simon's Congruence

Simon's congruence ∼_k is defined as follows: two words are ∼_k-equivale...
research
06/22/2023

Faster Compression of Deterministic Finite Automata

Deterministic finite automata (DFA) are a classic tool for high throughp...
research
11/08/2021

Graphs can be succinctly indexed for pattern matching in O(|E|^2 + |V|^5 / 2) time

For the first time we provide a succinct pattern matching index for arbi...
research
07/15/2020

On Indexing and Compressing Finite Automata

An index for a finite automaton is a powerful data structure that suppor...
research
07/22/2019

Succinct Representation for (Non)Deterministic Finite Automata

Deterministic finite automata are one of the simplest and most practical...
research
10/07/2004

Automated Pattern Detection--An Algorithm for Constructing Optimally Synchronizing Multi-Regular Language Filters

In the computational-mechanics structural analysis of one-dimensional ce...
research
02/02/2023

New Linear-time Algorithm for SubTree Kernel Computation based on Root-Weighted Tree Automata

Tree kernels have been proposed to be used in many areas as the automati...

Please sign up or login with your details

Forgot password? Click here to reset