DeepAI AI Chat
Log In Sign Up

On Indexing and Compressing Finite Automata

by   Nicola Cotumaccio, et al.
Luiss Guido Carli

An index for a finite automaton is a powerful data structure that supports locating paths labeled with a query pattern, thus solving pattern matching on the underlying regular language. In this paper, we solve the long-standing problem of indexing arbitrary finite automata. Our solution consists in finding a partial co-lexicographic order of the states and proving, as in the total order case, that states reached by a given string form one interval on the partial order, thus enabling indexing. We provide a lower bound stating that such an interval requires O(p) words to be represented, p being the order's width (i.e. the size of its largest antichain). Indeed, we show that p determines the complexity of several fundamental problems on finite automata: (i) Letting σ be the alphabet size, we provide an encoding for NFAs using ⌈logσ⌉ + 2⌈log p⌉ + 2 bits per transition and a smaller encoding for DFAs using ⌈logσ⌉ + ⌈log p⌉ + 2 bits per transition. This is achieved by generalizing the Burrows-Wheeler transform to arbitrary automata. (ii) We show that indexed pattern matching can be solved in Õ(m· p^2) query time on NFAs. (iii) We provide a polynomial-time algorithm to index DFAs, while matching the optimal value for p. On the other hand, we prove that the problem is NP-hard on NFAs. (iv) We show that, in the worst case, the classic powerset construction algorithm for NFA determinization generates an equivalent DFA of size 2^p(n-p+1)-1, where n is the number of NFA's states.


page 1

page 2

page 3

page 4


Graphs can be succinctly indexed for pattern matching in O(|E|^2 + |V|^5 / 2) time

For the first time we provide a succinct pattern matching index for arbi...

Random Wheeler Automata

Wheeler automata were introduced in 2017 as a tool to generalize existin...

Simulations in Rank-Based Büchi Automata Complementation

The long search for an optimal complementation construction for Büchi au...

Sparse Regular Expression Matching

We present the first algorithm for regular expression matching that can ...

Linear-time Minimization of Wheeler DFAs

Wheeler DFAs (WDFAs) are a sub-class of finite-state automata which is p...

Sorting Finite Automata via Partition Refinement

Wheeler nondeterministic finite automata (WNFAs) were introduced as a ge...