Faster Compression of Deterministic Finite Automata

06/22/2023
by   Philip Bille, et al.
0

Deterministic finite automata (DFA) are a classic tool for high throughput matching of regular expressions, both in theory and practice. Due to their high space consumption, extensive research has been devoted to compressed representations of DFAs that still support efficient pattern matching queries. Kumar et al. [SIGCOMM 2006] introduced the delayed deterministic finite automaton () which exploits the large redundancy between inter-state transitions in the automaton. They showed it to obtain up to two orders of magnitude compression of real-world DFAs, and their work formed the basis of numerous subsequent results. Their algorithm, as well as later algorithms based on their idea, have an inherent quadratic-time bottleneck, as they consider every pair of states to compute the optimal compression. In this work we present a simple, general framework based on locality-sensitive hashing for speeding up these algorithms to achieve sub-quadratic construction times for s. We apply the framework to speed up several algorithms to near-linear time, and experimentally evaluate their performance on real-world regular expression sets extracted from modern intrusion detection systems. We find an order of magnitude improvement in compression times, with either little or no loss of compression, or even significantly better compression in some cases.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/03/2021

Linear-time Minimization of Wheeler DFAs

Wheeler DFAs (WDFAs) are a sub-class of finite-state automata which is p...
research
10/04/2019

Succinct Determinisation of Counting Automata via Sphere Construction (Technical Report)

We propose an efficient algorithm for determinising counting automata (C...
research
08/20/2023

Real-time Regular Expression Matching

This paper is devoted to finite state automata, regular expression match...
research
04/21/2023

Faster Prefix-Sorting Algorithms for Deterministic Finite Automata

Sorting is a fundamental algorithmic pre-processing technique which ofte...
research
05/09/2023

Sorting Finite Automata via Partition Refinement

Wheeler nondeterministic finite automata (WNFAs) were introduced as a ge...
research
04/24/2019

Deep Packet Inspection in FPGAs via Approximate Nondeterministic Automata

Deep packet inspection via regular expression (RE) matching is a crucial...
research
12/01/2021

CAMA: Energy and Memory Efficient Automata Processing in Content-Addressable Memories

Accelerating finite automata processing is critical for advancing real-t...

Please sign up or login with your details

Forgot password? Click here to reset