ReGiS: Regular Expression Simplification via Rewrite-Guided Synthesis

by   Jordan Schmerge, et al.

Expression simplification is an important task necessary in a variety of domains, e.g., compilers, digital logic design, etc. Syntax-guided synthesis (SyGuS) with a cost function can be used for this purpose, but ordered enumeration through a large space of candidate expressions can be expensive. Equality saturation is an alternative approach which allows efficient construction and maintenance of expression equivalence classes generated by rewrite rules, but the procedure may not reach saturation, meaning global minimality cannot be confirmed. We present a new approach called rewrite-guided synthesis (ReGiS), in which a unique interplay between SyGuS and equality saturation-based rewriting helps to overcome these problems, resulting in an efficient, scalable framework for expression simplification. We demonstrate the flexibility and practicality of our approach by applying ReGiS to regular expression denial of service (ReDoS) attack prevention. Many real-world regular expression matching engines are vulnerable to these complexity-based attacks, and while much research has focused on detecting vulnerable regular expressions, we provide a way for developers to go further, by automatically transforming their regular expressions to remove vulnerabilities.



page 1

page 2

page 3

page 4


Automatic Repair of Vulnerable Regular Expressions

A regular expression is called vulnerable if there exist input strings o...

FOREST: An Interactive Multi-tree Synthesizer for Regular Expressions

Form validators based on regular expressions are often used on digital f...

Demystifying Regular Expression Bugs: A comprehensive study on regular expression bug causes, fixes, and testing

Regular expressions cause string-related bugs and open security vulnerab...

Sketch-Driven Regular Expression Generation from Natural Language and Examples

Recent systems for converting natural language descriptions into regular...

SPORES: Sum-Product Optimization via Relational Equality Saturation for Large Scale Linear Algebra

Machine learning algorithms are commonly specified in linear algebra (LA...

A Search for Improved Performance in Regular Expressions

The primary aim of automated performance improvement is to reduce the ru...

Neural-Network Guided Expression Transformation

Optimizing compilers, as well as other translator systems, often work by...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.