Sound Regular Expression Semantics for Dynamic Symbolic Execution of JavaScript

10/10/2018
by   Blake Loring, et al.
0

Existing support for regular expressions in automated test generation or verification tools is lacking. Common aspects of regular expression engines found in mainstream programming languages, such as backreferences or greedy matching, are commonly ignored or imprecisely approximated, leading to poor test coverage or failed proofs. In this paper, we present the first complete strategy to faithfully reason about regular expressions in the context of symbolic execution, focusing on the operators found in JavaScript. We model regular expression operations using string constraints and classical regular expressions and use a refinement scheme to address the problem of matching precedence and greediness. Our survey of over 400,000 JavaScript packages from the NPM software repository shows that one fifth make use of complex regular expressions features. We implemented our model in a dynamic symbolic execution engine for JavaScript and evaluated it on over 1,000 Node.js packages containing regular expressions, demonstrating that the strategy is effective and can increase line coverage of programs by up to 30

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

11/08/2021

Solving String Constraints With Regex-Dependent Functions Through Transducers With Priorities And Variables

Regular expressions are a classical concept in formal language theory. R...
01/29/2022

Symbolic-Numeric Integration of Univariate Expressions based on Sparse Regression

Most computer algebra systems (CAS) support symbolic integration as core...
10/23/2020

Automatic Repair of Vulnerable Regular Expressions

A regular expression is called vulnerable if there exist input strings o...
04/13/2017

A Search for Improved Performance in Regular Expressions

The primary aim of automated performance improvement is to reduce the ru...
12/16/2017

Enhancing Symbolic Execution of Heap-based Programs with Separation Logic for Test Input Generation

Symbolic execution is a well established method for test input generatio...
12/28/2020

FOREST: An Interactive Multi-tree Synthesizer for Regular Expressions

Form validators based on regular expressions are often used on digital f...
11/09/2017

What Is Decidable about String Constraints with the ReplaceAll Function

Recently, it was shown that any theory of strings containing the string-...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.