Active Learning of Input Grammars

08/29/2017
by   Matthias Höschele, et al.
0

Knowing the precise format of a program's input is a necessary prerequisite for systematic testing. Given a program and a small set of sample inputs, we (1) track the data flow of inputs to aggregate input fragments that share the same data flow through program execution into lexical and syntactic entities; (2) assign these entities names that are based on the associated variable and function identifiers; and (3) systematically generalize production rules by means of membership queries. As a result, we need only a minimal set of sample inputs to obtain human-readable context-free grammars that reflect valid input structure. In our evaluation on inputs like URLs, spreadsheets, or configuration files, our AUTOGRAM prototype obtains input grammars that are both accurate and very readable - and that can be directly fed into test generators for comprehensive automated testing.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/18/2018

Inputs from Hell Generating Uncommon Inputs from Common Samples

Generating structured input files to test programs can be performed by t...
research
10/18/2018

Sample-Free Learning of Input Grammars for Comprehensive Software Fuzzing

Generating valid test inputs for a program is much easier if one knows t...
research
12/12/2019

Inferring Input Grammars from Dynamic Control Flow

A program is characterized by its input model, and a formal input model ...
research
11/30/2018

Zest: Validity Fuzzing and Parametric Generators for Effective Random Testing

Programs expecting structured inputs often consist of both a syntactic a...
research
12/21/2022

When and Why Test Generators for Deep Learning Produce Invalid Inputs: an Empirical Study

Testing Deep Learning (DL) based systems inherently requires large and r...
research
12/08/2022

SkipFuzz: Active Learning-based Input Selection for Fuzzing Deep Learning Libraries

Many modern software systems are enabled by deep learning libraries such...
research
11/02/2019

WEIZZ: Automatic Grey-box Fuzzing for Structured Binary Formats

Fuzzing technologies have evolved at a fast pace in recent years, reveal...

Please sign up or login with your details

Forgot password? Click here to reset