Inputs from Hell Generating Uncommon Inputs from Common Samples

12/18/2018
by   Esteban Pavese, et al.
0

Generating structured input files to test programs can be performed by techniques that produce them from a grammar that serves as the specification for syntactically correct input files. Two interesting scenarios then arise for effective testing. In the first scenario, software engineers would like to generate inputs that are as similar as possible to the inputs in common usage of the program, to test the reliability of the program. More interesting is the second scenario where inputs should be as dissimilar as possible from normal usage. This is useful for robustness testing and exploring yet uncovered behavior. To provide test cases for both scenarios, we leverage a context-free grammar to parse a set of sample input files that represent the program's common usage, and determine probabilities for individual grammar production as they occur during parsing the inputs. Replicating these probabilities during grammar-based test input generation, we obtain inputs that are close to the samples. Inverting these probabilities yields inputs that are strongly dissimilar to common inputs, yet still valid with respect to the grammar. Our evaluation on three common input formats (JSON, JavaScript, CSS) shows the effectiveness of these approaches in obtaining instances from both sets of inputs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/18/2018

Sample-Free Learning of Input Grammars for Comprehensive Software Fuzzing

Generating valid test inputs for a program is much easier if one knows t...
research
08/03/2020

Evolutionary Grammar-Based Fuzzing

A fuzzer provides randomly generated inputs to a targeted software to ex...
research
08/29/2017

Active Learning of Input Grammars

Knowing the precise format of a program's input is a necessary prerequis...
research
11/18/2019

Building Fast Fuzzers

Fuzzing is one of the key techniques for evaluating the robustness of pr...
research
06/28/2023

FuzzyFlow: Leveraging Dataflow To Find and Squash Program Optimization Bugs

The current hardware landscape and application scale is driving performa...
research
03/07/2021

Growing a Test Corpus with Bonsai Fuzzing

This paper presents a coverage-guided grammar-based fuzzing technique fo...
research
12/12/2019

Inferring Input Grammars from Dynamic Control Flow

A program is characterized by its input model, and a formal input model ...

Please sign up or login with your details

Forgot password? Click here to reset