My Fuzzer Beats Them All! Developing a Framework for Fair Evaluation and Comparison of Fuzzers

08/16/2021
by   David Paaßen, et al.
0

Fuzzing has become one of the most popular techniques to identify bugs in software. To improve the fuzzing process, a plethora of techniques have recently appeared in academic literature. However, evaluating and comparing these techniques is challenging as fuzzers depend on randomness when generating test inputs. Commonly, existing evaluations only partially follow best practices for fuzzing evaluations. We argue that the reason for this are twofold. First, it is unclear if the proposed guidelines are necessary due to the lack of comprehensive empirical data in the case of fuzz testing. Second, there does not yet exist a framework that integrates statistical evaluation techniques to enable fair comparison of fuzzers. To address these limitations, we introduce a novel fuzzing evaluation framework called SENF (Statistical EvaluatioN of Fuzzers). We demonstrate the practical applicability of our framework by utilizing the most wide-spread fuzzer AFL as our baseline fuzzer and exploring the impact of different evaluation parameters (e.g., the number of repetitions or run-time), compilers, seeds, and fuzzing strategies. Using our evaluation framework, we show that supposedly small changes of the parameters can have a major influence on the measured performance of a fuzzer.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/02/2020

Magma: A Ground-Truth Fuzzing Benchmark

High scalability and low running costs have made fuzz testing the de fac...
research
08/29/2018

Evaluating Fuzz Testing

Fuzz testing has enjoyed great success at discovering security critical ...
research
07/24/2023

A conceptual framework for SPI evaluation

Software Process Improvement (SPI) encompasses the analysis and modifica...
research
03/04/2022

Towards Benchmarking and Evaluating Deepfake Detection

Deepfake detection automatically recognizes the manipulated medias throu...
research
08/15/2022

Evaluating Dense Passage Retrieval using Transformers

Although representational retrieval models based on Transformers have be...
research
03/09/2023

RCABench: Open Benchmarking Platform for Root Cause Analysis

Fuzzing has contributed to automatically identifying bugs and vulnerabil...
research
04/05/2022

Design Guidelines for Inclusive Speaker Verification Evaluation Datasets

Speaker verification (SV) provides billions of voice-enabled devices wit...

Please sign up or login with your details

Forgot password? Click here to reset