Explainable Fuzzer Evaluation

12/19/2022
by   Dylan Wolff, et al.
0

While the aim of fuzzer evaluation is to establish fuzzer performance in general, an evaluation is always conducted on a specific benchmark. In this paper, we investigate the degree to which the benchmarking result depends on the properties of the benchmark and propose a methodology to quantify the impact of benchmark properties on the benchmarking result in relation to the impact of the choice of fuzzer. We found that the measured performance and ranking of a fuzzer substantially depends on properties of the programs and the seed corpora used during evaluation. For instance, if the benchmark contained larger programs or seed corpora with a higher initial coverage, AFL's ranking would improve while LibFuzzer's ranking would worsen. We describe our methodology as explainable fuzzer evaluation because it explains why the specific evaluation setup yields the observed superiority or ranking of the fuzzers and how it might change for different benchmarks. We envision that our analysis can be used to assess the degree to which evaluation results are overfitted to the benchmark and to identify the specific conditions under which different fuzzers performs better than others.

READ FULL TEXT

page 5

page 8

research
08/22/2023

Efficient Benchmarking (of Language Models)

The increasing versatility of language models LMs has given rise to a ne...
research
12/18/2022

Rare-Seed Generation for Fuzzing

Starting with a random initial seed, fuzzers search for inputs that trig...
research
02/20/2020

MEUZZ: Smart Seed Scheduling for Hybrid Fuzzing

Seed scheduling is a prominent factor in determining the yields of hybri...
research
03/04/2022

Towards Benchmarking and Evaluating Deepfake Detection

Deepfake detection automatically recognizes the manipulated medias throu...
research
10/06/2020

Does the Objective Matter? Comparing Training Objectives for Pronoun Resolution

Hard cases of pronoun resolution have been used as a long-standing bench...
research
06/07/2023

RD-Suite: A Benchmark for Ranking Distillation

The distillation of ranking models has become an important topic in both...
research
02/25/2021

BeFaaS: An Application-Centric Benchmarking Framework for FaaS Platforms

Following the increasing interest and adoption of FaaS systems, benchmar...

Please sign up or login with your details

Forgot password? Click here to reset