Magma: A Ground-Truth Fuzzing Benchmark

09/02/2020
by   Ahmad Hazimeh, et al.
0

High scalability and low running costs have made fuzz testing the de facto standard for discovering software bugs. Fuzzing techniques are constantly being improved in a race to build the ultimate bug-finding tool. However, while fuzzing excels at finding bugs, evaluating and comparing fuzzer performance is challenging due to the lack of metrics and benchmarks. Crash count, the most common performance metric, is inaccurate due to imperfections in deduplication techniques. Moreover, the lack of a unified set of targets results in ad hoc evaluations that inhibit fair comparison. We tackle these problems by developing Magma, a ground-truth evaluation framework that enables uniform fuzzer evaluation and comparison. By introducing real bugs into real software, Magma allows for realistic evaluation of fuzzers against a broad set of targets. By instrumenting these bugs, Magma also enables the collection of bug-centric performance metrics independent of the fuzzer. Magma is an open benchmark consisting of seven targets that perform a variety of input manipulations and complex computations, presenting a challenge to state-of-the-art fuzzers. We evaluate six popular mutation-based greybox fuzzers (AFL, AFLFast, AFL++, FairFuzz, MOpt-AFL, and honggfuzz) against Magma over 200 000 CPU-hours. Based on the number of bugs reached, triggered, and detected, we draw conclusions about the fuzzers' exploration and detection capabilities. This provides insight into fuzzer performance evaluation, highlighting the importance of ground truth in performing more accurate and meaningful evaluations.

READ FULL TEXT

page 9

page 11

page 16

research
08/23/2022

Evaluating Synthetic Bugs

Fuzz testing has been used to find bugs in programs since the 1990s, but...
research
08/16/2021

My Fuzzer Beats Them All! Developing a Framework for Fair Evaluation and Comparison of Fuzzers

Fuzzing has become one of the most popular techniques to identify bugs i...
research
07/05/2020

EvilCoder: Automated Bug Insertion

The art of finding software vulnerabilities has been covered extensively...
research
04/19/2022

Test suite effectiveness metric evaluation: what do we know and what should we do?

Comparing test suite effectiveness metrics has always been a research ho...
research
11/11/2021

SyzScope: Revealing High-Risk Security Impacts of Fuzzer-Exposed Bugs in Linux kernel

Fuzzing has become one of the most effective bug finding approach for so...
research
08/29/2018

Evaluating Fuzz Testing

Fuzz testing has enjoyed great success at discovering security critical ...
research
10/05/2021

SiliFuzz: Fuzzing CPUs by proxy

CPUs are becoming more complex with every generation, at both the logica...

Please sign up or login with your details

Forgot password? Click here to reset