afl-test-viz
Visualizing tests generated in AFL during fuzzing
view repo
Software fuzzing is a strong testing technique that has become the de facto approach for automated software testing and software vulnerability detection in the industry. The random nature of fuzzing makes monitoring and understanding the behavior of fuzzers difficult. In this paper, we report the development of Fuzzer Mutation Visualizer (FMViz), a tool that focuses on visualizing byte-level mutations in fuzzers. In particular, FMViz extends American Fuzzy Lop (AFL) to visualize the generated test inputs and highlight changes between consecutively generated seeds as a fuzzing campaign progresses. The overarching goal of our tool is to help developers and students comprehend the inner-workings of the AFL fuzzer better. In this paper, we present the architecture of FMViz, discuss a sample case study of it, and outline the future work. FMViz is open-source and publicly available at https://github.com/AftabHussain/afl-test-viz.
READ FULL TEXT VIEW PDFVisualizing tests generated in AFL during fuzzing
Fuzzing has become a widely popular tool for testing programs in the software industry. The overall simplicity of the design of coverage-guided fuzzers (e.g., AFL [21]) – generating test inputs by mutating other test inputs in a pseudo-random fashion while optimizing for code coverage in the test subject, and executing them on test subjects at scale – has been very effective in finding bugs and vulnerabilities in software systems. Large companies have integrated fuzzers in their testing ecosystem; for instance, Google is continuously running fuzzers on its Chrome browser to find vulnerabilities.
There has been significant research in building efficient fuzzers that can generate interesting test inputs faster in a fuzzing campaign: e.g., grammar-based fuzzing [16, 8], data-flow techniques [15, 14], stochastic scheduling methods [13], smarter test selection methods [10], etc. Nevertheless, there is one theme that all fuzzers have in common, albeit in varying degrees: randomness, which contributes to the “black-box” nature of their operation.
This randomness of fuzzers poses a challenge for developers to understand and interpret what operations are being carried out on which test inputs, and reason about the behavior of the fuzzers. While fuzzers, like AFL, do offer high-level statistics on what operations are being performed, the information is shown in a hard-to-follow111The stats are real-time and constantly-changing. descriptive manner. They also store data for coverage-increasing test inputs only, and provide no support for understanding how the other tests were generated (which may contain useful information). In addition, the real-time statistics do not portray which test inputs are being changed (i.e. mutated). Furthermore, the initial choice of test inputs in a fuzzing campaign can considerably influence its progress [10]. We thus believe it is necessary to have an approach for understanding how the test inputs are being mutated and address the problem from a visualization angle.
The importance of software visualization (SV) cannot be over-emphasized. The superiority of visual memory in cognition is discussed in Diehl’s seminal work [5] where it was mentioned that 75% of all information from the real world is visually perceived, 13% through auditory senses, and the rest is perceived through other senses. Despite its value, SV has a huge potential to be realized in software engineering [4] and, to the best of our knowledge, even more so in fuzzing (we discuss a few existing research we found in Section 2). Towards the idea of bringing better visualization in fuzzing, we build a visualization approach for the integral component of almost all state-of-the-art fuzzers: mutation. Our tool, FMViz, helps us see which bytes of a test input undergo mutations during an AFL fuzzing campaign, and thereby makes mutation patterns in fuzzing more perceivable. The tool is light-weight and easy to extend to other fuzzers. We believe this work is a stepping stone in the direction of inspecting the behavior of mutational fuzzers on various test inputs, at a deeper-level.
Contributions. The main contributions of this paper are as follows:
We provide a light-weight approach to visualize fuzzing mutation behavior in AFL by visualizing test inputs generated during fuzzing and highlighting the changes.
We instantiate the approach in FMViz by capturing and displaying mutation locations in test inputs that undergo mutation in the fuzzing process – observing series of FMViz output images help in seeing various mutation patterns that take place during fuzzing.
We present a short demonstration of FMViz with an AFL fuzzing process on libxml2, where we present some mutation patterns captured by the tool.
Information visualization has been widely used in different realms of software engineering including bug analysis [20], evolution [7], refactoring [11, 1] – it makes it easier for developers to understand, analyze, and deploy various software engineering tasks. There are a few works that have adopted visualization techniques in the fuzzing domain. For example, VisFuzz [22], an LLVM plugin that works on top of a modified version of AFL, is an interactive real-time visualization tool that visualizes constraints in the fuzz subject by extracting a call graph and a control flow graph from the subject code. FuzzSplore [6] provides statistical visualizations such as a coverage plot, which shows the number of edges that are covered over time by test inputs, and a plot that shows the number of interesting test inputs generated over the campaign. Vainio [18] provides a fuzzing visualization framework that adopts information visualization techniques (e.g. circle packing) to view fuzzing performance data such as CPU and memory usage statistics and power consumption. In [3], the coverage of a test subject’s call graph is visualized when fuzzed by Hongfuzz [9] and AFL. Unlike our tool, FMViz, none of these works delve into visualizing mutations in fuzzers; FMViz generates visuals of how the mutations are occurring on the test inputs at the byte-level.
In this section, we provide an overview of the architecture of FMViz. We also present information on its usage and performance. The implementation and documentation of FMViz is available in Github222https://github.com/AftabHussain/afl-test-viz.
Figure 1 depicts the architecture of FMViz and its main components, which are: Test Input Color Representation Generator and Test Input Image Generator.
Figure 2 shows a sample visualization output of FMViz of a test input.
FMViz extends AFL to capture the byte stream representations of new test inputs that are generated as AFL mutates original seeds (Figure 3(a)). FMViz saves these representations in a single file (a dump file in hexadecimal). Each byte’s hex code is chosen to represent a shade of red, depending on the value stored in the byte (we elaborate on the color representation in the following subsection). Each line of this file corresponds to the representation of a single test input.
This piece of our tool (Fig. 3(b)), written in Python, reads the file generated by the color representation generator, line by line, and generates PNG image files (where each line corresponds to a test input as mentioned previously). In the image output, each box represents a byte of a test input. For obtaining the box colors we use the six-digit hex triplet, a three-byte hexadecimal number, which is typically used for various computing applications, e.g. HTML, SVG, etc., to generate colors. Each of these three bytes show the red, green, and blue components of the color respectively [19]. The box color representation for each byte of the test input is evaluated as follows: byte is translated to the hex color code , where is a hex representation of a test input byte, where and each belong to the set of 16 hexadecimal symbols (,
). The box dimensions can be changed to vary the number of test input bytes to display in the image. The PNG files can be used independently to represent individual test inputs, or can be used to generate a time-lapse video of the evolution of the test inputs using a linear image interpolator
[17], or a screen recorder [2].The present implementation is adapted for an AFL fuzzing campaign with a single test input. Although the overhead of writing to the color dump file is minimal during the fuzzing campaign, since a single file is used, the file can get very large over long fuzzing periods. We are considering to extend FMViz to optimize the storage use through using a more compressed representation of the test inputs.
In this section, we present a short demo of FMViz. The purpose of this demo is to show how the mutation locations in a test input are visually captured.
We applied FMViz’s representation generator on top of AFL to fuzz the XML C parser library, libxml2333We used the version, https://github.com/GNOME/libxml2.git – commit id. 1fbcf40[12] for seconds. This step produced test inputs and a dump file containing color representations of each of those test inputs. Next, we used FMViz’s image generator to parse the dump file and generate images for each test. The image generation process took slightly over five minutes for all tests. Then viewing a series of consecutive test input color matrix images (in PNG format), using the default system image viewer, revealed patterns (we also produced a time-lapse video from the sequence of images and saved them in a video file, which is available in the repository.). For this experiment, we used a computer system with Intel(R) 1.90GHz Xeon(R) CPU and 64 GB RAM with Ubuntu 18.04.5 LTS.
Figure 4 depicts the mutation patterns that we observed. For visualizing each pattern, seven consecutively generated tests are shown.
2-byte, shifting mutation pattern (Figure 4(a)). Here, in every mutation iteration, the fuzzer mutates a pair of bytes of the test input. This pair-mutation operation progresses by shifting by one byte in the next iteration.
4-byte, shifting mutation pattern (Figure 4(b)). Here, in every mutation iteration, the fuzzer mutates a set of four bytes of the test input. The 4-byte-mutation operation progresses by shifting by one byte in the next iteration.
Single-byte, fixed mutation pattern (Figure 4(c)). Here, in every mutation iteration, the fuzzer mutates the same byte. The changing byte in the figure is shown with the yellow arrow.
In this work, building on the motivation of software visualization, we presented an easy-to-extend, light-weight visualization tool, FMViz, that helps us better perceive the mutation process in the AFL fuzzer. In particular, we visualize bytes of a test input that undergo mutation during fuzzing. FMViz encodes bytes of test inputs as colors and mutations are captured by changes in the colors. In the next steps, we plan to augment the visual representation of test inputs by other information such as coverage. We also plan to explore more efficient ways to store representations of test inputs. Furthermore, we plan to evaluate the usefulness of FMViz and similar visualization tools in teaching software testing to undergraduate students.
The International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications.
, pp. 3–11. Cited by: §1.A new hierarchical clustering technique for restructuring software at the function level
. In Proceedings of the 6th India Software Engineering Conference, ISEC ’13, pp. 45–54. Cited by: §2.The use of data visualization in fuzz test monitoring
. Master’s Thesis, University of Oulu. Cited by: §2.In this section, we describe the steps to run the first release of FMViz, version 1.0. We use libxml2 as our fuzzing test subject. All commands provided next are for the Linux environment.
1. FMViz Setup
In any directory, we clone the FMViz repository as follows444Currently, the name of the repository has been purposely kept different from FMViz.:
git clone --recursive git@github.com:AftabHussain/afl-test-viz.git
Then we build and install the AFL fuzzer, patched with FMViz’s Test Input Color Representation Generator component, by performing the following command:
cd afl-test-viz/code/AFL-mut-viz/AFL && make -j32 && make install
2. libxml2 Setup
Once we have setup the fuzzer, we build the test subject (libxml2) with AFL’s compiler (afl-gcc), which prepares libxml2 binaries as fuzzing targets. We first obtain libxml2 as follows in a folder outside afl-test-viz directory:
git clone https://github.com/GNOME/libxml2.git && cd libxml2 && git checkout 1fbcf40
Finally, we configure and build libxml2 by performing the following command:
cd libxml2 && export CC=afl-gcc && ./autogen.sh && make -j32
We now invoke the first part of FMViz, the augmented AFL fuzzer, which produces color representations (in hex) of test inputs generated while fuzzing the test subject. In this demo, we fuzz the libxml2 binary, xmllint. We thus enter the libxml2 folder, create an input folder (input), and place in it any XML file as a test input (some sample inputs are available in the FMViz repository):
cd libxml2 && mkdir input && cp [path_to_xml_file] input/
Thereafter, we invoke the fuzzer as follows:
export AFL_SKIP_CPUFREQ=1 && export LD_LIBRARY_PATH=./.libs/ &&
afl-fuzz -i input/ -o output/ -- ./.libs/xmllint -o /dev/null @@
The fuzzing process can be terminated anytime using Ctrl+C – on termination all results are saved in the output folder, output. Inside this folder, the color dump file tests_generated contains color representations of all the tests created by the fuzzer.
To generate test input images, we process the color dump file obtained in the previous phase. We place this file along with the Image Generation program (viz_tests.py) in a separate directory:
mkdir process_color_rep
cp libxml2/output/tests_generated process_color_rep/
cp afl-test-viz/code/viz_tests.py process_color_rep/
Finally we invoke the script:
cd process_color_rep/ && python viz_tests.py
The above command generates PNG images for all tests, that are represented in the color dump file, in process_color_rep directory:
ls | xargs -n 1
. . . file_000005564.png file_000005565.png file_000005566.png file_000005567.png file_000005568.png file_000005569.png file_000005570.png file_000005571.png file_000005572.png . .
A sample screenshot of a test input image, opened with Image Viewer, a default image viewer in Ubuntu, is shown in Figure 5. Since the image files for the input tests are named in the order in which they were produced during fuzzing, toggling over consecutive images in the image viewer application shows the trends in mutations. In order to produce a time-lapse video, we use Simple Screen Recorder [2], which once installed can be invoked by the command simplescreenrecorder on the terminal. Then by starting recording and toggling over multiple images on Image Viewer by holding the left/right arrow key, we are able to record the mutation transitions that take place.
Comments
There are no comments yet.