FormatFuzzer: Effective Fuzzing of Binary File Formats

09/23/2021
by   Rafael Dutra, et al.
0

Effective fuzzing of programs that process structured binary inputs, such as multimedia files, is a challenging task, since those programs expect a very specific input format. Existing fuzzers, however, are mostly format-agnostic, which makes them versatile, but also ineffective when a specific format is required. We present FormatFuzzer, a generator for format-specific fuzzers. FormatFuzzer takes as input a binary template (a format specification used by the 010 Editor) and compiles it into C++ code that acts as parser, mutator, and highly efficient generator of inputs conforming to the rules of the language. The resulting format-specific fuzzer can be used as a standalone producer or mutator in black-box settings, where no guidance from the program is available. In addition, by providing mutable decision seeds, it can be easily integrated with arbitrary format-agnostic fuzzers such as AFL to make them format-aware. In our evaluation on complex formats such as MP4 or ZIP, FormatFuzzer showed to be a highly effective producer of valid inputs that also detected previously unknown memory errors in ffmpeg and timidity.

READ FULL TEXT

page 2

page 17

page 18

research
12/15/2020

Looking for non-compliant documents using error messages from multiple parsers

Whether a file is accepted by a single parser is not a reliable indicati...
research
11/02/2019

WEIZZ: Automatic Grey-box Fuzzing for Structured Binary Formats

Fuzzing technologies have evolved at a fast pace in recent years, reveal...
research
03/28/2023

Specification-based CSV Support in VDM

CSV is a widely used format for data representing systems control, infor...
research
03/13/2018

Narcissus: Deriving Correct-By-Construction Decoders and Encoders from Binary Formats

Every injective function has an inverse, although constructing the inver...
research
06/11/2023

Augmenting Greybox Fuzzing with Generative AI

Real-world programs expecting structured inputs often has a format-parsi...
research
01/25/2017

Learn&Fuzz: Machine Learning for Input Fuzzing

Fuzzing consists of repeatedly testing an application with modified, or ...
research
04/19/2021

Inferring Drop-in Binary Parsers from Program Executions

We present BIEBER (Byte-IdEntical Binary parsER), the first system to mo...

Please sign up or login with your details

Forgot password? Click here to reset