A Systematic Impact Study for Fuzzer-Found Compiler Bugs
Despite much recent interest in randomised testing (fuzzing) of compilers, the practical impact of fuzzer-found miscompilations on real-world applications has barely been assessed. We present the first quantitative study of the tangible impact of fuzzer-found compiler bugs. We follow a novel methodology where the impact of a miscompilation bug is evaluated based on (1) whether the bug appears to trigger during compilation; (2) whether the effects of triggering a bug propagate to the binary code that is generated; and (3) whether a binary-level propagation leads to observable differences in the application's test suite results. The study is conducted with respect to the compilation of more than 11 million lines of C/C++ code from 318 Debian packages, using 45 historical bugs in the Clang/LLVM compiler, either found using four distinct fuzzers, the Alive formal verification tool, or human users. The results show that almost half of the fuzzer-found bugs propagate to the generated binaries for some packages, but never cause application test suite failures. User-reported and Alive bugs have a lower impact, with less frequently triggered bugs and also no test failures. The major conclusions are that (1) either application test suites do not reflect real-world usage or the impact of compiler bugs on real-world code is limited, and (2) to the extent that compiler bugs matter, fuzzer-found compiler bugs are first class citizens, having at least as much impact as bugs from other sources.
READ FULL TEXT