Input Validation with Symbolic Execution
Symbolic execution has always been plagued by the inability to handle programs that require highly structured inputs. Most often, the symbolic execution engine gets overwhelmed by the sheer number of infeasible paths and fails to explore enough feasible paths to gain any respectable coverage. In this paper, we propose a system, InVaSion, that attempts to solve this problem for forking-based symbolic execution engines. We propose an input specification language (ISL) that is based on a finite-state automaton but includes guarded transitions, a set of registers and a set of commands to update the register states. We demonstrate that our language is expressive enough to handle complex input specifications, like the Tiff image format, while not requiring substantial human effort; even the Tiff image specification could be specified in our language with an automaton of about 35 states. InVaSion translates the given program and the input specification into a non-deterministic program and uses symbolic execution to instantiate the non-determinism. This allows our tool to work with any forking-based symbolic execution engine and with no requirement of any special theory solver. Over our set of benchmarks, on an average, InVaSion was able to increase branch coverage from 24.97 over baseline KLEE.
READ FULL TEXT