Interval Parsing Grammars for File Format Parsing

04/10/2023
by   Jialun Zhang, et al.
0

File formats specify how data is encoded for persistent storage. They cannot be formalized as context-free grammars since their specifications include context-sensitive patterns such as the random access pattern and the type-length-value pattern. We propose a new grammar mechanism called Interval Parsing Grammars IPGs) for file format specifications. An IPG attaches to every nonterminal/terminal an interval, which specifies the range of input the nonterminal/terminal consumes. By connecting intervals and attributes, the context-sensitive patterns in file formats can be well handled. In this paper, we formalize IPGs' syntax as well as its semantics, and its semantics naturally leads to a parser generator that generates a recursive-descent parser from an IPG. In general, IPGs are declarative, modular, and enable termination checking. We have used IPGs to specify a number of file formats including ZIP, ELF, GIF, PE, and part of PDF; we have also evaluated the performance of the generated parsers.

READ FULL TEXT
research
05/13/2020

Pika parsing: parsing in reverse solves the left recursion and error recovery problems

A recursive descent parser is built from a set of mutually-recursive fun...
research
01/13/2020

A Verified Packrat Parser Interpreter for Parsing Expression Grammars

Parsing expression grammars (PEGs) offer a natural opportunity for build...
research
12/04/2018

Using Binary File Format Description Languages for Documenting, Parsing, and Verifying Raw Data in TAIGA Experiment

The paper is devoted to the issues of raw binary data documenting, parsi...
research
06/28/2018

Syntax Error Recovery in Parsing Expression Grammars

Parsing Expression Grammars (PEGs) are a formalism used to describe top-...
research
03/14/2023

Happy-GLL: modular, reusable and complete top-down parsers for parameterized nonterminals

Parser generators and parser combinator libraries are the most popular t...
research
01/14/2020

The geometry of syntax and semantics for directed file transformations

We introduce a conceptual framework that associates syntax and semantics...
research
04/11/2023

flap: A Deterministic Parser with Fused Lexing

Lexers and parsers are typically defined separately and connected by a t...

Please sign up or login with your details

Forgot password? Click here to reset